mirror of https://github.com/citusdata/citus.git
Mitigate segfault in connection statemachine (#4551)
As described in the comment, we have observed crashes in production
due to a segfault caused by the dereference of a NULL pointer in our
connection statemachine.
As a mitigation, preventing system crashes, we provide an error with
a small explanation of the issue. Unfortunately the case is not
reliably reproduced yet, hence the inability to add tests.
DESCRIPTION: Prevent segfaults when SAVEPOINT handling cannot recover from connection failures
(cherry picked from commit d127516dc8
)
pull/4578/head
parent
49ce36fe8b
commit
2efeed412a
|
@ -3297,6 +3297,25 @@ TransactionStateMachine(WorkerSession *session)
|
||||||
case REMOTE_TRANS_SENT_COMMAND:
|
case REMOTE_TRANS_SENT_COMMAND:
|
||||||
{
|
{
|
||||||
TaskPlacementExecution *placementExecution = session->currentTask;
|
TaskPlacementExecution *placementExecution = session->currentTask;
|
||||||
|
if (placementExecution == NULL)
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* We have seen accounts in production where the placementExecution
|
||||||
|
* could inadvertently be not set. Investigation documented on
|
||||||
|
* https://github.com/citusdata/citus-enterprise/issues/493
|
||||||
|
* (due to sensitive data in the initial report it is not discussed
|
||||||
|
* in our community repository)
|
||||||
|
*
|
||||||
|
* Currently we don't have a reliable way of reproducing this issue.
|
||||||
|
* Erroring here seems to be a more desirable approach compared to a
|
||||||
|
* SEGFAULT on the dereference of placementExecution, with a possible
|
||||||
|
* crash recovery as a result.
|
||||||
|
*/
|
||||||
|
ereport(ERROR, (errmsg(
|
||||||
|
"unable to recover from inconsistent state in "
|
||||||
|
"the connection state machine on coordinator")));
|
||||||
|
}
|
||||||
|
|
||||||
ShardCommandExecution *shardCommandExecution =
|
ShardCommandExecution *shardCommandExecution =
|
||||||
placementExecution->shardCommandExecution;
|
placementExecution->shardCommandExecution;
|
||||||
Task *task = shardCommandExecution->task;
|
Task *task = shardCommandExecution->task;
|
||||||
|
|
Loading…
Reference in New Issue