mirror of https://github.com/citusdata/citus.git
In #6314 I refactored the connection cleanup to be simpler to
understand and use. However, by doing so I introduced a use-after-free
possibility (that valgrind luckily picked up):
In the `ShouldShutdownConnection` path of
`AfterXactHostConnectionHandling`
we free connections without removing the `transactionNode` from the
dlist that it might be part of. Before the refactoring this wasn't a
problem, because the dlist would be completely reset quickly after in
`ResetGlobalVariables` (without reading or writing the dlist entries).
The refactoring changed this by moving the `dlist_delete` call to
`ResetRemoteTransaction`, which in turn was called in the
`!ShouldShutdownConnection` path of `AfterXactHostConnectionHandling`.
Thus this `!ShouldShutdownConnection` path would now delete from the
`dlist`, but the `ShouldShutdownConnection` path would not. Thus to
remove itself the deleting path would sometimes update nodes in the list
that were freed right before.
There's two ways of fixing this:
1. Call `dlist_delete` from **both** of paths.
2. Call `dlist_delete` from **neither** of the paths.
This commit implements the second approach, and #6684 implements the
first. We need to choose which approach we prefer.
To make calling `dlist_delete` from both paths actually work, we also need
to use a slightly different check to determine if we need to call dlist_delete.
Various regression tests showed that there can be cases where the
`transactionState` is something else than `REMOTE_TRANS_NOT_STARTED`
but the connection was not added to the `InProgressTransactions` list
One example of such a case is when running `TransactionStateMachine`
without calling `StartRemoteTransactionBegin` beforehand. In those
cases the connection won't be added to `InProgressTransactions`, but
the `transactionState` is changed to `REMOTE_TRANS_SENT_COMMAND`.
Sidenote: This bug already existed in 11.1, but valgrind didn't catch it
back then. My guess is that this happened because #6314 was merged after
the initial release branch was cut.
Fixes #6638
(cherry picked from commit
|
||
---|---|---|
.. | ||
columnar | ||
distributed | ||
.gitignore | ||
citus_config.h.in | ||
citus_version.h.in | ||
pg_version_compat.h |