Change worker list sort: coord before workers

Before this change, we sort the list of worker nodes only by their host
name and port number. This was good enough when we only had worker nodes
in the metadata in the past, but now that we also have coordinator it is
harder to prevent distributed deadlocks. Hence, we now make sure that
the coordinator node(s) is always sorted to be in the beginning of the
list of worker nodes.

A DDL command that is run at the coordinator node will take locks on the
coordinator, and if successful attempt to take locks on the worker
nodes. Similarly, a DDL command that is run on a worker node will take
the locks on itself first, and then attempt to take locks on the
remaining of the workers. If the worker nodes are sorted before the
coordinator node, then we can end up in a situation where the
coordinator node is waiting for a lock on a worker node, while the
worker node is waiting for a lock on the coordinator node. This is a
distributed deadlock.
coord-first
Hanefi Onaldi 2023-06-20 14:47:36 +03:00
parent a849570f3f
commit 658b8c4ec2
No known key found for this signature in database
GPG Key ID: F18CDB10BA0DFDC7
2 changed files with 14 additions and 2 deletions

View File

@ -108,7 +108,7 @@ ActiveReadableNodeCount(void)
* NodeIsCoordinator returns true if the given node represents the coordinator.
*/
bool
NodeIsCoordinator(WorkerNode *node)
NodeIsCoordinator(const WorkerNode *node)
{
return node->groupId == COORDINATOR_GROUP_ID;
}
@ -352,6 +352,9 @@ CompareWorkerNodes(const void *leftElement, const void *rightElement)
* WorkerNodeCompare compares two worker nodes by their host name and port
* number. Two nodes that only differ by their rack locations are considered to
* be equal to each other.
*
* This function also makes sure that coordinator nodes are always considered
* lexicographically smaller than other worker nodes.
*/
int
WorkerNodeCompare(const void *lhsKey, const void *rhsKey, Size keySize)
@ -359,6 +362,15 @@ WorkerNodeCompare(const void *lhsKey, const void *rhsKey, Size keySize)
const WorkerNode *workerLhs = (const WorkerNode *) lhsKey;
const WorkerNode *workerRhs = (const WorkerNode *) rhsKey;
if (NodeIsCoordinator(workerLhs))
{
return -1;
}
if (NodeIsCoordinator(workerRhs))
{
return 1;
}
return NodeNamePortCompare(workerLhs->workerName, workerRhs->workerName,
workerLhs->workerPort, workerRhs->workerPort);
}

View File

@ -95,7 +95,7 @@ extern bool NodeIsPrimaryAndRemote(WorkerNode *worker);
extern bool NodeIsPrimary(WorkerNode *worker);
extern bool NodeIsSecondary(WorkerNode *worker);
extern bool NodeIsReadable(WorkerNode *worker);
extern bool NodeIsCoordinator(WorkerNode *node);
extern bool NodeIsCoordinator(const WorkerNode *node);
extern WorkerNode * SetWorkerColumn(WorkerNode *workerNode, int columnIndex, Datum value);
extern WorkerNode * SetWorkerColumnOptional(WorkerNode *workerNode, int columnIndex, Datum
value);