Don't reassign global PID when already assigned (#6412)

DESCRIPTION: Fix bug in global PID assignment for rebalancer
sub-connections

In CI our isolation_shard_rebalancer_progress test would sometimes fail
like this:
```diff
+isolationtester: canceling step s1-rebalance-c1-block-writes after 60 seconds
 step s1-rebalance-c1-block-writes:
  SELECT rebalance_table_shards('colocated1', shard_transfer_mode:='block_writes');
- <waiting ...>
+
+ERROR:  canceling statement due to user request
 step s7-get-progress:
```

Source:
https://app.circleci.com/pipelines/github/citusdata/citus/27855/workflows/2a7e335a-f3e8-46ed-b6bd-6920d42f7214/jobs/831710

It turned out this was an actual bug in the way our assigning of global
PIDs interacts with the way we connect to ourselves as the shard
rebalancer. The first command the shard rebalancer sends is a SET 
ommand to change the application_name to `citus_rebalancer`. If
`StartupCitusBackend` is called after this command is processed, then it
overwrites the global PID that was extracted from the previous
application_name. This makes sure that we don't do that, and continue to
use the original global PID. While it might seem that we only call 
`StartupCitusBackend` once for each query backend, this isn't actually 
the case. Whenever pg_dist_partition gets ANALYZEd by autovacuum
we indirectly call `StartupCitusBackend` again, because we invalidate 
the cache then.

In passing this fixes two other things as well:
1. It sets `distributedCommandOriginator` correctly in
   `AssignGlobalPID`, by using IsExternalClientBackend(). This doesn't
   matter much anymore, since AssignGlobalPID effectively becomes a
   no-op in this PR for any non-external client backends.
2. It passes the application_name to InitializeBackendData in
   StartupCitusBackend, instead of INVALID_CITUS_INTERNAL_BACKEND_GPID
   (which effectively got casted to NULL). In practice this doesn't
   change the behaviour of the call, since the call is a no-op for every
   backend except the maintenance daemon. And the behaviour of the call
   is the same for NULL as for the application_name of the maintenance
   daemon.
pull/6422/head
Jelte Fennema 2022-10-11 16:41:01 +02:00 committed by GitHub
parent b5d70d2e11
commit cb34adf7ac
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 74 additions and 9 deletions

View File

@ -4622,7 +4622,13 @@ InvalidateDistRelationCacheCallback(Datum argument, Oid relationId)
/*
* If pg_dist_partition is being invalidated drop all state
* This happens pretty rarely, but most importantly happens during
* DROP EXTENSION citus;
* DROP EXTENSION citus; This isn't the only time when this happens
* though, it can happen for multiple other reasons, such as an
* autovacuum running ANALYZE on pg_dist_partition. Such an ANALYZE
* wouldn't really need a full Metadata cache invalidation, but we
* don't know how to differentiate between DROP EXTENSION and ANALYZE.
* So for now we simply drop it in both cases and take the slight
* temporary performance hit.
*/
if (relationId == MetadataCache.distPartitionRelationId)
{

View File

@ -649,7 +649,9 @@ multi_log_hook(ErrorData *edata)
*
* NB: All code here has to be able to cope with this routine being called
* multiple times in the same backend. This will e.g. happen when the
* extension is created or upgraded.
* extension is created, upgraded or dropped. Due to the way we detect the
* extension being dropped this can also happen when autovacuum runs ANALYZE on
* pg_dist_partition, see InvalidateDistRelationCacheCallback for details.
*/
void
StartupCitusBackend(void)
@ -657,13 +659,18 @@ StartupCitusBackend(void)
InitializeMaintenanceDaemonBackend();
/*
* For queries this will be a no-op. But for background daemons we might
* still need to initialize the backend data. For those backaground daemons
* it doesn't really matter that we temporarily assign
* INVALID_CITUS_INTERNAL_BACKEND_GPID, since we override it again two
* lines below.
* For query backends this will be a no-op, because InitializeBackendData
* is already called from the CitusAuthHook. But for background workers we
* still need to initialize the backend data.
*/
InitializeBackendData(application_name);
/*
* If this is an external connection or a background workers this will
* generate the global PID for this connection. For internal connections
* this is a no-op, since InitializeBackendData will already have extracted
* the gpid from the application_name.
*/
InitializeBackendData(INVALID_CITUS_INTERNAL_BACKEND_GPID);
AssignGlobalPID();
RegisterConnectionCleanup();
}

View File

@ -871,17 +871,40 @@ AssignDistributedTransactionId(void)
* If this is a Citus initiated backend, which means it is distributed part of a distributed
* query, then this function assigns the global pid extracted from the application name.
* If not, this function assigns a new generated global pid.
*
* If a global PID is already assigned to this backend, then this function is a
* no-op. In most scenarios this would already be the case, because a newly
* assigned global PID would be the same as a proviously assigned one. But
* there's two important cases where the newly assigned global PID would be
* different from the previous one:
* 1. The current backend is an internal backend and in the meantime the
* application_name was changed to one without a gpid, e.g.
* citus_rebalancer. In this case we don't want to throw away the original
* gpid of the query originator, because that would mess up distributed
* deadlock detection involving this backend.
* 2. The current backend is an external backend and the node id of the current
* node changed. Updating the gpid to match the nodeid might actually seem
* like a desirable property, but that's not the case. Updating the gpid
* with the new nodeid would mess up distributed deadlock and originator
* detection of queries too. Because if this backend already opened
* connections to other nodes, then those backends will still have the old
* gpid.
*/
void
AssignGlobalPID(void)
{
if (GetGlobalPID() != INVALID_CITUS_INTERNAL_BACKEND_GPID)
{
return;
}
uint64 globalPID = INVALID_CITUS_INTERNAL_BACKEND_GPID;
bool distributedCommandOriginator = false;
if (!IsCitusInternalBackend())
{
globalPID = GenerateGlobalPID();
distributedCommandOriginator = true;
distributedCommandOriginator = IsExternalClientBackend();
}
else
{

View File

@ -1,3 +1,19 @@
-- add coordinator in idempotent way, this is needed so we get a nodeid
SET client_min_messages TO ERROR;
SELECT 1 FROM master_add_node('localhost', :master_port, groupid => 0);
?column?
---------------------------------------------------------------------
1
(1 row)
RESET client_min_messages;
-- Kill maintenance daemon so it gets restarted and gets a gpid containing our
-- nodeid
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE application_name = 'Citus Maintenance Daemon' \gset
-- reconnect to make sure we get a session with the gpid containing our nodeid
\c - - - -
CREATE SCHEMA global_cancel;
SET search_path TO global_cancel;
SET citus.next_shard_id TO 56789000;

View File

@ -1,3 +1,16 @@
-- add coordinator in idempotent way, this is needed so we get a nodeid
SET client_min_messages TO ERROR;
SELECT 1 FROM master_add_node('localhost', :master_port, groupid => 0);
RESET client_min_messages;
-- Kill maintenance daemon so it gets restarted and gets a gpid containing our
-- nodeid
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE application_name = 'Citus Maintenance Daemon' \gset
-- reconnect to make sure we get a session with the gpid containing our nodeid
\c - - - -
CREATE SCHEMA global_cancel;
SET search_path TO global_cancel;
SET citus.next_shard_id TO 56789000;