citus/src/bin/pg_send_cancellation
Jelte Fennema 184c7c0bce
Make enterprise features open source (#6008)
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 00:23:46 -07:00
..
.gitignore Make enterprise features open source (#6008) 2022-06-16 00:23:46 -07:00
Makefile Make enterprise features open source (#6008) 2022-06-16 00:23:46 -07:00
README.md Make enterprise features open source (#6008) 2022-06-16 00:23:46 -07:00
pg_send_cancellation.c Make enterprise features open source (#6008) 2022-06-16 00:23:46 -07:00

README.md

pg_send_cancellation

pg_send_cancellation is a program for manually sending a cancellation to a Postgres endpoint. It is effectively a command-line version of PQcancel in libpq, but it can use any PID or cancellation key.

We use pg_send_cancellation primarily to propagate cancellations between pgbouncers behind a load balancer. Since the cancellation protocol involves opening a new connection, the new connection may go to a different node that does not recognize the cancellation key. To handle that scenario, we modified pgbouncer to pass unrecognized cancellation keys to a shell command.

Users can configure the cancellation_command, which will be run with:

<cancellation_command> <client ip> <client port> <pid> <cancel key>

Note that pgbouncer does not use actual PIDs. Instead, it generates PID and cancellation key together a random 8-byte number. This makes the chance of collisions exceedingly small.

By providing pg_send_cancellation as part of Citus, we can use a shell script that pgbouncer invokes to propagate the cancellation to all other worker nodes in the same cluster, for example:

#!/bin/sh
remote_ip=$1
remote_port=$2
pid=$3
cancel_key=$4

postgres_path=/usr/pgsql-14/bin
pgbouncer_port=6432

nodes_query="select nodename from pg_dist_node where groupid > 0 and groupid not in (select groupid from pg_dist_local_group) and nodecluster = current_setting('citus.cluster_name')"

# Get hostnames of other worker nodes in the cluster, and send cancellation to their pgbouncers
$postgres_path/psql -c "$nodes_query" -tAX | xargs -n 1 sh -c "$postgres_path/pg_send_cancellation $pid $cancel_key \$0 $pgbouncer_port"

One thing we need to be careful about is that the cancellations do not get forwarded back-and-forth. This is handled in pgbouncer by setting the last bit of all generated cancellation keys (sent to clients) to 1, and setting the last bit of all forwarded bits to 0. That way, when a pgbouncer receives a cancellation key with the last bit set to 0, it knows it is from another pgbouncer and should not forward further, and should set the last bit to 1 when comparing to stored cancellation keys.

Another thing we need to be careful about is that the integers should be encoded as big endian on the wire.