Commit Graph

10 Commits (5a2ec73475a66161e44c25b63d7b7d5396ff4f4b)

Author SHA1 Message Date
Nils Dijk 5a2ec73475 unschedule dependant jobs on failure 2022-08-23 15:24:05 +02:00
Nils Dijk aee5dba634 add dependency tracking between rebalance jobs 2022-08-23 15:24:05 +02:00
Nils Dijk 836950ff07 Add running state to rebalance job with pid reported 2022-08-23 15:24:05 +02:00
Nils Dijk 12aec202d8 add error state + wait for job udf 2022-08-23 15:24:05 +02:00
Nils Dijk 12b0063c31 run task based on sql command, log messages and simple retry 2022-08-23 15:24:04 +02:00
Nils Dijk a7428c4ce6 basic infrastructure for running jobs in a background worker from the maintenance daemon 2022-08-23 15:24:04 +02:00
aykut-bozkurt be06d65721
Nonblocking tenant isolation is supported by using split api. (#6167) 2022-08-17 11:13:07 +03:00
Jelte Fennema 8017693b2f
Allow specifying the shard_transfer_mode when replicating reference tables (#6070)
When using `citus.replicate_reference_tables_on_activate = off`,
reference tables need to be replicated later. This can be done using the
`replicate_reference_tables()` UDF. However, this function only allowed
blocking replication. This changes the function to default to logical
replication instead, and allows choosing any of our existing shard
transfer modes.
2022-08-09 13:21:31 +03:00
Jelte Fennema dd548ee3c7
Use faster custom copy logic for non-blocking shard moves (#6119)
DESCRIPTION: Use faster custom copy logic for non-blocking shard moves

Non-blocking shard moves consist of two main phases:
1. Initial data copy
2. Catchup phase

This changes the first of these phases significantly. Previously we used the
copy logic provided by postgres subscriptions. This meant we didn't have
to implement it ourselves, but it came with the downside of little control.
When implementing shard splits we needed more control to even make it
work, so we implemented our own logic for copying data between nodes.

This PR starts using that logic for non-blocking shard moves. Doing so
has four main advantages:
1. It uses COPY in binary format when possible, which is cheaper to encode 
    and decode. Furthermore it very often results in less data that needs to 
    be sent over the network.
2. It allows us to create the primary key (or other replica identity) after doing
    the initial data copy. This should give some speed up over the total run,
    because creating an index is bulk is much faster than incrementally building it.
3. It doesn't require a replication slot per parallel copy. Increasing the maximum
    number of replication slots uses resources in postgres, even if they are not used.
    So reducing the number of replication slots that shard moves need is nice.
4. Logical replication table_sync workers are slow to start up, so if lots of shards
    need to be copied that can make it quite slow. This can happen easily when
    combining Postgres partitioning with Citus.
2022-08-08 17:09:43 +02:00
Ahmet Gedemenli 8b68b0b5bb
Fix pg upgrade script for foreign tables (#6100)
Fixes unexpected error for foreign tables when upgrading pg
2022-08-05 13:35:17 +03:00