citus/src/backend/distributed/operations
SaitTalhaNisanci 4fbed90505
Fix data-race with concurrent calls of DropMarkedShards (#4909)
* Fix problews with concurrent calls of DropMarkedShards

When trying to enable `citus.defer_drop_after_shard_move` by default it
turned out that DropMarkedShards was not safe to call concurrently.
This could especially cause big problems when also moving shards at the
same time. During tests it was possible to trigger a state where a shard
that was moved would not be available on any of the nodes anymore after
the move.

Currently DropMarkedShards is only called in production by the
maintenaince deamon. Since this is only a single process triggering such
a race is currently impossible in production settings. In future changes
we will want to call DropMarkedShards from other places too though.

* Add some isolation tests

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
(cherry picked from commit 93c2dcf3d2)
2021-07-13 05:30:51 +03:00
..
citus_create_restore_point.c rename node/worker utilities (#4003) 2020-07-09 15:30:35 +03:00
citus_tools.c Fix Semmle errors (#4636) 2021-02-08 18:37:44 +03:00
create_shards.c Fix int32 overflow and use PG macros for INT32_XX (#4061) 2020-07-23 18:30:08 +03:00
delete_protocol.c Rename use -> shouldUse 2021-03-16 10:01:14 +03:00
modify_multiple_shards.c Remove master from file hierarchy 2020-06-16 17:49:09 +02:00
node_protocol.c Add udf citus_get_active_worker_nodes 2021-03-17 14:56:28 +03:00
partitioning.c Add a view for simple (time) partitions and their access methods 2021-01-08 11:28:15 +01:00
repair_shards.c Not mention citus local tables in error messages (#4579) 2021-01-27 12:36:53 +03:00
shard_cleaner.c Fix data-race with concurrent calls of DropMarkedShards (#4909) 2021-07-13 05:30:51 +03:00
shard_rebalancer.c Avoid two race conditions in the rebalance progress monitor (#5050) 2021-06-21 16:42:10 +02:00
split_shards.c Remove master from file hierarchy 2020-06-16 17:49:09 +02:00
stage_protocol.c Reimplement citus_update_table_statistics to detect dist. deadlocks (#4752) 2021-03-03 11:41:31 +03:00
worker_node_manager.c COPY uses adaptive connection management on local node 2021-02-04 09:45:07 +01:00