mirror of https://github.com/citusdata/citus.git
The new shard copy code that was created for shard splits has some advantages over the old shard copy code. The old code was using worker_append_table_to_shard, which wrote to disk twice. And it also didn't use binary copy when that was possible. Both of these issues were fixed in the new copy code. This PR starts using this new copy logic also for shard moves, not just for shard splits. On my local machine I created a single shard table like this. ```sql set citus.shard_count = 1; create table t(id bigint, a bigint); select create_distributed_table('t', 'id'); INSERT into t(id, a) SELECT i, i from generate_series(1, 100000000) i; ``` I then turned `fsync` off to make sure I wasn't bottlenecked by disk. Finally I moved this shard between nodes with `citus_move_shard_placement` with `block_writes`. Before this PR a move took ~127s, after this PR it took only ~38s. So for this small test this resulted in spending ~70% less time. And I also tried the same test for a table that contained large strings: ```sql set citus.shard_count = 1; create table t(id bigint, a bigint, content text); select create_distributed_table('t', 'id'); INSERT into t(id, a, content) SELECT i, i, 'aunethautnehoautnheaotnuhetnohueoutnehotnuhetncouhaeohuaeochgrhgd.athbetndairgexdbuhaobulrhdbaetoausnetohuracehousncaoehuesousnaceohuenacouhancoexdaseohusnaetobuetnoduhasneouhaceohusnaoetcuhmsnaetohuacoeuhebtokteaoshetouhsanetouhaoug.lcuahesonuthaseauhcoerhuaoecuh.lg;rcydabsnetabuesabhenth' from generate_series(1, 20000000) i; ``` |
||
---|---|---|
.. | ||
citus_create_restore_point.c | ||
citus_split_shard_by_split_points.c | ||
citus_tools.c | ||
create_shards.c | ||
delete_protocol.c | ||
health_check.c | ||
isolate_shards.c | ||
modify_multiple_shards.c | ||
node_protocol.c | ||
partitioning.c | ||
repair_shards.c | ||
shard_cleaner.c | ||
shard_rebalancer.c | ||
shard_split.c | ||
stage_protocol.c | ||
worker_copy_table_to_node_udf.c | ||
worker_node_manager.c | ||
worker_shard_copy.c | ||
worker_split_copy_udf.c |