citus/src/test/regress
Nils Dijk 5df1b49bed
Feature: optionally force master_update_node during failover (#2773)
When `master_update_node` is called to update a node's location it waits for appropriate locks to become available. This is useful during normal operation as new operations will be blocked till after the metadata update while running operations have time to finish.

When `master_update_node` is called after a node failure it is less useful to wait for running operations to finish as they can't. The lock being held indicates an operation that once attempted to commit will fail as the machine already failed. Now the downside is the failover is postponed till the termination point of the operation. This has been observed by users to take a significant amount of time causing the rest of the system to be observed unavailable.

With this patch it is possible in such situations to invoke `master_update_node` with 2 optional arguments:
 - `force` (bool defaults to `false`): When called with true the update of the metadata will be forced to proceed by terminating conflicting backends. A cancel is not enough as the backend might be in idle time (eg. an interactive session, or going back and forth between an appliaction), therefore a more intrusive solution of termination is used here.
 - `lock_cooldown` (int defaults to `10000`): This is the time in milliseconds before conflicting backends are terminated. This is to allow the backends to finish cleanly before terminating them. This allows the user to set an upperbound to the expected time to complete the metadata update, eg. performing the failover.

The functionality is implemented by spawning a background worker that has the task of helping a certain backend in acquiring its locks. The backend is either terminated on successful execution of the metadata update, or once the memory context of the expression gets reset, eg. on a cancel of the statement.
2019-06-21 12:03:15 +02:00
..
bin Make COPY compatible with unified executor. 2019-06-20 19:53:40 +02:00
data Adds colocation check to local join 2018-04-04 22:49:27 +03:00
expected Feature: optionally force master_update_node during failover (#2773) 2019-06-21 12:03:15 +02:00
input Make COPY compatible with unified executor. 2019-06-20 19:53:40 +02:00
mitmscripts Fix misc typos 2019-05-23 17:23:27 -07:00
output Make COPY compatible with unified executor. 2019-06-20 19:53:40 +02:00
specs Feature: optionally force master_update_node during failover (#2773) 2019-06-21 12:03:15 +02:00
sql Feature: optionally force master_update_node during failover (#2773) 2019-06-21 12:03:15 +02:00
.gitignore Ignore test_times.log (#2638) 2019-03-22 10:29:01 -07:00
Makefile Feature: optionally force master_update_node during failover (#2773) 2019-06-21 12:03:15 +02:00
Pipfile Travis uses Pipfile instead of re-specifying deps 2018-09-12 17:37:14 -06:00
Pipfile.lock Travis uses Pipfile instead of re-specifying deps 2018-09-12 17:37:14 -06:00
base_schedule Propagate more ALTER FOREIGN TABLE to workers 2019-05-24 12:54:05 -07:00
failure_base_schedule Implementation for asycn FinishConnectionListEstablishment (#2584) 2019-03-22 17:30:42 +01:00
failure_schedule Implementation for asycn FinishConnectionListEstablishment (#2584) 2019-03-22 17:30:42 +01:00
isolation_schedule Feature: optionally force master_update_node during failover (#2773) 2019-06-21 12:03:15 +02:00
log_test_times Add test-timing script 2019-02-26 23:01:40 -07:00
multi_follower_schedule Allow simple DML commands from hot standby 2018-10-06 10:54:44 +02:00
multi_mx_schedule Support TRUNCATE from the MX worker nodes 2018-09-03 14:06:31 +03:00
multi_schedule Fix join alias resolution 2019-06-12 17:25:07 -07:00
multi_task_tracker_extra_schedule Remove create_insert_proxy_for_table 2019-03-15 14:13:03 -06:00
pg_regress_multi.pl Sort output of RETURNING 2019-04-24 11:51:19 +03:00
worker_schedule Initial commit of Citus 5.0 2016-02-11 04:05:32 +02:00