mirror of https://github.com/citusdata/citus.git
Hopefully reduce flaky tests by disabling the maintenance daemon (#6252)
Sometimes our CI randomly fails on a test in a way similar to this: ```diff step s2-drop: DROP TABLE cancel_table; - + <waiting ...> +step s2-drop: <... completed> starting permutation: s1-timeout s1-begin s1-sleep10000 s1-rollback s1-reset s1-drop ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26524/workflows/5415b84f-13a3-482f-bef9-648314c79a67/jobs/756377 Another example of a failure like this: ```diff stop_session_level_connection_to_node ------------------------------------- (1 row) step s3-display: SELECT * FROM ref_table ORDER BY id, value; SELECT * FROM dist_table ORDER BY id, value; - + <waiting ...> +step s3-display: <... completed> id|value --+----- ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26551/workflows/91dca4b2-bb1c-4cae-b2ef-ce3f9c689ce5/jobs/757781 A step that shouldn't be blocked is detected as "waiting..." temporarily and then gets unblocked automatically immediately after. I'm not certain of the reason for this, but one explanation is that the maintenance daemon is doing something that blocks the query. In the shown case my hunch is that it could be the deferred shard deletion. This PR disables all the features of the maintenance daemon during isolation testing to try and prevent process from randomly being detected as blocking. NOTE: I'm not certain that this will actually fix this issue. If the issue persists even after this change, at least we know that it's not the maintenance daemon that's blocking it.pull/6375/head
parent
813542dfa1
commit
5c64227223
|
@ -550,13 +550,21 @@ if($isolationtester)
|
|||
{
|
||||
push(@pgOptions, "citus.worker_min_messages='warning'");
|
||||
push(@pgOptions, "citus.log_distributed_deadlock_detection=on");
|
||||
push(@pgOptions, "citus.distributed_deadlock_detection_factor=-1");
|
||||
push(@pgOptions, "citus.shard_count=4");
|
||||
push(@pgOptions, "citus.metadata_sync_interval=1000");
|
||||
push(@pgOptions, "citus.metadata_sync_retry_interval=100");
|
||||
push(@pgOptions, "client_min_messages='warning'"); # pg12 introduced notice showing during isolation tests
|
||||
push(@pgOptions, "citus.running_under_isolation_test=true");
|
||||
|
||||
# Disable all features of the maintenance daemon. Otherwise queries might
|
||||
# randomly show temporarily as "waiting..." because they are waiting for the
|
||||
# maintenance daemon.
|
||||
push(@pgOptions, "citus.distributed_deadlock_detection_factor=-1");
|
||||
push(@pgOptions, "citus.recover_2pc_interval=-1");
|
||||
push(@pgOptions, "citus.enable_statistics_collection=-1");
|
||||
push(@pgOptions, "citus.defer_shard_delete_interval=-1");
|
||||
push(@pgOptions, "citus.stat_statements_purge_interval=-1");
|
||||
push(@pgOptions, "citus.background_task_queue_interval=-1");
|
||||
}
|
||||
|
||||
# Add externally added options last, so they overwrite the default ones above
|
||||
|
|
Loading…
Reference in New Issue