Increase isolation timeout because of shards splits (#6213)

Recently isolation tests involving shard splits have been randomly
failing in CI with timeouts. It's possible that there's an actual bug
here, but it's also quite likely that our timeout is just slightly too
low for the combination of shard splits and the CI VM having a bad day.

Increasing the timeout is fairly low cost and allows us to find out if
there's an actual bug or if its simply slowness. So that's what this PR
does. If it turns out to be an actual bug, we can decrease the timeout
again when we fix it.

Examples of failed tests:
1. https://app.circleci.com/pipelines/github/citusdata/citus/26241/workflows/9e0bb721-d798-481b-907c-914236b63e38/jobs/742409
2. https://app.circleci.com/pipelines/github/citusdata/citus/26171/workflows/8f352e3b-e6e4-4f7f-b0d0-2543f62a0209/jobs/739470
pull/6211/head
Jelte Fennema 2022-08-19 21:37:45 +02:00 committed by GitHub
parent 9cfadd7965
commit dfa6c26d7d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 7 additions and 3 deletions

View File

@ -16,9 +16,13 @@ MAKEFILE_DIR := $(dir $(realpath $(firstword $(MAKEFILE_LIST))))
export PATH := $(MAKEFILE_DIR)/bin:$(PATH)
export PG_REGRESS_DIFF_OPTS = -dU10 -w
# Use lower isolation test timeout, the 5 minute default is waaay too long for
# us so we use 20 seconds instead. We should detect blockages very quickly and
# the queries we run are also very fast.
export PGISOLATIONTIMEOUT = 20
# us so we use 60 seconds instead. We should detect blockages very quickly and
# most queries that we run are also very fast. So fast even that 60 seconds is
# usually too long. However, any commands involving logical replication can be
# quite slow, especially shard splits and especially on CI. So we still keep
# this value at the pretty high 60 seconds because even those slow commands are
# definitly stuck when they take longer than that.
export PGISOLATIONTIMEOUT = 60
##
## Citus regression support