Compare commits

...

53 Commits
main ... v9.5.7

Author SHA1 Message Date
Hanefi Onaldi 209049006c
Bump Citus version to 9.5.7 2021-08-16 17:51:46 +03:00
Hanefi Onaldi a1633ed969
Add changelog entries for 9.5.7
(cherry picked from commit 41b15d8775)
2021-08-16 17:50:34 +03:00
Onder Kalaci d052c19cd8 Guard against hard WaitEvenSet errors
In short, add wrappers around Postgres' AddWaitEventToSet() and
ModifyWaitEvent().

AddWaitEventToSet()/ModifyWaitEvent*() may throw hard errors. For
example, when the underlying socket for a connection is closed by
the remote server and already reflected by the OS, however
Citus hasn't had a chance to get this information. In that case,
if replication factor is >1, Citus can failover to other nodes
for executing the query. Even if replication factor = 1, Citus
can give much nicer errors.

So CitusAddWaitEventSetToSet()/CitusModifyWaitEvent() simply puts
AddWaitEventToSet()/ModifyWaitEvent() into a PG_TRY/PG_CATCH block
in order to catch any hard errors, and returns this information to
the caller.
2021-08-10 09:47:08 +02:00
Onder Kalaci fc2272c6bd Adjust the tests to earlier versions
- Drop PRIMARY KEY for Citus 10 compatibility
- Drop columnar for PG 12
- Do not start/stop metadata sync as stop is not implemented in 10.1
- PG 11 parallel query changes explain outputs
2021-08-06 16:47:11 +02:00
Onder Kalaci aeca7b1868 Dropped columns do not diverge distribution column for partitioned tables
Before this commit, creating a partition after a DROP column
on the parent (position before dist. key) was leading to
partition to have the wrong distribution column.

(cherry picked from commit 32124efd83)
2021-08-06 16:46:52 +02:00
naisila 74a58270db Bump use of new sql function 2021-08-03 16:45:23 +03:00
naisila edfa976cc8 Fix master_update_table_statistics scripts for 9.5 2021-08-03 16:45:23 +03:00
naisila daf85c6923 Fix master_update_table_statistics scripts for 9.4 2021-08-03 16:45:23 +03:00
naisila 7f11a59863 Reimplement master_update_table_statistics to detect dist. deadlocks
(ALL CODE BORROWED from commit 2f30614fe3)
2021-08-03 16:45:23 +03:00
Onur Tirtir b9df44742f Remove fkey graph visited flags & rework GetConnectedListHelper (#4446)
With this commit, we remove visited flags from ForeignConstraintRelationshipNode
struct since keeping local state in global object is both dangerous and
meaningless.

Also to improve readability, this commit also converts needless recursion to
iterative DFS to avoid passing local hash-map as another parameter to
GetConnectedListHelper function.
(cherry picked from commit 0db21bbe14)
2021-08-03 16:45:23 +03:00
Onur Tirtir 928cee6af6 Refactor foreign_key_relationship.c (#4438)
(cherry picked from commit 3f60b08b11)
2021-08-03 16:45:23 +03:00
naisila 27d74f1540 Add OpenConnectionToNodes and functions that generate shard queries
(ALL CODE BORROWED from commit 724d56f949)
2021-08-03 16:45:23 +03:00
Hanefi Onaldi a529866f93
Bump Citus version to 9.5.6 2021-07-08 15:38:42 +03:00
Hanefi Onaldi 2efbd6e21a
Add changelog entry for 9.5.6
(cherry picked from commit a8a632269af6a276c92bc4852a0c87f0e6467fb3)
2021-07-08 15:38:42 +03:00
Nils Dijk 3e2257fe8f fix 9.5-2 upgrade script to adhere to idempotency 2021-07-08 12:25:18 +02:00
Nils Dijk 2e0f9cff59 Add test for idempotency of citus_prepare_pg_upgrade 2021-07-08 12:25:18 +02:00
Hanefi Onaldi 5ba6edaa48
Bump Citus version to 9.5.5 2021-07-08 09:47:22 +03:00
Hanefi Onaldi 39c4b1045c
Add changelog entry for 9.5.5
(cherry picked from commit d3b6651403)
2021-07-08 09:47:09 +03:00
Onur Tirtir 344ac23f69 Fix lower boundary calculation when pruning range dist table shards (#5082)
This happens only when we have a "<" or "<=" filter on distribution
column of a range distributed table and that filter falls in between
two shards.

When the filter falls in between two shards:

  If the filter is ">" or ">=", then UpperShardBoundary was
  returning "upperBoundIndex - 1", where upperBoundIndex is
  exclusive shard index used during binary seach.
  This is expected since upperBoundIndex is an exclusive
  index.

  If the filter is "<" or "<=", then LowerShardBoundary was
  returning "lowerBoundIndex + 1", where lowerBoundIndex is
  inclusive shard index used during binary seach.
  On the other hand, since lowerBoundIndex is an inclusive
  index, we should just return lowerBoundIndex instead of
  doing "+ 1". Before this commit, we were missing leftmost
  shard in such queries.

* Remove useless conditional branches

The branch that we delete from UpperShardBoundary was obviously useless.

The other one in LowerShardBoundary became useless after we remove "+ 1"
from there.

This indeed is another proof of what & how we are fixing with this pr.

* Improve comments and add more

* Add some tests for upper bound calculation too

(cherry picked from commit b118d4188e)

* Also fix a debug message diff for 9.5
2021-07-07 12:49:16 +03:00
Nils Dijk c11d804e56 Bump use of new sql function 2021-07-05 16:18:43 +02:00
Marco Slot 636bdda886 Fix PG upgrade scripts for 9.5 2021-07-05 16:18:43 +02:00
Marco Slot 7dbda5d607 Fix PG upgrade scripts for 9.4 2021-07-05 16:18:43 +02:00
Sait Talha Nisanci 3584fa11b0 Fix a flaky behaviour in shared_connection_stats
With the previous query, we were not pushing down the pg_sleep hence the
number of connections to a worker could be different from run to run.
2021-07-02 18:37:49 +03:00
Hanefi Onaldi 4aff833ca8 Do not use security flags by default (#4770)
(cherry picked from commit 697bbbd3c6)
2021-03-04 00:54:35 +03:00
Hanefi Onaldi bf345ac49b Add security flags in configure scripts (#4760)
(cherry picked from commit f87107eb6b)
2021-03-04 00:52:34 +03:00
Onur Tirtir cd1e706acb Bump version to 9.5.4 2021-02-19 15:08:31 +03:00
Onur Tirtir 3d4d76fdde Update CHANGELOG for 9.5.4
(cherry picked from commit bb14c5267f)

 Conflicts:
	CHANGELOG.md
2021-02-19 15:08:31 +03:00
SaitTalhaNisanci 45671a1caa Use PROCESS_UTILITY_QUERY in utility calls
When we use PROCESS_UTILITY_TOPLEVEL it causes some problems when
combined with other extensions such as pg_audit. With this commit we use
PROCESS_UTILITY_QUERY in the codebase to fix those problems.

(cherry picked from commit dcf54eaf2a)

 Conflicts:
	src/backend/distributed/commands/alter_table.c
	src/backend/distributed/commands/cascade_table_operation_for_connected_relations.c
	src/backend/distributed/executor/local_executor.c
	src/backend/distributed/utils/role.c
	src/backend/distributed/worker/worker_create_or_replace.c
	src/backend/distributed/worker/worker_data_fetch_protocol.c
2021-02-19 15:08:31 +03:00
Onur Tirtir 1eec630640 Bump version to 9.5.3 2021-02-16 15:19:53 +03:00
Onur Tirtir 5dc2fae9d6 Update CHANGELOG for 9.5.3
(cherry picked from commit a0de066996)

 Conflicts:
	CHANGELOG.md
2021-02-16 15:18:35 +03:00
Ahmet Gedemenli 0f498ac26d Fix dropping fkey when distributing table
(cherry picked from commit c8e83d1f26)
2021-02-12 18:33:05 +03:00
Onur Tirtir 44459be1ab Implement GetPgDependTuplesForDependingObjects
(cherry picked from commit 04a4167a8a)
2021-02-12 18:32:16 +03:00
Onur Tirtir 8401acb761 Implement ConstraintWithNameIsOfType (#4451)
(cherry picked from commit e91e745dbc)
2021-02-12 18:28:00 +03:00
Onur Tirtir 26556b2bba Refactor ColumnAppearsInForeignKeyToReferenceTable (#4441)
(cherry picked from commit d1b3eaf767)
2021-02-12 18:26:23 +03:00
Sait Talha Nisanci 7480160f4f Call 6 times not 7 in subquery_prepared_statements 2021-02-11 17:28:30 +01:00
Onder Kalaci 23951c562e Do not connection re-use for intermediate results
/*
 * Colocated intermediate results are just files and not required to use
 * the same connections with their co-located shards. So, we are free to
 * use any connection we can get.
 *
 * Also, the current connection re-use logic does not know how to handle
 * intermediate results as the intermediate results always truncates the
 * existing files. That's why, we use one connection per intermediate
 * result.
 */

(cherry picked from commit 5d5a357487)
2021-02-11 16:51:09 +01:00
Onder Kalaci 51560f9644 When reaches to shared pool size, COPY sets the placement access
It looks like we forgot to set the placement accesses, and
this could lead to self-deadlocks on complex transaction blocks.
2021-02-02 10:18:48 +01:00
Onder Kalaci 9f27e398a9 When reaches to executor pool size, COPY sets the placement access
It looks like we forgot to set the placement accesses, and
this could lead to self-deadlocks on complex transaction blocks.

(cherry picked from commit 36bdeef1bb)
2021-02-02 10:18:39 +01:00
Sait Talha Nisanci 5bb4bb4b5f Bump version to 9.5.2 2021-01-26 16:26:25 +03:00
SaitTalhaNisanci a7ff0c5800
Update CHANGELOG for 9.5.2 (#4578) 2021-01-26 12:27:01 +03:00
Nils Dijk 6703b173a0 rework ci
(cherry picked from commit a748729998)
2021-01-25 15:45:05 +03:00
Nils Dijk 2efeed412a Mitigate segfault in connection statemachine (#4551)
As described in the comment, we have observed crashes in production
due to a segfault caused by the dereference of a NULL pointer in our
connection statemachine.

As a mitigation, preventing system crashes, we provide an error with
a small explanation of the issue. Unfortunately the case is not
reliably reproduced yet, hence the inability to add tests.

DESCRIPTION: Prevent segfaults when SAVEPOINT handling cannot recover from connection failures
(cherry picked from commit d127516dc8)
2021-01-25 15:22:41 +03:00
Hadi Moshayedi 49ce36fe8b Reland #4419
(cherry picked from commit bc01c795a2)
2021-01-25 15:15:59 +03:00
Hadi Moshayedi 043c3356ae Faster logical replication tests.
Logical replication status can take wal_receiver_status_interval
seconds to get updated. Default is 10s, which means tests in
which logical replication is used can take a long time to finish.
We reduce it to 1 second to speed these tests up.

Logical replication apply launcher launches workers every
wal_retrieve_retry_interval, so if we have many shard moves with
logical replication consecutively, they will be throttled by this
parameter. Default is 5s, we reduce it to 1s so we finish tests
faster.

(cherry picked from commit 0e0fd6599a)
2021-01-25 15:13:16 +03:00
Onur Tirtir a603ad9cbf Not consider single shard hash dist. tables as replicated (#4413)
(cherry picked from commit 0eb5701658)
2020-12-17 19:02:58 +03:00
Onur Tirtir 4a1255fd10 Update CHANGELOG for 9.5.1
(cherry picked from commit dd3453ced5)

 Conflicts:
	CHANGELOG.md
2020-12-02 11:20:43 +03:00
Onur Tirtir 67004edf43 Bump version to 9.5.1 2020-12-02 11:20:43 +03:00
Onur Tirtir 789d441296 Handle invalid connection hash entries (#4362)
If MemoryContextAlloc errors out -e.g. during an OOM-, ConnectionHashEntry->connections
stays as NULL.

With this commit, we add isValid flag to ConnectionHashEntry that should be set to true
right after we allocate & initialize ConnectionHashEntry->connections list properly, and we
check it before accesing to ConnectionHashEntry->connections.
(cherry picked from commit 7f3d1182ed)
2020-12-01 11:07:12 +03:00
Önder Kalacı 6d06e9760a Enable parallel query on EXPLAIN ANALYZE (#4325)
It seems that we forgot to pass the revelant
flag to enable Postgres' parallel query
capabilities on the shards when user does
EXPLAIN ANALYZE on a distributed table.
(cherry picked from commit b0ddbbd33a)
2020-12-01 11:07:12 +03:00
Onder Kalaci 74f0dd0c25 Do not execute subplans multiple times with cursors
Before this commit, we let AdaptiveExecutorPreExecutorRun()
to be effective multiple times on every FETCH on cursors.
That does not affect the correctness of the query results,
but adds significant overhead.

(cherry picked from commit c433c66f2b)
2020-12-01 11:07:12 +03:00
Onder Kalaci e777daad22 Do not cache all the distributed table metadata during CitusTableTypeIdList()
CitusTableTypeIdList() function iterates on all the entries of pg_dist_partition
and loads all the metadata in to the cache. This can be quite memory intensive
especially when there are lots of distributed tables.

When partitioned tables are used, it is common to have many distributed tables
given that each partition also becomes a distributed table.

CitusTableTypeIdList() is used on every CREATE TABLE .. PARTITION OF.. command
as well. It means that, anytime a partition is created, Citus loads all the
metadata to the cache. Note that Citus typically only loads the accessed table's
metadata to the cache.

(cherry picked from commit 7accbff3f6)

 Conflicts:
	src/test/regress/bin/normalize.sed
2020-12-01 11:07:06 +03:00
Onur Tirtir 4e373fadd8 Update CHANGELOG for 9.5.0
(cherry picked from commit 52a5ab0751)
2020-11-11 16:02:17 +03:00
Hanefi Önaldı 35703d5e61
Bump Citus to 9.5.0 2020-11-09 13:16:05 +03:00
99 changed files with 5748 additions and 730 deletions

View File

@ -5,47 +5,32 @@ orbs:
jobs:
build-11:
build:
description: Build the citus extension
parameters:
pg_major:
description: postgres major version building citus for
type: integer
image:
description: docker image to use for the build
type: string
default: citus/extbuilder
image_tag:
description: tag to use for the docker image
type: string
docker:
- image: 'citus/extbuilder:11.9'
- image: '<< parameters.image >>:<< parameters.image_tag >>'
steps:
- checkout
- run:
name: 'Configure, Build, and Install'
command: build-ext
command: |
./ci/build-citus.sh
- persist_to_workspace:
root: .
paths:
- build-11/*
- install-11.tar
build-12:
docker:
- image: 'citus/extbuilder:12.4'
steps:
- checkout
- run:
name: 'Configure, Build, and Install'
command: build-ext
- persist_to_workspace:
root: .
paths:
- build-12/*
- install-12.tar
build-13:
docker:
- image: 'citus/extbuilder:13.0'
steps:
- checkout
- run:
name: 'Configure, Build, and Install'
command: build-ext
- persist_to_workspace:
root: .
paths:
- build-13/*
- install-13.tar
- build-<< parameters.pg_major >>/*
- install-<<parameters.pg_major >>.tar
check-style:
docker:
@ -91,6 +76,7 @@ jobs:
- run:
name: 'Check if all CI scripts are actually run'
command: ci/check_all_ci_scripts_are_run.sh
check-sql-snapshots:
docker:
- image: 'citus/extbuilder:latest'
@ -99,392 +85,230 @@ jobs:
- run:
name: 'Check Snapshots'
command: ci/check_sql_snapshots.sh
test-11_check-multi:
test-pg-upgrade:
description: Runs postgres upgrade tests
parameters:
old_pg_major:
description: 'postgres major version to use before the upgrade'
type: integer
new_pg_major:
description: 'postgres major version to upgrade to'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/pgupgradetester
image_tag:
description: 'docker image tag to use'
type: string
default: latest
docker:
- image: 'citus/exttester:11.9'
- image: '<< parameters.image >>:<< parameters.image_tag >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-multi)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-multi'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_11,multi'
test-11_check-vanilla:
docker:
- image: 'citus/exttester:11.9'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.old_pg_major >>.tar" --directory /
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.new_pg_major >>.tar" --directory /
- run:
name: 'Install and Test (check-vanilla)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-vanilla'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_11,vanilla'
test-11_check-mx:
docker:
- image: 'citus/exttester:11.9'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-mx)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-multi-mx'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_11,mx'
test-11_check-worker:
docker:
- image: 'citus/exttester:11.9'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-worker)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-worker'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_11,worker'
test-11_check-isolation:
docker:
- image: 'citus/exttester:11.9'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-isolation)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-isolation'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_11,isolation'
test-11_check-follower-cluster:
docker:
- image: 'citus/exttester:11.9'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: 'ulimit -c unlimited'
command: |
ulimit -c unlimited
- run:
name: 'Install and Test (follower-cluster)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-follower-cluster'
name: 'Install and test postgres upgrade'
command: |
gosu circleci \
make -C src/test/regress \
check-pg-upgrade \
old-bindir=/usr/lib/postgresql/<< parameters.old_pg_major >>/bin \
new-bindir=/usr/lib/postgresql/<< parameters.new_pg_major >>/bin
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
cp core.* /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
when: on_fail
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
when: on_fail
- codecov/upload:
flags: 'test_11,follower-cluster'
- store_artifacts:
path: '/tmp/core_dumps'
test-11_check-failure:
flags: 'test_<< parameters.old_pg_major >>_<< parameters.new_pg_major >>,upgrade'
test-citus-upgrade:
description: Runs citus upgrade tests
parameters:
pg_major:
description: "postgres major version"
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/citusupgradetester
image_tag:
description: 'docker image tag to use'
type: string
docker:
- image: 'citus/failtester:11.9'
- image: '<< parameters.image >>:<< parameters.image_tag >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-failure)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-failure'
no_output_timeout: 2m
test-11-12_check-pg-upgrade:
docker:
- image: 'citus/pgupgradetester:latest'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and test postgres upgrade'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext --target check-pg-upgrade --old-pg-version 11 --new-pg-version 12'
no_output_timeout: 2m
test-12-13_check-pg-upgrade:
docker:
- image: 'citus/pgupgradetester:latest'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and test postgres upgrade'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext --target check-pg-upgrade --old-pg-version 12 --new-pg-version 13'
no_output_timeout: 2m
test-12_check-multi:
docker:
- image: 'citus/exttester:12.4'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-multi)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-multi'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_12,multi'
test-12_check-vanilla:
docker:
- image: 'citus/exttester:12.4'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-vanilla)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-vanilla'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_12,vanilla'
test-12_check-mx:
docker:
- image: 'citus/exttester:12.4'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-mx)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-multi-mx'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_12,mx'
test-12_check-isolation:
docker:
- image: 'citus/exttester:12.4'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-isolation)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-isolation'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_12,isolation'
test-12_check-worker:
docker:
- image: 'citus/exttester:12.4'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-worker)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-worker'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_12,worker'
test-12_check-follower-cluster:
docker:
- image: 'citus/exttester:12.4'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: 'ulimit -c unlimited'
- run:
name: 'Install and Test (follower-cluster)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-follower-cluster'
no_output_timeout: 2m
- run:
command: |
mkdir -p /tmp/core_dumps
cp core.* /tmp/core_dumps
when: on_fail
- codecov/upload:
flags: 'test_12,follower-cluster'
- store_artifacts:
path: '/tmp/core_dumps'
test-12_check-failure:
docker:
- image: 'citus/failtester:12.4'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-failure)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-failure'
no_output_timeout: 2m
test-11_check-citus-upgrade:
docker:
- image: 'citus/citusupgradetester:11.9'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
ulimit -c unlimited
- run:
name: 'Install and test citus upgrade'
command: |
chown -R circleci:circleci /home/circleci
install-and-test-ext --target check-citus-upgrade --citus-pre-tar /install-pg11-citusv8.0.0.tar
install-and-test-ext --target check-citus-upgrade --citus-pre-tar /install-pg11-citusv8.1.0.tar
install-and-test-ext --target check-citus-upgrade --citus-pre-tar /install-pg11-citusv8.2.0.tar
install-and-test-ext --target check-citus-upgrade --citus-pre-tar /install-pg11-citusv8.3.0.tar
# run make check-citus-upgrade for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-pre-tar=/install-pg11-citus${citus_version}.tar \
citus-post-tar=/home/circleci/project/install-$PG_MAJOR.tar; \
done;
install-and-test-ext --target check-citus-upgrade-mixed --citus-pre-tar /install-pg11-citusv8.0.0.tar
install-and-test-ext --target check-citus-upgrade-mixed --citus-pre-tar /install-pg11-citusv8.1.0.tar
install-and-test-ext --target check-citus-upgrade-mixed --citus-pre-tar /install-pg11-citusv8.2.0.tar
install-and-test-ext --target check-citus-upgrade-mixed --citus-pre-tar /install-pg11-citusv8.3.0.tar
no_output_timeout: 2m
test-13_check-multi:
docker:
- image: 'citus/exttester:13.0'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-multi)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-multi'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_13,multi'
test-13_check-mx:
docker:
- image: 'citus/exttester:13.0'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-mx)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-multi-mx'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_13,mx'
test-13_check-vanilla:
docker:
- image: 'citus/exttester:13.0'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-vanilla)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-vanilla'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_13,vanilla'
test-13_check-worker:
docker:
- image: 'citus/exttester:13.0'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-worker)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-worker'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_13,worker'
test-13_check-isolation:
docker:
- image: 'citus/exttester:13.0'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-isolation)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-isolation'
no_output_timeout: 2m
- codecov/upload:
flags: 'test_13,isolation'
test-13_check-follower-cluster:
docker:
- image: 'citus/exttester:13.0'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Enable core dumps'
command: 'ulimit -c unlimited'
- run:
name: 'Install and Test (follower-cluster)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-follower-cluster'
# run make check-citus-upgrade-mixed for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade-mixed \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-pre-tar=/install-pg11-citus${citus_version}.tar \
citus-post-tar=/home/circleci/project/install-$PG_MAJOR.tar; \
done;
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
cp core.* /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
when: on_fail
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
when: on_fail
- codecov/upload:
flags: 'test_13,follower-cluster'
- store_artifacts:
path: '/tmp/core_dumps'
flags: 'test_<< parameters.pg_major >>,upgrade'
test-13_check-failure:
test-citus:
description: Runs the common tests of citus
parameters:
pg_major:
description: "postgres major version"
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/exttester
image_tag:
description: 'docker image tag to use'
type: string
make:
description: "make target"
type: string
docker:
- image: 'citus/failtester:13.0'
- image: '<< parameters.image >>:<< parameters.image_tag >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install and Test (check-failure)'
command: 'chown -R circleci:circleci /home/circleci && install-and-test-ext check-failure'
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-${PG_MAJOR}.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Run Test'
command: |
gosu circleci make -C src/test/regress << parameters.make >>
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
when: on_fail
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
when: on_fail
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,<< parameters.make >>'
when: always
check-merge-to-enterprise:
docker:
@ -495,6 +319,7 @@ jobs:
- run:
command: |
ci/check_enterprise_merge.sh
ch_benchmark:
docker:
- image: buildpack-deps:stretch
@ -509,6 +334,7 @@ jobs:
sh run_hammerdb.sh citusbot_ch_benchmark_rg
name: install dependencies and run ch_benchmark tests
no_output_timeout: 20m
tpcc_benchmark:
docker:
- image: buildpack-deps:stretch
@ -524,7 +350,6 @@ jobs:
name: install dependencies and run ch_benchmark tests
no_output_timeout: 20m
workflows:
version: 2
build_and_test:
@ -536,70 +361,173 @@ workflows:
ignore:
- /release-[0-9]+\.[0-9]+.*/ # match with releaseX.Y.*
- build-11
- build-12
- build-13
- build:
name: build-11
pg_major: 11
image_tag: '11.9'
- build:
name: build-12
pg_major: 12
image_tag: '12.4'
- build:
name: build-13
pg_major: 13
image_tag: '13.0'
- check-style
- check-sql-snapshots
- test-11_check-multi:
- test-citus:
name: 'test-11_check-multi'
pg_major: 11
image_tag: '11.9'
make: check-multi
requires: [build-11]
- test-11_check-vanilla:
- test-citus:
name: 'test-11_check-mx'
pg_major: 11
image_tag: '11.9'
make: check-multi-mx
requires: [build-11]
- test-11_check-isolation:
- test-citus:
name: 'test-11_check-vanilla'
pg_major: 11
image_tag: '11.9'
make: check-vanilla
requires: [build-11]
- test-11_check-mx:
- test-citus:
name: 'test-11_check-isolation'
pg_major: 11
image_tag: '11.9'
make: check-isolation
requires: [build-11]
- test-11_check-worker:
- test-citus:
name: 'test-11_check-worker'
pg_major: 11
image_tag: '11.9'
make: check-worker
requires: [build-11]
- test-11_check-follower-cluster:
- test-citus:
name: 'test-11_check-follower-cluster'
pg_major: 11
image_tag: '11.9'
make: check-follower-cluster
requires: [build-11]
- test-11_check-failure:
- test-citus:
name: 'test-11_check-failure'
pg_major: 11
image: citus/failtester
image_tag: '11.9'
make: check-failure
requires: [build-11]
- test-12_check-multi:
- test-citus:
name: 'test-12_check-multi'
pg_major: 12
image_tag: '12.4'
make: check-multi
requires: [build-12]
- test-12_check-vanilla:
- test-citus:
name: 'test-12_check-mx'
pg_major: 12
image_tag: '12.4'
make: check-multi-mx
requires: [build-12]
- test-12_check-isolation:
- test-citus:
name: 'test-12_check-vanilla'
pg_major: 12
image_tag: '12.4'
make: check-vanilla
requires: [build-12]
- test-12_check-mx:
- test-citus:
name: 'test-12_check-isolation'
pg_major: 12
image_tag: '12.4'
make: check-isolation
requires: [build-12]
- test-12_check-worker:
- test-citus:
name: 'test-12_check-worker'
pg_major: 12
image_tag: '12.4'
make: check-worker
requires: [build-12]
- test-12_check-follower-cluster:
- test-citus:
name: 'test-12_check-follower-cluster'
pg_major: 12
image_tag: '12.4'
make: check-follower-cluster
requires: [build-12]
- test-12_check-failure:
- test-citus:
name: 'test-12_check-failure'
pg_major: 12
image: citus/failtester
image_tag: '12.4'
make: check-failure
requires: [build-12]
- test-13_check-multi:
- test-citus:
name: 'test-13_check-multi'
pg_major: 13
image_tag: '13.0'
make: check-multi
requires: [build-13]
- test-13_check-vanilla:
- test-citus:
name: 'test-13_check-mx'
pg_major: 13
image_tag: '13.0'
make: check-multi-mx
requires: [build-13]
- test-13_check-isolation:
- test-citus:
name: 'test-13_check-vanilla'
pg_major: 13
image_tag: '13.0'
make: check-vanilla
requires: [build-13]
- test-13_check-mx:
- test-citus:
name: 'test-13_check-isolation'
pg_major: 13
image_tag: '13.0'
make: check-isolation
requires: [build-13]
- test-13_check-worker:
- test-citus:
name: 'test-13_check-worker'
pg_major: 13
image_tag: '13.0'
make: check-worker
requires: [build-13]
- test-13_check-follower-cluster:
requires: [build-13]
- test-13_check-failure:
- test-citus:
name: 'test-13_check-follower-cluster'
pg_major: 13
image_tag: '13.0'
make: check-follower-cluster
requires: [build-13]
- test-11-12_check-pg-upgrade:
requires:
- build-11
- build-12
- test-citus:
name: 'test-13_check-failure'
pg_major: 13
image: citus/failtester
image_tag: '13.0'
make: check-failure
requires: [build-13]
- test-12-13_check-pg-upgrade:
requires:
- build-12
- build-13
- test-pg-upgrade:
name: 'test-11-12_check-pg-upgrade'
old_pg_major: 11
new_pg_major: 12
image_tag: latest
requires: [build-11,build-12]
- test-11_check-citus-upgrade:
- test-pg-upgrade:
name: 'test-12-13_check-pg-upgrade'
old_pg_major: 12
new_pg_major: 13
image_tag: latest
requires: [build-12,build-13]
- test-citus-upgrade:
name: test-11_check-citus-upgrade
pg_major: 11
image_tag: '11.9'
requires: [build-11]
- ch_benchmark:

View File

@ -1,3 +1,151 @@
### citus v9.5.7 (August 16, 2021) ###
* Allows more graceful failovers when replication factor > 1
* Fixes a bug that causes partitions to have wrong distribution key after
`DROP COLUMN`
* Improves master_update_table_statistics and provides distributed deadlock
detection
### citus v9.5.6 (July 8, 2021) ###
* Fixes minor bug in `citus_prepare_pg_upgrade` that caused it to lose its
idempotency
### citus v9.5.5 (July 7, 2021) ###
* Adds a configure flag to enforce security
* Fixes a bug that causes pruning incorrect shard of a range distributed table
* Fixes an issue that could cause citus_finish_pg_upgrade to fail
### citus v9.5.4 (February 19, 2021) ###
* Fixes a compatibility issue with pg_audit in utility calls
### citus v9.5.3 (February 16, 2021) ###
* Avoids re-using connections for intermediate results
* Fixes a bug that might cause self-deadlocks when `COPY` used in xact block
* Fixes a crash that occurs when distributing table after dropping foreign key
### citus v9.5.2 (January 26, 2021) ###
* Fixes distributed deadlock detection being blocked by metadata sync
* Prevents segfaults when SAVEPOINT handling cannot recover from connection
failures
* Fixes possible issues that might occur with single shard distributed tables
### citus v9.5.1 (December 1, 2020) ###
* Enables PostgreSQL's parallel queries on EXPLAIN ANALYZE
* Fixes a bug that could cause excessive memory consumption when a partition is
created
* Fixes a bug that triggers subplan executions unnecessarily with cursors
* Fixes a segfault in connection management due to invalid connection hash
entries
### citus v9.5.0 (November 10, 2020) ###
* Adds support for PostgreSQL 13
* Removes the task-tracker executor
* Introduces citus local tables
* Introduces `undistribute_table` UDF to convert tables back to postgres tables
* Adds support for `EXPLAIN (ANALYZE) EXECUTE` and `EXPLAIN EXECUTE`
* Adds support for `EXPLAIN (ANALYZE, WAL)` for PG13
* Sorts the output of `EXPLAIN (ANALYZE)` by execution duration.
* Adds support for CREATE TABLE ... USING table_access_method
* Adds support for `WITH TIES` option in SELECT and INSERT SELECT queries
* Avoids taking multi-shard locks on workers
* Enforces `citus.max_shared_pool_size` config in COPY queries
* Enables custom aggregates with multiple parameters to be executed on workers
* Enforces `citus.max_intermediate_result_size` in local execution
* Improves cost estimation of INSERT SELECT plans
* Introduces delegation of procedures that read from reference tables
* Prevents pull-push execution for simple pushdownable subqueries
* Improves error message when creating a foreign key to a local table
* Makes `citus_prepare_pg_upgrade` idempotent by dropping transition tables
* Disallows `ON TRUE` outer joins with reference & distributed tables when
reference table is outer relation to avoid incorrect results
* Disallows field indirection in INSERT/UPDATE queries to avoid incorrect
results
* Disallows volatile functions in UPDATE subqueries to avoid incorrect results
* Fixes CREATE INDEX CONCURRENTLY crash with local execution
* Fixes `citus_finish_pg_upgrade` to drop all backup tables
* Fixes a bug that cause failures when `RECURSIVE VIEW` joined reference table
* Fixes DROP SEQUENCE failures when metadata syncing is enabled
* Fixes a bug that caused CREATE TABLE with CHECK constraint to fail
* Fixes a bug that could cause VACUUM to deadlock
* Fixes master_update_node failure when no background worker slots are available
* Fixes a bug that caused replica identity to not be propagated on shard repair
* Fixes a bug that could cause crashes after connection timeouts
* Fixes a bug that could cause crashes with certain compile flags
* Fixes a bug that could cause deadlocks on CREATE INDEX
* Fixes a bug with genetic query optimization in outer joins
* Fixes a crash when aggregating empty tables
* Fixes a crash with inserting domain constrained composite types
* Fixes a crash with multi-row & router INSERT's in local execution
* Fixes a possibility of doing temporary file cleanup more than once
* Fixes incorrect setting of join related fields
* Fixes memory issues around deparsing index commands
* Fixes reference table access tracking for sequential execution
* Fixes removal of a single node with only reference tables
* Fixes sending commands to coordinator when it is added as a worker
* Fixes write queries with const expressions and COLLATE in various places
* Fixes wrong cancellation message about distributed deadlock
### citus v9.4.2 (October 21, 2020) ###
* Fixes a bug that could lead to multiple maintenance daemons

View File

@ -86,6 +86,7 @@ endif
# Add options passed to configure or computed therein, to CFLAGS/CPPFLAGS/...
override CFLAGS += @CFLAGS@ @CITUS_CFLAGS@
override BITCODE_CFLAGS := $(BITCODE_CFLAGS) @CITUS_BITCODE_CFLAGS@
ifneq ($(GIT_VERSION),)
override CFLAGS += -DGIT_VERSION=\"$(GIT_VERSION)\"
endif

View File

@ -46,6 +46,16 @@ following:
requires also adding a comment before explaining why this specific use of the
function is safe.
## `build-citus.sh`
This is the script used during the build phase of the extension. Historically this script
was embedded in the docker images. This made maintenance a hassle. Now it lives in tree
with the rest of the source code.
When this script fails you most likely have a build error on the postgres version it was
building at the time of the failure. Fix the compile error and push a new version of your
code to fix.
## `check_enterprise_merge.sh`
This check exists to make sure that we can always merge the `master` branch of

47
ci/build-citus.sh Executable file
View File

@ -0,0 +1,47 @@
#!/bin/bash
# make bash behave
set -euo pipefail
IFS=$'\n\t'
# shellcheck disable=SC1091
source ci/ci_helpers.sh
# read pg major version, error if not provided
PG_MAJOR=${PG_MAJOR:?please provide the postgres major version}
# get codename from release file
. /etc/os-release
codename=${VERSION#*(}
codename=${codename%)*}
# get project from argument
project="${CIRCLE_PROJECT_REPONAME}"
# we'll do everything with absolute paths
basedir="$(pwd)"
# get the project and clear out the git repo (reduce workspace size
rm -rf "${basedir}/.git"
build_ext() {
pg_major="$1"
builddir="${basedir}/build-${pg_major}"
echo "Beginning build of ${project} for PostgreSQL ${pg_major}..." >&2
# do everything in a subdirectory to avoid clutter in current directory
mkdir -p "${builddir}" && cd "${builddir}"
CFLAGS=-Werror "${basedir}/configure" PG_CONFIG="/usr/lib/postgresql/${pg_major}/bin/pg_config" --enable-coverage
installdir="${builddir}/install"
make -j$(nproc) && mkdir -p "${installdir}" && { make DESTDIR="${installdir}" install-all || make DESTDIR="${installdir}" install ; }
cd "${installdir}" && find . -type f -print > "${builddir}/files.lst"
tar cvf "${basedir}/install-${pg_major}.tar" `cat ${builddir}/files.lst`
cd "${builddir}" && rm -rf install files.lst && make clean
}
build_ext "${PG_MAJOR}"

128
configure vendored
View File

@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for Citus 9.5devel.
# Generated by GNU Autoconf 2.69 for Citus 9.5.7.
#
#
# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@ -579,8 +579,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='Citus'
PACKAGE_TARNAME='citus'
PACKAGE_VERSION='9.5devel'
PACKAGE_STRING='Citus 9.5devel'
PACKAGE_VERSION='9.5.7'
PACKAGE_STRING='Citus 9.5.7'
PACKAGE_BUGREPORT=''
PACKAGE_URL=''
@ -627,8 +627,10 @@ POSTGRES_BUILDDIR
POSTGRES_SRCDIR
CITUS_LDFLAGS
CITUS_CPPFLAGS
CITUS_BITCODE_CFLAGS
CITUS_CFLAGS
GIT_BIN
with_security_flags
EGREP
GREP
CPP
@ -664,6 +666,7 @@ infodir
docdir
oldincludedir
includedir
runstatedir
localstatedir
sharedstatedir
sysconfdir
@ -690,6 +693,7 @@ with_extra_version
enable_coverage
with_libcurl
with_reports_hostname
with_security_flags
'
ac_precious_vars='build_alias
host_alias
@ -740,6 +744,7 @@ datadir='${datarootdir}'
sysconfdir='${prefix}/etc'
sharedstatedir='${prefix}/com'
localstatedir='${prefix}/var'
runstatedir='${localstatedir}/run'
includedir='${prefix}/include'
oldincludedir='/usr/include'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@ -992,6 +997,15 @@ do
| -silent | --silent | --silen | --sile | --sil)
silent=yes ;;
-runstatedir | --runstatedir | --runstatedi | --runstated \
| --runstate | --runstat | --runsta | --runst | --runs \
| --run | --ru | --r)
ac_prev=runstatedir ;;
-runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
| --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
| --run=* | --ru=* | --r=*)
runstatedir=$ac_optarg ;;
-sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
ac_prev=sbindir ;;
-sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@ -1129,7 +1143,7 @@ fi
for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
libdir localedir mandir
libdir localedir mandir runstatedir
do
eval ac_val=\$$ac_var
# Remove trailing slashes.
@ -1242,7 +1256,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures Citus 9.5devel to adapt to many kinds of systems.
\`configure' configures Citus 9.5.7 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@ -1282,6 +1296,7 @@ Fine tuning of the installation directories:
--sysconfdir=DIR read-only single-machine data [PREFIX/etc]
--sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com]
--localstatedir=DIR modifiable single-machine data [PREFIX/var]
--runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run]
--libdir=DIR object code libraries [EPREFIX/lib]
--includedir=DIR C header files [PREFIX/include]
--oldincludedir=DIR C header files for non-gcc [/usr/include]
@ -1303,7 +1318,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of Citus 9.5devel:";;
short | recursive ) echo "Configuration of Citus 9.5.7:";;
esac
cat <<\_ACEOF
@ -1323,6 +1338,7 @@ Optional Packages:
--with-reports-hostname=HOSTNAME
Use HOSTNAME as hostname for statistics collection
and update checks
--with-security-flags use security flags
Some influential environment variables:
PG_CONFIG Location to find pg_config for target PostgreSQL instalation
@ -1403,7 +1419,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
Citus configure 9.5devel
Citus configure 9.5.7
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
@ -1886,7 +1902,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by Citus $as_me 9.5devel, which was
It was created by Citus $as_me 9.5.7, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@ -4327,6 +4343,48 @@ if test x"$citusac_cv_prog_cc_cflags__Werror_return_type" = x"yes"; then
CITUS_CFLAGS="$CITUS_CFLAGS -Werror=return-type"
fi
# Security flags
# Flags taken from: https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10203#guide
# We do not enforce the following flag because it is only available on GCC>=8
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports -fstack-clash-protection" >&5
$as_echo_n "checking whether $CC supports -fstack-clash-protection... " >&6; }
if ${citusac_cv_prog_cc_cflags__fstack_clash_protection+:} false; then :
$as_echo_n "(cached) " >&6
else
citusac_save_CFLAGS=$CFLAGS
flag=-fstack-clash-protection
case $flag in -Wno*)
flag=-W$(echo $flag | cut -c 6-)
esac
CFLAGS="$citusac_save_CFLAGS $flag"
ac_save_c_werror_flag=$ac_c_werror_flag
ac_c_werror_flag=yes
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
/* end confdefs.h. */
int
main ()
{
;
return 0;
}
_ACEOF
if ac_fn_c_try_compile "$LINENO"; then :
citusac_cv_prog_cc_cflags__fstack_clash_protection=yes
else
citusac_cv_prog_cc_cflags__fstack_clash_protection=no
fi
rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
ac_c_werror_flag=$ac_save_c_werror_flag
CFLAGS="$citusac_save_CFLAGS"
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $citusac_cv_prog_cc_cflags__fstack_clash_protection" >&5
$as_echo "$citusac_cv_prog_cc_cflags__fstack_clash_protection" >&6; }
if test x"$citusac_cv_prog_cc_cflags__fstack_clash_protection" = x"yes"; then
CITUS_CFLAGS="$CITUS_CFLAGS -fstack-clash-protection"
fi
#
# --enable-coverage enables generation of code coverage metrics with gcov
@ -4468,6 +4526,54 @@ cat >>confdefs.h <<_ACEOF
_ACEOF
# Check whether --with-security-flags was given.
if test "${with_security_flags+set}" = set; then :
withval=$with_security_flags;
case $withval in
yes)
:
;;
no)
:
;;
*)
as_fn_error $? "no argument expected for --with-security-flags option" "$LINENO" 5
;;
esac
else
with_security_flags=no
fi
if test "$with_security_flags" = yes; then
# Flags taken from: https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10203#guide
# We always want to have some compiler flags for security concerns.
SECURITY_CFLAGS="-fstack-protector-strong -D_FORTIFY_SOURCE=2 -O2 -z noexecstack -fpic -shared -Wl,-z,relro -Wl,-z,now -Wformat -Wformat-security -Werror=format-security"
CITUS_CFLAGS="$CITUS_CFLAGS $SECURITY_CFLAGS"
{ $as_echo "$as_me:${as_lineno-$LINENO}: Blindly added security flags for linker: $SECURITY_CFLAGS" >&5
$as_echo "$as_me: Blindly added security flags for linker: $SECURITY_CFLAGS" >&6;}
# We always want to have some clang flags for security concerns.
# This doesn't include "-Wl,-z,relro -Wl,-z,now" on purpuse, because bitcode is not linked.
# This doesn't include -fsanitize=cfi because it breaks builds on many distros including
# Debian/Buster, Debian/Stretch, Ubuntu/Bionic, Ubuntu/Xenial and EL7.
SECURITY_BITCODE_CFLAGS="-fsanitize=safe-stack -fstack-protector-strong -flto -fPIC -Wformat -Wformat-security -Werror=format-security"
CITUS_BITCODE_CFLAGS="$CITUS_BITCODE_CFLAGS $SECURITY_BITCODE_CFLAGS"
{ $as_echo "$as_me:${as_lineno-$LINENO}: Blindly added security flags for llvm: $SECURITY_BITCODE_CFLAGS" >&5
$as_echo "$as_me: Blindly added security flags for llvm: $SECURITY_BITCODE_CFLAGS" >&6;}
{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: If you run into issues during linking or bitcode compilation, you can use --without-security-flags." >&5
$as_echo "$as_me: WARNING: If you run into issues during linking or bitcode compilation, you can use --without-security-flags." >&2;}
fi
# Check if git is installed, when installed the gitref of the checkout will be baked in the application
# Extract the first word of "git", so it can be a program name with args.
set dummy git; ac_word=$2
@ -4533,6 +4639,8 @@ fi
CITUS_CFLAGS="$CITUS_CFLAGS"
CITUS_BITCODE_CFLAGS="$CITUS_BITCODE_CFLAGS"
CITUS_CPPFLAGS="$CITUS_CPPFLAGS"
CITUS_LDFLAGS="$LIBS $CITUS_LDFLAGS"
@ -5055,7 +5163,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by Citus $as_me 9.5devel, which was
This file was extended by Citus $as_me 9.5.7, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@ -5117,7 +5225,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
Citus config.status 9.5devel
Citus config.status 9.5.7
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"

View File

@ -5,7 +5,7 @@
# everyone needing autoconf installed, the resulting files are checked
# into the SCM.
AC_INIT([Citus], [9.5devel])
AC_INIT([Citus], [9.5.7])
AC_COPYRIGHT([Copyright (c) Citus Data, Inc.])
# we'll need sed and awk for some of the version commands
@ -174,6 +174,10 @@ CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=vla]) # visual studio does not support thes
CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=implicit-int])
CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=implicit-function-declaration])
CITUSAC_PROG_CC_CFLAGS_OPT([-Werror=return-type])
# Security flags
# Flags taken from: https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10203#guide
# We do not enforce the following flag because it is only available on GCC>=8
CITUSAC_PROG_CC_CFLAGS_OPT([-fstack-clash-protection])
#
# --enable-coverage enables generation of code coverage metrics with gcov
@ -212,11 +216,35 @@ PGAC_ARG_REQ(with, reports-hostname, [HOSTNAME],
AC_DEFINE_UNQUOTED(REPORTS_BASE_URL, "$REPORTS_BASE_URL",
[Base URL for statistics collection and update checks])
PGAC_ARG_BOOL(with, security-flags, no,
[use security flags])
AC_SUBST(with_security_flags)
if test "$with_security_flags" = yes; then
# Flags taken from: https://liquid.microsoft.com/Web/Object/Read/ms.security/Requirements/Microsoft.Security.SystemsADM.10203#guide
# We always want to have some compiler flags for security concerns.
SECURITY_CFLAGS="-fstack-protector-strong -D_FORTIFY_SOURCE=2 -O2 -z noexecstack -fpic -shared -Wl,-z,relro -Wl,-z,now -Wformat -Wformat-security -Werror=format-security"
CITUS_CFLAGS="$CITUS_CFLAGS $SECURITY_CFLAGS"
AC_MSG_NOTICE([Blindly added security flags for linker: $SECURITY_CFLAGS])
# We always want to have some clang flags for security concerns.
# This doesn't include "-Wl,-z,relro -Wl,-z,now" on purpuse, because bitcode is not linked.
# This doesn't include -fsanitize=cfi because it breaks builds on many distros including
# Debian/Buster, Debian/Stretch, Ubuntu/Bionic, Ubuntu/Xenial and EL7.
SECURITY_BITCODE_CFLAGS="-fsanitize=safe-stack -fstack-protector-strong -flto -fPIC -Wformat -Wformat-security -Werror=format-security"
CITUS_BITCODE_CFLAGS="$CITUS_BITCODE_CFLAGS $SECURITY_BITCODE_CFLAGS"
AC_MSG_NOTICE([Blindly added security flags for llvm: $SECURITY_BITCODE_CFLAGS])
AC_MSG_WARN([If you run into issues during linking or bitcode compilation, you can use --without-security-flags.])
fi
# Check if git is installed, when installed the gitref of the checkout will be baked in the application
AC_PATH_PROG(GIT_BIN, git)
AC_CHECK_FILE(.git,[HAS_DOTGIT=yes], [HAS_DOTGIT=])
AC_SUBST(CITUS_CFLAGS, "$CITUS_CFLAGS")
AC_SUBST(CITUS_BITCODE_CFLAGS, "$CITUS_BITCODE_CFLAGS")
AC_SUBST(CITUS_CPPFLAGS, "$CITUS_CPPFLAGS")
AC_SUBST(CITUS_LDFLAGS, "$LIBS $CITUS_LDFLAGS")
AC_SUBST(POSTGRES_SRCDIR, "$POSTGRES_SRCDIR")

View File

@ -1,6 +1,6 @@
# Citus extension
comment = 'Citus distributed database'
default_version = '9.5-1'
default_version = '9.5-3'
module_pathname = '$libdir/citus'
relocatable = false
schema = pg_catalog

View File

@ -569,7 +569,7 @@ ExecuteAndLogDDLCommand(const char *commandString)
ereport(DEBUG4, (errmsg("executing \"%s\"", commandString)));
Node *parseTree = ParseTreeNode(commandString);
CitusProcessUtility(parseTree, commandString, PROCESS_UTILITY_TOPLEVEL,
CitusProcessUtility(parseTree, commandString, PROCESS_UTILITY_QUERY,
NULL, None_Receiver, NULL);
}

View File

@ -368,6 +368,24 @@ CreateDistributedTable(Oid relationId, Var *distributionColumn, char distributio
char replicationModel = DecideReplicationModel(distributionMethod,
viaDeprecatedAPI);
/*
* Due to dropping columns, the parent's distribution key may not match the
* partition's distribution key. The input distributionColumn belongs to
* the parent. That's why we override the distribution column of partitions
* here. See issue #5123 for details.
*/
if (PartitionTable(relationId))
{
Oid parentRelationId = PartitionParentOid(relationId);
char *distributionColumnName =
ColumnToColumnName(parentRelationId, nodeToString(distributionColumn));
distributionColumn =
FindColumnWithNameOnTargetRelation(parentRelationId, distributionColumnName,
relationId);
}
/*
* ColocationIdForNewTable assumes caller acquires lock on relationId. In our case,
* our caller already acquired lock on relationId.
@ -1658,7 +1676,7 @@ UndistributeTable(Oid relationId)
Node *parseTree = ParseTreeNode(tableCreationCommand);
RelayEventExtendNames(parseTree, schemaName, hashOfName);
CitusProcessUtility(parseTree, tableCreationCommand, PROCESS_UTILITY_TOPLEVEL,
CitusProcessUtility(parseTree, tableCreationCommand, PROCESS_UTILITY_QUERY,
NULL, None_Receiver, NULL);
}

View File

@ -64,6 +64,8 @@ static void ForeignConstraintFindDistKeys(HeapTuple pgConstraintTuple,
Var *referencedDistColumn,
int *referencingAttrIndex,
int *referencedAttrIndex);
static List * GetForeignKeyIdsForColumn(char *columnName, Oid relationId,
int searchForeignKeyColumnFlags);
static List * GetForeignConstraintCommandsInternal(Oid relationId, int flags);
static Oid get_relation_constraint_oid_compat(HeapTuple heapTuple);
static List * GetForeignKeyOidsToCitusLocalTables(Oid relationId);
@ -483,6 +485,21 @@ ForeignConstraintFindDistKeys(HeapTuple pgConstraintTuple,
}
/*
* ColumnAppearsInForeignKey returns true if there is a foreign key constraint
* from/to given column. False otherwise.
*/
bool
ColumnAppearsInForeignKey(char *columnName, Oid relationId)
{
int searchForeignKeyColumnFlags = SEARCH_REFERENCING_RELATION |
SEARCH_REFERENCED_RELATION;
List *foreignKeysColumnAppeared =
GetForeignKeyIdsForColumn(columnName, relationId, searchForeignKeyColumnFlags);
return list_length(foreignKeysColumnAppeared) > 0;
}
/*
* ColumnAppearsInForeignKeyToReferenceTable checks if there is a foreign key
* constraint from/to any reference table on the given column.
@ -490,9 +507,45 @@ ForeignConstraintFindDistKeys(HeapTuple pgConstraintTuple,
bool
ColumnAppearsInForeignKeyToReferenceTable(char *columnName, Oid relationId)
{
int searchForeignKeyColumnFlags = SEARCH_REFERENCING_RELATION |
SEARCH_REFERENCED_RELATION;
List *foreignKeyIdsColumnAppeared =
GetForeignKeyIdsForColumn(columnName, relationId, searchForeignKeyColumnFlags);
Oid foreignKeyId = InvalidOid;
foreach_oid(foreignKeyId, foreignKeyIdsColumnAppeared)
{
Oid referencedTableId = GetReferencedTableId(foreignKeyId);
if (IsCitusTableType(referencedTableId, REFERENCE_TABLE))
{
return true;
}
}
return false;
}
/*
* GetForeignKeyIdsForColumn takes columnName and relationId for the owning
* relation, and returns a list of OIDs for foreign constraints that the column
* with columnName is involved according to "searchForeignKeyColumnFlags" argument.
* See SearchForeignKeyColumnFlags enum definition for usage.
*/
static List *
GetForeignKeyIdsForColumn(char *columnName, Oid relationId,
int searchForeignKeyColumnFlags)
{
bool searchReferencing = searchForeignKeyColumnFlags & SEARCH_REFERENCING_RELATION;
bool searchReferenced = searchForeignKeyColumnFlags & SEARCH_REFERENCED_RELATION;
/* at least one of them should be true */
Assert(searchReferencing || searchReferenced);
List *foreignKeyIdsColumnAppeared = NIL;
ScanKeyData scanKey[1];
int scanKeyCount = 1;
bool foreignKeyToReferenceTableIncludesGivenColumn = false;
Relation pgConstraint = table_open(ConstraintRelationId, AccessShareLock);
@ -511,11 +564,11 @@ ColumnAppearsInForeignKeyToReferenceTable(char *columnName, Oid relationId)
Oid referencedTableId = constraintForm->confrelid;
Oid referencingTableId = constraintForm->conrelid;
if (referencedTableId == relationId)
if (referencedTableId == relationId && searchReferenced)
{
pgConstraintKey = Anum_pg_constraint_confkey;
}
else if (referencingTableId == relationId)
else if (referencingTableId == relationId && searchReferencing)
{
pgConstraintKey = Anum_pg_constraint_conkey;
}
@ -529,22 +582,12 @@ ColumnAppearsInForeignKeyToReferenceTable(char *columnName, Oid relationId)
continue;
}
/*
* We check if the referenced table is a reference table. There cannot be
* any foreign constraint from a distributed table to a local table.
*/
Assert(IsCitusTable(referencedTableId));
if (!IsCitusTableType(referencedTableId, REFERENCE_TABLE))
{
heapTuple = systable_getnext(scanDescriptor);
continue;
}
if (HeapTupleOfForeignConstraintIncludesColumn(heapTuple, relationId,
pgConstraintKey, columnName))
{
foreignKeyToReferenceTableIncludesGivenColumn = true;
break;
Oid foreignKeyOid = get_relation_constraint_oid_compat(heapTuple);
foreignKeyIdsColumnAppeared = lappend_oid(foreignKeyIdsColumnAppeared,
foreignKeyOid);
}
heapTuple = systable_getnext(scanDescriptor);
@ -554,7 +597,7 @@ ColumnAppearsInForeignKeyToReferenceTable(char *columnName, Oid relationId)
systable_endscan(scanDescriptor);
table_close(pgConstraint, NoLock);
return foreignKeyToReferenceTableIncludesGivenColumn;
return foreignKeyIdsColumnAppeared;
}
@ -773,31 +816,70 @@ TableReferencing(Oid relationId)
/*
* ConstraintIsAForeignKey is a wrapper around GetForeignKeyOidByName that
* returns true if the given constraint name identifies a foreign key
* constraint defined on relation with relationId.
* ConstraintWithNameIsOfType is a wrapper around ConstraintWithNameIsOfType that returns true
* if given constraint name identifies a uniqueness constraint, i.e:
* - primary key constraint, or
* - unique constraint
*/
bool
ConstraintIsAForeignKey(char *inputConstaintName, Oid relationId)
ConstraintIsAUniquenessConstraint(char *inputConstaintName, Oid relationId)
{
Oid foreignKeyId = GetForeignKeyOidByName(inputConstaintName, relationId);
return OidIsValid(foreignKeyId);
bool isUniqueConstraint = ConstraintWithNameIsOfType(inputConstaintName, relationId,
CONSTRAINT_UNIQUE);
bool isPrimaryConstraint = ConstraintWithNameIsOfType(inputConstaintName, relationId,
CONSTRAINT_PRIMARY);
return isUniqueConstraint || isPrimaryConstraint;
}
/*
* GetForeignKeyOidByName returns OID of the foreign key with name and defined
* on relation with relationId. If there is no such foreign key constraint, then
* this function returns InvalidOid.
* ConstraintIsAForeignKey is a wrapper around ConstraintWithNameIsOfType that returns true
* if given constraint name identifies a foreign key constraint.
*/
Oid
GetForeignKeyOidByName(char *inputConstaintName, Oid relationId)
bool
ConstraintIsAForeignKey(char *inputConstaintName, Oid relationId)
{
int flags = INCLUDE_REFERENCING_CONSTRAINTS;
List *foreignKeyOids = GetForeignKeyOids(relationId, flags);
return ConstraintWithNameIsOfType(inputConstaintName, relationId, CONSTRAINT_FOREIGN);
}
Oid foreignKeyId = FindForeignKeyOidWithName(foreignKeyOids, inputConstaintName);
return foreignKeyId;
/*
* ConstraintWithNameIsOfType is a wrapper around get_relation_constraint_oid that
* returns true if given constraint name identifies a valid constraint defined
* on relation with relationId and it's type matches the input constraint type.
*/
bool
ConstraintWithNameIsOfType(char *inputConstaintName, Oid relationId,
char targetConstraintType)
{
bool missingOk = true;
Oid constraintId =
get_relation_constraint_oid(relationId, inputConstaintName, missingOk);
return ConstraintWithIdIsOfType(constraintId, targetConstraintType);
}
/*
* ConstraintWithIdIsOfType returns true if constraint with constraintId exists
* and is of type targetConstraintType.
*/
bool
ConstraintWithIdIsOfType(Oid constraintId, char targetConstraintType)
{
HeapTuple heapTuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constraintId));
if (!HeapTupleIsValid(heapTuple))
{
/* no such constraint */
return false;
}
Form_pg_constraint constraintForm = (Form_pg_constraint) GETSTRUCT(heapTuple);
char constraintType = constraintForm->contype;
bool constraintTypeMatches = (constraintType == targetConstraintType);
ReleaseSysCache(heapTuple);
return constraintTypeMatches;
}

View File

@ -1138,12 +1138,16 @@ TriggerSyncMetadataToPrimaryNodes(void)
triggerMetadataSync = true;
}
else if (!workerNode->metadataSynced)
{
triggerMetadataSync = true;
}
}
/* let the maintanince deamon know about the metadata sync */
if (triggerMetadataSync)
{
TriggerMetadataSync(MyDatabaseId);
TriggerMetadataSyncOnCommit();
}
}

View File

@ -511,6 +511,11 @@ PreprocessDropIndexStmt(Node *node, const char *dropIndexCommand)
ErrorIfUnsupportedDropIndexStmt(dropIndexStatement);
if (AnyForeignKeyDependsOnIndex(distributedIndexId))
{
MarkInvalidateForeignKeyGraph();
}
ddlJob->targetRelationId = distributedRelationId;
ddlJob->concurrentIndexCmd = dropIndexStatement->concurrent;

View File

@ -261,7 +261,8 @@ static CopyShardState * GetShardState(uint64 shardId, HTAB *shardStateHash,
copyOutState, bool isCopyToIntermediateFile);
static MultiConnection * CopyGetPlacementConnection(HTAB *connectionStateHash,
ShardPlacement *placement,
bool stopOnFailure);
bool stopOnFailure,
bool colocatedIntermediateResult);
static bool HasReachedAdaptiveExecutorPoolSize(List *connectionStateHash);
static MultiConnection * GetLeastUtilisedCopyConnection(List *connectionStateList,
char *nodeName, int nodePort);
@ -2253,8 +2254,9 @@ CitusCopyDestReceiverStartup(DestReceiver *dest, int operation,
/* define the template for the COPY statement that is sent to workers */
CopyStmt *copyStatement = makeNode(CopyStmt);
if (copyDest->intermediateResultIdPrefix != NULL)
bool colocatedIntermediateResults =
copyDest->intermediateResultIdPrefix != NULL;
if (colocatedIntermediateResults)
{
copyStatement->relation = makeRangeVar(NULL, copyDest->intermediateResultIdPrefix,
-1);
@ -2291,13 +2293,21 @@ CitusCopyDestReceiverStartup(DestReceiver *dest, int operation,
RecordRelationAccessIfNonDistTable(tableId, PLACEMENT_ACCESS_DML);
/*
* For all the primary (e.g., writable) nodes, reserve a shared connection.
* We do this upfront because we cannot know which nodes are going to be
* accessed. Since the order of the reservation is important, we need to
* do it right here. For the details on why the order important, see
* the function.
* Colocated intermediate results do not honor citus.max_shared_pool_size,
* so we don't need to reserve any connections. Each result file is sent
* over a single connection.
*/
EnsureConnectionPossibilityForPrimaryNodes();
if (!colocatedIntermediateResults)
{
/*
* For all the primary (e.g., writable) nodes, reserve a shared connection.
* We do this upfront because we cannot know which nodes are going to be
* accessed. Since the order of the reservation is important, we need to
* do it right here. For the details on why the order important, see
* the function.
*/
EnsureConnectionPossibilityForPrimaryNodes();
}
}
@ -3438,7 +3448,8 @@ InitializeCopyShardState(CopyShardState *shardState,
}
MultiConnection *connection =
CopyGetPlacementConnection(connectionStateHash, placement, stopOnFailure);
CopyGetPlacementConnection(connectionStateHash, placement, stopOnFailure,
isCopyToIntermediateFile);
if (connection == NULL)
{
failedPlacementCount++;
@ -3534,11 +3545,40 @@ LogLocalCopyExecution(uint64 shardId)
* then it reuses the connection. Otherwise, it requests a connection for placement.
*/
static MultiConnection *
CopyGetPlacementConnection(HTAB *connectionStateHash, ShardPlacement *placement, bool
stopOnFailure)
CopyGetPlacementConnection(HTAB *connectionStateHash, ShardPlacement *placement,
bool stopOnFailure, bool colocatedIntermediateResult)
{
uint32 connectionFlags = FOR_DML;
char *nodeUser = CurrentUserName();
if (colocatedIntermediateResult)
{
/*
* Colocated intermediate results are just files and not required to use
* the same connections with their co-located shards. So, we are free to
* use any connection we can get.
*
* Also, the current connection re-use logic does not know how to handle
* intermediate results as the intermediate results always truncates the
* existing files. That's why we we use one connection per intermediate
* result.
*
* Also note that we are breaking the guarantees of citus.shared_pool_size
* as we cannot rely on optional connections.
*/
uint32 connectionFlagsForIntermediateResult = 0;
MultiConnection *connection =
GetNodeConnection(connectionFlagsForIntermediateResult, placement->nodeName,
placement->nodePort);
/*
* As noted above, we want each intermediate file to go over
* a separate connection.
*/
ClaimConnectionExclusively(connection);
/* and, we cannot afford to handle failures when anything goes wrong */
MarkRemoteTransactionCritical(connection);
return connection;
}
/*
* Determine whether the task has to be assigned to a particular connection
@ -3546,6 +3586,7 @@ CopyGetPlacementConnection(HTAB *connectionStateHash, ShardPlacement *placement,
*/
ShardPlacementAccess *placementAccess = CreatePlacementAccess(placement,
PLACEMENT_ACCESS_DML);
uint32 connectionFlags = FOR_DML;
MultiConnection *connection =
GetConnectionIfPlacementAccessedInXact(connectionFlags,
list_make1(placementAccess), NULL);
@ -3583,6 +3624,12 @@ CopyGetPlacementConnection(HTAB *connectionStateHash, ShardPlacement *placement,
nodeName,
nodePort);
/*
* Make sure that the connection management remembers that Citus
* accesses this placement over the connection.
*/
AssignPlacementListToConnection(list_make1(placementAccess), connection);
return connection;
}
@ -3628,6 +3675,7 @@ CopyGetPlacementConnection(HTAB *connectionStateHash, ShardPlacement *placement,
connectionFlags |= REQUIRE_CLEAN_CONNECTION;
}
char *nodeUser = CurrentUserName();
connection = GetPlacementConnection(connectionFlags, placement, nodeUser);
if (connection == NULL)
{
@ -3643,6 +3691,12 @@ CopyGetPlacementConnection(HTAB *connectionStateHash, ShardPlacement *placement,
connection =
GetLeastUtilisedCopyConnection(copyConnectionStateList, nodeName,
nodePort);
/*
* Make sure that the connection management remembers that Citus
* accesses this placement over the connection.
*/
AssignPlacementListToConnection(list_make1(placementAccess), connection);
}
else
{

View File

@ -17,6 +17,8 @@
#include "access/xact.h"
#include "catalog/index.h"
#include "catalog/pg_class.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_depend.h"
#include "commands/tablecmds.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/colocation_utils.h"
@ -24,9 +26,11 @@
#include "distributed/commands/utility_hook.h"
#include "distributed/deparser.h"
#include "distributed/deparse_shard_query.h"
#include "distributed/distribution_column.h"
#include "distributed/listutils.h"
#include "distributed/coordinator_protocol.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata/dependency.h"
#include "distributed/multi_executor.h"
#include "distributed/multi_partitioning_utils.h"
#include "distributed/reference_table_utils.h"
@ -49,6 +53,7 @@ static void ErrorIfAlterTableDefinesFKeyFromPostgresToCitusLocalTable(
static List * GetAlterTableStmtFKeyConstraintList(AlterTableStmt *alterTableStatement);
static List * GetAlterTableCommandFKeyConstraintList(AlterTableCmd *command);
static bool AlterTableCommandTypeIsTrigger(AlterTableType alterTableType);
static bool AlterTableDropsForeignKey(AlterTableStmt *alterTableStatement);
static void ErrorIfUnsupportedAlterTableStmt(AlterTableStmt *alterTableStatement);
static void ErrorIfCitusLocalTablePartitionCommand(AlterTableCmd *alterTableCmd,
Oid parentRelationId);
@ -384,6 +389,18 @@ PreprocessAlterTableStmt(Node *node, const char *alterTableCommand)
*/
ErrorIfAlterTableDefinesFKeyFromPostgresToCitusLocalTable(alterTableStatement);
if (AlterTableDropsForeignKey(alterTableStatement))
{
/*
* The foreign key graph keeps track of the foreign keys including local tables.
* So, even if a foreign key on a local table is dropped, we should invalidate
* the graph so that the next commands can see the graph up-to-date.
* We are aware that utility hook would still invalidate foreign key graph
* even when command fails, but currently we are ok with that.
*/
MarkInvalidateForeignKeyGraph();
}
bool referencingIsLocalTable = !IsCitusTable(leftRelationId);
if (referencingIsLocalTable)
{
@ -461,7 +478,9 @@ PreprocessAlterTableStmt(Node *node, const char *alterTableCommand)
*/
Assert(list_length(commandList) == 1);
Oid foreignKeyId = GetForeignKeyOidByName(constraintName, leftRelationId);
bool missingOk = false;
Oid foreignKeyId = get_relation_constraint_oid(leftRelationId,
constraintName, missingOk);
rightRelationId = GetReferencedTableId(foreignKeyId);
}
}
@ -714,6 +733,99 @@ AlterTableCommandTypeIsTrigger(AlterTableType alterTableType)
}
/*
* AlterTableDropsForeignKey returns true if the given AlterTableStmt drops
* a foreign key. False otherwise.
*/
static bool
AlterTableDropsForeignKey(AlterTableStmt *alterTableStatement)
{
LOCKMODE lockmode = AlterTableGetLockLevel(alterTableStatement->cmds);
Oid relationId = AlterTableLookupRelation(alterTableStatement, lockmode);
AlterTableCmd *command = NULL;
foreach_ptr(command, alterTableStatement->cmds)
{
AlterTableType alterTableType = command->subtype;
if (alterTableType == AT_DropColumn)
{
char *columnName = command->name;
if (ColumnAppearsInForeignKey(columnName, relationId))
{
/* dropping a column in the either side of the fkey will drop the fkey */
return true;
}
}
/*
* In order to drop the foreign key, other than DROP COLUMN, the command must be
* DROP CONSTRAINT command.
*/
if (alterTableType != AT_DropConstraint)
{
continue;
}
char *constraintName = command->name;
if (ConstraintIsAForeignKey(constraintName, relationId))
{
return true;
}
else if (ConstraintIsAUniquenessConstraint(constraintName, relationId))
{
/*
* If the uniqueness constraint of the column that the foreign key depends on
* is getting dropped, then the foreign key will also be dropped.
*/
bool missingOk = false;
Oid uniquenessConstraintId =
get_relation_constraint_oid(relationId, constraintName, missingOk);
Oid indexId = get_constraint_index(uniquenessConstraintId);
if (AnyForeignKeyDependsOnIndex(indexId))
{
return true;
}
}
}
return false;
}
/*
* AnyForeignKeyDependsOnIndex scans pg_depend and returns true if given index
* is valid and any foreign key depends on it.
*/
bool
AnyForeignKeyDependsOnIndex(Oid indexId)
{
Oid dependentObjectClassId = RelationRelationId;
Oid dependentObjectId = indexId;
List *dependencyTupleList =
GetPgDependTuplesForDependingObjects(dependentObjectClassId, dependentObjectId);
HeapTuple dependencyTuple = NULL;
foreach_ptr(dependencyTuple, dependencyTupleList)
{
Form_pg_depend dependencyForm = (Form_pg_depend) GETSTRUCT(dependencyTuple);
Oid dependingClassId = dependencyForm->classid;
if (dependingClassId != ConstraintRelationId)
{
continue;
}
Oid dependingObjectId = dependencyForm->objid;
if (ConstraintWithIdIsOfType(dependingObjectId, CONSTRAINT_FOREIGN))
{
return true;
}
}
return false;
}
/*
* PreprocessAlterTableStmt issues a warning.
* ALTER TABLE ALL IN TABLESPACE statements have their node type as
@ -1339,21 +1451,6 @@ ErrorIfUnsupportedAlterTableStmt(AlterTableStmt *alterTableStatement)
break;
}
case AT_DropConstraint:
{
if (!OidIsValid(relationId))
{
return;
}
if (ConstraintIsAForeignKey(command->name, relationId))
{
MarkInvalidateForeignKeyGraph();
}
break;
}
case AT_EnableTrig:
case AT_EnableAlwaysTrig:
case AT_EnableReplicaTrig:
@ -1383,6 +1480,7 @@ ErrorIfUnsupportedAlterTableStmt(AlterTableStmt *alterTableStatement)
case AT_SetNotNull:
case AT_ReplicaIdentity:
case AT_ValidateConstraint:
case AT_DropConstraint: /* we do the check for invalidation in AlterTableDropsForeignKey */
{
/*
* We will not perform any special check for:

View File

@ -161,6 +161,12 @@ AfterXactConnectionHandling(bool isCommit)
hash_seq_init(&status, ConnectionHash);
while ((entry = (ConnectionHashEntry *) hash_seq_search(&status)) != 0)
{
if (!entry->isValid)
{
/* skip invalid connection hash entries */
continue;
}
AfterXactHostConnectionHandling(entry, isCommit);
/*
@ -289,11 +295,24 @@ StartNodeUserDatabaseConnection(uint32 flags, const char *hostname, int32 port,
*/
ConnectionHashEntry *entry = hash_search(ConnectionHash, &key, HASH_ENTER, &found);
if (!found)
if (!found || !entry->isValid)
{
/*
* We are just building hash entry or previously it was left in an
* invalid state as we couldn't allocate memory for it.
* So initialize entry->connections list here.
*/
entry->isValid = false;
entry->connections = MemoryContextAlloc(ConnectionContext,
sizeof(dlist_head));
dlist_init(entry->connections);
/*
* If MemoryContextAlloc errors out -e.g. during an OOM-, entry->connections
* stays as NULL. So entry->isValid should be set to true right after we
* initialize entry->connections properly.
*/
entry->isValid = true;
}
/* if desired, check whether there's a usable connection */
@ -449,6 +468,12 @@ CloseAllConnectionsAfterTransaction(void)
hash_seq_init(&status, ConnectionHash);
while ((entry = (ConnectionHashEntry *) hash_seq_search(&status)) != 0)
{
if (!entry->isValid)
{
/* skip invalid connection hash entries */
continue;
}
dlist_iter iter;
dlist_head *connections = entry->connections;
@ -483,7 +508,7 @@ ConnectionAvailableToNode(char *hostName, int nodePort, const char *userName,
ConnectionHashEntry *entry =
(ConnectionHashEntry *) hash_search(ConnectionHash, &key, HASH_FIND, &found);
if (!found)
if (!found || !entry->isValid)
{
return false;
}
@ -509,6 +534,12 @@ CloseNodeConnectionsAfterTransaction(char *nodeName, int nodePort)
hash_seq_init(&status, ConnectionHash);
while ((entry = (ConnectionHashEntry *) hash_seq_search(&status)) != 0)
{
if (!entry->isValid)
{
/* skip invalid connection hash entries */
continue;
}
dlist_iter iter;
if (strcmp(entry->key.hostname, nodeName) != 0 || entry->key.port != nodePort)
@ -584,6 +615,12 @@ ShutdownAllConnections(void)
hash_seq_init(&status, ConnectionHash);
while ((entry = (ConnectionHashEntry *) hash_seq_search(&status)) != NULL)
{
if (!entry->isValid)
{
/* skip invalid connection hash entries */
continue;
}
dlist_iter iter;
dlist_foreach(iter, entry->connections)
@ -1194,6 +1231,12 @@ FreeConnParamsHashEntryFields(ConnParamsHashEntry *entry)
static void
AfterXactHostConnectionHandling(ConnectionHashEntry *entry, bool isCommit)
{
if (!entry || !entry->isValid)
{
/* callers only pass valid hash entries but let's be on the safe side */
ereport(ERROR, (errmsg("connection hash entry is NULL or invalid")));
}
dlist_mutable_iter iter;
int cachedConnectionCount = 0;

View File

@ -174,6 +174,8 @@
#include "utils/timestamp.h"
#define SLOW_START_DISABLED 0
#define WAIT_EVENT_SET_INDEX_NOT_INITIALIZED -1
#define WAIT_EVENT_SET_INDEX_FAILED -2
/*
@ -611,6 +613,10 @@ static int UsableConnectionCount(WorkerPool *workerPool);
static long NextEventTimeout(DistributedExecution *execution);
static WaitEventSet * BuildWaitEventSet(List *sessionList);
static void RebuildWaitEventSetFlags(WaitEventSet *waitEventSet, List *sessionList);
static int CitusAddWaitEventSetToSet(WaitEventSet *set, uint32 events, pgsocket fd,
Latch *latch, void *user_data);
static bool CitusModifyWaitEvent(WaitEventSet *set, int pos, uint32 events,
Latch *latch);
static TaskPlacementExecution * PopPlacementExecution(WorkerSession *session);
static TaskPlacementExecution * PopAssignedPlacementExecution(WorkerSession *session);
static TaskPlacementExecution * PopUnassignedPlacementExecution(WorkerPool *workerPool);
@ -642,6 +648,8 @@ static void ExtractParametersForRemoteExecution(ParamListInfo paramListInfo,
Oid **parameterTypes,
const char ***parameterValues);
static int GetEventSetSize(List *sessionList);
static bool ProcessSessionsWithFailedWaitEventSetOperations(
DistributedExecution *execution);
static int RebuildWaitEventSet(DistributedExecution *execution);
static void ProcessWaitEvents(DistributedExecution *execution, WaitEvent *events, int
eventCount, bool *cancellationReceived);
@ -660,6 +668,16 @@ static void SetAttributeInputMetadata(DistributedExecution *execution,
void
AdaptiveExecutorPreExecutorRun(CitusScanState *scanState)
{
if (scanState->finishedPreScan)
{
/*
* Cursors (and hence RETURN QUERY syntax in pl/pgsql functions)
* may trigger AdaptiveExecutorPreExecutorRun() on every fetch
* operation. Though, we should only execute PreScan once.
*/
return;
}
DistributedPlan *distributedPlan = scanState->distributedPlan;
/*
@ -670,6 +688,8 @@ AdaptiveExecutorPreExecutorRun(CitusScanState *scanState)
LockPartitionsForDistributedPlan(distributedPlan);
ExecuteSubPlans(distributedPlan);
scanState->finishedPreScan = true;
}
@ -2084,6 +2104,7 @@ FindOrCreateWorkerSession(WorkerPool *workerPool, MultiConnection *connection)
session->connection = connection;
session->workerPool = workerPool;
session->commandsSent = 0;
session->waitEventSetIndex = WAIT_EVENT_SET_INDEX_NOT_INITIALIZED;
dlist_init(&session->pendingTaskQueue);
dlist_init(&session->readyTaskQueue);
@ -2231,6 +2252,8 @@ RunDistributedExecution(DistributedExecution *execution)
ManageWorkerPool(workerPool);
}
bool skipWaitEvents = false;
if (execution->rebuildWaitEventSet)
{
if (events != NULL)
@ -2245,11 +2268,28 @@ RunDistributedExecution(DistributedExecution *execution)
eventSetSize = RebuildWaitEventSet(execution);
events = palloc0(eventSetSize * sizeof(WaitEvent));
skipWaitEvents =
ProcessSessionsWithFailedWaitEventSetOperations(execution);
}
else if (execution->waitFlagsChanged)
{
RebuildWaitEventSetFlags(execution->waitEventSet, execution->sessionList);
execution->waitFlagsChanged = false;
skipWaitEvents =
ProcessSessionsWithFailedWaitEventSetOperations(execution);
}
if (skipWaitEvents)
{
/*
* Some operation on the wait event set is failed, retry
* as we already removed the problematic connections.
*/
execution->rebuildWaitEventSet = true;
continue;
}
/* wait for I/O events */
@ -2297,6 +2337,51 @@ RunDistributedExecution(DistributedExecution *execution)
}
/*
* ProcessSessionsWithFailedEventSetOperations goes over the session list and
* processes sessions with failed wait event set operations.
*
* Failed sessions are not going to generate any further events, so it is our
* only chance to process the failure by calling into `ConnectionStateMachine`.
*
* The function returns true if any session failed.
*/
static bool
ProcessSessionsWithFailedWaitEventSetOperations(DistributedExecution *execution)
{
bool foundFailedSession = false;
WorkerSession *session = NULL;
foreach_ptr(session, execution->sessionList)
{
if (session->waitEventSetIndex == WAIT_EVENT_SET_INDEX_FAILED)
{
/*
* We can only lost only already connected connections,
* others are regular failures.
*/
MultiConnection *connection = session->connection;
if (connection->connectionState == MULTI_CONNECTION_CONNECTED)
{
connection->connectionState = MULTI_CONNECTION_LOST;
}
else
{
connection->connectionState = MULTI_CONNECTION_FAILED;
}
ConnectionStateMachine(session);
session->waitEventSetIndex = WAIT_EVENT_SET_INDEX_NOT_INITIALIZED;
foundFailedSession = true;
}
}
return foundFailedSession;
}
/*
* RebuildWaitEventSet updates the waitEventSet for the distributed execution.
* This happens when the connection set for the distributed execution is changed,
@ -3285,6 +3370,25 @@ TransactionStateMachine(WorkerSession *session)
case REMOTE_TRANS_SENT_COMMAND:
{
TaskPlacementExecution *placementExecution = session->currentTask;
if (placementExecution == NULL)
{
/*
* We have seen accounts in production where the placementExecution
* could inadvertently be not set. Investigation documented on
* https://github.com/citusdata/citus-enterprise/issues/493
* (due to sensitive data in the initial report it is not discussed
* in our community repository)
*
* Currently we don't have a reliable way of reproducing this issue.
* Erroring here seems to be a more desirable approach compared to a
* SEGFAULT on the dereference of placementExecution, with a possible
* crash recovery as a result.
*/
ereport(ERROR, (errmsg(
"unable to recover from inconsistent state in "
"the connection state machine on coordinator")));
}
ShardCommandExecution *shardCommandExecution =
placementExecution->shardCommandExecution;
Task *task = shardCommandExecution->task;
@ -4455,18 +4559,79 @@ BuildWaitEventSet(List *sessionList)
continue;
}
int waitEventSetIndex = AddWaitEventToSet(waitEventSet, connection->waitFlags,
sock, NULL, (void *) session);
int waitEventSetIndex =
CitusAddWaitEventSetToSet(waitEventSet, connection->waitFlags, sock,
NULL, (void *) session);
session->waitEventSetIndex = waitEventSetIndex;
}
AddWaitEventToSet(waitEventSet, WL_POSTMASTER_DEATH, PGINVALID_SOCKET, NULL, NULL);
AddWaitEventToSet(waitEventSet, WL_LATCH_SET, PGINVALID_SOCKET, MyLatch, NULL);
CitusAddWaitEventSetToSet(waitEventSet, WL_POSTMASTER_DEATH, PGINVALID_SOCKET, NULL,
NULL);
CitusAddWaitEventSetToSet(waitEventSet, WL_LATCH_SET, PGINVALID_SOCKET, MyLatch,
NULL);
return waitEventSet;
}
/*
* CitusAddWaitEventSetToSet is a wrapper around Postgres' AddWaitEventToSet().
*
* AddWaitEventToSet() may throw hard errors. For example, when the
* underlying socket for a connection is closed by the remote server
* and already reflected by the OS, however Citus hasn't had a chance
* to get this information. In that case, if replication factor is >1,
* Citus can failover to other nodes for executing the query. Even if
* replication factor = 1, Citus can give much nicer errors.
*
* So CitusAddWaitEventSetToSet simply puts ModifyWaitEvent into a
* PG_TRY/PG_CATCH block in order to catch any hard errors, and
* returns this information to the caller.
*/
static int
CitusAddWaitEventSetToSet(WaitEventSet *set, uint32 events, pgsocket fd,
Latch *latch, void *user_data)
{
volatile int waitEventSetIndex = WAIT_EVENT_SET_INDEX_NOT_INITIALIZED;
MemoryContext savedContext = CurrentMemoryContext;
PG_TRY();
{
waitEventSetIndex =
AddWaitEventToSet(set, events, fd, latch, (void *) user_data);
}
PG_CATCH();
{
/*
* We might be in an arbitrary memory context when the
* error is thrown and we should get back to one we had
* at PG_TRY() time, especially because we are not
* re-throwing the error.
*/
MemoryContextSwitchTo(savedContext);
FlushErrorState();
if (user_data != NULL)
{
WorkerSession *workerSession = (WorkerSession *) user_data;
ereport(DEBUG1, (errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("Adding wait event for node %s:%d failed. "
"The socket was: %d",
workerSession->workerPool->nodeName,
workerSession->workerPool->nodePort, fd)));
}
/* let the callers know about the failure */
waitEventSetIndex = WAIT_EVENT_SET_INDEX_FAILED;
}
PG_END_TRY();
return waitEventSetIndex;
}
/*
* GetEventSetSize returns the event set size for a list of sessions.
*/
@ -4510,11 +4675,68 @@ RebuildWaitEventSetFlags(WaitEventSet *waitEventSet, List *sessionList)
continue;
}
ModifyWaitEvent(waitEventSet, waitEventSetIndex, connection->waitFlags, NULL);
bool success =
CitusModifyWaitEvent(waitEventSet, waitEventSetIndex,
connection->waitFlags, NULL);
if (!success)
{
ereport(DEBUG1, (errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("Modifying wait event for node %s:%d failed. "
"The wait event index was: %d",
connection->hostname, connection->port,
waitEventSetIndex)));
session->waitEventSetIndex = WAIT_EVENT_SET_INDEX_FAILED;
}
}
}
/*
* CitusModifyWaitEvent is a wrapper around Postgres' ModifyWaitEvent().
*
* ModifyWaitEvent may throw hard errors. For example, when the underlying
* socket for a connection is closed by the remote server and already
* reflected by the OS, however Citus hasn't had a chance to get this
* information. In that case, if repliction factor is >1, Citus can
* failover to other nodes for executing the query. Even if replication
* factor = 1, Citus can give much nicer errors.
*
* So CitusModifyWaitEvent simply puts ModifyWaitEvent into a PG_TRY/PG_CATCH
* block in order to catch any hard errors, and returns this information to the
* caller.
*/
static bool
CitusModifyWaitEvent(WaitEventSet *set, int pos, uint32 events, Latch *latch)
{
volatile bool success = true;
MemoryContext savedContext = CurrentMemoryContext;
PG_TRY();
{
ModifyWaitEvent(set, pos, events, latch);
}
PG_CATCH();
{
/*
* We might be in an arbitrary memory context when the
* error is thrown and we should get back to one we had
* at PG_TRY() time, especially because we are not
* re-throwing the error.
*/
MemoryContextSwitchTo(savedContext);
FlushErrorState();
/* let the callers know about the failure */
success = false;
}
PG_END_TRY();
return success;
}
/*
* SetLocalForceMaxQueryParallelization is simply a C interface for setting
* the following:

View File

@ -558,6 +558,9 @@ AdaptiveExecutorCreateScan(CustomScan *scan)
scanState->customScanState.methods = &AdaptiveExecutorCustomExecMethods;
scanState->PreExecScan = &CitusPreExecScan;
scanState->finishedPreScan = false;
scanState->finishedRemoteScan = false;
return (Node *) scanState;
}
@ -578,6 +581,9 @@ NonPushableInsertSelectCreateScan(CustomScan *scan)
scanState->customScanState.methods =
&NonPushableInsertSelectCustomExecMethods;
scanState->finishedPreScan = false;
scanState->finishedRemoteScan = false;
return (Node *) scanState;
}

View File

@ -409,7 +409,8 @@ LocallyExecuteUtilityTask(const char *localTaskQueryCommand)
* process utility.
*/
CitusProcessUtility(localTaskRawParseTree, localTaskQueryCommand,
PROCESS_UTILITY_TOPLEVEL, NULL, None_Receiver, NULL);
PROCESS_UTILITY_QUERY, NULL, None_Receiver,
NULL);
}
}
}

View File

@ -1033,25 +1033,13 @@ BuildViewDependencyGraph(Oid relationId, HTAB *nodeMap)
node->remainingDependencyCount = 0;
node->dependingNodes = NIL;
ObjectAddress target = { 0 };
ObjectAddressSet(target, RelationRelationId, relationId);
Oid targetObjectClassId = RelationRelationId;
Oid targetObjectId = relationId;
List *dependencyTupleList = GetPgDependTuplesForDependingObjects(targetObjectClassId,
targetObjectId);
ScanKeyData key[2];
HeapTuple depTup = NULL;
/*
* iterate the actual pg_depend catalog
*/
Relation depRel = table_open(DependRelationId, AccessShareLock);
ScanKeyInit(&key[0], Anum_pg_depend_refclassid, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(target.classId));
ScanKeyInit(&key[1], Anum_pg_depend_refobjid, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(target.objectId));
SysScanDesc depScan = systable_beginscan(depRel, DependReferenceIndexId,
true, NULL, 2, key);
while (HeapTupleIsValid(depTup = systable_getnext(depScan)))
foreach_ptr(depTup, dependencyTupleList)
{
Form_pg_depend pg_depend = (Form_pg_depend) GETSTRUCT(depTup);
@ -1066,13 +1054,48 @@ BuildViewDependencyGraph(Oid relationId, HTAB *nodeMap)
}
}
systable_endscan(depScan);
relation_close(depRel, AccessShareLock);
return node;
}
/*
* GetPgDependTuplesForDependingObjects scans pg_depend for given object and
* returns a list of heap tuples for the objects depending on it.
*/
List *
GetPgDependTuplesForDependingObjects(Oid targetObjectClassId, Oid targetObjectId)
{
List *dependencyTupleList = NIL;
Relation pgDepend = table_open(DependRelationId, AccessShareLock);
ScanKeyData key[2];
int scanKeyCount = 2;
ScanKeyInit(&key[0], Anum_pg_depend_refclassid, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(targetObjectClassId));
ScanKeyInit(&key[1], Anum_pg_depend_refobjid, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(targetObjectId));
bool useIndex = true;
SysScanDesc depScan = systable_beginscan(pgDepend, DependReferenceIndexId,
useIndex, NULL, scanKeyCount, key);
HeapTuple dependencyTuple = NULL;
while (HeapTupleIsValid(dependencyTuple = systable_getnext(depScan)))
{
/* copy the tuple first */
dependencyTuple = heap_copytuple(dependencyTuple);
dependencyTupleList = lappend(dependencyTupleList, dependencyTuple);
}
systable_endscan(depScan);
relation_close(pgDepend, AccessShareLock);
return dependencyTupleList;
}
/*
* GetDependingViews takes a relation id, finds the views that depend on the relation
* and returns list of the oids of those views. It recurses on the pg_depend table to

View File

@ -254,8 +254,8 @@ static void InvalidateCitusTableCacheEntrySlot(CitusTableCacheEntrySlot *cacheSl
static void InvalidateDistTableCache(void);
static void InvalidateDistObjectCache(void);
static void InitializeTableCacheEntry(int64 shardId);
static bool IsCitusTableTypeInternal(CitusTableCacheEntry *tableEntry, CitusTableType
tableType);
static bool IsCitusTableTypeInternal(char partitionMethod, char replicationModel,
CitusTableType tableType);
static bool RefreshTableCacheEntryIfInvalid(ShardIdCacheEntry *shardEntry);
@ -309,7 +309,7 @@ IsCitusTableType(Oid relationId, CitusTableType tableType)
{
return false;
}
return IsCitusTableTypeInternal(tableEntry, tableType);
return IsCitusTableTypeCacheEntry(tableEntry, tableType);
}
@ -320,7 +320,8 @@ IsCitusTableType(Oid relationId, CitusTableType tableType)
bool
IsCitusTableTypeCacheEntry(CitusTableCacheEntry *tableEntry, CitusTableType tableType)
{
return IsCitusTableTypeInternal(tableEntry, tableType);
return IsCitusTableTypeInternal(tableEntry->partitionMethod,
tableEntry->replicationModel, tableType);
}
@ -329,47 +330,48 @@ IsCitusTableTypeCacheEntry(CitusTableCacheEntry *tableEntry, CitusTableType tabl
* the given table type group. For definition of table types, see CitusTableType.
*/
static bool
IsCitusTableTypeInternal(CitusTableCacheEntry *tableEntry, CitusTableType tableType)
IsCitusTableTypeInternal(char partitionMethod, char replicationModel,
CitusTableType tableType)
{
switch (tableType)
{
case HASH_DISTRIBUTED:
{
return tableEntry->partitionMethod == DISTRIBUTE_BY_HASH;
return partitionMethod == DISTRIBUTE_BY_HASH;
}
case APPEND_DISTRIBUTED:
{
return tableEntry->partitionMethod == DISTRIBUTE_BY_APPEND;
return partitionMethod == DISTRIBUTE_BY_APPEND;
}
case RANGE_DISTRIBUTED:
{
return tableEntry->partitionMethod == DISTRIBUTE_BY_RANGE;
return partitionMethod == DISTRIBUTE_BY_RANGE;
}
case DISTRIBUTED_TABLE:
{
return tableEntry->partitionMethod == DISTRIBUTE_BY_HASH ||
tableEntry->partitionMethod == DISTRIBUTE_BY_RANGE ||
tableEntry->partitionMethod == DISTRIBUTE_BY_APPEND;
return partitionMethod == DISTRIBUTE_BY_HASH ||
partitionMethod == DISTRIBUTE_BY_RANGE ||
partitionMethod == DISTRIBUTE_BY_APPEND;
}
case REFERENCE_TABLE:
{
return tableEntry->partitionMethod == DISTRIBUTE_BY_NONE &&
tableEntry->replicationModel == REPLICATION_MODEL_2PC;
return partitionMethod == DISTRIBUTE_BY_NONE &&
replicationModel == REPLICATION_MODEL_2PC;
}
case CITUS_LOCAL_TABLE:
{
return tableEntry->partitionMethod == DISTRIBUTE_BY_NONE &&
tableEntry->replicationModel != REPLICATION_MODEL_2PC;
return partitionMethod == DISTRIBUTE_BY_NONE &&
replicationModel != REPLICATION_MODEL_2PC;
}
case CITUS_TABLE_WITH_NO_DIST_KEY:
{
return tableEntry->partitionMethod == DISTRIBUTE_BY_NONE;
return partitionMethod == DISTRIBUTE_BY_NONE;
}
case ANY_CITUS_TABLE_TYPE:
@ -3706,12 +3708,25 @@ CitusTableTypeIdList(CitusTableType citusTableType)
while (HeapTupleIsValid(heapTuple))
{
bool isNull = false;
Datum relationIdDatum = heap_getattr(heapTuple,
Anum_pg_dist_partition_logicalrelid,
tupleDescriptor, &isNull);
Oid relationId = DatumGetObjectId(relationIdDatum);
if (IsCitusTableType(relationId, citusTableType))
Datum partMethodDatum =
heap_getattr(heapTuple, Anum_pg_dist_partition_partmethod,
tupleDescriptor, &isNull);
Datum replicationModelDatum =
heap_getattr(heapTuple, Anum_pg_dist_partition_repmodel,
tupleDescriptor, &isNull);
Oid partitionMethod = DatumGetChar(partMethodDatum);
Oid replicationModel = DatumGetChar(replicationModelDatum);
if (IsCitusTableTypeInternal(partitionMethod, replicationModel, citusTableType))
{
Datum relationIdDatum = heap_getattr(heapTuple,
Anum_pg_dist_partition_logicalrelid,
tupleDescriptor, &isNull);
Oid relationId = DatumGetObjectId(relationIdDatum);
relationIdList = lappend_oid(relationIdList, relationId);
}

View File

@ -14,6 +14,7 @@
#include "postgres.h"
#include "miscadmin.h"
#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
@ -28,6 +29,7 @@
#include "catalog/pg_foreign_server.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_type.h"
#include "commands/async.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/commands.h"
#include "distributed/deparser.h"
@ -35,6 +37,7 @@
#include "distributed/listutils.h"
#include "distributed/metadata_utility.h"
#include "distributed/coordinator_protocol.h"
#include "distributed/maintenanced.h"
#include "distributed/metadata_cache.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata/distobject.h"
@ -48,11 +51,15 @@
#include "foreign/foreign.h"
#include "miscadmin.h"
#include "nodes/pg_list.h"
#include "pgstat.h"
#include "postmaster/bgworker.h"
#include "postmaster/postmaster.h"
#include "storage/lmgr.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
@ -76,11 +83,18 @@ static GrantStmt * GenerateGrantOnSchemaStmtForRights(Oid roleOid,
char *permission,
bool withGrantOption);
static char * GenerateSetRoleQuery(Oid roleOid);
static void MetadataSyncSigTermHandler(SIGNAL_ARGS);
static void MetadataSyncSigAlrmHandler(SIGNAL_ARGS);
PG_FUNCTION_INFO_V1(start_metadata_sync_to_node);
PG_FUNCTION_INFO_V1(stop_metadata_sync_to_node);
PG_FUNCTION_INFO_V1(worker_record_sequence_dependency);
static bool got_SIGTERM = false;
static bool got_SIGALRM = false;
#define METADATA_SYNC_APP_NAME "Citus Metadata Sync Daemon"
/*
* start_metadata_sync_to_node function sets hasmetadata column of the given
@ -1481,7 +1495,7 @@ DetachPartitionCommandList(void)
* metadata workers that are out of sync. Returns the result of
* synchronization.
*/
MetadataSyncResult
static MetadataSyncResult
SyncMetadataToNodes(void)
{
MetadataSyncResult result = METADATA_SYNC_SUCCESS;
@ -1511,6 +1525,9 @@ SyncMetadataToNodes(void)
if (!SyncMetadataSnapshotToNode(workerNode, raiseInterrupts))
{
ereport(WARNING, (errmsg("failed to sync metadata to %s:%d",
workerNode->workerName,
workerNode->workerPort)));
result = METADATA_SYNC_FAILED_SYNC;
}
else
@ -1523,3 +1540,244 @@ SyncMetadataToNodes(void)
return result;
}
/*
* SyncMetadataToNodesMain is the main function for syncing metadata to
* MX nodes. It retries until success and then exits.
*/
void
SyncMetadataToNodesMain(Datum main_arg)
{
Oid databaseOid = DatumGetObjectId(main_arg);
/* extension owner is passed via bgw_extra */
Oid extensionOwner = InvalidOid;
memcpy_s(&extensionOwner, sizeof(extensionOwner),
MyBgworkerEntry->bgw_extra, sizeof(Oid));
pqsignal(SIGTERM, MetadataSyncSigTermHandler);
pqsignal(SIGALRM, MetadataSyncSigAlrmHandler);
BackgroundWorkerUnblockSignals();
/* connect to database, after that we can actually access catalogs */
BackgroundWorkerInitializeConnectionByOid(databaseOid, extensionOwner, 0);
/* make worker recognizable in pg_stat_activity */
pgstat_report_appname(METADATA_SYNC_APP_NAME);
bool syncedAllNodes = false;
while (!syncedAllNodes)
{
InvalidateMetadataSystemCache();
StartTransactionCommand();
/*
* Some functions in ruleutils.c, which we use to get the DDL for
* metadata propagation, require an active snapshot.
*/
PushActiveSnapshot(GetTransactionSnapshot());
if (!LockCitusExtension())
{
ereport(DEBUG1, (errmsg("could not lock the citus extension, "
"skipping metadata sync")));
}
else if (CheckCitusVersion(DEBUG1) && CitusHasBeenLoaded())
{
UseCoordinatedTransaction();
MetadataSyncResult result = SyncMetadataToNodes();
syncedAllNodes = (result == METADATA_SYNC_SUCCESS);
/* we use LISTEN/NOTIFY to wait for metadata syncing in tests */
if (result != METADATA_SYNC_FAILED_LOCK)
{
Async_Notify(METADATA_SYNC_CHANNEL, NULL);
}
}
PopActiveSnapshot();
CommitTransactionCommand();
ProcessCompletedNotifies();
if (syncedAllNodes)
{
break;
}
/*
* If backend is cancelled (e.g. bacause of distributed deadlock),
* CHECK_FOR_INTERRUPTS() will raise a cancellation error which will
* result in exit(1).
*/
CHECK_FOR_INTERRUPTS();
/*
* SIGTERM is used for when maintenance daemon tries to clean-up
* metadata sync daemons spawned by terminated maintenance daemons.
*/
if (got_SIGTERM)
{
exit(0);
}
/*
* SIGALRM is used for testing purposes and it simulates an error in metadata
* sync daemon.
*/
if (got_SIGALRM)
{
elog(ERROR, "Error in metadata sync daemon");
}
pg_usleep(MetadataSyncRetryInterval * 1000);
}
}
/*
* MetadataSyncSigTermHandler set a flag to request termination of metadata
* sync daemon.
*/
static void
MetadataSyncSigTermHandler(SIGNAL_ARGS)
{
int save_errno = errno;
got_SIGTERM = true;
if (MyProc != NULL)
{
SetLatch(&MyProc->procLatch);
}
errno = save_errno;
}
/*
* MetadataSyncSigAlrmHandler set a flag to request error at metadata
* sync daemon. This is used for testing purposes.
*/
static void
MetadataSyncSigAlrmHandler(SIGNAL_ARGS)
{
int save_errno = errno;
got_SIGALRM = true;
if (MyProc != NULL)
{
SetLatch(&MyProc->procLatch);
}
errno = save_errno;
}
/*
* SpawnSyncMetadataToNodes starts a background worker which runs metadata
* sync. On success it returns workers' handle. Otherwise it returns NULL.
*/
BackgroundWorkerHandle *
SpawnSyncMetadataToNodes(Oid database, Oid extensionOwner)
{
BackgroundWorker worker;
BackgroundWorkerHandle *handle = NULL;
/* Configure a worker. */
memset(&worker, 0, sizeof(worker));
SafeSnprintf(worker.bgw_name, BGW_MAXLEN,
"Citus Metadata Sync: %u/%u",
database, extensionOwner);
worker.bgw_flags =
BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
/* don't restart, we manage restarts from maintenance daemon */
worker.bgw_restart_time = BGW_NEVER_RESTART;
strcpy_s(worker.bgw_library_name, sizeof(worker.bgw_library_name), "citus");
strcpy_s(worker.bgw_function_name, sizeof(worker.bgw_library_name),
"SyncMetadataToNodesMain");
worker.bgw_main_arg = ObjectIdGetDatum(MyDatabaseId);
memcpy_s(worker.bgw_extra, sizeof(worker.bgw_extra), &extensionOwner,
sizeof(Oid));
worker.bgw_notify_pid = MyProcPid;
if (!RegisterDynamicBackgroundWorker(&worker, &handle))
{
return NULL;
}
pid_t pid;
WaitForBackgroundWorkerStartup(handle, &pid);
return handle;
}
/*
* SignalMetadataSyncDaemon signals metadata sync daemons belonging to
* the given database.
*/
void
SignalMetadataSyncDaemon(Oid database, int sig)
{
int backendCount = pgstat_fetch_stat_numbackends();
for (int backend = 1; backend <= backendCount; backend++)
{
LocalPgBackendStatus *localBeEntry = pgstat_fetch_stat_local_beentry(backend);
if (!localBeEntry)
{
continue;
}
PgBackendStatus *beStatus = &localBeEntry->backendStatus;
if (beStatus->st_databaseid == database &&
strncmp(beStatus->st_appname, METADATA_SYNC_APP_NAME, BGW_MAXLEN) == 0)
{
kill(beStatus->st_procpid, sig);
}
}
}
/*
* ShouldInitiateMetadataSync returns if metadata sync daemon should be initiated.
* It sets lockFailure to true if pg_dist_node lock couldn't be acquired for the
* check.
*/
bool
ShouldInitiateMetadataSync(bool *lockFailure)
{
if (!IsCoordinator())
{
*lockFailure = false;
return false;
}
Oid distNodeOid = DistNodeRelationId();
if (!ConditionalLockRelationOid(distNodeOid, AccessShareLock))
{
*lockFailure = true;
return false;
}
bool shouldSyncMetadata = false;
List *workerList = ActivePrimaryNonCoordinatorNodeList(NoLock);
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, workerList)
{
if (workerNode->hasMetadata && !workerNode->metadataSynced)
{
shouldSyncMetadata = true;
break;
}
}
UnlockRelationOid(distNodeOid, AccessShareLock);
*lockFailure = false;
return shouldSyncMetadata;
}

View File

@ -35,6 +35,7 @@
#include "distributed/citus_nodes.h"
#include "distributed/citus_safe_lib.h"
#include "distributed/listutils.h"
#include "distributed/lock_graph.h"
#include "distributed/metadata_utility.h"
#include "distributed/coordinator_protocol.h"
#include "distributed/metadata_cache.h"
@ -50,6 +51,7 @@
#include "distributed/relay_utility.h"
#include "distributed/resource_lock.h"
#include "distributed/remote_commands.h"
#include "distributed/tuplestore.h"
#include "distributed/worker_manager.h"
#include "distributed/worker_protocol.h"
#include "distributed/version_compat.h"
@ -75,9 +77,23 @@ static uint64 DistributedTableSize(Oid relationId, char *sizeQuery);
static uint64 DistributedTableSizeOnWorker(WorkerNode *workerNode, Oid relationId,
char *sizeQuery);
static List * ShardIntervalsOnWorkerGroup(WorkerNode *workerNode, Oid relationId);
static char * GenerateShardStatisticsQueryForShardList(List *shardIntervalList, bool
useShardMinMaxQuery);
static char * GenerateAllShardStatisticsQueryForNode(WorkerNode *workerNode,
List *citusTableIds, bool
useShardMinMaxQuery);
static List * GenerateShardStatisticsQueryList(List *workerNodeList, List *citusTableIds,
bool useShardMinMaxQuery);
static void ErrorIfNotSuitableToGetSize(Oid relationId);
static ShardPlacement * ShardPlacementOnGroup(uint64 shardId, int groupId);
static List * OpenConnectionToNodes(List *workerNodeList);
static void AppendShardSizeMinMaxQuery(StringInfo selectQuery, uint64 shardId,
ShardInterval *
shardInterval, char *shardName,
char *quotedShardName);
static void AppendShardSizeQuery(StringInfo selectQuery, ShardInterval *shardInterval,
char *quotedShardName);
/* exports for SQL callable functions */
PG_FUNCTION_INFO_V1(citus_table_size);
@ -154,6 +170,106 @@ citus_relation_size(PG_FUNCTION_ARGS)
}
/*
* SendShardStatisticsQueriesInParallel generates query lists for obtaining shard
* statistics and then sends the commands in parallel by opening connections
* to available nodes. It returns the connection list.
*/
List *
SendShardStatisticsQueriesInParallel(List *citusTableIds, bool useDistributedTransaction,
bool
useShardMinMaxQuery)
{
List *workerNodeList = ActivePrimaryNodeList(NoLock);
List *shardSizesQueryList = GenerateShardStatisticsQueryList(workerNodeList,
citusTableIds,
useShardMinMaxQuery);
List *connectionList = OpenConnectionToNodes(workerNodeList);
FinishConnectionListEstablishment(connectionList);
if (useDistributedTransaction)
{
/*
* For now, in the case we want to include shard min and max values, we also
* want to update the entries in pg_dist_placement and pg_dist_shard with the
* latest statistics. In order to detect distributed deadlocks, we assign a
* distributed transaction ID to the current transaction
*/
UseCoordinatedTransaction();
}
/* send commands in parallel */
for (int i = 0; i < list_length(connectionList); i++)
{
MultiConnection *connection = (MultiConnection *) list_nth(connectionList, i);
char *shardSizesQuery = (char *) list_nth(shardSizesQueryList, i);
if (useDistributedTransaction)
{
/* run the size query in a distributed transaction */
RemoteTransactionBeginIfNecessary(connection);
}
int querySent = SendRemoteCommand(connection, shardSizesQuery);
if (querySent == 0)
{
ReportConnectionError(connection, WARNING);
}
}
return connectionList;
}
/*
* OpenConnectionToNodes opens a single connection per node
* for the given workerNodeList.
*/
static List *
OpenConnectionToNodes(List *workerNodeList)
{
List *connectionList = NIL;
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, workerNodeList)
{
const char *nodeName = workerNode->workerName;
int nodePort = workerNode->workerPort;
int connectionFlags = 0;
MultiConnection *connection = StartNodeConnection(connectionFlags, nodeName,
nodePort);
connectionList = lappend(connectionList, connection);
}
return connectionList;
}
/*
* GenerateShardStatisticsQueryList generates a query per node that will return:
* - all shard_name, shard_size pairs from the node (if includeShardMinMax is false)
* - all shard_id, shard_minvalue, shard_maxvalue, shard_size quartuples from the node (if true)
*/
static List *
GenerateShardStatisticsQueryList(List *workerNodeList, List *citusTableIds, bool
useShardMinMaxQuery)
{
List *shardStatisticsQueryList = NIL;
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, workerNodeList)
{
char *shardStatisticsQuery = GenerateAllShardStatisticsQueryForNode(workerNode,
citusTableIds,
useShardMinMaxQuery);
shardStatisticsQueryList = lappend(shardStatisticsQueryList,
shardStatisticsQuery);
}
return shardStatisticsQueryList;
}
/*
* DistributedTableSize is helper function for each kind of citus size functions.
* It first checks whether the table is distributed and size query can be run on
@ -352,6 +468,130 @@ GenerateSizeQueryOnMultiplePlacements(List *shardIntervalList, char *sizeQuery)
}
/*
* GenerateAllShardStatisticsQueryForNode generates a query that returns:
* - all shard_name, shard_size pairs for the given node (if useShardMinMaxQuery is false)
* - all shard_id, shard_minvalue, shard_maxvalue, shard_size quartuples (if true)
*/
static char *
GenerateAllShardStatisticsQueryForNode(WorkerNode *workerNode, List *citusTableIds, bool
useShardMinMaxQuery)
{
StringInfo allShardStatisticsQuery = makeStringInfo();
Oid relationId = InvalidOid;
foreach_oid(relationId, citusTableIds)
{
List *shardIntervalsOnNode = ShardIntervalsOnWorkerGroup(workerNode, relationId);
char *shardStatisticsQuery =
GenerateShardStatisticsQueryForShardList(shardIntervalsOnNode,
useShardMinMaxQuery);
appendStringInfoString(allShardStatisticsQuery, shardStatisticsQuery);
}
/* Add a dummy entry so that UNION ALL doesn't complain */
if (useShardMinMaxQuery)
{
/* 0 for shard_id, NULL for min, NULL for text, 0 for shard_size */
appendStringInfo(allShardStatisticsQuery,
"SELECT 0::bigint, NULL::text, NULL::text, 0::bigint;");
}
else
{
/* NULL for shard_name, 0 for shard_size */
appendStringInfo(allShardStatisticsQuery, "SELECT NULL::text, 0::bigint;");
}
return allShardStatisticsQuery->data;
}
/*
* GenerateShardStatisticsQueryForShardList generates one of the two types of queries:
* - SELECT shard_name - shard_size (if useShardMinMaxQuery is false)
* - SELECT shard_id, shard_minvalue, shard_maxvalue, shard_size (if true)
*/
static char *
GenerateShardStatisticsQueryForShardList(List *shardIntervalList, bool
useShardMinMaxQuery)
{
StringInfo selectQuery = makeStringInfo();
ShardInterval *shardInterval = NULL;
foreach_ptr(shardInterval, shardIntervalList)
{
uint64 shardId = shardInterval->shardId;
Oid schemaId = get_rel_namespace(shardInterval->relationId);
char *schemaName = get_namespace_name(schemaId);
char *shardName = get_rel_name(shardInterval->relationId);
AppendShardIdToName(&shardName, shardId);
char *shardQualifiedName = quote_qualified_identifier(schemaName, shardName);
char *quotedShardName = quote_literal_cstr(shardQualifiedName);
if (useShardMinMaxQuery)
{
AppendShardSizeMinMaxQuery(selectQuery, shardId, shardInterval, shardName,
quotedShardName);
}
else
{
AppendShardSizeQuery(selectQuery, shardInterval, quotedShardName);
}
appendStringInfo(selectQuery, " UNION ALL ");
}
return selectQuery->data;
}
/*
* AppendShardSizeMinMaxQuery appends a query in the following form to selectQuery
* SELECT shard_id, shard_minvalue, shard_maxvalue, shard_size
*/
static void
AppendShardSizeMinMaxQuery(StringInfo selectQuery, uint64 shardId,
ShardInterval *shardInterval, char *shardName,
char *quotedShardName)
{
if (PartitionMethod(shardInterval->relationId) == DISTRIBUTE_BY_APPEND)
{
/* fill in the partition column name */
const uint32 unusedTableId = 1;
Var *partitionColumn = PartitionColumn(shardInterval->relationId,
unusedTableId);
char *partitionColumnName = get_attname(shardInterval->relationId,
partitionColumn->varattno, false);
appendStringInfo(selectQuery,
"SELECT " UINT64_FORMAT
" AS shard_id, min(%s)::text AS shard_minvalue, max(%s)::text AS shard_maxvalue, pg_relation_size(%s) AS shard_size FROM %s ",
shardId, partitionColumnName,
partitionColumnName,
quotedShardName, shardName);
}
else
{
/* we don't need to update min/max for non-append distributed tables because they don't change */
appendStringInfo(selectQuery,
"SELECT " UINT64_FORMAT
" AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size(%s) AS shard_size ",
shardId, quotedShardName);
}
}
/*
* AppendShardSizeQuery appends a query in the following form to selectQuery
* SELECT shard_name, shard_size
*/
static void
AppendShardSizeQuery(StringInfo selectQuery, ShardInterval *shardInterval,
char *quotedShardName)
{
appendStringInfo(selectQuery, "SELECT %s AS shard_name, ", quotedShardName);
appendStringInfo(selectQuery, PG_RELATION_SIZE_FUNCTION, quotedShardName);
}
/*
* ErrorIfNotSuitableToGetSize determines whether the table is suitable to find
* its' size with internal functions.

View File

@ -443,7 +443,7 @@ SetUpDistributedTableDependencies(WorkerNode *newWorkerNode)
{
MarkNodeHasMetadata(newWorkerNode->workerName, newWorkerNode->workerPort,
true);
TriggerMetadataSync(MyDatabaseId);
TriggerMetadataSyncOnCommit();
}
}
}
@ -809,7 +809,7 @@ master_update_node(PG_FUNCTION_ARGS)
*/
if (UnsetMetadataSyncedForAll())
{
TriggerMetadataSync(MyDatabaseId);
TriggerMetadataSyncOnCommit();
}
if (handle != NULL)

View File

@ -32,7 +32,9 @@
#include "distributed/connection_management.h"
#include "distributed/deparse_shard_query.h"
#include "distributed/distributed_planner.h"
#include "distributed/foreign_key_relationship.h"
#include "distributed/listutils.h"
#include "distributed/lock_graph.h"
#include "distributed/multi_client_executor.h"
#include "distributed/multi_executor.h"
#include "distributed/metadata_utility.h"
@ -65,11 +67,21 @@ static List * RelationShardListForShardCreate(ShardInterval *shardInterval);
static bool WorkerShardStats(ShardPlacement *placement, Oid relationId,
const char *shardName, uint64 *shardSize,
text **shardMinValue, text **shardMaxValue);
static void UpdateTableStatistics(Oid relationId);
static void ReceiveAndUpdateShardsSizeAndMinMax(List *connectionList);
static void UpdateShardSizeAndMinMax(uint64 shardId, ShardInterval *shardInterval, Oid
relationId, List *shardPlacementList, uint64
shardSize, text *shardMinValue,
text *shardMaxValue);
static bool ProcessShardStatisticsRow(PGresult *result, int64 rowIndex, uint64 *shardId,
text **shardMinValue, text **shardMaxValue,
uint64 *shardSize);
/* exports for SQL callable functions */
PG_FUNCTION_INFO_V1(master_create_empty_shard);
PG_FUNCTION_INFO_V1(master_append_table_to_shard);
PG_FUNCTION_INFO_V1(master_update_shard_statistics);
PG_FUNCTION_INFO_V1(citus_update_table_statistics);
/*
@ -361,6 +373,23 @@ master_update_shard_statistics(PG_FUNCTION_ARGS)
}
/*
* citus_update_table_statistics updates metadata (shard size and shard min/max
* values) of the shards of the given table
*/
Datum
citus_update_table_statistics(PG_FUNCTION_ARGS)
{
Oid distributedTableId = PG_GETARG_OID(0);
CheckCitusVersion(ERROR);
UpdateTableStatistics(distributedTableId);
PG_RETURN_VOID();
}
/*
* CheckDistributedTable checks if the given relationId corresponds to a
* distributed table. If it does not, the function errors out.
@ -776,7 +805,6 @@ UpdateShardStatistics(int64 shardId)
{
ShardInterval *shardInterval = LoadShardInterval(shardId);
Oid relationId = shardInterval->relationId;
char storageType = shardInterval->storageType;
bool statsOK = false;
uint64 shardSize = 0;
text *minValue = NULL;
@ -819,36 +847,176 @@ UpdateShardStatistics(int64 shardId)
errdetail("Setting shard statistics to NULL")));
}
/* make sure we don't process cancel signals */
HOLD_INTERRUPTS();
UpdateShardSizeAndMinMax(shardId, shardInterval, relationId, shardPlacementList,
shardSize, minValue, maxValue);
return shardSize;
}
/* update metadata for each shard placement we appended to */
/*
* UpdateTableStatistics updates metadata (shard size and shard min/max values)
* of the shards of the given table. Follows a similar logic to citus_shard_sizes function.
*/
static void
UpdateTableStatistics(Oid relationId)
{
List *citusTableIds = NIL;
citusTableIds = lappend_oid(citusTableIds, relationId);
/* we want to use a distributed transaction here to detect distributed deadlocks */
bool useDistributedTransaction = true;
/* we also want shard min/max values for append distributed tables */
bool useShardMinMaxQuery = true;
List *connectionList = SendShardStatisticsQueriesInParallel(citusTableIds,
useDistributedTransaction,
useShardMinMaxQuery);
ReceiveAndUpdateShardsSizeAndMinMax(connectionList);
}
/*
* ReceiveAndUpdateShardsSizeAndMinMax receives shard id, size
* and min max results from the given connection list, and updates
* respective entries in pg_dist_placement and pg_dist_shard
*/
static void
ReceiveAndUpdateShardsSizeAndMinMax(List *connectionList)
{
/*
* From the connection list, we will not get all the shards, but
* all the placements. We use a hash table to remember already visited shard ids
* since we update all the different placements of a shard id at once.
*/
HTAB *alreadyVisitedShardPlacements = CreateOidVisitedHashSet();
MultiConnection *connection = NULL;
foreach_ptr(connection, connectionList)
{
if (PQstatus(connection->pgConn) != CONNECTION_OK)
{
continue;
}
bool raiseInterrupts = true;
PGresult *result = GetRemoteCommandResult(connection, raiseInterrupts);
if (!IsResponseOK(result))
{
ReportResultError(connection, result, WARNING);
continue;
}
int64 rowCount = PQntuples(result);
int64 colCount = PQnfields(result);
/* Although it is not expected */
if (colCount != UPDATE_SHARD_STATISTICS_COLUMN_COUNT)
{
ereport(WARNING, (errmsg("unexpected number of columns from "
"master_update_table_statistics")));
continue;
}
for (int64 rowIndex = 0; rowIndex < rowCount; rowIndex++)
{
uint64 shardId = 0;
text *shardMinValue = NULL;
text *shardMaxValue = NULL;
uint64 shardSize = 0;
if (!ProcessShardStatisticsRow(result, rowIndex, &shardId, &shardMinValue,
&shardMaxValue, &shardSize))
{
/* this row has no valid shard statistics */
continue;
}
if (OidVisited(alreadyVisitedShardPlacements, shardId))
{
/* We have already updated this placement list */
continue;
}
VisitOid(alreadyVisitedShardPlacements, shardId);
ShardInterval *shardInterval = LoadShardInterval(shardId);
Oid relationId = shardInterval->relationId;
List *shardPlacementList = ActiveShardPlacementList(shardId);
UpdateShardSizeAndMinMax(shardId, shardInterval, relationId,
shardPlacementList, shardSize, shardMinValue,
shardMaxValue);
}
PQclear(result);
ForgetResults(connection);
}
hash_destroy(alreadyVisitedShardPlacements);
}
/*
* ProcessShardStatisticsRow processes a row of shard statistics of the input PGresult
* - it returns true if this row belongs to a valid shard
* - it returns false if this row has no valid shard statistics (shardId = INVALID_SHARD_ID)
*/
static bool
ProcessShardStatisticsRow(PGresult *result, int64 rowIndex, uint64 *shardId,
text **shardMinValue, text **shardMaxValue, uint64 *shardSize)
{
*shardId = ParseIntField(result, rowIndex, 0);
/* check for the dummy entries we put so that UNION ALL wouldn't complain */
if (*shardId == INVALID_SHARD_ID)
{
/* this row has no valid shard statistics */
return false;
}
char *minValueResult = PQgetvalue(result, rowIndex, 1);
char *maxValueResult = PQgetvalue(result, rowIndex, 2);
*shardMinValue = cstring_to_text(minValueResult);
*shardMaxValue = cstring_to_text(maxValueResult);
*shardSize = ParseIntField(result, rowIndex, 3);
return true;
}
/*
* UpdateShardSizeAndMinMax updates the shardlength (shard size) of the given
* shard and its placements in pg_dist_placement, and updates the shard min value
* and shard max value of the given shard in pg_dist_shard if the relationId belongs
* to an append-distributed table
*/
static void
UpdateShardSizeAndMinMax(uint64 shardId, ShardInterval *shardInterval, Oid relationId,
List *shardPlacementList, uint64 shardSize, text *shardMinValue,
text *shardMaxValue)
{
char storageType = shardInterval->storageType;
ShardPlacement *placement = NULL;
/* update metadata for each shard placement */
foreach_ptr(placement, shardPlacementList)
{
uint64 placementId = placement->placementId;
int32 groupId = placement->groupId;
DeleteShardPlacementRow(placementId);
InsertShardPlacementRow(shardId, placementId, SHARD_STATE_ACTIVE, shardSize,
InsertShardPlacementRow(shardId, placementId, SHARD_STATE_ACTIVE,
shardSize,
groupId);
}
/* only update shard min/max values for append-partitioned tables */
if (IsCitusTableType(relationId, APPEND_DISTRIBUTED))
if (PartitionMethod(relationId) == DISTRIBUTE_BY_APPEND)
{
DeleteShardRow(shardId);
InsertShardRow(relationId, shardId, storageType, minValue, maxValue);
InsertShardRow(relationId, shardId, storageType, shardMinValue,
shardMaxValue);
}
if (QueryCancelPending)
{
ereport(WARNING, (errmsg("cancel requests are ignored during metadata update")));
QueryCancelPending = false;
}
RESUME_INTERRUPTS();
return shardSize;
}

View File

@ -1063,7 +1063,7 @@ worker_save_query_explain_analyze(PG_FUNCTION_ARGS)
INSTR_TIME_SET_CURRENT(planStart);
PlannedStmt *plan = pg_plan_query_compat(query, NULL, 0, NULL);
PlannedStmt *plan = pg_plan_query_compat(query, NULL, CURSOR_OPT_PARALLEL_OK, NULL);
INSTR_TIME_SET_CURRENT(planDuration);
INSTR_TIME_SUBTRACT(planDuration, planStart);

View File

@ -1576,6 +1576,22 @@ LowerShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
/* setup partitionColumnValue argument once */
fcSetArg(compareFunction, 0, partitionColumnValue);
/*
* Now we test partitionColumnValue used in where clause such as
* partCol > partitionColumnValue (or partCol >= partitionColumnValue)
* against four possibilities, these are:
* 1) partitionColumnValue falls into a specific shard, such that:
* partitionColumnValue >= shard[x].min, and
* partitionColumnValue < shard[x].max (or partitionColumnValue <= shard[x].max).
* 2) partitionColumnValue < shard[x].min for all the shards
* 3) partitionColumnValue > shard[x].max for all the shards
* 4) partitionColumnValue falls in between two shards, such that:
* partitionColumnValue > shard[x].max and
* partitionColumnValue < shard[x+1].min
*
* For 1), we find that shard in below loop using binary search and
* return the index of it. For the others, see the end of this function.
*/
while (lowerBoundIndex < upperBoundIndex)
{
int middleIndex = lowerBoundIndex + ((upperBoundIndex - lowerBoundIndex) / 2);
@ -1608,7 +1624,7 @@ LowerShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
continue;
}
/* found interval containing partitionValue */
/* partitionColumnValue falls into a specific shard, possibility 1) */
return middleIndex;
}
@ -1619,20 +1635,30 @@ LowerShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
* (we'd have hit the return middleIndex; case otherwise). Figure out
* whether there's possibly any interval containing a value that's bigger
* than the partition key one.
*
* Also note that we initialized lowerBoundIndex with 0. Similarly,
* we always set it to the index of the shard that we consider as our
* lower boundary during binary search.
*/
if (lowerBoundIndex == 0)
if (lowerBoundIndex == shardCount)
{
/* all intervals are bigger, thus return 0 */
return 0;
}
else if (lowerBoundIndex == shardCount)
{
/* partition value is bigger than all partition values */
/*
* Since lowerBoundIndex is an inclusive index, being equal to shardCount
* means all the shards have smaller values than partitionColumnValue,
* which corresponds to possibility 3).
* In that case, since we can't have a lower bound shard, we return
* INVALID_SHARD_INDEX here.
*/
return INVALID_SHARD_INDEX;
}
/* value falls inbetween intervals */
return lowerBoundIndex + 1;
/*
* partitionColumnValue is either smaller than all the shards or falls in
* between two shards, which corresponds to possibility 2) or 4).
* Knowing that lowerBoundIndex is an inclusive index, we directly return
* it as the index for the lower bound shard here.
*/
return lowerBoundIndex;
}
@ -1652,6 +1678,23 @@ UpperShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
/* setup partitionColumnValue argument once */
fcSetArg(compareFunction, 0, partitionColumnValue);
/*
* Now we test partitionColumnValue used in where clause such as
* partCol < partitionColumnValue (or partCol <= partitionColumnValue)
* against four possibilities, these are:
* 1) partitionColumnValue falls into a specific shard, such that:
* partitionColumnValue <= shard[x].max, and
* partitionColumnValue > shard[x].min (or partitionColumnValue >= shard[x].min).
* 2) partitionColumnValue > shard[x].max for all the shards
* 3) partitionColumnValue < shard[x].min for all the shards
* 4) partitionColumnValue falls in between two shards, such that:
* partitionColumnValue > shard[x].max and
* partitionColumnValue < shard[x+1].min
*
* For 1), we find that shard in below loop using binary search and
* return the index of it. For the others, see the end of this function.
*/
while (lowerBoundIndex < upperBoundIndex)
{
int middleIndex = lowerBoundIndex + ((upperBoundIndex - lowerBoundIndex) / 2);
@ -1684,7 +1727,7 @@ UpperShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
continue;
}
/* found interval containing partitionValue */
/* partitionColumnValue falls into a specific shard, possibility 1) */
return middleIndex;
}
@ -1695,19 +1738,29 @@ UpperShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
* (we'd have hit the return middleIndex; case otherwise). Figure out
* whether there's possibly any interval containing a value that's smaller
* than the partition key one.
*
* Also note that we initialized upperBoundIndex with shardCount. Similarly,
* we always set it to the index of the next shard that we consider as our
* upper boundary during binary search.
*/
if (upperBoundIndex == shardCount)
if (upperBoundIndex == 0)
{
/* all intervals are smaller, thus return 0 */
return shardCount - 1;
}
else if (upperBoundIndex == 0)
{
/* partition value is smaller than all partition values */
/*
* Since upperBoundIndex is an exclusive index, being equal to 0 means
* all the shards have greater values than partitionColumnValue, which
* corresponds to possibility 3).
* In that case, since we can't have an upper bound shard, we return
* INVALID_SHARD_INDEX here.
*/
return INVALID_SHARD_INDEX;
}
/* value falls inbetween intervals, return the inverval one smaller as bound */
/*
* partitionColumnValue is either greater than all the shards or falls in
* between two shards, which corresponds to possibility 2) or 4).
* Knowing that upperBoundIndex is an exclusive index, we return the index
* for the previous shard here.
*/
return upperBoundIndex - 1;
}

View File

@ -0,0 +1,3 @@
-- 9.4-1--9.4-2 was added later as a patch to fix a bug in our PG upgrade functions
#include "udfs/citus_prepare_pg_upgrade/9.4-2.sql"
#include "udfs/citus_finish_pg_upgrade/9.4-2.sql"

View File

@ -0,0 +1,9 @@
--
-- 9.4-1--9.4-2 was added later as a patch to fix a bug in our PG upgrade functions
--
-- This script brings users who installed the patch released back to the 9.4-1
-- upgrade path. We do this via a semantical downgrade since there has already been
-- introduced new changes in the schema from 9.4-1 to 9.5-1. To make sure we include all
-- changes made during that version change we decide to use the existing upgrade path from
-- our later introduced 9.4-2 version.
--

View File

@ -0,0 +1,7 @@
-- 9.4-2--9.4-3 was added later as a patch to improve master_update_table_statistics
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_update_table_statistics$$;
COMMENT ON FUNCTION pg_catalog.master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table';

View File

@ -0,0 +1,22 @@
-- citus--9.4-3--9.4-2
-- This is a downgrade path that will revert the changes made in citus--9.4-2--9.4-3.sql
-- 9.4-2--9.4-3 was added later as a patch to improve master_update_table_statistics.
-- We have this downgrade script so that we can continue from the main upgrade path
-- when upgrading to later versions.
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID AS $$
DECLARE
colocated_tables regclass[];
BEGIN
SELECT get_colocated_table_array(relation) INTO colocated_tables;
PERFORM
master_update_shard_statistics(shardid)
FROM
pg_dist_shard
WHERE
logicalrelid = ANY (colocated_tables);
END;
$$ LANGUAGE 'plpgsql';
COMMENT ON FUNCTION master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table and its colocated tables';

View File

@ -0,0 +1,3 @@
-- 9.5-1--9.5-2 was added later as a patch to fix a bug in our PG upgrade functions
#include "udfs/citus_prepare_pg_upgrade/9.5-2.sql"
#include "udfs/citus_finish_pg_upgrade/9.5-2.sql"

View File

@ -0,0 +1,9 @@
--
-- 9.5-1--9.5-2 was added later as a patch to fix a bug in our PG upgrade functions
--
-- This script brings users who installed the patch released back to the 9.5-1
-- upgrade path. We do this via a semantical downgrade since there has already been
-- introduced new changes in the schema from 9.5-1 to 10.0-1. To make sure we include all
-- changes made during that version change we decide to use the existing upgrade path from
-- our later introduced 9.5-1 version.
--

View File

@ -0,0 +1,7 @@
-- 9.5-2--9.5-3 was added later as a patch to improve master_update_table_statistics
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_update_table_statistics$$;
COMMENT ON FUNCTION pg_catalog.master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table';

View File

@ -0,0 +1,22 @@
-- citus--9.5-3--9.5-2
-- This is a downgrade path that will revert the changes made in citus--9.5-2--9.5-3.sql
-- 9.5-2--9.5-3 was added later as a patch to improve master_update_table_statistics.
-- We have this downgrade script so that we can continue from the main upgrade path
-- when upgrading to later versions.
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID AS $$
DECLARE
colocated_tables regclass[];
BEGIN
SELECT get_colocated_table_array(relation) INTO colocated_tables;
PERFORM
master_update_shard_statistics(shardid)
FROM
pg_dist_shard
WHERE
logicalrelid = ANY (colocated_tables);
END;
$$ LANGUAGE 'plpgsql';
COMMENT ON FUNCTION master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table and its colocated tables';

View File

@ -0,0 +1,105 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
DECLARE
table_name regclass;
command text;
trigger_name text;
BEGIN
--
-- restore citus catalog tables
--
INSERT INTO pg_catalog.pg_dist_partition SELECT * FROM public.pg_dist_partition;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
INSERT INTO pg_catalog.pg_dist_node SELECT * FROM public.pg_dist_node;
INSERT INTO pg_catalog.pg_dist_local_group SELECT * FROM public.pg_dist_local_group;
INSERT INTO pg_catalog.pg_dist_transaction SELECT * FROM public.pg_dist_transaction;
INSERT INTO pg_catalog.pg_dist_colocation SELECT * FROM public.pg_dist_colocation;
-- enterprise catalog tables
INSERT INTO pg_catalog.pg_dist_authinfo SELECT * FROM public.pg_dist_authinfo;
INSERT INTO pg_catalog.pg_dist_poolinfo SELECT * FROM public.pg_dist_poolinfo;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy DISABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
INSERT INTO pg_catalog.pg_dist_rebalance_strategy SELECT
name,
default_strategy,
shard_cost_function::regprocedure::regproc,
node_capacity_function::regprocedure::regproc,
shard_allowed_on_node_function::regprocedure::regproc,
default_threshold,
minimum_threshold
FROM public.pg_dist_rebalance_strategy;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy ENABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
--
-- drop backup tables
--
DROP TABLE public.pg_dist_authinfo;
DROP TABLE public.pg_dist_colocation;
DROP TABLE public.pg_dist_local_group;
DROP TABLE public.pg_dist_node;
DROP TABLE public.pg_dist_node_metadata;
DROP TABLE public.pg_dist_partition;
DROP TABLE public.pg_dist_placement;
DROP TABLE public.pg_dist_poolinfo;
DROP TABLE public.pg_dist_shard;
DROP TABLE public.pg_dist_transaction;
--
-- reset sequences
--
PERFORM setval('pg_catalog.pg_dist_shardid_seq', (SELECT MAX(shardid)+1 AS max_shard_id FROM pg_dist_shard), false);
PERFORM setval('pg_catalog.pg_dist_placement_placementid_seq', (SELECT MAX(placementid)+1 AS max_placement_id FROM pg_dist_placement), false);
PERFORM setval('pg_catalog.pg_dist_groupid_seq', (SELECT MAX(groupid)+1 AS max_group_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_node_nodeid_seq', (SELECT MAX(nodeid)+1 AS max_node_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_colocationid_seq', (SELECT MAX(colocationid)+1 AS max_colocation_id FROM pg_dist_colocation), false);
--
-- register triggers
--
FOR table_name IN SELECT logicalrelid FROM pg_catalog.pg_dist_partition
LOOP
trigger_name := 'truncate_trigger_' || table_name::oid;
command := 'create trigger ' || trigger_name || ' after truncate on ' || table_name || ' execute procedure pg_catalog.citus_truncate_trigger()';
EXECUTE command;
command := 'update pg_trigger set tgisinternal = true where tgname = ' || quote_literal(trigger_name);
EXECUTE command;
END LOOP;
--
-- set dependencies
--
INSERT INTO pg_depend
SELECT
'pg_class'::regclass::oid as classid,
p.logicalrelid::regclass::oid as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
address.objid,
address.objsubid,
naming.distribution_argument_index,
naming.colocationid
FROM
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_finish_pg_upgrade()
IS 'perform tasks to restore citus settings from a location that has been prepared before pg_upgrade';

View File

@ -0,0 +1,106 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
DECLARE
table_name regclass;
command text;
trigger_name text;
BEGIN
--
-- restore citus catalog tables
--
INSERT INTO pg_catalog.pg_dist_partition SELECT * FROM public.pg_dist_partition;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
INSERT INTO pg_catalog.pg_dist_node SELECT * FROM public.pg_dist_node;
INSERT INTO pg_catalog.pg_dist_local_group SELECT * FROM public.pg_dist_local_group;
INSERT INTO pg_catalog.pg_dist_transaction SELECT * FROM public.pg_dist_transaction;
INSERT INTO pg_catalog.pg_dist_colocation SELECT * FROM public.pg_dist_colocation;
-- enterprise catalog tables
INSERT INTO pg_catalog.pg_dist_authinfo SELECT * FROM public.pg_dist_authinfo;
INSERT INTO pg_catalog.pg_dist_poolinfo SELECT * FROM public.pg_dist_poolinfo;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy DISABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
INSERT INTO pg_catalog.pg_dist_rebalance_strategy SELECT
name,
default_strategy,
shard_cost_function::regprocedure::regproc,
node_capacity_function::regprocedure::regproc,
shard_allowed_on_node_function::regprocedure::regproc,
default_threshold,
minimum_threshold
FROM public.pg_dist_rebalance_strategy;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy ENABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
--
-- drop backup tables
--
DROP TABLE public.pg_dist_authinfo;
DROP TABLE public.pg_dist_colocation;
DROP TABLE public.pg_dist_local_group;
DROP TABLE public.pg_dist_node;
DROP TABLE public.pg_dist_node_metadata;
DROP TABLE public.pg_dist_partition;
DROP TABLE public.pg_dist_placement;
DROP TABLE public.pg_dist_poolinfo;
DROP TABLE public.pg_dist_shard;
DROP TABLE public.pg_dist_transaction;
DROP TABLE public.pg_dist_rebalance_strategy;
--
-- reset sequences
--
PERFORM setval('pg_catalog.pg_dist_shardid_seq', (SELECT MAX(shardid)+1 AS max_shard_id FROM pg_dist_shard), false);
PERFORM setval('pg_catalog.pg_dist_placement_placementid_seq', (SELECT MAX(placementid)+1 AS max_placement_id FROM pg_dist_placement), false);
PERFORM setval('pg_catalog.pg_dist_groupid_seq', (SELECT MAX(groupid)+1 AS max_group_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_node_nodeid_seq', (SELECT MAX(nodeid)+1 AS max_node_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_colocationid_seq', (SELECT MAX(colocationid)+1 AS max_colocation_id FROM pg_dist_colocation), false);
--
-- register triggers
--
FOR table_name IN SELECT logicalrelid FROM pg_catalog.pg_dist_partition
LOOP
trigger_name := 'truncate_trigger_' || table_name::oid;
command := 'create trigger ' || trigger_name || ' after truncate on ' || table_name || ' execute procedure pg_catalog.citus_truncate_trigger()';
EXECUTE command;
command := 'update pg_trigger set tgisinternal = true where tgname = ' || quote_literal(trigger_name);
EXECUTE command;
END LOOP;
--
-- set dependencies
--
INSERT INTO pg_depend
SELECT
'pg_class'::regclass::oid as classid,
p.logicalrelid::regclass::oid as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
address.objid,
address.objsubid,
naming.distribution_argument_index,
naming.colocationid
FROM
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_finish_pg_upgrade()
IS 'perform tasks to restore citus settings from a location that has been prepared before pg_upgrade';

View File

@ -86,17 +86,7 @@ BEGIN
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
-- DELETE/INSERT to avoid primary key violations
WITH old_records AS (
DELETE FROM
citus.pg_dist_object
RETURNING
type,
object_names,
object_args,
distribution_argument_index,
colocationid
)
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
@ -105,8 +95,10 @@ BEGIN
naming.distribution_argument_index,
naming.colocationid
FROM
old_records naming,
pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;

View File

@ -0,0 +1,44 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_prepare_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
BEGIN
--
-- backup citus catalog tables
--
CREATE TABLE public.pg_dist_partition AS SELECT * FROM pg_catalog.pg_dist_partition;
CREATE TABLE public.pg_dist_shard AS SELECT * FROM pg_catalog.pg_dist_shard;
CREATE TABLE public.pg_dist_placement AS SELECT * FROM pg_catalog.pg_dist_placement;
CREATE TABLE public.pg_dist_node_metadata AS SELECT * FROM pg_catalog.pg_dist_node_metadata;
CREATE TABLE public.pg_dist_node AS SELECT * FROM pg_catalog.pg_dist_node;
CREATE TABLE public.pg_dist_local_group AS SELECT * FROM pg_catalog.pg_dist_local_group;
CREATE TABLE public.pg_dist_transaction AS SELECT * FROM pg_catalog.pg_dist_transaction;
CREATE TABLE public.pg_dist_colocation AS SELECT * FROM pg_catalog.pg_dist_colocation;
-- enterprise catalog tables
CREATE TABLE public.pg_dist_authinfo AS SELECT * FROM pg_catalog.pg_dist_authinfo;
CREATE TABLE public.pg_dist_poolinfo AS SELECT * FROM pg_catalog.pg_dist_poolinfo;
CREATE TABLE public.pg_dist_rebalance_strategy AS SELECT
name,
default_strategy,
shard_cost_function::regprocedure::text,
node_capacity_function::regprocedure::text,
shard_allowed_on_node_function::regprocedure::text,
default_threshold,
minimum_threshold
FROM pg_catalog.pg_dist_rebalance_strategy;
-- store upgrade stable identifiers on pg_dist_object catalog
CREATE TABLE public.pg_dist_object AS SELECT
address.type,
address.object_names,
address.object_args,
objects.distribution_argument_index,
objects.colocationid
FROM citus.pg_dist_object objects,
pg_catalog.pg_identify_object_as_address(objects.classid, objects.objid, objects.objsubid) address;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_prepare_pg_upgrade()
IS 'perform tasks to copy citus settings to a location that could later be restored after pg_upgrade is done';

View File

@ -0,0 +1,60 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_prepare_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
BEGIN
--
-- Drop existing backup tables
--
DROP TABLE IF EXISTS public.pg_dist_partition;
DROP TABLE IF EXISTS public.pg_dist_shard;
DROP TABLE IF EXISTS public.pg_dist_placement;
DROP TABLE IF EXISTS public.pg_dist_node_metadata;
DROP TABLE IF EXISTS public.pg_dist_node;
DROP TABLE IF EXISTS public.pg_dist_local_group;
DROP TABLE IF EXISTS public.pg_dist_transaction;
DROP TABLE IF EXISTS public.pg_dist_colocation;
DROP TABLE IF EXISTS public.pg_dist_authinfo;
DROP TABLE IF EXISTS public.pg_dist_poolinfo;
DROP TABLE IF EXISTS public.pg_dist_rebalance_strategy;
DROP TABLE IF EXISTS public.pg_dist_object;
--
-- backup citus catalog tables
--
CREATE TABLE public.pg_dist_partition AS SELECT * FROM pg_catalog.pg_dist_partition;
CREATE TABLE public.pg_dist_shard AS SELECT * FROM pg_catalog.pg_dist_shard;
CREATE TABLE public.pg_dist_placement AS SELECT * FROM pg_catalog.pg_dist_placement;
CREATE TABLE public.pg_dist_node_metadata AS SELECT * FROM pg_catalog.pg_dist_node_metadata;
CREATE TABLE public.pg_dist_node AS SELECT * FROM pg_catalog.pg_dist_node;
CREATE TABLE public.pg_dist_local_group AS SELECT * FROM pg_catalog.pg_dist_local_group;
CREATE TABLE public.pg_dist_transaction AS SELECT * FROM pg_catalog.pg_dist_transaction;
CREATE TABLE public.pg_dist_colocation AS SELECT * FROM pg_catalog.pg_dist_colocation;
-- enterprise catalog tables
CREATE TABLE public.pg_dist_authinfo AS SELECT * FROM pg_catalog.pg_dist_authinfo;
CREATE TABLE public.pg_dist_poolinfo AS SELECT * FROM pg_catalog.pg_dist_poolinfo;
CREATE TABLE public.pg_dist_rebalance_strategy AS SELECT
name,
default_strategy,
shard_cost_function::regprocedure::text,
node_capacity_function::regprocedure::text,
shard_allowed_on_node_function::regprocedure::text,
default_threshold,
minimum_threshold
FROM pg_catalog.pg_dist_rebalance_strategy;
-- store upgrade stable identifiers on pg_dist_object catalog
CREATE TABLE public.pg_dist_object AS SELECT
address.type,
address.object_names,
address.object_args,
objects.distribution_argument_index,
objects.colocationid
FROM citus.pg_dist_object objects,
pg_catalog.pg_identify_object_as_address(objects.classid, objects.objid, objects.objsubid) address;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_prepare_pg_upgrade()
IS 'perform tasks to copy citus settings to a location that could later be restored after pg_upgrade is done';

View File

@ -18,6 +18,7 @@ BEGIN
DROP TABLE IF EXISTS public.pg_dist_authinfo;
DROP TABLE IF EXISTS public.pg_dist_poolinfo;
DROP TABLE IF EXISTS public.pg_dist_rebalance_strategy;
DROP TABLE IF EXISTS public.pg_dist_object;
--
-- backup citus catalog tables
@ -44,8 +45,14 @@ BEGIN
FROM pg_catalog.pg_dist_rebalance_strategy;
-- store upgrade stable identifiers on pg_dist_object catalog
UPDATE citus.pg_dist_object
SET (type, object_names, object_args) = (SELECT * FROM pg_identify_object_as_address(classid, objid, objsubid));
CREATE TABLE public.pg_dist_object AS SELECT
address.type,
address.object_names,
address.object_args,
objects.distribution_argument_index,
objects.colocationid
FROM citus.pg_dist_object objects,
pg_catalog.pg_identify_object_as_address(objects.classid, objects.objid, objects.objsubid) address;
END;
$cppu$;

View File

@ -16,6 +16,7 @@
#include "catalog/pg_type.h"
#include "distributed/connection_management.h"
#include "distributed/listutils.h"
#include "distributed/maintenanced.h"
#include "distributed/metadata_sync.h"
#include "distributed/remote_commands.h"
#include "postmaster/postmaster.h"
@ -28,6 +29,8 @@
/* declarations for dynamic loading */
PG_FUNCTION_INFO_V1(master_metadata_snapshot);
PG_FUNCTION_INFO_V1(wait_until_metadata_sync);
PG_FUNCTION_INFO_V1(trigger_metadata_sync);
PG_FUNCTION_INFO_V1(raise_error_in_metadata_sync);
/*
@ -124,3 +127,26 @@ wait_until_metadata_sync(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
/*
* trigger_metadata_sync triggers metadata sync for testing.
*/
Datum
trigger_metadata_sync(PG_FUNCTION_ARGS)
{
TriggerMetadataSyncOnCommit();
PG_RETURN_VOID();
}
/*
* raise_error_in_metadata_sync causes metadata sync to raise an error.
*/
Datum
raise_error_in_metadata_sync(PG_FUNCTION_ARGS)
{
/* metadata sync uses SIGALRM to test errors */
SignalMetadataSyncDaemon(MyDatabaseId, SIGALRM);
PG_RETURN_VOID();
}

View File

@ -28,6 +28,7 @@
#include "distributed/listutils.h"
#include "distributed/local_executor.h"
#include "distributed/locally_reserved_shared_connections.h"
#include "distributed/maintenanced.h"
#include "distributed/multi_executor.h"
#include "distributed/multi_explain.h"
#include "distributed/repartition_join_execution.h"
@ -102,6 +103,9 @@ bool CoordinatedTransactionUses2PC = false;
/* if disabled, distributed statements in a function may run as separate transactions */
bool FunctionOpensTransactionBlock = true;
/* if true, we should trigger metadata sync on commit */
bool MetadataSyncOnCommit = false;
/* transaction management functions */
static void CoordinatedTransactionCallback(XactEvent event, void *arg);
@ -262,6 +266,15 @@ CoordinatedTransactionCallback(XactEvent event, void *arg)
AfterXactConnectionHandling(true);
}
/*
* Changes to catalog tables are now visible to the metadata sync
* daemon, so we can trigger metadata sync if necessary.
*/
if (MetadataSyncOnCommit)
{
TriggerMetadataSync(MyDatabaseId);
}
ResetGlobalVariables();
/*
@ -474,6 +487,7 @@ ResetGlobalVariables()
activeSetStmts = NULL;
CoordinatedTransactionUses2PC = false;
TransactionModifiedNodeMetadata = false;
MetadataSyncOnCommit = false;
ResetWorkerErrorIndication();
}
@ -728,3 +742,15 @@ MaybeExecutingUDF(void)
{
return ExecutorLevel > 1 || (ExecutorLevel == 1 && PlannerLevel > 0);
}
/*
* TriggerMetadataSyncOnCommit sets a flag to do metadata sync on commit.
* This is because new metadata only becomes visible to the metadata sync
* daemon after commit happens.
*/
void
TriggerMetadataSyncOnCommit(void)
{
MetadataSyncOnCommit = true;
}

View File

@ -18,6 +18,7 @@
#include "access/htup_details.h"
#include "distributed/distribution_column.h"
#include "distributed/metadata_cache.h"
#include "distributed/multi_partitioning_utils.h"
#include "distributed/version_compat.h"
#include "nodes/makefuncs.h"
#include "nodes/nodes.h"
@ -115,6 +116,53 @@ column_to_column_name(PG_FUNCTION_ARGS)
}
/*
* FindColumnWithNameOnTargetRelation gets a source table and
* column name. The function returns the the column with the
* same name on the target table.
*
* Note that due to dropping columns, the parent's distribution key may not
* match the partition's distribution key. See issue #5123.
*
* The function throws error if the input or output is not valid or does
* not exist.
*/
Var *
FindColumnWithNameOnTargetRelation(Oid sourceRelationId, char *sourceColumnName,
Oid targetRelationId)
{
if (sourceColumnName == NULL || sourceColumnName[0] == '\0')
{
ereport(ERROR, (errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("cannot find the given column on table \"%s\"",
generate_qualified_relation_name(sourceRelationId))));
}
AttrNumber attributeNumberOnTarget = get_attnum(targetRelationId, sourceColumnName);
if (attributeNumberOnTarget == InvalidAttrNumber)
{
ereport(ERROR, (errmsg("Column \"%s\" does not exist on "
"relation \"%s\"", sourceColumnName,
get_rel_name(targetRelationId))));
}
Index varNo = 1;
Oid targetTypeId = InvalidOid;
int32 targetTypMod = 0;
Oid targetCollation = InvalidOid;
Index varlevelsup = 0;
/* this function throws error in case anything goes wrong */
get_atttypetypmodcoll(targetRelationId, attributeNumberOnTarget,
&targetTypeId, &targetTypMod, &targetCollation);
Var *targetColumn =
makeVar(varNo, attributeNumberOnTarget, targetTypeId, targetTypMod,
targetCollation, varlevelsup);
return targetColumn;
}
/*
* BuildDistributionKeyFromColumnName builds a simple distribution key consisting
* only out of a reference to the column of name columnName. Errors out if the

View File

@ -58,7 +58,6 @@ typedef struct ForeignConstraintRelationshipGraph
typedef struct ForeignConstraintRelationshipNode
{
Oid relationId;
bool visited;
List *adjacencyList;
List *backAdjacencyList;
}ForeignConstraintRelationshipNode;
@ -78,16 +77,21 @@ typedef struct ForeignConstraintRelationshipEdge
static ForeignConstraintRelationshipGraph *fConstraintRelationshipGraph = NULL;
static ForeignConstraintRelationshipNode * GetRelationshipNodeForRelationId(Oid
relationId,
bool *isFound);
static void CreateForeignConstraintRelationshipGraph(void);
static List * GetNeighbourList(ForeignConstraintRelationshipNode *relationshipNode,
bool isReferencing);
static List * GetRelationIdsFromRelationshipNodeList(List *fKeyRelationshipNodeList);
static void PopulateAdjacencyLists(void);
static int CompareForeignConstraintRelationshipEdges(const void *leftElement,
const void *rightElement);
static void AddForeignConstraintRelationshipEdge(Oid referencingOid, Oid referencedOid);
static ForeignConstraintRelationshipNode * CreateOrFindNode(HTAB *adjacencyLists, Oid
relid);
static void GetConnectedListHelper(ForeignConstraintRelationshipNode *node,
List **adjacentNodeList, bool
isReferencing);
static List * GetConnectedListHelper(ForeignConstraintRelationshipNode *node,
bool isReferencing);
static List * GetForeignConstraintRelationshipHelper(Oid relationId, bool isReferencing);
@ -108,7 +112,7 @@ ReferencedRelationIdList(Oid relationId)
/*
* ReferencingRelationIdList is a wrapper function around GetForeignConstraintRelationshipHelper
* to get list of relation IDs which are referencing by the given relation id.
* to get list of relation IDs which are referencing to given relation id.
*
* Note that, if relation A is referenced by relation B and relation B is referenced
* by relation C, then the result list for relation C consists of the relation
@ -129,16 +133,9 @@ ReferencingRelationIdList(Oid relationId)
static List *
GetForeignConstraintRelationshipHelper(Oid relationId, bool isReferencing)
{
List *foreignConstraintList = NIL;
List *foreignNodeList = NIL;
bool isFound = false;
CreateForeignConstraintRelationshipGraph();
ForeignConstraintRelationshipNode *relationNode =
(ForeignConstraintRelationshipNode *) hash_search(
fConstraintRelationshipGraph->nodeMap, &relationId,
HASH_FIND, &isFound);
ForeignConstraintRelationshipNode *relationshipNode =
GetRelationshipNodeForRelationId(relationId, &isFound);
if (!isFound)
{
@ -149,24 +146,31 @@ GetForeignConstraintRelationshipHelper(Oid relationId, bool isReferencing)
return NIL;
}
GetConnectedListHelper(relationNode, &foreignNodeList, isReferencing);
List *connectedNodeList = GetConnectedListHelper(relationshipNode, isReferencing);
List *relationIdList = GetRelationIdsFromRelationshipNodeList(connectedNodeList);
return relationIdList;
}
/*
* We need only their OIDs, we get back node list to make their visited
* variable to false for using them iteratively.
*/
ForeignConstraintRelationshipNode *currentNode = NULL;
foreach_ptr(currentNode, foreignNodeList)
{
foreignConstraintList = lappend_oid(foreignConstraintList,
currentNode->relationId);
currentNode->visited = false;
}
/* set to false separately, since we don't add itself to foreign node list */
relationNode->visited = false;
/*
* GetRelationshipNodeForRelationId searches foreign key graph for relation
* with relationId and returns ForeignConstraintRelationshipNode object for
* relation if it exists in graph. Otherwise, sets isFound to false.
*
* Also before searching foreign key graph, this function implicitly builds
* foreign key graph if it's invalid or not built yet.
*/
static ForeignConstraintRelationshipNode *
GetRelationshipNodeForRelationId(Oid relationId, bool *isFound)
{
CreateForeignConstraintRelationshipGraph();
return foreignConstraintList;
ForeignConstraintRelationshipNode *relationshipNode =
(ForeignConstraintRelationshipNode *) hash_search(
fConstraintRelationshipGraph->nodeMap, &relationId,
HASH_FIND, isFound);
return relationshipNode;
}
@ -249,38 +253,142 @@ SetForeignConstraintRelationshipGraphInvalid()
/*
* GetConnectedListHelper is the function for getting nodes connected (or connecting) to
* the given relation. adjacentNodeList holds the result for recursive calls and
* by changing isReferencing caller function can select connected or connecting
* adjacency list.
* GetConnectedListHelper returns list of ForeignConstraintRelationshipNode
* objects for relations referenced by or referencing to given relation
* according to isReferencing flag.
*
*/
static void
GetConnectedListHelper(ForeignConstraintRelationshipNode *node, List **adjacentNodeList,
bool isReferencing)
static List *
GetConnectedListHelper(ForeignConstraintRelationshipNode *node, bool isReferencing)
{
List *neighbourList = NIL;
HTAB *oidVisitedMap = CreateOidVisitedHashSet();
node->visited = true;
List *connectedNodeList = NIL;
List *relationshipNodeStack = list_make1(node);
while (list_length(relationshipNodeStack) != 0)
{
/*
* Note that this loop considers leftmost element of
* relationshipNodeStack as top of the stack.
*/
/* pop top element from stack */
ForeignConstraintRelationshipNode *currentNode = linitial(relationshipNodeStack);
relationshipNodeStack = list_delete_first(relationshipNodeStack);
Oid currentRelationId = currentNode->relationId;
if (!OidVisited(oidVisitedMap, currentRelationId))
{
connectedNodeList = lappend(connectedNodeList, currentNode);
VisitOid(oidVisitedMap, currentRelationId);
}
List *neighbourList = GetNeighbourList(currentNode, isReferencing);
ForeignConstraintRelationshipNode *neighbourNode = NULL;
foreach_ptr(neighbourNode, neighbourList)
{
Oid neighbourRelationId = neighbourNode->relationId;
if (!OidVisited(oidVisitedMap, neighbourRelationId))
{
/* push to stack */
relationshipNodeStack = lcons(neighbourNode, relationshipNodeStack);
}
}
}
hash_destroy(oidVisitedMap);
/* finally remove yourself from list */
connectedNodeList = list_delete_first(connectedNodeList);
return connectedNodeList;
}
/*
* CreateOidVisitedHashSet creates and returns an hash-set object in
* CurrentMemoryContext to store visited oid's.
* As hash_create allocates memory in heap, callers are responsible to call
* hash_destroy when appropriate.
*/
HTAB *
CreateOidVisitedHashSet(void)
{
HASHCTL info = { 0 };
info.keysize = sizeof(Oid);
info.hash = oid_hash;
info.hcxt = CurrentMemoryContext;
/* we don't have value field as it's a set */
info.entrysize = info.keysize;
uint32 hashFlags = (HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
HTAB *oidVisitedMap = hash_create("oid visited hash map", 32, &info, hashFlags);
return oidVisitedMap;
}
/*
* OidVisited returns true if given oid is visited according to given oid hash-set.
*/
bool
OidVisited(HTAB *oidVisitedMap, Oid oid)
{
bool found = false;
hash_search(oidVisitedMap, &oid, HASH_FIND, &found);
return found;
}
/*
* VisitOid sets given oid as visited in given hash-set.
*/
void
VisitOid(HTAB *oidVisitedMap, Oid oid)
{
bool found = false;
hash_search(oidVisitedMap, &oid, HASH_ENTER, &found);
}
/*
* GetNeighbourList returns copy of relevant adjacency list of given
* ForeignConstraintRelationshipNode object depending on the isReferencing
* flag.
*/
static List *
GetNeighbourList(ForeignConstraintRelationshipNode *relationshipNode, bool isReferencing)
{
if (isReferencing)
{
neighbourList = node->backAdjacencyList;
return relationshipNode->backAdjacencyList;
}
else
{
neighbourList = node->adjacencyList;
return relationshipNode->adjacencyList;
}
}
/*
* GetRelationIdsFromRelationshipNodeList returns list of relationId's for
* given ForeignConstraintRelationshipNode object list.
*/
static List *
GetRelationIdsFromRelationshipNodeList(List *fKeyRelationshipNodeList)
{
List *relationIdList = NIL;
ForeignConstraintRelationshipNode *fKeyRelationshipNode = NULL;
foreach_ptr(fKeyRelationshipNode, fKeyRelationshipNodeList)
{
Oid relationId = fKeyRelationshipNode->relationId;
relationIdList = lappend_oid(relationIdList, relationId);
}
ForeignConstraintRelationshipNode *neighborNode = NULL;
foreach_ptr(neighborNode, neighbourList)
{
if (neighborNode->visited == false)
{
*adjacentNodeList = lappend(*adjacentNodeList, neighborNode);
GetConnectedListHelper(neighborNode, adjacentNodeList, isReferencing);
}
}
return relationIdList;
}
@ -415,7 +523,6 @@ CreateOrFindNode(HTAB *adjacencyLists, Oid relid)
{
node->adjacencyList = NIL;
node->backAdjacencyList = NIL;
node->visited = false;
}
return node;

View File

@ -118,7 +118,6 @@ static size_t MaintenanceDaemonShmemSize(void);
static void MaintenanceDaemonShmemInit(void);
static void MaintenanceDaemonShmemExit(int code, Datum arg);
static void MaintenanceDaemonErrorContext(void *arg);
static bool LockCitusExtension(void);
static bool MetadataSyncTriggeredCheckAndReset(MaintenanceDaemonDBData *dbData);
static void WarnMaintenanceDaemonNotStarted(void);
@ -291,6 +290,13 @@ CitusMaintenanceDaemonMain(Datum main_arg)
TimestampTz lastRecoveryTime = 0;
TimestampTz nextMetadataSyncTime = 0;
/*
* We do metadata sync in a separate background worker. We need its
* handle to be able to check its status.
*/
BackgroundWorkerHandle *metadataSyncBgwHandle = NULL;
/*
* Look up this worker's configuration.
*/
@ -371,6 +377,12 @@ CitusMaintenanceDaemonMain(Datum main_arg)
/* make worker recognizable in pg_stat_activity */
pgstat_report_appname("Citus Maintenance Daemon");
/*
* Terminate orphaned metadata sync daemons spawned from previously terminated
* or crashed maintenanced instances.
*/
SignalMetadataSyncDaemon(databaseOid, SIGTERM);
/* enter main loop */
while (!got_SIGTERM)
{
@ -450,21 +462,42 @@ CitusMaintenanceDaemonMain(Datum main_arg)
}
#endif
if (!RecoveryInProgress() &&
(MetadataSyncTriggeredCheckAndReset(myDbData) ||
GetCurrentTimestamp() >= nextMetadataSyncTime))
pid_t metadataSyncBgwPid = 0;
BgwHandleStatus metadataSyncStatus =
metadataSyncBgwHandle != NULL ?
GetBackgroundWorkerPid(metadataSyncBgwHandle, &metadataSyncBgwPid) :
BGWH_STOPPED;
if (metadataSyncStatus != BGWH_STOPPED &&
GetCurrentTimestamp() >= nextMetadataSyncTime)
{
bool metadataSyncFailed = false;
/*
* Metadata sync is still running, recheck in a short while.
*/
int nextTimeout = MetadataSyncRetryInterval;
nextMetadataSyncTime =
TimestampTzPlusMilliseconds(GetCurrentTimestamp(), nextTimeout);
timeout = Min(timeout, nextTimeout);
}
else if (!RecoveryInProgress() &&
metadataSyncStatus == BGWH_STOPPED &&
(MetadataSyncTriggeredCheckAndReset(myDbData) ||
GetCurrentTimestamp() >= nextMetadataSyncTime))
{
if (metadataSyncBgwHandle)
{
TerminateBackgroundWorker(metadataSyncBgwHandle);
pfree(metadataSyncBgwHandle);
metadataSyncBgwHandle = NULL;
}
InvalidateMetadataSystemCache();
StartTransactionCommand();
/*
* Some functions in ruleutils.c, which we use to get the DDL for
* metadata propagation, require an active snapshot.
*/
PushActiveSnapshot(GetTransactionSnapshot());
int nextTimeout = MetadataSyncRetryInterval;
bool syncMetadata = false;
if (!LockCitusExtension())
{
ereport(DEBUG1, (errmsg("could not lock the citus extension, "
@ -472,25 +505,28 @@ CitusMaintenanceDaemonMain(Datum main_arg)
}
else if (CheckCitusVersion(DEBUG1) && CitusHasBeenLoaded())
{
MetadataSyncResult result = SyncMetadataToNodes();
metadataSyncFailed = (result != METADATA_SYNC_SUCCESS);
bool lockFailure = false;
syncMetadata = ShouldInitiateMetadataSync(&lockFailure);
/*
* Notification means we had an attempt on synchronization
* without being blocked for pg_dist_node access.
* If lock fails, we need to recheck in a short while. If we are
* going to sync metadata, we should recheck in a short while to
* see if it failed. Otherwise, we can wait longer.
*/
if (result != METADATA_SYNC_FAILED_LOCK)
{
Async_Notify(METADATA_SYNC_CHANNEL, NULL);
}
nextTimeout = (lockFailure || syncMetadata) ?
MetadataSyncRetryInterval :
MetadataSyncInterval;
}
PopActiveSnapshot();
CommitTransactionCommand();
ProcessCompletedNotifies();
int64 nextTimeout = metadataSyncFailed ? MetadataSyncRetryInterval :
MetadataSyncInterval;
if (syncMetadata)
{
metadataSyncBgwHandle =
SpawnSyncMetadataToNodes(MyDatabaseId, myDbData->userOid);
}
nextMetadataSyncTime =
TimestampTzPlusMilliseconds(GetCurrentTimestamp(), nextTimeout);
timeout = Min(timeout, nextTimeout);
@ -626,6 +662,11 @@ CitusMaintenanceDaemonMain(Datum main_arg)
ProcessConfigFile(PGC_SIGHUP);
}
}
if (metadataSyncBgwHandle)
{
TerminateBackgroundWorker(metadataSyncBgwHandle);
}
}
@ -786,7 +827,7 @@ MaintenanceDaemonErrorContext(void *arg)
* LockCitusExtension acquires a lock on the Citus extension or returns
* false if the extension does not exist or is being dropped.
*/
static bool
bool
LockCitusExtension(void)
{
Oid extensionOid = get_extension_oid("citus", true);

View File

@ -41,7 +41,7 @@ alter_role_if_exists(PG_FUNCTION_ARGS)
Node *parseTree = ParseTreeNode(utilityQuery);
CitusProcessUtility(parseTree, utilityQuery, PROCESS_UTILITY_TOPLEVEL, NULL,
CitusProcessUtility(parseTree, utilityQuery, PROCESS_UTILITY_QUERY, NULL,
None_Receiver, NULL);
PG_RETURN_BOOL(true);
@ -98,7 +98,7 @@ worker_create_or_alter_role(PG_FUNCTION_ARGS)
CitusProcessUtility(parseTree,
createRoleUtilityQuery,
PROCESS_UTILITY_TOPLEVEL,
PROCESS_UTILITY_QUERY,
NULL,
None_Receiver, NULL);
@ -126,7 +126,7 @@ worker_create_or_alter_role(PG_FUNCTION_ARGS)
CitusProcessUtility(parseTree,
alterRoleUtilityQuery,
PROCESS_UTILITY_TOPLEVEL,
PROCESS_UTILITY_QUERY,
NULL,
None_Receiver, NULL);

View File

@ -465,7 +465,7 @@ SingleReplicatedTable(Oid relationId)
List *shardPlacementList = NIL;
/* we could have append/range distributed tables without shards */
if (list_length(shardList) <= 1)
if (list_length(shardList) == 0)
{
return false;
}

View File

@ -111,12 +111,12 @@ worker_create_or_replace_object(PG_FUNCTION_ARGS)
RenameStmt *renameStmt = CreateRenameStatement(&address, newName);
const char *sqlRenameStmt = DeparseTreeNode((Node *) renameStmt);
CitusProcessUtility((Node *) renameStmt, sqlRenameStmt,
PROCESS_UTILITY_TOPLEVEL,
PROCESS_UTILITY_QUERY,
NULL, None_Receiver, NULL);
}
/* apply create statement locally */
CitusProcessUtility(parseTree, sqlStatement, PROCESS_UTILITY_TOPLEVEL, NULL,
CitusProcessUtility(parseTree, sqlStatement, PROCESS_UTILITY_QUERY, NULL,
None_Receiver, NULL);
/* type has been created */

View File

@ -396,7 +396,7 @@ worker_apply_shard_ddl_command(PG_FUNCTION_ARGS)
/* extend names in ddl command and apply extended command */
RelayEventExtendNames(ddlCommandNode, schemaName, shardId);
CitusProcessUtility(ddlCommandNode, ddlCommand, PROCESS_UTILITY_TOPLEVEL, NULL,
CitusProcessUtility(ddlCommandNode, ddlCommand, PROCESS_UTILITY_QUERY, NULL,
None_Receiver, NULL);
PG_RETURN_VOID();
@ -428,7 +428,7 @@ worker_apply_inter_shard_ddl_command(PG_FUNCTION_ARGS)
RelayEventExtendNamesForInterShardCommands(ddlCommandNode, leftShardId,
leftShardSchemaName, rightShardId,
rightShardSchemaName);
CitusProcessUtility(ddlCommandNode, ddlCommand, PROCESS_UTILITY_TOPLEVEL, NULL,
CitusProcessUtility(ddlCommandNode, ddlCommand, PROCESS_UTILITY_QUERY, NULL,
None_Receiver, NULL);
PG_RETURN_VOID();
@ -461,7 +461,7 @@ worker_apply_sequence_command(PG_FUNCTION_ARGS)
}
/* run the CREATE SEQUENCE command */
CitusProcessUtility(commandNode, commandString, PROCESS_UTILITY_TOPLEVEL, NULL,
CitusProcessUtility(commandNode, commandString, PROCESS_UTILITY_QUERY, NULL,
None_Receiver, NULL);
CommandCounterIncrement();
@ -669,7 +669,7 @@ worker_append_table_to_shard(PG_FUNCTION_ARGS)
SetUserIdAndSecContext(CitusExtensionOwner(), SECURITY_LOCAL_USERID_CHANGE);
CitusProcessUtility((Node *) localCopyCommand, queryString->data,
PROCESS_UTILITY_TOPLEVEL, NULL, None_Receiver, NULL);
PROCESS_UTILITY_QUERY, NULL, None_Receiver, NULL);
SetUserIdAndSecContext(savedUserId, savedSecurityContext);
@ -782,7 +782,7 @@ AlterSequenceMinMax(Oid sequenceId, char *schemaName, char *sequenceName,
/* since the command is an AlterSeqStmt, a dummy command string works fine */
CitusProcessUtility((Node *) alterSequenceStatement, dummyString,
PROCESS_UTILITY_TOPLEVEL, NULL, None_Receiver, NULL);
PROCESS_UTILITY_QUERY, NULL, None_Receiver, NULL);
}
}

View File

@ -20,6 +20,7 @@ typedef struct CitusScanState
CustomScanState customScanState; /* underlying custom scan node */
/* function that gets called before postgres starts its execution */
bool finishedPreScan; /* flag to check if the pre scan is finished */
void (*PreExecScan)(struct CitusScanState *scanState);
DistributedPlan *distributedPlan; /* distributed execution plan */

View File

@ -64,6 +64,26 @@ typedef enum ExtractForeignKeyConstraintsMode
EXCLUDE_SELF_REFERENCES = 1 << 2
} ExtractForeignKeyConstraintMode;
/*
* Flags that can be passed to GetForeignKeyIdsForColumn to
* indicate whether relationId argument should match:
* - referencing relation or,
* - referenced relation,
* or we are searching for both sides.
*/
typedef enum SearchForeignKeyColumnFlags
{
/* relationId argument should match referencing relation */
SEARCH_REFERENCING_RELATION = 1 << 0,
/* relationId argument should match referenced relation */
SEARCH_REFERENCED_RELATION = 1 << 1,
/* callers can also pass union of above flags */
} SearchForeignKeyColumnFlags;
/* cluster.c - forward declarations */
extern List * PreprocessClusterStmt(Node *node, const char *clusterCommand);
@ -119,15 +139,21 @@ extern void ErrorIfUnsupportedForeignConstraintExists(Relation relation,
Var *distributionColumn,
uint32 colocationId);
extern void ErrorOutForFKeyBetweenPostgresAndCitusLocalTable(Oid localTableId);
extern bool ColumnReferencedByAnyForeignKey(char *columnName, Oid relationId);
extern bool ColumnAppearsInForeignKey(char *columnName, Oid relationId);
extern bool ColumnAppearsInForeignKeyToReferenceTable(char *columnName, Oid
relationId);
extern List * GetReferencingForeignConstaintCommands(Oid relationOid);
extern bool AnyForeignKeyDependsOnIndex(Oid indexId);
extern bool HasForeignKeyToCitusLocalTable(Oid relationId);
extern bool HasForeignKeyToReferenceTable(Oid relationOid);
extern bool TableReferenced(Oid relationOid);
extern bool TableReferencing(Oid relationOid);
extern bool ConstraintIsAUniquenessConstraint(char *inputConstaintName, Oid relationId);
extern bool ConstraintIsAForeignKey(char *inputConstaintName, Oid relationOid);
extern Oid GetForeignKeyOidByName(char *inputConstaintName, Oid relationId);
extern bool ConstraintWithNameIsOfType(char *inputConstaintName, Oid relationId,
char targetConstraintType);
extern bool ConstraintWithIdIsOfType(Oid constraintId, char targetConstraintType);
extern void ErrorIfTableHasExternalForeignKeys(Oid relationId);
extern List * GetForeignKeyOids(Oid relationId, int flags);
extern Oid GetReferencedTableId(Oid foreignKeyId);

View File

@ -178,6 +178,9 @@ typedef struct ConnectionHashEntry
{
ConnectionHashKey key;
dlist_head *connections;
/* connections list is valid or not */
bool isValid;
} ConnectionHashEntry;
/* hash entry for cached connection parameters */

View File

@ -19,6 +19,9 @@
/* Remaining metadata utility functions */
extern Var * FindColumnWithNameOnTargetRelation(Oid sourceRelationId,
char *sourceColumnName,
Oid targetRelationId);
extern Var * BuildDistributionKeyFromColumnName(Relation distributedRelation,
char *columnName);
extern char * ColumnToColumnName(Oid relationId, char *columnNodeString);

View File

@ -20,5 +20,8 @@ extern List * ReferencingRelationIdList(Oid relationId);
extern void SetForeignConstraintRelationshipGraphInvalid(void);
extern bool IsForeignConstraintRelationshipGraphValid(void);
extern void ClearForeignConstraintRelationshipGraphContext(void);
extern HTAB * CreateOidVisitedHashSet(void);
extern bool OidVisited(HTAB *oidVisitedMap, Oid oid);
extern void VisitOid(HTAB *oidVisitedMap, Oid oid);
#endif

View File

@ -25,6 +25,7 @@ extern void StopMaintenanceDaemon(Oid databaseId);
extern void TriggerMetadataSync(Oid databaseId);
extern void InitializeMaintenanceDaemon(void);
extern void InitializeMaintenanceDaemonBackend(void);
extern bool LockCitusExtension(void);
extern void CitusMaintenanceDaemonMain(Datum main_arg);

View File

@ -21,6 +21,8 @@ extern List * GetUniqueDependenciesList(List *objectAddressesList);
extern List * GetDependenciesForObject(const ObjectAddress *target);
extern List * OrderObjectAddressListInDependencyOrder(List *objectAddressList);
extern bool SupportedDependencyByCitus(const ObjectAddress *address);
extern List * GetPgDependTuplesForDependingObjects(Oid targetObjectClassId,
Oid targetObjectId);
extern List * GetDependingViews(Oid relationId);
#endif /* CITUS_DEPENDENCY_H */

View File

@ -50,11 +50,14 @@ extern char * PlacementUpsertCommand(uint64 shardId, uint64 placementId, int sha
extern void CreateTableMetadataOnWorkers(Oid relationId);
extern void MarkNodeHasMetadata(const char *nodeName, int32 nodePort, bool hasMetadata);
extern void MarkNodeMetadataSynced(const char *nodeName, int32 nodePort, bool synced);
extern MetadataSyncResult SyncMetadataToNodes(void);
extern BackgroundWorkerHandle * SpawnSyncMetadataToNodes(Oid database, Oid owner);
extern bool SendOptionalCommandListToWorkerInTransaction(const char *nodeName, int32
nodePort,
const char *nodeUser,
List *commandList);
extern void SyncMetadataToNodesMain(Datum main_arg);
extern void SignalMetadataSyncDaemon(Oid database, int sig);
extern bool ShouldInitiateMetadataSync(bool *lockFailure);
#define DELETE_ALL_NODES "TRUNCATE pg_dist_node CASCADE"
#define REMOVE_ALL_CLUSTERED_TABLES_COMMAND \

View File

@ -35,6 +35,8 @@
#define PG_TOTAL_RELATION_SIZE_FUNCTION "pg_total_relation_size(%s)"
#define CSTORE_TABLE_SIZE_FUNCTION "cstore_table_size(%s)"
#define UPDATE_SHARD_STATISTICS_COLUMN_COUNT 4
/* In-memory representation of a typed tuple in pg_dist_shard. */
typedef struct ShardInterval
{
@ -169,5 +171,8 @@ extern ShardInterval * DeformedDistShardTupleToShardInterval(Datum *datumArray,
int32 intervalTypeMod);
extern void GetIntervalTypeInfo(char partitionMethod, Var *partitionColumn,
Oid *intervalTypeId, int32 *intervalTypeMod);
extern List * SendShardStatisticsQueriesInParallel(List *citusTableIds, bool
useDistributedTransaction, bool
useShardMinMaxQuery);
#endif /* METADATA_UTILITY_H */

View File

@ -121,6 +121,7 @@ extern void InitializeTransactionManagement(void);
/* other functions */
extern List * ActiveSubXactContexts(void);
extern StringInfo BeginAndSetDistributedTransactionIdCommand(void);
extern void TriggerMetadataSyncOnCommit(void);
#endif /* TRANSACTION_MANAGMENT_H */

View File

@ -181,3 +181,6 @@ s/wrong data type: [0-9]+, expected [0-9]+/wrong data type: XXXX, expected XXXX/
# Errors with relation OID does not exist
s/relation with OID [0-9]+ does not exist/relation with OID XXXX does not exist/g
# ignore DEBUG1 messages that Postgres generates
/^DEBUG: rehashing catalog cache id [0-9]+$/d

View File

@ -0,0 +1,218 @@
CREATE SCHEMA cursors;
SET search_path TO cursors;
CREATE TABLE distributed_table (key int, value text);
SELECT create_distributed_table('distributed_table', 'key');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- load some data, but not very small amounts because RETURN QUERY in plpgsql
-- hard codes the cursor fetch to 50 rows on PG 12, though they might increase
-- it sometime in the future, so be mindful
INSERT INTO distributed_table SELECT i % 10, i::text FROM generate_series(0, 1000) i;
CREATE OR REPLACE FUNCTION simple_cursor_on_dist_table(cursor_name refcursor) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR SELECT DISTINCT key FROM distributed_table ORDER BY 1;
RETURN $1;
END;
' LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION cursor_with_intermediate_result_on_dist_table(cursor_name refcursor) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR
WITH cte_1 AS (SELECT * FROM distributed_table OFFSET 0)
SELECT DISTINCT key FROM distributed_table WHERE value in (SELECT value FROM cte_1) ORDER BY 1;
RETURN $1;
END;
' LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION cursor_with_intermediate_result_on_dist_table_with_param(cursor_name refcursor, filter text) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR
WITH cte_1 AS (SELECT * FROM distributed_table WHERE value < $2 OFFSET 0)
SELECT DISTINCT key FROM distributed_table WHERE value in (SELECT value FROM cte_1) ORDER BY 1;
RETURN $1;
END;
' LANGUAGE plpgsql;
-- pretty basic query with cursors
-- Citus should plan/execute once and pull
-- the results to coordinator, then serve it
-- from the coordinator
BEGIN;
SELECT simple_cursor_on_dist_table('cursor_1');
simple_cursor_on_dist_table
---------------------------------------------------------------------
cursor_1
(1 row)
SET LOCAL citus.log_intermediate_results TO ON;
SET LOCAL client_min_messages TO DEBUG1;
FETCH 5 IN cursor_1;
key
---------------------------------------------------------------------
0
1
2
3
4
(5 rows)
FETCH 50 IN cursor_1;
key
---------------------------------------------------------------------
5
6
7
8
9
(5 rows)
FETCH ALL IN cursor_1;
key
---------------------------------------------------------------------
(0 rows)
COMMIT;
BEGIN;
SELECT cursor_with_intermediate_result_on_dist_table('cursor_1');
cursor_with_intermediate_result_on_dist_table
---------------------------------------------------------------------
cursor_1
(1 row)
-- multiple FETCH commands should not trigger re-running the subplans
SET LOCAL citus.log_intermediate_results TO ON;
SET LOCAL client_min_messages TO DEBUG1;
FETCH 5 IN cursor_1;
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
key
---------------------------------------------------------------------
0
1
2
3
4
(5 rows)
FETCH 1 IN cursor_1;
key
---------------------------------------------------------------------
5
(1 row)
FETCH ALL IN cursor_1;
key
---------------------------------------------------------------------
6
7
8
9
(4 rows)
FETCH 5 IN cursor_1;
key
---------------------------------------------------------------------
(0 rows)
COMMIT;
BEGIN;
SELECT cursor_with_intermediate_result_on_dist_table_with_param('cursor_1', '600');
cursor_with_intermediate_result_on_dist_table_with_param
---------------------------------------------------------------------
cursor_1
(1 row)
-- multiple FETCH commands should not trigger re-running the subplans
-- also test with parameters
SET LOCAL citus.log_intermediate_results TO ON;
SET LOCAL client_min_messages TO DEBUG1;
FETCH 1 IN cursor_1;
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
key
---------------------------------------------------------------------
0
(1 row)
FETCH 1 IN cursor_1;
key
---------------------------------------------------------------------
1
(1 row)
FETCH 1 IN cursor_1;
key
---------------------------------------------------------------------
2
(1 row)
FETCH 1 IN cursor_1;
key
---------------------------------------------------------------------
3
(1 row)
FETCH 1 IN cursor_1;
key
---------------------------------------------------------------------
4
(1 row)
FETCH 1 IN cursor_1;
key
---------------------------------------------------------------------
5
(1 row)
FETCH ALL IN cursor_1;
key
---------------------------------------------------------------------
6
7
8
9
(4 rows)
COMMIT;
CREATE OR REPLACE FUNCTION value_counter() RETURNS TABLE(counter text) LANGUAGE PLPGSQL AS $function$
BEGIN
return query
WITH cte AS
(SELECT dt.value
FROM distributed_table dt
WHERE dt.value in
(SELECT value
FROM distributed_table p
GROUP BY p.value
HAVING count(*) > 0))
SELECT * FROM cte;
END;
$function$ ;
SET citus.log_intermediate_results TO ON;
SET client_min_messages TO DEBUG1;
\set VERBOSITY terse
SELECT count(*) from (SELECT value_counter()) as foo;
DEBUG: CTE cte is going to be inlined via distributed planning
DEBUG: generating subplan XXX_1 for subquery SELECT value FROM cursors.distributed_table p GROUP BY value HAVING (count(*) OPERATOR(pg_catalog.>) 0)
DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT value FROM (SELECT dt.value FROM cursors.distributed_table dt WHERE (dt.value OPERATOR(pg_catalog.=) ANY (SELECT intermediate_result.value FROM read_intermediate_result('XXX_1'::text, 'binary'::citus_copy_format) intermediate_result(value text)))) cte
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
count
---------------------------------------------------------------------
1001
(1 row)
BEGIN;
SELECT count(*) from (SELECT value_counter()) as foo;
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
DEBUG: Subplan XXX_1 will be sent to localhost:xxxxx
count
---------------------------------------------------------------------
1001
(1 row)
COMMIT;
-- suppress NOTICEs
SET client_min_messages TO ERROR;
DROP SCHEMA cursors CASCADE;

View File

@ -0,0 +1,366 @@
CREATE SCHEMA drop_column_partitioned_table;
SET search_path TO drop_column_partitioned_table;
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 2580000;
-- create a partitioned table with some columns that
-- are going to be dropped within the tests
CREATE TABLE sensors(
col_to_drop_0 text,
col_to_drop_1 text,
col_to_drop_2 date,
col_to_drop_3 inet,
col_to_drop_4 date,
measureid integer,
eventdatetime date,
measure_data jsonb)
PARTITION BY RANGE(eventdatetime);
-- drop column even before attaching any partitions
ALTER TABLE sensors DROP COLUMN col_to_drop_1;
-- now attach the first partition and create the distributed table
CREATE TABLE sensors_2000 PARTITION OF sensors FOR VALUES FROM ('2000-01-01') TO ('2001-01-01');
SELECT create_distributed_table('sensors', 'measureid');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- prepared statements should work fine even after columns are dropped
PREPARE drop_col_prepare_insert(int, date, jsonb) AS INSERT INTO sensors (measureid, eventdatetime, measure_data) VALUES ($1, $2, $3);
PREPARE drop_col_prepare_select(int, date) AS SELECT count(*) FROM sensors WHERE measureid = $1 AND eventdatetime = $2;
-- execute 7 times to make sure it is cached
EXECUTE drop_col_prepare_insert(1, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-05', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-06', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-07', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(1, '2000-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-02');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-03');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-04');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-05');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-06');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-07');
count
---------------------------------------------------------------------
1
(1 row)
-- drop another column before attaching another partition
-- with .. PARTITION OF .. syntax
ALTER TABLE sensors DROP COLUMN col_to_drop_0;
CREATE TABLE sensors_2001 PARTITION OF sensors FOR VALUES FROM ('2001-01-01') TO ('2002-01-01');
-- drop another column before attaching another partition
-- with ALTER TABLE .. ATTACH PARTITION
ALTER TABLE sensors DROP COLUMN col_to_drop_2;
CREATE TABLE sensors_2002(
col_to_drop_4 date, col_to_drop_3 inet, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
ALTER TABLE sensors ATTACH PARTITION sensors_2002 FOR VALUES FROM ('2002-01-01') TO ('2003-01-01');
-- drop another column before attaching another partition
-- that is already distributed
ALTER TABLE sensors DROP COLUMN col_to_drop_3;
CREATE TABLE sensors_2003(
col_to_drop_4 date, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
SELECT create_distributed_table('sensors_2003', 'measureid');
create_distributed_table
---------------------------------------------------------------------
(1 row)
ALTER TABLE sensors ATTACH PARTITION sensors_2003 FOR VALUES FROM ('2003-01-01') TO ('2004-01-01');
CREATE TABLE sensors_2004(
col_to_drop_4 date, measureid integer NOT NULL, eventdatetime date NOT NULL, measure_data jsonb NOT NULL);
ALTER TABLE sensors ATTACH PARTITION sensors_2004 FOR VALUES FROM ('2004-01-01') TO ('2005-01-01');
ALTER TABLE sensors DROP COLUMN col_to_drop_4;
-- show that all partitions have the same distribution key
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
logicalrelid | column_to_column_name
---------------------------------------------------------------------
sensors | measureid
sensors_2000 | measureid
sensors_2001 | measureid
sensors_2002 | measureid
sensors_2003 | measureid
sensors_2004 | measureid
(6 rows)
-- show that all the tables prune to the same shard for the same distribution key
WITH
sensors_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors', 3)),
sensors_2000_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2000', 3)),
sensors_2001_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2001', 3)),
sensors_2002_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2002', 3)),
sensors_2003_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2003', 3)),
sensors_2004_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2004', 3)),
all_shardids AS (SELECT * FROM sensors_shardid UNION SELECT * FROM sensors_2000_shardid UNION
SELECT * FROM sensors_2001_shardid UNION SELECT * FROM sensors_2002_shardid
UNION SELECT * FROM sensors_2003_shardid UNION SELECT * FROM sensors_2004_shardid)
SELECT logicalrelid, shardid, shardminvalue, shardmaxvalue FROM pg_dist_shard WHERE shardid IN (SELECT * FROM all_shardids);
logicalrelid | shardid | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
sensors | 2580001 | -1073741824 | -1
sensors_2000 | 2580005 | -1073741824 | -1
sensors_2001 | 2580009 | -1073741824 | -1
sensors_2002 | 2580013 | -1073741824 | -1
sensors_2003 | 2580017 | -1073741824 | -1
sensors_2004 | 2580021 | -1073741824 | -1
(6 rows)
VACUUM ANALYZE sensors, sensors_2000, sensors_2001, sensors_2002, sensors_2003;
-- show that both INSERT and SELECT can route to a single node when distribution
-- key is provided in the query
EXPLAIN (COSTS FALSE) INSERT INTO sensors VALUES (3, '2000-02-02', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2580001
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2000 VALUES (3, '2000-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2000_2580005
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2001 VALUES (3, '2001-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2001_2580009
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2002 VALUES (3, '2002-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2002_2580013
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2003 VALUES (3, '2003-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2003_2580017
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2000 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Seq Scan on sensors_2000_2580005 sensors_2000
Filter: (measureid = 3)
(8 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2001 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Seq Scan on sensors_2001_2580009 sensors_2001
Filter: (measureid = 3)
(8 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2002 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Bitmap Heap Scan on sensors_2002_2580013 sensors_2002
Recheck Cond: (measureid = 3)
-> Bitmap Index Scan on sensors_2002_pkey_2580013
Index Cond: (measureid = 3)
(10 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2003 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Bitmap Heap Scan on sensors_2003_2580017 sensors_2003
Recheck Cond: (measureid = 3)
-> Bitmap Index Scan on sensors_2003_pkey_2580017
Index Cond: (measureid = 3)
(10 rows)
-- execute 7 times to make sure it is re-cached
EXECUTE drop_col_prepare_insert(3, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2001-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2002-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(4, '2003-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(5, '2003-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(3, '2000-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2001-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2002-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2003-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2003-10-02');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(4, '2003-10-03');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(5, '2003-10-04');
count
---------------------------------------------------------------------
1
(1 row)
-- non-fast router planner queries should also work
-- so we switched to DEBUG2 to show that dist. key
-- and the query is router
SET client_min_messages TO DEBUG2;
SELECT count(*) FROM (
SELECT * FROM sensors WHERE measureid = 3
UNION
SELECT * FROM sensors_2000 WHERE measureid = 3
UNION
SELECT * FROM sensors_2001 WHERE measureid = 3
UNION
SELECT * FROM sensors_2002 WHERE measureid = 3
UNION
SELECT * FROM sensors_2003 WHERE measureid = 3
UNION
SELECT * FROM sensors_2004 WHERE measureid = 3
) as foo;
DEBUG: Creating router plan
DEBUG: query has a single distribution column value: 3
count
---------------------------------------------------------------------
5
(1 row)
RESET client_min_messages;
-- show that all partitions have the same distribution key
-- even after alter_distributed_table changes the shard count
-- remove this comment once https://github.com/citusdata/citus/issues/5137 is fixed
--SELECT alter_distributed_table('sensors', shard_count:='3');
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
logicalrelid | column_to_column_name
---------------------------------------------------------------------
sensors | measureid
sensors_2000 | measureid
sensors_2001 | measureid
sensors_2002 | measureid
sensors_2003 | measureid
sensors_2004 | measureid
(6 rows)
SET client_min_messages TO WARNING;
DROP SCHEMA drop_column_partitioned_table CASCADE;

View File

@ -233,7 +233,7 @@ SELECT * FROM test WHERE x = 1;
ERROR: node group 0 does not have a secondary node
-- add the the follower as secondary nodes and try again, the SELECT statement
-- should work this time
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
SET search_path TO single_node;
SELECT 1 FROM master_add_node('localhost', :follower_master_port, groupid => 0, noderole => 'secondary');
?column?
@ -350,7 +350,7 @@ SELECT count(*) FROM test WHERE false GROUP BY GROUPING SETS (x,y);
RESET citus.task_assignment_policy;
-- Cleanup
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
SET search_path TO single_node;
SET client_min_messages TO WARNING;
DROP SCHEMA single_node CASCADE;

View File

@ -1216,6 +1216,7 @@ ON CONFLICT(c1, c2, c3, c4, c5, c6)
DO UPDATE SET
cardinality = enriched.cardinality + excluded.cardinality,
sum = enriched.sum + excluded.sum;
DEBUG: rehashing catalog cache id 14 for pg_opclass; 17 tups, 8 buckets at character 224
DEBUG: INSERT target table and the source relation of the SELECT partition column value must be colocated in distributed INSERT ... SELECT
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: performing repartitioned INSERT ... SELECT

View File

@ -28,11 +28,11 @@ step detector-dump-wait-edges:
waiting_transaction_numblocking_transaction_numblocking_transaction_waiting
390 389 f
395 394 f
transactionnumberwaitingtransactionnumbers
389
390 389
394
395 394
step s1-abort:
ABORT;
@ -75,14 +75,14 @@ step detector-dump-wait-edges:
waiting_transaction_numblocking_transaction_numblocking_transaction_waiting
394 393 f
395 393 f
395 394 t
399 398 f
400 398 f
400 399 t
transactionnumberwaitingtransactionnumbers
393
394 393
395 393,394
398
399 398
400 398,399
step s1-abort:
ABORT;

View File

@ -0,0 +1,204 @@
Parsed test spec with 3 sessions
starting permutation: enable-deadlock-detection reload-conf s2-start-session-level-connection s1-begin s1-update-1 s2-begin-on-worker s2-update-2-on-worker s2-truncate-on-worker s3-invalidate-metadata s3-resync s3-wait s2-update-1-on-worker s1-update-2 s1-commit s2-commit-on-worker disable-deadlock-detection reload-conf s2-stop-connection
create_distributed_table
step enable-deadlock-detection:
ALTER SYSTEM SET citus.distributed_deadlock_detection_factor TO 1.1;
step reload-conf:
SELECT pg_reload_conf();
pg_reload_conf
t
step s2-start-session-level-connection:
SELECT start_session_level_connection_to_node('localhost', 57638);
start_session_level_connection_to_node
step s1-begin:
BEGIN;
step s1-update-1:
UPDATE deadlock_detection_test SET some_val = 1 WHERE user_id = 1;
step s2-begin-on-worker:
SELECT run_commands_on_session_level_connection_to_node('BEGIN');
run_commands_on_session_level_connection_to_node
step s2-update-2-on-worker:
SELECT run_commands_on_session_level_connection_to_node('UPDATE deadlock_detection_test SET some_val = 2 WHERE user_id = 2');
run_commands_on_session_level_connection_to_node
step s2-truncate-on-worker:
SELECT run_commands_on_session_level_connection_to_node('TRUNCATE t2');
run_commands_on_session_level_connection_to_node
step s3-invalidate-metadata:
update pg_dist_node SET metadatasynced = false;
step s3-resync:
SELECT trigger_metadata_sync();
trigger_metadata_sync
step s3-wait:
SELECT pg_sleep(2);
pg_sleep
step s2-update-1-on-worker:
SELECT run_commands_on_session_level_connection_to_node('UPDATE deadlock_detection_test SET some_val = 2 WHERE user_id = 1');
<waiting ...>
step s1-update-2:
UPDATE deadlock_detection_test SET some_val = 1 WHERE user_id = 2;
<waiting ...>
step s1-update-2: <... completed>
step s2-update-1-on-worker: <... completed>
run_commands_on_session_level_connection_to_node
error in steps s1-update-2 s2-update-1-on-worker: ERROR: canceling the transaction since it was involved in a distributed deadlock
step s1-commit:
COMMIT;
step s2-commit-on-worker:
SELECT run_commands_on_session_level_connection_to_node('COMMIT');
run_commands_on_session_level_connection_to_node
step disable-deadlock-detection:
ALTER SYSTEM SET citus.distributed_deadlock_detection_factor TO -1;
step reload-conf:
SELECT pg_reload_conf();
pg_reload_conf
t
step s2-stop-connection:
SELECT stop_session_level_connection_to_node();
stop_session_level_connection_to_node
restore_isolation_tester_func
starting permutation: increase-retry-interval reload-conf s2-start-session-level-connection s2-begin-on-worker s2-truncate-on-worker s3-invalidate-metadata s3-resync s3-wait s1-count-daemons s1-cancel-metadata-sync s1-count-daemons reset-retry-interval reload-conf s2-commit-on-worker s2-stop-connection s3-resync s3-wait
create_distributed_table
step increase-retry-interval:
ALTER SYSTEM SET citus.metadata_sync_retry_interval TO 20000;
step reload-conf:
SELECT pg_reload_conf();
pg_reload_conf
t
step s2-start-session-level-connection:
SELECT start_session_level_connection_to_node('localhost', 57638);
start_session_level_connection_to_node
step s2-begin-on-worker:
SELECT run_commands_on_session_level_connection_to_node('BEGIN');
run_commands_on_session_level_connection_to_node
step s2-truncate-on-worker:
SELECT run_commands_on_session_level_connection_to_node('TRUNCATE t2');
run_commands_on_session_level_connection_to_node
step s3-invalidate-metadata:
update pg_dist_node SET metadatasynced = false;
step s3-resync:
SELECT trigger_metadata_sync();
trigger_metadata_sync
step s3-wait:
SELECT pg_sleep(2);
pg_sleep
step s1-count-daemons:
SELECT count(*) FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
count
1
step s1-cancel-metadata-sync:
SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
SELECT pg_sleep(2);
pg_cancel_backend
t
pg_sleep
step s1-count-daemons:
SELECT count(*) FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
count
0
step reset-retry-interval:
ALTER SYSTEM RESET citus.metadata_sync_retry_interval;
step reload-conf:
SELECT pg_reload_conf();
pg_reload_conf
t
step s2-commit-on-worker:
SELECT run_commands_on_session_level_connection_to_node('COMMIT');
run_commands_on_session_level_connection_to_node
step s2-stop-connection:
SELECT stop_session_level_connection_to_node();
stop_session_level_connection_to_node
step s3-resync:
SELECT trigger_metadata_sync();
trigger_metadata_sync
step s3-wait:
SELECT pg_sleep(2);
pg_sleep
restore_isolation_tester_func

View File

@ -0,0 +1,190 @@
--
-- master_update_table_statistics.sql
--
-- Test master_update_table_statistics function on both
-- hash and append distributed tables
-- This function updates shardlength, shardminvalue and shardmaxvalue
--
SET citus.next_shard_id TO 981000;
SET citus.next_placement_id TO 982000;
SET citus.shard_count TO 8;
SET citus.shard_replication_factor TO 2;
-- test with a hash-distributed table
-- here we update only shardlength, not shardminvalue and shardmaxvalue
CREATE TABLE test_table_statistics_hash (id int);
SELECT create_distributed_table('test_table_statistics_hash', 'id');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- populate table
INSERT INTO test_table_statistics_hash SELECT i FROM generate_series(0, 10000)i;
-- originally shardlength (size of the shard) is zero
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue AS shardminvalue,
ds.shardmaxvalue AS shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_hash') AND dsp.shardlength = 0
ORDER BY 2, 3;
tablename | shardid | placementid | shardname | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
test_table_statistics_hash | 981000 | 982000 | test_table_statistics_hash_981000 | -2147483648 | -1610612737
test_table_statistics_hash | 981000 | 982001 | test_table_statistics_hash_981000 | -2147483648 | -1610612737
test_table_statistics_hash | 981001 | 982002 | test_table_statistics_hash_981001 | -1610612736 | -1073741825
test_table_statistics_hash | 981001 | 982003 | test_table_statistics_hash_981001 | -1610612736 | -1073741825
test_table_statistics_hash | 981002 | 982004 | test_table_statistics_hash_981002 | -1073741824 | -536870913
test_table_statistics_hash | 981002 | 982005 | test_table_statistics_hash_981002 | -1073741824 | -536870913
test_table_statistics_hash | 981003 | 982006 | test_table_statistics_hash_981003 | -536870912 | -1
test_table_statistics_hash | 981003 | 982007 | test_table_statistics_hash_981003 | -536870912 | -1
test_table_statistics_hash | 981004 | 982008 | test_table_statistics_hash_981004 | 0 | 536870911
test_table_statistics_hash | 981004 | 982009 | test_table_statistics_hash_981004 | 0 | 536870911
test_table_statistics_hash | 981005 | 982010 | test_table_statistics_hash_981005 | 536870912 | 1073741823
test_table_statistics_hash | 981005 | 982011 | test_table_statistics_hash_981005 | 536870912 | 1073741823
test_table_statistics_hash | 981006 | 982012 | test_table_statistics_hash_981006 | 1073741824 | 1610612735
test_table_statistics_hash | 981006 | 982013 | test_table_statistics_hash_981006 | 1073741824 | 1610612735
test_table_statistics_hash | 981007 | 982014 | test_table_statistics_hash_981007 | 1610612736 | 2147483647
test_table_statistics_hash | 981007 | 982015 | test_table_statistics_hash_981007 | 1610612736 | 2147483647
(16 rows)
-- setting this to on in order to verify that we use a distributed transaction id
-- to run the size queries from different connections
-- this is going to help detect deadlocks
SET citus.log_remote_commands TO ON;
-- setting this to sequential in order to have a deterministic order
-- in the output of citus.log_remote_commands
SET citus.multi_shard_modify_mode TO sequential;
-- update table statistics and then check that shardlength has changed
-- but shardminvalue and shardmaxvalue stay the same because this is
-- a hash distributed table
SELECT master_update_table_statistics('test_table_statistics_hash');
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing SELECT 981000 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981000') AS shard_size UNION ALL SELECT 981001 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981001') AS shard_size UNION ALL SELECT 981002 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981002') AS shard_size UNION ALL SELECT 981003 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981003') AS shard_size UNION ALL SELECT 981004 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981004') AS shard_size UNION ALL SELECT 981005 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981005') AS shard_size UNION ALL SELECT 981006 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981006') AS shard_size UNION ALL SELECT 981007 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981007') AS shard_size UNION ALL SELECT 0::bigint, NULL::text, NULL::text, 0::bigint;
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing SELECT 981000 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981000') AS shard_size UNION ALL SELECT 981001 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981001') AS shard_size UNION ALL SELECT 981002 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981002') AS shard_size UNION ALL SELECT 981003 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981003') AS shard_size UNION ALL SELECT 981004 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981004') AS shard_size UNION ALL SELECT 981005 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981005') AS shard_size UNION ALL SELECT 981006 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981006') AS shard_size UNION ALL SELECT 981007 AS shard_id, NULL::text AS shard_minvalue, NULL::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_hash_981007') AS shard_size UNION ALL SELECT 0::bigint, NULL::text, NULL::text, 0::bigint;
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing COMMIT
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing COMMIT
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
master_update_table_statistics
---------------------------------------------------------------------
(1 row)
RESET citus.log_remote_commands;
RESET citus.multi_shard_modify_mode;
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue as shardminvalue,
ds.shardmaxvalue as shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_hash') AND dsp.shardlength > 0
ORDER BY 2, 3;
tablename | shardid | placementid | shardname | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
test_table_statistics_hash | 981000 | 982000 | test_table_statistics_hash_981000 | -2147483648 | -1610612737
test_table_statistics_hash | 981000 | 982001 | test_table_statistics_hash_981000 | -2147483648 | -1610612737
test_table_statistics_hash | 981001 | 982002 | test_table_statistics_hash_981001 | -1610612736 | -1073741825
test_table_statistics_hash | 981001 | 982003 | test_table_statistics_hash_981001 | -1610612736 | -1073741825
test_table_statistics_hash | 981002 | 982004 | test_table_statistics_hash_981002 | -1073741824 | -536870913
test_table_statistics_hash | 981002 | 982005 | test_table_statistics_hash_981002 | -1073741824 | -536870913
test_table_statistics_hash | 981003 | 982006 | test_table_statistics_hash_981003 | -536870912 | -1
test_table_statistics_hash | 981003 | 982007 | test_table_statistics_hash_981003 | -536870912 | -1
test_table_statistics_hash | 981004 | 982008 | test_table_statistics_hash_981004 | 0 | 536870911
test_table_statistics_hash | 981004 | 982009 | test_table_statistics_hash_981004 | 0 | 536870911
test_table_statistics_hash | 981005 | 982010 | test_table_statistics_hash_981005 | 536870912 | 1073741823
test_table_statistics_hash | 981005 | 982011 | test_table_statistics_hash_981005 | 536870912 | 1073741823
test_table_statistics_hash | 981006 | 982012 | test_table_statistics_hash_981006 | 1073741824 | 1610612735
test_table_statistics_hash | 981006 | 982013 | test_table_statistics_hash_981006 | 1073741824 | 1610612735
test_table_statistics_hash | 981007 | 982014 | test_table_statistics_hash_981007 | 1610612736 | 2147483647
test_table_statistics_hash | 981007 | 982015 | test_table_statistics_hash_981007 | 1610612736 | 2147483647
(16 rows)
-- check with an append-distributed table
-- here we update shardlength, shardminvalue and shardmaxvalue
CREATE TABLE test_table_statistics_append (id int);
SELECT create_distributed_table('test_table_statistics_append', 'id', 'append');
create_distributed_table
---------------------------------------------------------------------
(1 row)
COPY test_table_statistics_append FROM PROGRAM 'echo 0 && echo 1 && echo 2 && echo 3' WITH CSV;
COPY test_table_statistics_append FROM PROGRAM 'echo 4 && echo 5 && echo 6 && echo 7' WITH CSV;
-- originally shardminvalue and shardmaxvalue will be 0,3 and 4, 7
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue as shardminvalue,
ds.shardmaxvalue as shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_append')
ORDER BY 2, 3;
tablename | shardid | placementid | shardname | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
test_table_statistics_append | 981008 | 982016 | test_table_statistics_append_981008 | 0 | 3
test_table_statistics_append | 981008 | 982017 | test_table_statistics_append_981008 | 0 | 3
test_table_statistics_append | 981009 | 982018 | test_table_statistics_append_981009 | 4 | 7
test_table_statistics_append | 981009 | 982019 | test_table_statistics_append_981009 | 4 | 7
(4 rows)
-- delete some data to change shardminvalues of a shards
DELETE FROM test_table_statistics_append WHERE id = 0 OR id = 4;
SET citus.log_remote_commands TO ON;
SET citus.multi_shard_modify_mode TO sequential;
-- update table statistics and then check that shardminvalue has changed
-- shardlength (shardsize) is still 8192 since there is very few data
SELECT master_update_table_statistics('test_table_statistics_append');
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing SELECT 981008 AS shard_id, min(id)::text AS shard_minvalue, max(id)::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_append_981008') AS shard_size FROM test_table_statistics_append_981008 UNION ALL SELECT 981009 AS shard_id, min(id)::text AS shard_minvalue, max(id)::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_append_981009') AS shard_size FROM test_table_statistics_append_981009 UNION ALL SELECT 0::bigint, NULL::text, NULL::text, 0::bigint;
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing SELECT 981008 AS shard_id, min(id)::text AS shard_minvalue, max(id)::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_append_981008') AS shard_size FROM test_table_statistics_append_981008 UNION ALL SELECT 981009 AS shard_id, min(id)::text AS shard_minvalue, max(id)::text AS shard_maxvalue, pg_relation_size('public.test_table_statistics_append_981009') AS shard_size FROM test_table_statistics_append_981009 UNION ALL SELECT 0::bigint, NULL::text, NULL::text, 0::bigint;
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing COMMIT
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing COMMIT
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
master_update_table_statistics
---------------------------------------------------------------------
(1 row)
RESET citus.log_remote_commands;
RESET citus.multi_shard_modify_mode;
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue as shardminvalue,
ds.shardmaxvalue as shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_append')
ORDER BY 2, 3;
tablename | shardid | placementid | shardname | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
test_table_statistics_append | 981008 | 982016 | test_table_statistics_append_981008 | 1 | 3
test_table_statistics_append | 981008 | 982017 | test_table_statistics_append_981008 | 1 | 3
test_table_statistics_append | 981009 | 982018 | test_table_statistics_append_981009 | 5 | 7
test_table_statistics_append | 981009 | 982019 | test_table_statistics_append_981009 | 5 | 7
(4 rows)
DROP TABLE test_table_statistics_hash, test_table_statistics_append;
ALTER SYSTEM RESET citus.shard_count;
ALTER SYSTEM RESET citus.shard_replication_factor;

View File

@ -406,6 +406,105 @@ SELECT * FROM print_extension_changes();
| function worker_save_query_explain_analyze(text,jsonb)
(2 rows)
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.4-2';
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
ALTER EXTENSION citus UPDATE TO '9.4-2';
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
ALTER EXTENSION citus UPDATE TO '9.4-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
-- Should be empty result
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Test downgrade to 9.4-1 from 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
BEGIN;
@ -453,12 +552,111 @@ SELECT * FROM print_extension_changes();
| function worker_record_sequence_dependency(regclass,regclass,name)
(10 rows)
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.5-2';
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
ALTER EXTENSION citus UPDATE TO '9.5-2';
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
ALTER EXTENSION citus UPDATE TO '9.5-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
-- Should be empty result
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
DROP TABLE prev_objects, extension_diff;
-- show running version
SHOW citus.version;
citus.version
---------------------------------------------------------------------
9.5devel
9.5.7
(1 row)
-- ensure no objects were created outside pg_catalog

View File

@ -269,7 +269,7 @@ ERROR: writing to worker nodes is not currently allowed
DETAIL: citus.use_secondary_nodes is set to 'always'
SELECT * FROM citus_local_table ORDER BY a;
ERROR: there is a shard placement in node group 0 but there are no nodes in that group
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
DROP TABLE the_table;
DROP TABLE reference_table;
DROP TABLE citus_local_table;

View File

@ -77,7 +77,7 @@ order by s_i_id;
SELECT * FROM the_table;
ERROR: node group does not have a secondary node
-- add the secondary nodes and try again, the SELECT statement should work this time
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
SELECT 1 FROM master_add_node('localhost', :follower_worker_1_port,
groupid => (SELECT groupid FROM pg_dist_node WHERE nodeport = :worker_1_port),
noderole => 'secondary');
@ -149,7 +149,7 @@ order by s_i_id;
ERROR: there is a shard placement in node group but there are no nodes in that group
-- now move the secondary nodes into the new cluster and see that the follower, finally
-- correctly configured, can run select queries involving them
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
UPDATE pg_dist_node SET nodecluster = 'second-cluster' WHERE noderole = 'secondary';
\c "port=9070 dbname=regression options='-c\ citus.use_secondary_nodes=always\ -c\ citus.cluster_name=second-cluster'"
SELECT * FROM the_table;
@ -160,6 +160,6 @@ SELECT * FROM the_table;
(2 rows)
-- clean up after ourselves
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
DROP TABLE the_table;
DROP TABLE stock;

View File

@ -938,5 +938,239 @@ SELECT create_reference_table('self_referencing_reference_table');
(1 row)
ALTER TABLE self_referencing_reference_table ADD CONSTRAINT fk FOREIGN KEY(id, other_column_ref) REFERENCES self_referencing_reference_table(id, other_column);
-- make sure that if fkey is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (x) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ALTER TABLE dropfkeytest2 DROP CONSTRAINT fkey1;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that if a column that is in a fkey is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ALTER TABLE dropfkeytest2 DROP COLUMN y;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a column that is in a multi-column index is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int, y int);
CREATE UNIQUE INDEX indd ON dropfkeytest1(x, y);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (x, y) REFERENCES dropfkeytest1(x, y);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ALTER TABLE dropfkeytest2 DROP COLUMN y CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a column that is in a multi-column fkey is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int, y int);
CREATE UNIQUE INDEX indd ON dropfkeytest1(x, y);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (x, y) REFERENCES dropfkeytest1(x, y);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ALTER TABLE dropfkeytest1 DROP COLUMN y CASCADE;
NOTICE: drop cascades to constraint fkey1 on table dropfkeytest2
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if an index which fkey relies on is dropped
-- Citus can see up-to date fkey graph
-- also irrelevant index drops doesn't affect this
CREATE TABLE dropfkeytest1 (x int);
CREATE UNIQUE INDEX i1 ON dropfkeytest1(x);
CREATE UNIQUE INDEX unrelated_idx ON dropfkeytest1(x);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
DROP INDEX unrelated_idx CASCADE;
-- should still error out since we didn't drop the index that foreign key depends on
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
DROP INDEX i1 CASCADE;
NOTICE: drop cascades to constraint fkey1 on table dropfkeytest2
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a uniqueness constraint which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ALTER TABLE dropfkeytest1 DROP CONSTRAINT dropfkeytest1_x_key CASCADE;
NOTICE: drop cascades to constraint fkey1 on table dropfkeytest2
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a primary key which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int primary key);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ALTER TABLE dropfkeytest1 DROP CONSTRAINT dropfkeytest1_pkey CASCADE;
NOTICE: drop cascades to constraint fkey1 on table dropfkeytest2
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a schema which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE SCHEMA fkeytestsc;
CREATE TABLE fkeytestsc.dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES fkeytestsc.dropfkeytest1(x);
SELECT create_distributed_table ('fkeytestsc.dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
DROP SCHEMA fkeytestsc CASCADE;
NOTICE: drop cascades to 2 other objects
DETAIL: drop cascades to table fkeytestsc.dropfkeytest1
drop cascades to constraint fkey1 on table dropfkeytest2
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
DROP TABLE dropfkeytest2 CASCADE;
-- make sure that even if a table which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
DROP TABLE dropfkeytest1 CASCADE;
NOTICE: drop cascades to constraint fkey1 on table dropfkeytest2
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- we no longer need those tables
DROP TABLE referenced_by_reference_table, references_to_reference_table, reference_table, reference_table_second, referenced_local_table, self_referencing_reference_table;
DROP TABLE referenced_by_reference_table, references_to_reference_table, reference_table, reference_table_second, referenced_local_table, self_referencing_reference_table, dropfkeytest2;

View File

@ -38,14 +38,48 @@ DEBUG: Router planner does not support append-partitioned tables.
-- Partition pruning left three shards for the lineitem and one shard for the
-- orders table. These shard sets don't overlap, so join pruning should prune
-- out all the shards, and leave us with an empty task list.
select * from pg_dist_shard
where logicalrelid='lineitem'::regclass or
logicalrelid='orders'::regclass
order by shardid;
logicalrelid | shardid | shardstorage | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
lineitem | 290000 | t | 1 | 5986
lineitem | 290001 | t | 8997 | 14947
orders | 290002 | t | 1 | 5986
orders | 290003 | t | 8997 | 14947
(4 rows)
set citus.explain_distributed_queries to on;
-- explain the query before actually executing it
EXPLAIN SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
DEBUG: Router planner does not support append-partitioned tables.
DEBUG: join prunable for intervals [8997,14947] and [1,5986]
QUERY PLAN
---------------------------------------------------------------------
Aggregate (cost=750.01..750.02 rows=1 width=40)
-> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=100000 width=24)
Task Count: 0
Tasks Shown: All
(4 rows)
set citus.explain_distributed_queries to off;
set client_min_messages to debug3;
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
DEBUG: Router planner does not support append-partitioned tables.
DEBUG: constraint (gt) value: '6000'::bigint
DEBUG: shard count: 1
DEBUG: constraint (lt) value: '6000'::bigint
DEBUG: shard count: 1
DEBUG: join prunable for intervals [8997,14947] and [1,5986]
sum | avg
---------------------------------------------------------------------
|
(1 row)
set client_min_messages to debug2;
-- Make sure that we can handle filters without a column
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND false;

View File

@ -21,6 +21,27 @@ CREATE FUNCTION mark_node_readonly(hostname TEXT, port INTEGER, isreadonly BOOLE
master_run_on_worker(ARRAY[hostname], ARRAY[port],
ARRAY['SELECT pg_reload_conf()'], false);
$$;
CREATE OR REPLACE FUNCTION trigger_metadata_sync()
RETURNS void
LANGUAGE C STRICT
AS 'citus';
CREATE OR REPLACE FUNCTION raise_error_in_metadata_sync()
RETURNS void
LANGUAGE C STRICT
AS 'citus';
CREATE PROCEDURE wait_until_process_count(appname text, target_count int) AS $$
declare
counter integer := -1;
begin
while counter != target_count loop
-- pg_stat_activity is cached at xact level and there is no easy way to clear it.
-- Look it up in a new connection to get latest updates.
SELECT result::int into counter FROM
master_run_on_worker(ARRAY['localhost'], ARRAY[57636], ARRAY[
'SELECT count(*) FROM pg_stat_activity WHERE application_name = ' || quote_literal(appname) || ';'], false);
PERFORM pg_sleep(0.1);
end loop;
end$$ LANGUAGE plpgsql;
-- add a node to the cluster
SELECT master_add_node('localhost', :worker_1_port) As nodeid_1 \gset
SELECT nodeid, nodename, nodeport, hasmetadata, metadatasynced FROM pg_dist_node;
@ -152,6 +173,142 @@ SELECT nodeid, hasmetadata, metadatasynced FROM pg_dist_node;
2 | t | f
(1 row)
-- verify that metadata sync daemon has started
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
count
---------------------------------------------------------------------
1
(1 row)
--
-- terminate maintenance daemon, and verify that we don't spawn multiple
-- metadata sync daemons
--
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE application_name = 'Citus Maintenance Daemon';
pg_terminate_backend
---------------------------------------------------------------------
t
(1 row)
CALL wait_until_process_count('Citus Maintenance Daemon', 1);
select trigger_metadata_sync();
trigger_metadata_sync
---------------------------------------------------------------------
(1 row)
select wait_until_metadata_sync();
wait_until_metadata_sync
---------------------------------------------------------------------
(1 row)
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
count
---------------------------------------------------------------------
1
(1 row)
--
-- cancel metadata sync daemon, and verify that it exits and restarts.
--
select pid as pid_before_cancel from pg_stat_activity where application_name like 'Citus Met%' \gset
select pg_cancel_backend(pid) from pg_stat_activity where application_name = 'Citus Metadata Sync Daemon';
pg_cancel_backend
---------------------------------------------------------------------
t
(1 row)
select wait_until_metadata_sync();
wait_until_metadata_sync
---------------------------------------------------------------------
(1 row)
select pid as pid_after_cancel from pg_stat_activity where application_name like 'Citus Met%' \gset
select :pid_before_cancel != :pid_after_cancel AS metadata_sync_restarted;
metadata_sync_restarted
---------------------------------------------------------------------
t
(1 row)
--
-- cancel metadata sync daemon so it exits and restarts, but at the
-- same time tell maintenanced to trigger a new metadata sync. One
-- of these should exit to avoid multiple metadata syncs.
--
select pg_cancel_backend(pid) from pg_stat_activity where application_name = 'Citus Metadata Sync Daemon';
pg_cancel_backend
---------------------------------------------------------------------
t
(1 row)
select trigger_metadata_sync();
trigger_metadata_sync
---------------------------------------------------------------------
(1 row)
select wait_until_metadata_sync();
wait_until_metadata_sync
---------------------------------------------------------------------
(1 row)
-- we assume citus.metadata_sync_retry_interval is 500ms. Change amount we sleep to ceiling + 0.2 if it changes.
select pg_sleep(1.2);
pg_sleep
---------------------------------------------------------------------
(1 row)
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
count
---------------------------------------------------------------------
1
(1 row)
--
-- error in metadata sync daemon, and verify it exits and restarts.
--
select pid as pid_before_error from pg_stat_activity where application_name like 'Citus Met%' \gset
select raise_error_in_metadata_sync();
raise_error_in_metadata_sync
---------------------------------------------------------------------
(1 row)
select wait_until_metadata_sync(30000);
wait_until_metadata_sync
---------------------------------------------------------------------
(1 row)
select pid as pid_after_error from pg_stat_activity where application_name like 'Citus Met%' \gset
select :pid_before_error != :pid_after_error AS metadata_sync_restarted;
metadata_sync_restarted
---------------------------------------------------------------------
t
(1 row)
SELECT trigger_metadata_sync();
trigger_metadata_sync
---------------------------------------------------------------------
(1 row)
SELECT wait_until_metadata_sync(30000);
wait_until_metadata_sync
---------------------------------------------------------------------
(1 row)
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
count
---------------------------------------------------------------------
1
(1 row)
-- update it back to :worker_1_port, now metadata should be synced
SELECT 1 FROM master_update_node(:nodeid_1, 'localhost', :worker_1_port);
?column?
@ -594,6 +751,59 @@ SELECT verify_metadata('localhost', :worker_1_port);
t
(1 row)
-- verify that metadata sync daemon exits
call wait_until_process_count('Citus Metadata Sync Daemon', 0);
-- verify that DROP DATABASE terminates metadata sync
SELECT current_database() datname \gset
CREATE DATABASE db_to_drop;
NOTICE: Citus partially supports CREATE DATABASE for distributed databases
SELECT run_command_on_workers('CREATE DATABASE db_to_drop');
run_command_on_workers
---------------------------------------------------------------------
(localhost,57637,t,"CREATE DATABASE")
(localhost,57638,t,"CREATE DATABASE")
(2 rows)
\c db_to_drop - - :worker_1_port
CREATE EXTENSION citus;
\c db_to_drop - - :master_port
CREATE EXTENSION citus;
SELECT master_add_node('localhost', :worker_1_port);
master_add_node
---------------------------------------------------------------------
1
(1 row)
UPDATE pg_dist_node SET hasmetadata = true;
SELECT master_update_node(nodeid, 'localhost', 12345) FROM pg_dist_node;
master_update_node
---------------------------------------------------------------------
(1 row)
CREATE OR REPLACE FUNCTION trigger_metadata_sync()
RETURNS void
LANGUAGE C STRICT
AS 'citus';
SELECT trigger_metadata_sync();
trigger_metadata_sync
---------------------------------------------------------------------
(1 row)
\c :datname - - :master_port
SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
datname
---------------------------------------------------------------------
db_to_drop
(1 row)
DROP DATABASE db_to_drop;
SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
datname
---------------------------------------------------------------------
(0 rows)
-- cleanup
DROP TABLE ref_table;
TRUNCATE pg_dist_colocation;

View File

@ -84,7 +84,7 @@ SELECT prune_using_both_values('pruning', 'tomato', 'rose');
-- unit test of the equality expression generation code
SELECT debug_equality_expression('pruning');
debug_equality_expression
debug_equality_expression
---------------------------------------------------------------------
{OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ({VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 1 :location -1} {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull true :location -1 :constvalue <>}) :location -1}
(1 row)
@ -543,6 +543,154 @@ SELECT * FROM numeric_test WHERE id = 21.1::numeric;
21.1 | 87
(1 row)
CREATE TABLE range_dist_table_1 (dist_col BIGINT);
SELECT create_distributed_table('range_dist_table_1', 'dist_col', 'range');
create_distributed_table
---------------------------------------------------------------------
(1 row)
CALL public.create_range_partitioned_shards('range_dist_table_1', '{1000,3000,6000}', '{2000,4000,7000}');
INSERT INTO range_dist_table_1 VALUES (1001);
INSERT INTO range_dist_table_1 VALUES (3800);
INSERT INTO range_dist_table_1 VALUES (6500);
-- all were returning false before fixing #5077
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2999;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2999;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2500;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2000;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 1001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col > 1000;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1000;
?column?
---------------------------------------------------------------------
t
(1 row)
-- we didn't have such an off-by-one error in upper bound
-- calculation, but let's test such cases too
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 4001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4500;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 6000;
?column?
---------------------------------------------------------------------
t
(1 row)
-- now test with composite type and more shards
CREATE TYPE comp_type AS (
int_field_1 BIGINT,
int_field_2 BIGINT
);
CREATE TYPE comp_type_range AS RANGE (
subtype = comp_type);
CREATE TABLE range_dist_table_2 (dist_col comp_type);
SELECT create_distributed_table('range_dist_table_2', 'dist_col', 'range');
create_distributed_table
---------------------------------------------------------------------
(1 row)
CALL public.create_range_partitioned_shards(
'range_dist_table_2',
'{"(10,24)","(10,58)",
"(10,90)","(20,100)"}',
'{"(10,25)","(10,65)",
"(10,99)","(20,100)"}');
INSERT INTO range_dist_table_2 VALUES ((10, 24));
INSERT INTO range_dist_table_2 VALUES ((10, 60));
INSERT INTO range_dist_table_2 VALUES ((10, 91));
INSERT INTO range_dist_table_2 VALUES ((20, 100));
SELECT dist_col='(10, 60)'::comp_type FROM range_dist_table_2
WHERE dist_col >= '(10,26)'::comp_type AND
dist_col <= '(10,75)'::comp_type;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type AND
dist_col <= '(10,95)'::comp_type
ORDER BY dist_col;
dist_col
---------------------------------------------------------------------
(10,60)
(10,91)
(2 rows)
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type
ORDER BY dist_col;
dist_col
---------------------------------------------------------------------
(10,60)
(10,91)
(20,100)
(3 rows)
SELECT dist_col='(20,100)'::comp_type FROM range_dist_table_2
WHERE dist_col > '(20,99)'::comp_type;
?column?
---------------------------------------------------------------------
t
(1 row)
DROP TABLE range_dist_table_1, range_dist_table_2;
DROP TYPE comp_type CASCADE;
NOTICE: drop cascades to type comp_type_range
SET search_path TO public;
DROP SCHEMA prune_shard_list CASCADE;
NOTICE: drop cascades to 10 other objects

View File

@ -26,7 +26,7 @@ WITH dist_node_summary AS (
ARRAY[dist_node_summary.query, dist_node_summary.query],
false)
), dist_placement_summary AS (
SELECT 'SELECT jsonb_agg(pg_dist_placement ORDER BY shardid) FROM pg_dist_placement)' AS query
SELECT 'SELECT jsonb_agg(pg_dist_placement ORDER BY shardid) FROM pg_dist_placement' AS query
), dist_placement_check AS (
SELECT count(distinct result) = 1 AS matches
FROM dist_placement_summary CROSS JOIN LATERAL

View File

@ -226,10 +226,11 @@ COMMIT;
-- now, some of the optional connections would be skipped,
-- and only 5 connections are used per node
BEGIN;
SELECT count(*), pg_sleep(0.1) FROM test;
count | pg_sleep
SET LOCAL citus.max_adaptive_executor_pool_size TO 16;
with cte_1 as (select pg_sleep(0.1) is null, a from test) SELECT a from cte_1 ORDER By 1 LIMIT 1;
a
---------------------------------------------------------------------
101 |
0
(1 row)
SELECT
@ -493,7 +494,7 @@ BEGIN;
(2 rows)
ROLLBACK;
-- INSERT SELECT with RETURNING/ON CONFLICT clauses should honor shared_pool_size
-- INSERT SELECT with RETURNING/ON CONFLICT clauses does not honor shared_pool_size
-- in underlying COPY commands
BEGIN;
SELECT pg_sleep(0.1);
@ -502,7 +503,9 @@ BEGIN;
(1 row)
INSERT INTO test SELECT i FROM generate_series(0,10) i RETURNING *;
-- make sure that we hit at least 4 shards per node, where 20 rows
-- is enough
INSERT INTO test SELECT i FROM generate_series(0,20) i RETURNING *;
a
---------------------------------------------------------------------
0
@ -516,10 +519,20 @@ BEGIN;
8
9
10
(11 rows)
11
12
13
14
15
16
17
18
19
20
(21 rows)
SELECT
connection_count_to_node
connection_count_to_node > current_setting('citus.max_shared_pool_size')::int
FROM
citus_remote_connection_stats()
WHERE
@ -527,10 +540,10 @@ BEGIN;
database_name = 'regression'
ORDER BY
hostname, port;
connection_count_to_node
?column?
---------------------------------------------------------------------
3
3
t
t
(2 rows)
ROLLBACK;

View File

@ -119,16 +119,6 @@ EXECUTE subquery_prepare_without_param;
(5,4)
(5 rows)
EXECUTE subquery_prepare_without_param;
values_of_subquery
---------------------------------------------------------------------
(6,4)
(6,3)
(6,2)
(6,1)
(5,4)
(5 rows)
EXECUTE subquery_prepare_param_on_partkey(1);
DEBUG: push down of limit count: 5
DEBUG: generating subplan XXX_1 for subquery SELECT DISTINCT ROW(users_table.user_id, events_table.event_type)::subquery_prepared_statements.xy AS values_of_subquery FROM public.users_table, public.events_table WHERE ((users_table.user_id OPERATOR(pg_catalog.=) events_table.user_id) AND ((users_table.user_id OPERATOR(pg_catalog.=) 1) OR (users_table.user_id OPERATOR(pg_catalog.=) 2)) AND (events_table.event_type OPERATOR(pg_catalog.=) ANY (ARRAY[1, 2, 3, 4]))) ORDER BY ROW(users_table.user_id, events_table.event_type)::subquery_prepared_statements.xy DESC LIMIT 5

View File

@ -80,3 +80,4 @@ test: isolation_insert_select_vs_all_on_mx
test: isolation_ref_select_for_update_vs_all_on_mx
test: isolation_ref_update_delete_upsert_vs_all_on_mx
test: isolation_dis2ref_foreign_keys_on_mx
test: isolation_metadata_sync_deadlock

View File

@ -86,7 +86,12 @@ test: set_operation_and_local_tables
test: subqueries_deep subquery_view subquery_partitioning subquery_complex_target_list subqueries_not_supported subquery_in_where
test: non_colocated_leaf_subquery_joins non_colocated_subquery_joins non_colocated_join_order
test: subquery_prepared_statements pg12 cte_inline pg13 recursive_view_local_table
test: tableam
test: tableam drop_column_partitioned_table
# ----------
# Test for updating table statistics
# ----------
test: master_update_table_statistics
# ----------
# Miscellaneous tests to check our query planning behavior
@ -96,7 +101,9 @@ test: multi_explain hyperscale_tutorial partitioned_intermediate_results distrib
test: multi_basic_queries cross_join multi_complex_expressions multi_subquery multi_subquery_complex_queries multi_subquery_behavioral_analytics
test: multi_subquery_complex_reference_clause multi_subquery_window_functions multi_view multi_sql_function multi_prepare_sql
test: sql_procedure multi_function_in_join row_types materialized_view undistribute_table
test: multi_subquery_in_where_reference_clause join geqo adaptive_executor propagate_set_commands
test: multi_subquery_in_where_reference_clause adaptive_executor propagate_set_commands geqo
# this should be run alone as it gets too many clients
test: join_pushdown
test: multi_subquery_union multi_subquery_in_where_clause multi_subquery_misc statement_cancel_error_message
test: multi_agg_distinct multi_agg_approximate_distinct multi_limit_clause_approximate multi_outer_join_reference multi_single_relation_subquery multi_prepare_plsql set_role_in_transaction
test: multi_reference_table multi_select_for_update relation_access_tracking pg13_with_ties
@ -108,7 +115,7 @@ test: ch_bench_subquery_repartition
test: multi_agg_type_conversion multi_count_type_conversion
test: multi_partition_pruning single_hash_repartition_join
test: multi_join_pruning multi_hash_pruning intermediate_result_pruning
test: multi_null_minmax_value_pruning
test: multi_null_minmax_value_pruning cursors
test: multi_query_directory_cleanup
test: multi_task_assignment_policy multi_cross_shard
test: multi_utility_statements

View File

@ -418,6 +418,15 @@ push(@pgOptions, "max_parallel_workers_per_gather=0");
# Allow CREATE SUBSCRIPTION to work
push(@pgOptions, "wal_level='logical'");
# Faster logical replication status update so tests with logical replication
# run faster
push(@pgOptions, "wal_receiver_status_interval=1");
# Faster logical replication apply worker launch so tests with logical
# replication run faster. This is used in ApplyLauncherMain in
# src/backend/replication/logical/launcher.c.
push(@pgOptions, "wal_retrieve_retry_interval=1000");
# Citus options set for the tests
push(@pgOptions, "citus.shard_count=4");
push(@pgOptions, "citus.shard_max_size=1500kB");

View File

@ -0,0 +1,153 @@
#include "isolation_mx_common.include.spec"
setup
{
CREATE OR REPLACE FUNCTION trigger_metadata_sync()
RETURNS void
LANGUAGE C STRICT
AS 'citus';
CREATE OR REPLACE FUNCTION wait_until_metadata_sync(timeout INTEGER DEFAULT 15000)
RETURNS void
LANGUAGE C STRICT
AS 'citus';
CREATE TABLE deadlock_detection_test (user_id int UNIQUE, some_val int);
INSERT INTO deadlock_detection_test SELECT i, i FROM generate_series(1,7) i;
SELECT create_distributed_table('deadlock_detection_test', 'user_id');
CREATE TABLE t2(a int);
SELECT create_distributed_table('t2', 'a');
}
teardown
{
DROP FUNCTION trigger_metadata_sync();
DROP TABLE deadlock_detection_test;
DROP TABLE t2;
SET citus.shard_replication_factor = 1;
SELECT citus_internal.restore_isolation_tester_func();
}
session "s1"
step "increase-retry-interval"
{
ALTER SYSTEM SET citus.metadata_sync_retry_interval TO 20000;
}
step "reset-retry-interval"
{
ALTER SYSTEM RESET citus.metadata_sync_retry_interval;
}
step "enable-deadlock-detection"
{
ALTER SYSTEM SET citus.distributed_deadlock_detection_factor TO 1.1;
}
step "disable-deadlock-detection"
{
ALTER SYSTEM SET citus.distributed_deadlock_detection_factor TO -1;
}
step "reload-conf"
{
SELECT pg_reload_conf();
}
step "s1-begin"
{
BEGIN;
}
step "s1-update-1"
{
UPDATE deadlock_detection_test SET some_val = 1 WHERE user_id = 1;
}
step "s1-update-2"
{
UPDATE deadlock_detection_test SET some_val = 1 WHERE user_id = 2;
}
step "s1-commit"
{
COMMIT;
}
step "s1-count-daemons"
{
SELECT count(*) FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
}
step "s1-cancel-metadata-sync"
{
SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
SELECT pg_sleep(2);
}
session "s2"
step "s2-start-session-level-connection"
{
SELECT start_session_level_connection_to_node('localhost', 57638);
}
step "s2-stop-connection"
{
SELECT stop_session_level_connection_to_node();
}
step "s2-begin-on-worker"
{
SELECT run_commands_on_session_level_connection_to_node('BEGIN');
}
step "s2-update-1-on-worker"
{
SELECT run_commands_on_session_level_connection_to_node('UPDATE deadlock_detection_test SET some_val = 2 WHERE user_id = 1');
}
step "s2-update-2-on-worker"
{
SELECT run_commands_on_session_level_connection_to_node('UPDATE deadlock_detection_test SET some_val = 2 WHERE user_id = 2');
}
step "s2-truncate-on-worker"
{
SELECT run_commands_on_session_level_connection_to_node('TRUNCATE t2');
}
step "s2-commit-on-worker"
{
SELECT run_commands_on_session_level_connection_to_node('COMMIT');
}
session "s3"
step "s3-invalidate-metadata"
{
update pg_dist_node SET metadatasynced = false;
}
step "s3-resync"
{
SELECT trigger_metadata_sync();
}
step "s3-wait"
{
SELECT pg_sleep(2);
}
// Backends can block metadata sync. The following test verifies that if this happens,
// we still do distributed deadlock detection. In the following, s2-truncate-on-worker
// causes the concurrent metadata sync to be blocked. But s2 and s1 themselves are
// themselves involved in a distributed deadlock.
// See https://github.com/citusdata/citus/issues/4393 for more details.
permutation "enable-deadlock-detection" "reload-conf" "s2-start-session-level-connection" "s1-begin" "s1-update-1" "s2-begin-on-worker" "s2-update-2-on-worker" "s2-truncate-on-worker" "s3-invalidate-metadata" "s3-resync" "s3-wait" "s2-update-1-on-worker" "s1-update-2" "s1-commit" "s2-commit-on-worker" "disable-deadlock-detection" "reload-conf" "s2-stop-connection"
// Test that when metadata sync is waiting for locks, cancelling it terminates it.
// This is important in cases where the metadata sync daemon itself is involved in a deadlock.
permutation "increase-retry-interval" "reload-conf" "s2-start-session-level-connection" "s2-begin-on-worker" "s2-truncate-on-worker" "s3-invalidate-metadata" "s3-resync" "s3-wait" "s1-count-daemons" "s1-cancel-metadata-sync" "s1-count-daemons" "reset-retry-interval" "reload-conf" "s2-commit-on-worker" "s2-stop-connection" "s3-resync" "s3-wait"

View File

@ -0,0 +1,112 @@
CREATE SCHEMA cursors;
SET search_path TO cursors;
CREATE TABLE distributed_table (key int, value text);
SELECT create_distributed_table('distributed_table', 'key');
-- load some data, but not very small amounts because RETURN QUERY in plpgsql
-- hard codes the cursor fetch to 50 rows on PG 12, though they might increase
-- it sometime in the future, so be mindful
INSERT INTO distributed_table SELECT i % 10, i::text FROM generate_series(0, 1000) i;
CREATE OR REPLACE FUNCTION simple_cursor_on_dist_table(cursor_name refcursor) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR SELECT DISTINCT key FROM distributed_table ORDER BY 1;
RETURN $1;
END;
' LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION cursor_with_intermediate_result_on_dist_table(cursor_name refcursor) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR
WITH cte_1 AS (SELECT * FROM distributed_table OFFSET 0)
SELECT DISTINCT key FROM distributed_table WHERE value in (SELECT value FROM cte_1) ORDER BY 1;
RETURN $1;
END;
' LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION cursor_with_intermediate_result_on_dist_table_with_param(cursor_name refcursor, filter text) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR
WITH cte_1 AS (SELECT * FROM distributed_table WHERE value < $2 OFFSET 0)
SELECT DISTINCT key FROM distributed_table WHERE value in (SELECT value FROM cte_1) ORDER BY 1;
RETURN $1;
END;
' LANGUAGE plpgsql;
-- pretty basic query with cursors
-- Citus should plan/execute once and pull
-- the results to coordinator, then serve it
-- from the coordinator
BEGIN;
SELECT simple_cursor_on_dist_table('cursor_1');
SET LOCAL citus.log_intermediate_results TO ON;
SET LOCAL client_min_messages TO DEBUG1;
FETCH 5 IN cursor_1;
FETCH 50 IN cursor_1;
FETCH ALL IN cursor_1;
COMMIT;
BEGIN;
SELECT cursor_with_intermediate_result_on_dist_table('cursor_1');
-- multiple FETCH commands should not trigger re-running the subplans
SET LOCAL citus.log_intermediate_results TO ON;
SET LOCAL client_min_messages TO DEBUG1;
FETCH 5 IN cursor_1;
FETCH 1 IN cursor_1;
FETCH ALL IN cursor_1;
FETCH 5 IN cursor_1;
COMMIT;
BEGIN;
SELECT cursor_with_intermediate_result_on_dist_table_with_param('cursor_1', '600');
-- multiple FETCH commands should not trigger re-running the subplans
-- also test with parameters
SET LOCAL citus.log_intermediate_results TO ON;
SET LOCAL client_min_messages TO DEBUG1;
FETCH 1 IN cursor_1;
FETCH 1 IN cursor_1;
FETCH 1 IN cursor_1;
FETCH 1 IN cursor_1;
FETCH 1 IN cursor_1;
FETCH 1 IN cursor_1;
FETCH ALL IN cursor_1;
COMMIT;
CREATE OR REPLACE FUNCTION value_counter() RETURNS TABLE(counter text) LANGUAGE PLPGSQL AS $function$
BEGIN
return query
WITH cte AS
(SELECT dt.value
FROM distributed_table dt
WHERE dt.value in
(SELECT value
FROM distributed_table p
GROUP BY p.value
HAVING count(*) > 0))
SELECT * FROM cte;
END;
$function$ ;
SET citus.log_intermediate_results TO ON;
SET client_min_messages TO DEBUG1;
\set VERBOSITY terse
SELECT count(*) from (SELECT value_counter()) as foo;
BEGIN;
SELECT count(*) from (SELECT value_counter()) as foo;
COMMIT;
-- suppress NOTICEs
SET client_min_messages TO ERROR;
DROP SCHEMA cursors CASCADE;

View File

@ -0,0 +1,167 @@
CREATE SCHEMA drop_column_partitioned_table;
SET search_path TO drop_column_partitioned_table;
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 2580000;
-- create a partitioned table with some columns that
-- are going to be dropped within the tests
CREATE TABLE sensors(
col_to_drop_0 text,
col_to_drop_1 text,
col_to_drop_2 date,
col_to_drop_3 inet,
col_to_drop_4 date,
measureid integer,
eventdatetime date,
measure_data jsonb)
PARTITION BY RANGE(eventdatetime);
-- drop column even before attaching any partitions
ALTER TABLE sensors DROP COLUMN col_to_drop_1;
-- now attach the first partition and create the distributed table
CREATE TABLE sensors_2000 PARTITION OF sensors FOR VALUES FROM ('2000-01-01') TO ('2001-01-01');
SELECT create_distributed_table('sensors', 'measureid');
-- prepared statements should work fine even after columns are dropped
PREPARE drop_col_prepare_insert(int, date, jsonb) AS INSERT INTO sensors (measureid, eventdatetime, measure_data) VALUES ($1, $2, $3);
PREPARE drop_col_prepare_select(int, date) AS SELECT count(*) FROM sensors WHERE measureid = $1 AND eventdatetime = $2;
-- execute 7 times to make sure it is cached
EXECUTE drop_col_prepare_insert(1, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-05', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-06', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-07', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(1, '2000-10-01');
EXECUTE drop_col_prepare_select(1, '2000-10-02');
EXECUTE drop_col_prepare_select(1, '2000-10-03');
EXECUTE drop_col_prepare_select(1, '2000-10-04');
EXECUTE drop_col_prepare_select(1, '2000-10-05');
EXECUTE drop_col_prepare_select(1, '2000-10-06');
EXECUTE drop_col_prepare_select(1, '2000-10-07');
-- drop another column before attaching another partition
-- with .. PARTITION OF .. syntax
ALTER TABLE sensors DROP COLUMN col_to_drop_0;
CREATE TABLE sensors_2001 PARTITION OF sensors FOR VALUES FROM ('2001-01-01') TO ('2002-01-01');
-- drop another column before attaching another partition
-- with ALTER TABLE .. ATTACH PARTITION
ALTER TABLE sensors DROP COLUMN col_to_drop_2;
CREATE TABLE sensors_2002(
col_to_drop_4 date, col_to_drop_3 inet, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
ALTER TABLE sensors ATTACH PARTITION sensors_2002 FOR VALUES FROM ('2002-01-01') TO ('2003-01-01');
-- drop another column before attaching another partition
-- that is already distributed
ALTER TABLE sensors DROP COLUMN col_to_drop_3;
CREATE TABLE sensors_2003(
col_to_drop_4 date, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
SELECT create_distributed_table('sensors_2003', 'measureid');
ALTER TABLE sensors ATTACH PARTITION sensors_2003 FOR VALUES FROM ('2003-01-01') TO ('2004-01-01');
CREATE TABLE sensors_2004(
col_to_drop_4 date, measureid integer NOT NULL, eventdatetime date NOT NULL, measure_data jsonb NOT NULL);
ALTER TABLE sensors ATTACH PARTITION sensors_2004 FOR VALUES FROM ('2004-01-01') TO ('2005-01-01');
ALTER TABLE sensors DROP COLUMN col_to_drop_4;
-- show that all partitions have the same distribution key
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
-- show that all the tables prune to the same shard for the same distribution key
WITH
sensors_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors', 3)),
sensors_2000_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2000', 3)),
sensors_2001_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2001', 3)),
sensors_2002_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2002', 3)),
sensors_2003_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2003', 3)),
sensors_2004_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2004', 3)),
all_shardids AS (SELECT * FROM sensors_shardid UNION SELECT * FROM sensors_2000_shardid UNION
SELECT * FROM sensors_2001_shardid UNION SELECT * FROM sensors_2002_shardid
UNION SELECT * FROM sensors_2003_shardid UNION SELECT * FROM sensors_2004_shardid)
SELECT logicalrelid, shardid, shardminvalue, shardmaxvalue FROM pg_dist_shard WHERE shardid IN (SELECT * FROM all_shardids);
VACUUM ANALYZE sensors, sensors_2000, sensors_2001, sensors_2002, sensors_2003;
-- show that both INSERT and SELECT can route to a single node when distribution
-- key is provided in the query
EXPLAIN (COSTS FALSE) INSERT INTO sensors VALUES (3, '2000-02-02', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2000 VALUES (3, '2000-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2001 VALUES (3, '2001-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2002 VALUES (3, '2002-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2003 VALUES (3, '2003-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2000 WHERE measureid = 3;
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2001 WHERE measureid = 3;
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2002 WHERE measureid = 3;
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2003 WHERE measureid = 3;
-- execute 7 times to make sure it is re-cached
EXECUTE drop_col_prepare_insert(3, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2001-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2002-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(4, '2003-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(5, '2003-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(3, '2000-10-01');
EXECUTE drop_col_prepare_select(3, '2001-10-01');
EXECUTE drop_col_prepare_select(3, '2002-10-01');
EXECUTE drop_col_prepare_select(3, '2003-10-01');
EXECUTE drop_col_prepare_select(3, '2003-10-02');
EXECUTE drop_col_prepare_select(4, '2003-10-03');
EXECUTE drop_col_prepare_select(5, '2003-10-04');
-- non-fast router planner queries should also work
-- so we switched to DEBUG2 to show that dist. key
-- and the query is router
SET client_min_messages TO DEBUG2;
SELECT count(*) FROM (
SELECT * FROM sensors WHERE measureid = 3
UNION
SELECT * FROM sensors_2000 WHERE measureid = 3
UNION
SELECT * FROM sensors_2001 WHERE measureid = 3
UNION
SELECT * FROM sensors_2002 WHERE measureid = 3
UNION
SELECT * FROM sensors_2003 WHERE measureid = 3
UNION
SELECT * FROM sensors_2004 WHERE measureid = 3
) as foo;
RESET client_min_messages;
-- show that all partitions have the same distribution key
-- even after alter_distributed_table changes the shard count
-- remove this comment once https://github.com/citusdata/citus/issues/5137 is fixed
--SELECT alter_distributed_table('sensors', shard_count:='3');
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
SET client_min_messages TO WARNING;
DROP SCHEMA drop_column_partitioned_table CASCADE;

View File

@ -100,7 +100,7 @@ SELECT * FROM test WHERE x = 1;
-- add the the follower as secondary nodes and try again, the SELECT statement
-- should work this time
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
SET search_path TO single_node;
SELECT 1 FROM master_add_node('localhost', :follower_master_port, groupid => 0, noderole => 'secondary');
@ -138,7 +138,7 @@ SELECT count(*) FROM test WHERE false GROUP BY GROUPING SETS (x,y);
RESET citus.task_assignment_policy;
-- Cleanup
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
SET search_path TO single_node;
SET client_min_messages TO WARNING;
DROP SCHEMA single_node CASCADE;

View File

@ -0,0 +1,107 @@
--
-- master_update_table_statistics.sql
--
-- Test master_update_table_statistics function on both
-- hash and append distributed tables
-- This function updates shardlength, shardminvalue and shardmaxvalue
--
SET citus.next_shard_id TO 981000;
SET citus.next_placement_id TO 982000;
SET citus.shard_count TO 8;
SET citus.shard_replication_factor TO 2;
-- test with a hash-distributed table
-- here we update only shardlength, not shardminvalue and shardmaxvalue
CREATE TABLE test_table_statistics_hash (id int);
SELECT create_distributed_table('test_table_statistics_hash', 'id');
-- populate table
INSERT INTO test_table_statistics_hash SELECT i FROM generate_series(0, 10000)i;
-- originally shardlength (size of the shard) is zero
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue AS shardminvalue,
ds.shardmaxvalue AS shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_hash') AND dsp.shardlength = 0
ORDER BY 2, 3;
-- setting this to on in order to verify that we use a distributed transaction id
-- to run the size queries from different connections
-- this is going to help detect deadlocks
SET citus.log_remote_commands TO ON;
-- setting this to sequential in order to have a deterministic order
-- in the output of citus.log_remote_commands
SET citus.multi_shard_modify_mode TO sequential;
-- update table statistics and then check that shardlength has changed
-- but shardminvalue and shardmaxvalue stay the same because this is
-- a hash distributed table
SELECT master_update_table_statistics('test_table_statistics_hash');
RESET citus.log_remote_commands;
RESET citus.multi_shard_modify_mode;
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue as shardminvalue,
ds.shardmaxvalue as shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_hash') AND dsp.shardlength > 0
ORDER BY 2, 3;
-- check with an append-distributed table
-- here we update shardlength, shardminvalue and shardmaxvalue
CREATE TABLE test_table_statistics_append (id int);
SELECT create_distributed_table('test_table_statistics_append', 'id', 'append');
COPY test_table_statistics_append FROM PROGRAM 'echo 0 && echo 1 && echo 2 && echo 3' WITH CSV;
COPY test_table_statistics_append FROM PROGRAM 'echo 4 && echo 5 && echo 6 && echo 7' WITH CSV;
-- originally shardminvalue and shardmaxvalue will be 0,3 and 4, 7
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue as shardminvalue,
ds.shardmaxvalue as shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_append')
ORDER BY 2, 3;
-- delete some data to change shardminvalues of a shards
DELETE FROM test_table_statistics_append WHERE id = 0 OR id = 4;
SET citus.log_remote_commands TO ON;
SET citus.multi_shard_modify_mode TO sequential;
-- update table statistics and then check that shardminvalue has changed
-- shardlength (shardsize) is still 8192 since there is very few data
SELECT master_update_table_statistics('test_table_statistics_append');
RESET citus.log_remote_commands;
RESET citus.multi_shard_modify_mode;
SELECT
ds.logicalrelid::regclass::text AS tablename,
ds.shardid AS shardid,
dsp.placementid AS placementid,
shard_name(ds.logicalrelid, ds.shardid) AS shardname,
ds.shardminvalue as shardminvalue,
ds.shardmaxvalue as shardmaxvalue
FROM pg_dist_shard ds JOIN pg_dist_shard_placement dsp USING (shardid)
WHERE ds.logicalrelid::regclass::text in ('test_table_statistics_append')
ORDER BY 2, 3;
DROP TABLE test_table_statistics_hash, test_table_statistics_append;
ALTER SYSTEM RESET citus.shard_count;
ALTER SYSTEM RESET citus.shard_replication_factor;

View File

@ -193,6 +193,44 @@ SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.4-1';
SELECT * FROM print_extension_changes();
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.4-2';
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.4-2';
SELECT * FROM print_extension_changes();
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
SELECT * FROM print_extension_changes();
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
ALTER EXTENSION citus UPDATE TO '9.4-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM print_extension_changes();
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM print_extension_changes();
-- Test downgrade to 9.4-1 from 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
@ -215,6 +253,44 @@ SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.5-1';
SELECT * FROM print_extension_changes();
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.5-2';
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.5-2';
SELECT * FROM print_extension_changes();
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
SELECT * FROM print_extension_changes();
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
ALTER EXTENSION citus UPDATE TO '9.5-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM print_extension_changes();
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM print_extension_changes();
DROP TABLE prev_objects, extension_diff;
-- show running version

View File

@ -151,7 +151,7 @@ SELECT * FROM reference_table ORDER BY a;
INSERT INTO citus_local_table (a, b, z) VALUES (1, 2, 3);
SELECT * FROM citus_local_table ORDER BY a;
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
DROP TABLE the_table;
DROP TABLE reference_table;
DROP TABLE citus_local_table;

View File

@ -54,7 +54,7 @@ SELECT * FROM the_table;
-- add the secondary nodes and try again, the SELECT statement should work this time
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
SELECT 1 FROM master_add_node('localhost', :follower_worker_1_port,
groupid => (SELECT groupid FROM pg_dist_node WHERE nodeport = :worker_1_port),
@ -106,12 +106,12 @@ order by s_i_id;
-- now move the secondary nodes into the new cluster and see that the follower, finally
-- correctly configured, can run select queries involving them
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
UPDATE pg_dist_node SET nodecluster = 'second-cluster' WHERE noderole = 'secondary';
\c "port=9070 dbname=regression options='-c\ citus.use_secondary_nodes=always\ -c\ citus.cluster_name=second-cluster'"
SELECT * FROM the_table;
-- clean up after ourselves
\c - - - :master_port
\c -reuse-previous=off regression - - :master_port
DROP TABLE the_table;
DROP TABLE stock;

View File

@ -563,5 +563,138 @@ CREATE TABLE self_referencing_reference_table(
SELECT create_reference_table('self_referencing_reference_table');
ALTER TABLE self_referencing_reference_table ADD CONSTRAINT fk FOREIGN KEY(id, other_column_ref) REFERENCES self_referencing_reference_table(id, other_column);
-- make sure that if fkey is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (x) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
ALTER TABLE dropfkeytest2 DROP CONSTRAINT fkey1;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that if a column that is in a fkey is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ALTER TABLE dropfkeytest2 DROP COLUMN y;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a column that is in a multi-column index is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int, y int);
CREATE UNIQUE INDEX indd ON dropfkeytest1(x, y);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (x, y) REFERENCES dropfkeytest1(x, y);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ALTER TABLE dropfkeytest2 DROP COLUMN y CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a column that is in a multi-column fkey is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int, y int);
CREATE UNIQUE INDEX indd ON dropfkeytest1(x, y);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (x, y) REFERENCES dropfkeytest1(x, y);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ALTER TABLE dropfkeytest1 DROP COLUMN y CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if an index which fkey relies on is dropped
-- Citus can see up-to date fkey graph
-- also irrelevant index drops doesn't affect this
CREATE TABLE dropfkeytest1 (x int);
CREATE UNIQUE INDEX i1 ON dropfkeytest1(x);
CREATE UNIQUE INDEX unrelated_idx ON dropfkeytest1(x);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
DROP INDEX unrelated_idx CASCADE;
-- should still error out since we didn't drop the index that foreign key depends on
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP INDEX i1 CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a uniqueness constraint which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ALTER TABLE dropfkeytest1 DROP CONSTRAINT dropfkeytest1_x_key CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a primary key which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int primary key);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
ALTER TABLE dropfkeytest1 DROP CONSTRAINT dropfkeytest1_pkey CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest1, dropfkeytest2 CASCADE;
-- make sure that even if a schema which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE SCHEMA fkeytestsc;
CREATE TABLE fkeytestsc.dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES fkeytestsc.dropfkeytest1(x);
SELECT create_distributed_table ('fkeytestsc.dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
DROP SCHEMA fkeytestsc CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
DROP TABLE dropfkeytest2 CASCADE;
-- make sure that even if a table which fkey depends on is dropped
-- Citus can see up-to date fkey graph
CREATE TABLE dropfkeytest1 (x int unique);
CREATE TABLE dropfkeytest2 (x int8, y int8);
ALTER TABLE dropfkeytest2 ADD CONSTRAINT fkey1 FOREIGN KEY (y) REFERENCES dropfkeytest1(x);
SELECT create_distributed_table ('dropfkeytest1', 'x');
-- this should error out
SELECT create_distributed_table ('dropfkeytest2', 'y', colocate_with:='none');
DROP TABLE dropfkeytest1 CASCADE;
-- this should work
SELECT create_distributed_table ('dropfkeytest2', 'x', colocate_with:='none');
-- we no longer need those tables
DROP TABLE referenced_by_reference_table, references_to_reference_table, reference_table, reference_table_second, referenced_local_table, self_referencing_reference_table;
DROP TABLE referenced_by_reference_table, references_to_reference_table, reference_table, reference_table_second, referenced_local_table, self_referencing_reference_table, dropfkeytest2;

View File

@ -27,8 +27,21 @@ SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
-- orders table. These shard sets don't overlap, so join pruning should prune
-- out all the shards, and leave us with an empty task list.
select * from pg_dist_shard
where logicalrelid='lineitem'::regclass or
logicalrelid='orders'::regclass
order by shardid;
set citus.explain_distributed_queries to on;
-- explain the query before actually executing it
EXPLAIN SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
set citus.explain_distributed_queries to off;
set client_min_messages to debug3;
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
set client_min_messages to debug2;
-- Make sure that we can handle filters without a column
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders

View File

@ -27,6 +27,30 @@ CREATE FUNCTION mark_node_readonly(hostname TEXT, port INTEGER, isreadonly BOOLE
ARRAY['SELECT pg_reload_conf()'], false);
$$;
CREATE OR REPLACE FUNCTION trigger_metadata_sync()
RETURNS void
LANGUAGE C STRICT
AS 'citus';
CREATE OR REPLACE FUNCTION raise_error_in_metadata_sync()
RETURNS void
LANGUAGE C STRICT
AS 'citus';
CREATE PROCEDURE wait_until_process_count(appname text, target_count int) AS $$
declare
counter integer := -1;
begin
while counter != target_count loop
-- pg_stat_activity is cached at xact level and there is no easy way to clear it.
-- Look it up in a new connection to get latest updates.
SELECT result::int into counter FROM
master_run_on_worker(ARRAY['localhost'], ARRAY[57636], ARRAY[
'SELECT count(*) FROM pg_stat_activity WHERE application_name = ' || quote_literal(appname) || ';'], false);
PERFORM pg_sleep(0.1);
end loop;
end$$ LANGUAGE plpgsql;
-- add a node to the cluster
SELECT master_add_node('localhost', :worker_1_port) As nodeid_1 \gset
SELECT nodeid, nodename, nodeport, hasmetadata, metadatasynced FROM pg_dist_node;
@ -79,6 +103,54 @@ END;
SELECT wait_until_metadata_sync(30000);
SELECT nodeid, hasmetadata, metadatasynced FROM pg_dist_node;
-- verify that metadata sync daemon has started
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
--
-- terminate maintenance daemon, and verify that we don't spawn multiple
-- metadata sync daemons
--
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE application_name = 'Citus Maintenance Daemon';
CALL wait_until_process_count('Citus Maintenance Daemon', 1);
select trigger_metadata_sync();
select wait_until_metadata_sync();
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
--
-- cancel metadata sync daemon, and verify that it exits and restarts.
--
select pid as pid_before_cancel from pg_stat_activity where application_name like 'Citus Met%' \gset
select pg_cancel_backend(pid) from pg_stat_activity where application_name = 'Citus Metadata Sync Daemon';
select wait_until_metadata_sync();
select pid as pid_after_cancel from pg_stat_activity where application_name like 'Citus Met%' \gset
select :pid_before_cancel != :pid_after_cancel AS metadata_sync_restarted;
--
-- cancel metadata sync daemon so it exits and restarts, but at the
-- same time tell maintenanced to trigger a new metadata sync. One
-- of these should exit to avoid multiple metadata syncs.
--
select pg_cancel_backend(pid) from pg_stat_activity where application_name = 'Citus Metadata Sync Daemon';
select trigger_metadata_sync();
select wait_until_metadata_sync();
-- we assume citus.metadata_sync_retry_interval is 500ms. Change amount we sleep to ceiling + 0.2 if it changes.
select pg_sleep(1.2);
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
--
-- error in metadata sync daemon, and verify it exits and restarts.
--
select pid as pid_before_error from pg_stat_activity where application_name like 'Citus Met%' \gset
select raise_error_in_metadata_sync();
select wait_until_metadata_sync(30000);
select pid as pid_after_error from pg_stat_activity where application_name like 'Citus Met%' \gset
select :pid_before_error != :pid_after_error AS metadata_sync_restarted;
SELECT trigger_metadata_sync();
SELECT wait_until_metadata_sync(30000);
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Metadata Sync Daemon';
-- update it back to :worker_1_port, now metadata should be synced
SELECT 1 FROM master_update_node(:nodeid_1, 'localhost', :worker_1_port);
SELECT wait_until_metadata_sync(30000);
@ -249,6 +321,39 @@ SELECT 1 FROM master_activate_node('localhost', :worker_2_port);
SELECT verify_metadata('localhost', :worker_1_port);
-- verify that metadata sync daemon exits
call wait_until_process_count('Citus Metadata Sync Daemon', 0);
-- verify that DROP DATABASE terminates metadata sync
SELECT current_database() datname \gset
CREATE DATABASE db_to_drop;
SELECT run_command_on_workers('CREATE DATABASE db_to_drop');
\c db_to_drop - - :worker_1_port
CREATE EXTENSION citus;
\c db_to_drop - - :master_port
CREATE EXTENSION citus;
SELECT master_add_node('localhost', :worker_1_port);
UPDATE pg_dist_node SET hasmetadata = true;
SELECT master_update_node(nodeid, 'localhost', 12345) FROM pg_dist_node;
CREATE OR REPLACE FUNCTION trigger_metadata_sync()
RETURNS void
LANGUAGE C STRICT
AS 'citus';
SELECT trigger_metadata_sync();
\c :datname - - :master_port
SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
DROP DATABASE db_to_drop;
SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
-- cleanup
DROP TABLE ref_table;
TRUNCATE pg_dist_colocation;

View File

@ -218,5 +218,75 @@ SELECT * FROM numeric_test WHERE id = 21;
SELECT * FROM numeric_test WHERE id = 21::numeric;
SELECT * FROM numeric_test WHERE id = 21.1::numeric;
CREATE TABLE range_dist_table_1 (dist_col BIGINT);
SELECT create_distributed_table('range_dist_table_1', 'dist_col', 'range');
CALL public.create_range_partitioned_shards('range_dist_table_1', '{1000,3000,6000}', '{2000,4000,7000}');
INSERT INTO range_dist_table_1 VALUES (1001);
INSERT INTO range_dist_table_1 VALUES (3800);
INSERT INTO range_dist_table_1 VALUES (6500);
-- all were returning false before fixing #5077
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2999;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2999;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2500;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2000;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 1001;
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1001;
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col > 1000;
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1000;
-- we didn't have such an off-by-one error in upper bound
-- calculation, but let's test such cases too
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4001;
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 4001;
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4500;
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 6000;
-- now test with composite type and more shards
CREATE TYPE comp_type AS (
int_field_1 BIGINT,
int_field_2 BIGINT
);
CREATE TYPE comp_type_range AS RANGE (
subtype = comp_type);
CREATE TABLE range_dist_table_2 (dist_col comp_type);
SELECT create_distributed_table('range_dist_table_2', 'dist_col', 'range');
CALL public.create_range_partitioned_shards(
'range_dist_table_2',
'{"(10,24)","(10,58)",
"(10,90)","(20,100)"}',
'{"(10,25)","(10,65)",
"(10,99)","(20,100)"}');
INSERT INTO range_dist_table_2 VALUES ((10, 24));
INSERT INTO range_dist_table_2 VALUES ((10, 60));
INSERT INTO range_dist_table_2 VALUES ((10, 91));
INSERT INTO range_dist_table_2 VALUES ((20, 100));
SELECT dist_col='(10, 60)'::comp_type FROM range_dist_table_2
WHERE dist_col >= '(10,26)'::comp_type AND
dist_col <= '(10,75)'::comp_type;
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type AND
dist_col <= '(10,95)'::comp_type
ORDER BY dist_col;
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type
ORDER BY dist_col;
SELECT dist_col='(20,100)'::comp_type FROM range_dist_table_2
WHERE dist_col > '(20,99)'::comp_type;
DROP TABLE range_dist_table_1, range_dist_table_2;
DROP TYPE comp_type CASCADE;
SET search_path TO public;
DROP SCHEMA prune_shard_list CASCADE;

View File

@ -23,7 +23,7 @@ WITH dist_node_summary AS (
ARRAY[dist_node_summary.query, dist_node_summary.query],
false)
), dist_placement_summary AS (
SELECT 'SELECT jsonb_agg(pg_dist_placement ORDER BY shardid) FROM pg_dist_placement)' AS query
SELECT 'SELECT jsonb_agg(pg_dist_placement ORDER BY shardid) FROM pg_dist_placement' AS query
), dist_placement_check AS (
SELECT count(distinct result) = 1 AS matches
FROM dist_placement_summary CROSS JOIN LATERAL

View File

@ -148,7 +148,8 @@ COMMIT;
-- now, some of the optional connections would be skipped,
-- and only 5 connections are used per node
BEGIN;
SELECT count(*), pg_sleep(0.1) FROM test;
SET LOCAL citus.max_adaptive_executor_pool_size TO 16;
with cte_1 as (select pg_sleep(0.1) is null, a from test) SELECT a from cte_1 ORDER By 1 LIMIT 1;
SELECT
connection_count_to_node
FROM
@ -338,14 +339,16 @@ BEGIN;
hostname, port;
ROLLBACK;
-- INSERT SELECT with RETURNING/ON CONFLICT clauses should honor shared_pool_size
-- INSERT SELECT with RETURNING/ON CONFLICT clauses does not honor shared_pool_size
-- in underlying COPY commands
BEGIN;
SELECT pg_sleep(0.1);
INSERT INTO test SELECT i FROM generate_series(0,10) i RETURNING *;
-- make sure that we hit at least 4 shards per node, where 20 rows
-- is enough
INSERT INTO test SELECT i FROM generate_series(0,20) i RETURNING *;
SELECT
connection_count_to_node
connection_count_to_node > current_setting('citus.max_shared_pool_size')::int
FROM
citus_remote_connection_stats()
WHERE

View File

@ -64,7 +64,6 @@ EXECUTE subquery_prepare_without_param;
EXECUTE subquery_prepare_without_param;
EXECUTE subquery_prepare_without_param;
EXECUTE subquery_prepare_without_param;
EXECUTE subquery_prepare_without_param;
EXECUTE subquery_prepare_param_on_partkey(1);

View File

@ -66,6 +66,8 @@ def main(config):
common.run_pg_regress(config.old_bindir, config.pg_srcdir,
NODE_PORTS[COORDINATOR_NAME], AFTER_PG_UPGRADE_SCHEDULE)
citus_prepare_pg_upgrade(config.old_bindir)
# prepare should be idempotent, calling it a second time should never fail.
citus_prepare_pg_upgrade(config.old_bindir)
common.stop_databases(config.old_bindir, config.old_datadir)