Compare commits

...

33 Commits

Author SHA1 Message Date
Hanefi Onaldi 280bc704d0
Bump Citus version to 10.1.2 2021-08-16 17:26:07 +03:00
Hanefi Onaldi fb0bb40225
Add changelog entries for 10.1.2
(cherry picked from commit da29a57837)
2021-08-16 17:24:45 +03:00
Onder Kalaci 9f4b6a6cb9 Guard against hard WaitEvenSet errors
In short, add wrappers around Postgres' AddWaitEventToSet() and
ModifyWaitEvent().

AddWaitEventToSet()/ModifyWaitEvent*() may throw hard errors. For
example, when the underlying socket for a connection is closed by
the remote server and already reflected by the OS, however
Citus hasn't had a chance to get this information. In that case,
if replication factor is >1, Citus can failover to other nodes
for executing the query. Even if replication factor = 1, Citus
can give much nicer errors.

So CitusAddWaitEventSetToSet()/CitusModifyWaitEvent() simply puts
AddWaitEventToSet()/ModifyWaitEvent() into a PG_TRY/PG_CATCH block
in order to catch any hard errors, and returns this information to
the caller.
2021-08-10 09:36:11 +02:00
Onder Kalaci 3c5ea1b1f2 Adjust the tests to earlier versions
- Drop PRIMARY KEY for Citus 10 compatibility
- Drop columnar for PG 12
- Do not start/stop metadata sync as stop is not implemented in 10.1
2021-08-06 15:56:17 +02:00
Onder Kalaci 0eb5c144ed Dropped columns do not diverge distribution column for partitioned tables
Before this commit, creating a partition after a DROP column
on the parent (position before dist. key) was leading to
partition to have the wrong distribution column.
2021-08-06 13:42:40 +02:00
Hanefi Onaldi b3947510b9
Bump Citus version to 10.1.1 2021-08-05 20:50:42 +03:00
Hanefi Onaldi e64e627e31
Add changelog entries for 10.1.1
(cherry picked from commit bc5553b5d1)
2021-08-05 20:49:43 +03:00
naisila c456a933f0 Fix master_update_table_statistics scripts for 9.5 2021-08-03 16:46:05 +03:00
naisila 86f1e181c4 Fix master_update_table_statistics scripts for 9.4 2021-08-03 16:46:05 +03:00
Jelte Fennema 84410da2ba Fix showing target shard size in the rebalance progress monitor (#5136)
The progress monitor wouldn't actually update the size of the shard on
the target node when using "block_writes" as the `shard_transfer_mode`.
The reason for this is that the CREATE TABLE part of the shard creation
would only be committed once all data was moved as well. This caused
our size calculation to always return 0, since the table did not exist
yet in the session that the progress monitor used.

This is fixed by first committing creation of the table, and only then
starting the actual data copy.

The test output changes slightly. Apparently splitting this up in two
transactions instead of one, increases the table size after the copy by
about 40kB. The additional size used doesn't increase when with the
amount of data in the table is larger (it stays ~40kB per shard). So 
this small change in test output is not considered an actual problem.

(cherry picked from commit 2aa67421a7)
2021-07-23 16:49:39 +02:00
Önder Kalacı 106d68fd61
CLUSTER ON deparser should consider schemas (#5122)
(cherry picked from commit 87a51ae552)
2021-07-16 19:16:37 +03:00
Hanefi Onaldi f571abcca6
Add changelog entries for 10.1.0
This patch also moves the section to the top of the changelog

(cherry picked from commit 6b4996f47e)
2021-07-16 18:09:17 +03:00
Sait Talha Nisanci 6fee3068e3
Not include to-be-deleted shards while finding shard placements
Ignore orphaned shards in more places

Only use active shard placements in RouterInsertTaskList

Use IncludingOrphanedPlacements in some more places

Fix comment

Add tests

(cherry picked from commit e7ed16c296)

Conflicts:
	src/backend/distributed/planner/multi_router_planner.c

Quite trivial conflict that was easy to resolve
2021-07-14 19:28:32 +03:00
Jelte Fennema 6f400dab58
Fix check to always allow foreign keys to reference tables (#5073)
With the previous version of this check we would disallow distributed
tables that did not have a colocationid, to have a foreign key to a
reference table. This fixes that, since there's no reason to disallow
that.

(cherry picked from commit e9bfb8eddd)
2021-07-14 19:09:58 +03:00
Jelte Fennema 90da684f56
Only allow moves of shards of distributed tables (#5072)
Moving shards of reference tables was possible in at least one case:
```sql
select citus_disable_node('localhost', 9702);
create table r(x int);
select create_reference_table('r');
set citus.replicate_reference_tables_on_activate = off;
select citus_activate_node('localhost', 9702);
select citus_move_shard_placement(102008, 'localhost', 9701, 'localhost', 9702);
```

This would then remove the reference table shard on the source, causing
all kinds of issues. This fixes that by disallowing all shard moves
except for shards of distributed tables.

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
(cherry picked from commit d1d386a904)
2021-07-14 19:07:45 +03:00
Jelte Fennema 6986ac2f17
Avoid two race conditions in the rebalance progress monitor (#5050)
The first and main issue was that we were putting absolute pointers into
shared memory for the `steps` field of the `ProgressMonitorData`. This
pointer was being overwritten every time a process requested the monitor
steps, which is the only reason why this even worked in the first place.

To quote a part of a relevant stack overflow answer:

> First of all, putting absolute pointers in shared memory segments is
> terrible terible idea - those pointers would only be valid in the
> process that filled in their values. Shared memory segments are not
> guaranteed to attach at the same virtual address in every process.
> On the contrary - they attach where the system deems it possible when
> `shmaddr == NULL` is specified on call to `shmat()`

Source: https://stackoverflow.com/a/10781921/2570866

In this case a race condition occurred when a second process overwrote
the pointer in between the first process its write and read of the steps
field.

This issue is fixed by not storing the pointer in shared memory anymore.
Instead we now calculate it's position every time we need it.

The second race condition I have not been able to trigger, but I found
it while investigating this. This issue was that we published the handle
of the shared memory segment, before we initialized the data in the
steps. This means that during initialization of the data, a call to
`get_rebalance_progress()` could read partial data in an unsynchronized
manner.

(cherry picked from commit ca00b63272)
2021-07-14 19:06:32 +03:00
Marco Slot 998b044fdc
Fix a bug that causes worker_create_or_alter_role to crash with NULL input
(cherry picked from commit a7e4d6c94a)
2021-07-14 13:56:43 +03:00
Naisila Puka 1507f32282
Fix nextval('seq_name'::text) bug, and schema for seq tests (#5046)
(cherry picked from commit e26b29d3bb)
2021-07-14 13:55:48 +03:00
Hanefi Onaldi 20e500f96b
Remove public schema dependency for 10.1 upgrades
This commit contains a subset of the changes that should be cherry
picked to 10.1 releases.

(cherry picked from commit efc5776451)
2021-07-09 12:12:19 +03:00
Hanefi Onaldi 60424534ef
Remove public schema dependency for 10.0 upgrades
This commit contains a subset of the changes that should be cherry
picked to 10.0 releases.

(cherry picked from commit 8e9cc229ff)
2021-07-09 12:12:01 +03:00
Nils Dijk fefaed37e7 fix 10.1-1 upgrade script to adhere to idempotency 2021-07-08 12:25:32 +02:00
Nils Dijk 5adc151e7c fix 9.5-2 upgrade script to adhere to idempotency 2021-07-08 12:25:32 +02:00
Nils Dijk 6192dc2bff Add test for idempotency of citus_prepare_pg_upgrade 2021-07-08 12:25:32 +02:00
Onur Tirtir 3f6e903722 Fix lower boundary calculation when pruning range dist table shards (#5082)
This happens only when we have a "<" or "<=" filter on distribution
column of a range distributed table and that filter falls in between
two shards.

When the filter falls in between two shards:

  If the filter is ">" or ">=", then UpperShardBoundary was
  returning "upperBoundIndex - 1", where upperBoundIndex is
  exclusive shard index used during binary seach.
  This is expected since upperBoundIndex is an exclusive
  index.

  If the filter is "<" or "<=", then LowerShardBoundary was
  returning "lowerBoundIndex + 1", where lowerBoundIndex is
  inclusive shard index used during binary seach.
  On the other hand, since lowerBoundIndex is an inclusive
  index, we should just return lowerBoundIndex instead of
  doing "+ 1". Before this commit, we were missing leftmost
  shard in such queries.

* Remove useless conditional branches

The branch that we delete from UpperShardBoundary was obviously useless.

The other one in LowerShardBoundary became useless after we remove "+ 1"
from there.

This indeed is another proof of what & how we are fixing with this pr.

* Improve comments and add more

* Add some tests for upper bound calculation too

(cherry picked from commit b118d4188e)
2021-07-07 13:14:27 +03:00
Marco Slot 9efd8e05d6
Fix PG upgrade scripts for 10.1 2021-07-06 16:08:47 +02:00
Marco Slot 210bcdcc08
Fix PG upgrade scripts for 10.0 2021-07-06 16:08:46 +02:00
Marco Slot d3417a5e34
Fix PG upgrade scripts for 9.5 2021-07-06 16:08:46 +02:00
Marco Slot e2330e8f87
Fix PG upgrade scripts for 9.4 2021-07-06 16:08:46 +02:00
Onder Kalaci 690dab316a fix regression tests to avoid any conflicts in enterprise 2021-06-22 08:49:48 +03:00
Onder Kalaci be6e372b27 Deparse/parse the local cached queries
With local query caching, we try to avoid deparse/parse stages as the
operation is too costly.

However, we can do deparse/parse operations once per cached queries, right
before we put the plan into the cache. With that, we avoid edge
cases like (4239) or (5038).

In a sense, we are making the local plan caching behave similar for non-cached
local/remote queries, by forcing to deparse the query once.

(cherry picked from commit 69ca943e58)
2021-06-22 08:24:15 +03:00
Onder Kalaci 3d6bc315ab Get ready for Improve index backed constraint creation for online rebalancer
See:
https://github.com/citusdata/citus-enterprise/issues/616
(cherry picked from commit bc09288651)
2021-06-22 08:23:57 +03:00
Ahmet Gedemenli 4a904e070d
Set table size to zero if no size is read (#5049) (#5056)
* Set table size to zero if no size is read

* Add comment to relation size bug fix

(cherry picked from commit 5115100db0)
2021-06-21 14:36:13 +03:00
Halil Ozan Akgul f8e06fb1ed Bump citus version to 10.1.0 2021-06-15 18:50:09 +03:00
93 changed files with 4092 additions and 588 deletions

View File

@ -1,4 +1,18 @@
### citus v10.1.0 (June 15, 2021) ###
### citus v10.1.2 (August 16, 2021) ###
* Allows more graceful failovers when replication factor > 1
* Fixes a bug that causes partitions to have wrong distribution key after
`DROP COLUMN`
### citus v10.1.1 (August 5, 2021) ###
* Improves citus_update_table_statistics and provides distributed deadlock
detection
* Fixes showing target shard size in the rebalance progress monitor
### citus v10.1.0 (July 14, 2021) ###
* Drops support for PostgreSQL 11
@ -53,16 +67,35 @@
* Removes length limits around partition names
* Removes dependencies on the existence of public schema
* Executor avoids opening extra connections
* Excludes orphaned shards while finding shard placements
* Preserves access method of materialized views when undistributing
or altering distributed tables
* Fixes a bug that allowed moving of shards belonging to a reference table
* Fixes a bug that can cause a crash when DEBUG4 logging is enabled
* Fixes a bug that causes pruning incorrect shard of a range distributed table
* Fixes a bug that causes worker_create_or_alter_role to crash with NULL input
* Fixes a bug where foreign key to reference table was disallowed
* Fixes a bug with local cached plans on tables with dropped columns
* Fixes data race in `get_rebalance_progress`
* Fixes error message for local table joins
* Fixes `FROM ONLY` queries on partitioned tables
* Fixes an issue that could cause citus_finish_pg_upgrade to fail
* Fixes error message for local table joins
* Fixes issues caused by omitting public schema in queries
* Fixes nested `SELECT` query with `UNION` bug
@ -77,10 +110,60 @@
* Fixes stale hostnames bug in prepared statements after `master_update_node`
* Fixes the relation size bug during rebalancing
* Fixes two race conditions in the get_rebalance_progress
* Fixes using 2PC when it might be necessary
* Preserves access method of materialized views when undistributing
or altering distributed tables
### citus v10.0.4 (July 14, 2021) ###
* Introduces `citus.local_hostname` GUC for connections to the current node
* Removes dependencies on the existence of public schema
* Removes limits around long partition names
* Fixes a bug that can cause a crash when DEBUG4 logging is enabled
* Fixes a bug that causes pruning incorrect shard of a range distributed table
* Fixes an issue that could cause citus_finish_pg_upgrade to fail
* Fixes FROM ONLY queries on partitioned tables
* Fixes issues caused by public schema being omitted in queries
* Fixes problems with concurrent calls of DropMarkedShards
* Fixes relname null bug when using parallel execution
* Fixes two race conditions in the get_rebalance_progress
### citus v9.5.6 (July 8, 2021) ###
* Fixes minor bug in `citus_prepare_pg_upgrade` that caused it to lose its
idempotency
### citus v9.5.5 (July 7, 2021) ###
* Adds a configure flag to enforce security
* Fixes a bug that causes pruning incorrect shard of a range distributed table
* Fixes an issue that could cause citus_finish_pg_upgrade to fail
### citus v9.4.5 (July 7, 2021) ###
* Adds a configure flag to enforce security
* Avoids re-using connections for intermediate results
* Fixes a bug that causes pruning incorrect shard of a range distributed table
* Fixes a bug that might cause self-deadlocks when COPY used in TX block
* Fixes an issue that could cause citus_finish_pg_upgrade to fail
### citus v8.3.3 (March 23, 2021) ###

18
configure vendored
View File

@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for Citus 10.1devel.
# Generated by GNU Autoconf 2.69 for Citus 10.1.2.
#
#
# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@ -579,8 +579,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='Citus'
PACKAGE_TARNAME='citus'
PACKAGE_VERSION='10.1devel'
PACKAGE_STRING='Citus 10.1devel'
PACKAGE_VERSION='10.1.2'
PACKAGE_STRING='Citus 10.1.2'
PACKAGE_BUGREPORT=''
PACKAGE_URL=''
@ -1260,7 +1260,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures Citus 10.1devel to adapt to many kinds of systems.
\`configure' configures Citus 10.1.2 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@ -1322,7 +1322,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of Citus 10.1devel:";;
short | recursive ) echo "Configuration of Citus 10.1.2:";;
esac
cat <<\_ACEOF
@ -1425,7 +1425,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
Citus configure 10.1devel
Citus configure 10.1.2
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
@ -1908,7 +1908,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by Citus $as_me 10.1devel, which was
It was created by Citus $as_me 10.1.2, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@ -5356,7 +5356,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by Citus $as_me 10.1devel, which was
This file was extended by Citus $as_me 10.1.2, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@ -5418,7 +5418,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
Citus config.status 10.1devel
Citus config.status 10.1.2
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"

View File

@ -5,7 +5,7 @@
# everyone needing autoconf installed, the resulting files are checked
# into the SCM.
AC_INIT([Citus], [10.1devel])
AC_INIT([Citus], [10.1.2])
AC_COPYRIGHT([Copyright (c) Citus Data, Inc.])
# we'll need sed and awk for some of the version commands

View File

@ -1071,6 +1071,30 @@ CreateDistributedTableLike(TableConversionState *con)
{
newShardCount = con->shardCount;
}
Oid originalRelationId = con->relationId;
if (con->originalDistributionKey != NULL && PartitionTable(originalRelationId))
{
/*
* Due to dropped columns, the partition tables might have different
* distribution keys than their parents, see issue #5123 for details.
*
* At this point, we get the partitioning information from the
* originalRelationId, but we get the distribution key for newRelationId.
*
* We have to do this, because the newRelationId is just a placeholder
* at this moment, but that's going to be the table in pg_dist_partition.
*/
Oid parentRelationId = PartitionParentOid(originalRelationId);
Var *parentDistKey = DistPartitionKey(parentRelationId);
char *parentDistKeyColumnName =
ColumnToColumnName(parentRelationId, nodeToString(parentDistKey));
newDistributionKey =
FindColumnWithNameOnTargetRelation(parentRelationId, parentDistKeyColumnName,
con->newRelationId);
}
char partitionMethod = PartitionMethod(con->relationId);
CreateDistributedTable(con->newRelationId, newDistributionKey, partitionMethod,
newShardCount, true, newColocateWith, false);

View File

@ -438,6 +438,24 @@ CreateDistributedTable(Oid relationId, Var *distributionColumn, char distributio
colocateWithTableName,
viaDeprecatedAPI);
/*
* Due to dropping columns, the parent's distribution key may not match the
* partition's distribution key. The input distributionColumn belongs to
* the parent. That's why we override the distribution column of partitions
* here. See issue #5123 for details.
*/
if (PartitionTable(relationId))
{
Oid parentRelationId = PartitionParentOid(relationId);
char *distributionColumnName =
ColumnToColumnName(parentRelationId, nodeToString(distributionColumn));
distributionColumn =
FindColumnWithNameOnTargetRelation(parentRelationId, distributionColumnName,
relationId);
}
/*
* ColocationIdForNewTable assumes caller acquires lock on relationId. In our case,
* our caller already acquired lock on relationId.

View File

@ -234,9 +234,9 @@ ErrorIfUnsupportedForeignConstraintExists(Relation relation, char referencingDis
*/
bool referencedIsReferenceTable =
(referencedReplicationModel == REPLICATION_MODEL_2PC);
if (referencingColocationId == INVALID_COLOCATION_ID ||
(referencingColocationId != referencedColocationId &&
!referencedIsReferenceTable))
if (!referencedIsReferenceTable && (
referencingColocationId == INVALID_COLOCATION_ID ||
referencingColocationId != referencedColocationId))
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot create foreign key constraint since "

View File

@ -2139,7 +2139,7 @@ ShardIntervalListHasLocalPlacements(List *shardIntervalList)
ShardInterval *shardInterval = NULL;
foreach_ptr(shardInterval, shardIntervalList)
{
if (FindShardPlacementOnGroup(localGroupId, shardInterval->shardId) != NULL)
if (ActiveShardPlacementOnGroup(localGroupId, shardInterval->shardId) != NULL)
{
return true;
}

View File

@ -25,6 +25,7 @@
#include "distributed/commands/utility_hook.h"
#include "distributed/deparser.h"
#include "distributed/deparse_shard_query.h"
#include "distributed/distribution_column.h"
#include "distributed/listutils.h"
#include "distributed/coordinator_protocol.h"
#include "distributed/metadata_sync.h"

View File

@ -377,7 +377,7 @@ EnsureConnectionPossibilityForNodeList(List *nodeList)
/*
* EnsureConnectionPossibilityForNode reserves a shared connection
* counter per node in the nodeList unless:
* - Reservation is possible/allowed (see IsReservationPossible())
* - Reservation is not possible/allowed (see IsReservationPossible())
* - there is at least one connection to the node so that we are guranteed
* to get a connection
* - An earlier call already reserved a connection (e.g., we allow only a

View File

@ -515,9 +515,21 @@ EnsureSequenceTypeSupported(Oid relationId, AttrNumber attnum, Oid seqTypId)
/* retrieve the sequence id of the sequence found in nextval('seq') */
List *sequencesFromAttrDef = GetSequencesFromAttrDef(attrdefOid);
/* to simplify and eliminate cases like "DEFAULT nextval('..') - nextval('..')" */
if (list_length(sequencesFromAttrDef) == 0)
{
/*
* We need this check because sometimes there are cases where the
* dependency between the table and the sequence is not formed
* One example is when the default is defined by
* DEFAULT nextval('seq_name'::text) (not by DEFAULT nextval('seq_name'))
* In these cases, sequencesFromAttrDef with be empty.
*/
return;
}
if (list_length(sequencesFromAttrDef) > 1)
{
/* to simplify and eliminate cases like "DEFAULT nextval('..') - nextval('..')" */
ereport(ERROR, (errmsg(
"More than one sequence in a column default"
" is not supported for distribution")));
@ -1013,12 +1025,13 @@ pg_get_indexclusterdef_string(Oid indexRelationId)
/* check if the table is clustered on this index */
if (indexForm->indisclustered)
{
char *tableName = generate_relation_name(tableRelationId, NIL);
char *qualifiedRelationName =
generate_qualified_relation_name(tableRelationId);
char *indexName = get_rel_name(indexRelationId); /* needs to be quoted */
initStringInfo(&buffer);
appendStringInfo(&buffer, "ALTER TABLE %s CLUSTER ON %s",
tableName, quote_identifier(indexName));
qualifiedRelationName, quote_identifier(indexName));
}
ReleaseSysCache(indexTuple);

View File

@ -176,6 +176,8 @@
#include "utils/timestamp.h"
#define SLOW_START_DISABLED 0
#define WAIT_EVENT_SET_INDEX_NOT_INITIALIZED -1
#define WAIT_EVENT_SET_INDEX_FAILED -2
/*
@ -656,6 +658,10 @@ static int UsableConnectionCount(WorkerPool *workerPool);
static long NextEventTimeout(DistributedExecution *execution);
static WaitEventSet * BuildWaitEventSet(List *sessionList);
static void RebuildWaitEventSetFlags(WaitEventSet *waitEventSet, List *sessionList);
static int CitusAddWaitEventSetToSet(WaitEventSet *set, uint32 events, pgsocket fd,
Latch *latch, void *user_data);
static bool CitusModifyWaitEvent(WaitEventSet *set, int pos, uint32 events,
Latch *latch);
static TaskPlacementExecution * PopPlacementExecution(WorkerSession *session);
static TaskPlacementExecution * PopAssignedPlacementExecution(WorkerSession *session);
static TaskPlacementExecution * PopUnassignedPlacementExecution(WorkerPool *workerPool);
@ -690,6 +696,8 @@ static void ExtractParametersForRemoteExecution(ParamListInfo paramListInfo,
Oid **parameterTypes,
const char ***parameterValues);
static int GetEventSetSize(List *sessionList);
static bool ProcessSessionsWithFailedWaitEventSetOperations(
DistributedExecution *execution);
static bool HasIncompleteConnectionEstablishment(DistributedExecution *execution);
static int RebuildWaitEventSet(DistributedExecution *execution);
static void ProcessWaitEvents(DistributedExecution *execution, WaitEvent *events, int
@ -2155,6 +2163,7 @@ FindOrCreateWorkerSession(WorkerPool *workerPool, MultiConnection *connection)
session->connection = connection;
session->workerPool = workerPool;
session->commandsSent = 0;
session->waitEventSetIndex = WAIT_EVENT_SET_INDEX_NOT_INITIALIZED;
dlist_init(&session->pendingTaskQueue);
dlist_init(&session->readyTaskQueue);
@ -2318,6 +2327,7 @@ RunDistributedExecution(DistributedExecution *execution)
ManageWorkerPool(workerPool);
}
bool skipWaitEvents = false;
if (execution->remoteTaskList == NIL)
{
/*
@ -2339,11 +2349,28 @@ RunDistributedExecution(DistributedExecution *execution)
}
eventSetSize = RebuildWaitEventSet(execution);
events = palloc0(eventSetSize * sizeof(WaitEvent));
skipWaitEvents =
ProcessSessionsWithFailedWaitEventSetOperations(execution);
}
else if (execution->waitFlagsChanged)
{
RebuildWaitEventSetFlags(execution->waitEventSet, execution->sessionList);
execution->waitFlagsChanged = false;
skipWaitEvents =
ProcessSessionsWithFailedWaitEventSetOperations(execution);
}
if (skipWaitEvents)
{
/*
* Some operation on the wait event set is failed, retry
* as we already removed the problematic connections.
*/
execution->rebuildWaitEventSet = true;
continue;
}
/* wait for I/O events */
@ -2392,6 +2419,51 @@ RunDistributedExecution(DistributedExecution *execution)
}
/*
* ProcessSessionsWithFailedEventSetOperations goes over the session list and
* processes sessions with failed wait event set operations.
*
* Failed sessions are not going to generate any further events, so it is our
* only chance to process the failure by calling into `ConnectionStateMachine`.
*
* The function returns true if any session failed.
*/
static bool
ProcessSessionsWithFailedWaitEventSetOperations(DistributedExecution *execution)
{
bool foundFailedSession = false;
WorkerSession *session = NULL;
foreach_ptr(session, execution->sessionList)
{
if (session->waitEventSetIndex == WAIT_EVENT_SET_INDEX_FAILED)
{
/*
* We can only lost only already connected connections,
* others are regular failures.
*/
MultiConnection *connection = session->connection;
if (connection->connectionState == MULTI_CONNECTION_CONNECTED)
{
connection->connectionState = MULTI_CONNECTION_LOST;
}
else
{
connection->connectionState = MULTI_CONNECTION_FAILED;
}
ConnectionStateMachine(session);
session->waitEventSetIndex = WAIT_EVENT_SET_INDEX_NOT_INITIALIZED;
foundFailedSession = true;
}
}
return foundFailedSession;
}
/*
* HasIncompleteConnectionEstablishment returns true if any of the connections
* that has been initiated by the executor is in initilization stage.
@ -5066,18 +5138,79 @@ BuildWaitEventSet(List *sessionList)
continue;
}
int waitEventSetIndex = AddWaitEventToSet(waitEventSet, connection->waitFlags,
sock, NULL, (void *) session);
int waitEventSetIndex =
CitusAddWaitEventSetToSet(waitEventSet, connection->waitFlags, sock,
NULL, (void *) session);
session->waitEventSetIndex = waitEventSetIndex;
}
AddWaitEventToSet(waitEventSet, WL_POSTMASTER_DEATH, PGINVALID_SOCKET, NULL, NULL);
AddWaitEventToSet(waitEventSet, WL_LATCH_SET, PGINVALID_SOCKET, MyLatch, NULL);
CitusAddWaitEventSetToSet(waitEventSet, WL_POSTMASTER_DEATH, PGINVALID_SOCKET, NULL,
NULL);
CitusAddWaitEventSetToSet(waitEventSet, WL_LATCH_SET, PGINVALID_SOCKET, MyLatch,
NULL);
return waitEventSet;
}
/*
* CitusAddWaitEventSetToSet is a wrapper around Postgres' AddWaitEventToSet().
*
* AddWaitEventToSet() may throw hard errors. For example, when the
* underlying socket for a connection is closed by the remote server
* and already reflected by the OS, however Citus hasn't had a chance
* to get this information. In that case, if replication factor is >1,
* Citus can failover to other nodes for executing the query. Even if
* replication factor = 1, Citus can give much nicer errors.
*
* So CitusAddWaitEventSetToSet simply puts ModifyWaitEvent into a
* PG_TRY/PG_CATCH block in order to catch any hard errors, and
* returns this information to the caller.
*/
static int
CitusAddWaitEventSetToSet(WaitEventSet *set, uint32 events, pgsocket fd,
Latch *latch, void *user_data)
{
volatile int waitEventSetIndex = WAIT_EVENT_SET_INDEX_NOT_INITIALIZED;
MemoryContext savedContext = CurrentMemoryContext;
PG_TRY();
{
waitEventSetIndex =
AddWaitEventToSet(set, events, fd, latch, (void *) user_data);
}
PG_CATCH();
{
/*
* We might be in an arbitrary memory context when the
* error is thrown and we should get back to one we had
* at PG_TRY() time, especially because we are not
* re-throwing the error.
*/
MemoryContextSwitchTo(savedContext);
FlushErrorState();
if (user_data != NULL)
{
WorkerSession *workerSession = (WorkerSession *) user_data;
ereport(DEBUG1, (errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("Adding wait event for node %s:%d failed. "
"The socket was: %d",
workerSession->workerPool->nodeName,
workerSession->workerPool->nodePort, fd)));
}
/* let the callers know about the failure */
waitEventSetIndex = WAIT_EVENT_SET_INDEX_FAILED;
}
PG_END_TRY();
return waitEventSetIndex;
}
/*
* GetEventSetSize returns the event set size for a list of sessions.
*/
@ -5121,11 +5254,68 @@ RebuildWaitEventSetFlags(WaitEventSet *waitEventSet, List *sessionList)
continue;
}
ModifyWaitEvent(waitEventSet, waitEventSetIndex, connection->waitFlags, NULL);
bool success =
CitusModifyWaitEvent(waitEventSet, waitEventSetIndex,
connection->waitFlags, NULL);
if (!success)
{
ereport(DEBUG1, (errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("Modifying wait event for node %s:%d failed. "
"The wait event index was: %d",
connection->hostname, connection->port,
waitEventSetIndex)));
session->waitEventSetIndex = WAIT_EVENT_SET_INDEX_FAILED;
}
}
}
/*
* CitusModifyWaitEvent is a wrapper around Postgres' ModifyWaitEvent().
*
* ModifyWaitEvent may throw hard errors. For example, when the underlying
* socket for a connection is closed by the remote server and already
* reflected by the OS, however Citus hasn't had a chance to get this
* information. In that case, if repliction factor is >1, Citus can
* failover to other nodes for executing the query. Even if replication
* factor = 1, Citus can give much nicer errors.
*
* So CitusModifyWaitEvent simply puts ModifyWaitEvent into a PG_TRY/PG_CATCH
* block in order to catch any hard errors, and returns this information to the
* caller.
*/
static bool
CitusModifyWaitEvent(WaitEventSet *set, int pos, uint32 events, Latch *latch)
{
volatile bool success = true;
MemoryContext savedContext = CurrentMemoryContext;
PG_TRY();
{
ModifyWaitEvent(set, pos, events, latch);
}
PG_CATCH();
{
/*
* We might be in an arbitrary memory context when the
* error is thrown and we should get back to one we had
* at PG_TRY() time, especially because we are not
* re-throwing the error.
*/
MemoryContextSwitchTo(savedContext);
FlushErrorState();
/* let the callers know about the failure */
success = false;
}
PG_END_TRY();
return success;
}
/*
* SetLocalForceMaxQueryParallelization is simply a C interface for setting
* the following:

View File

@ -300,7 +300,8 @@ CitusBeginReadOnlyScan(CustomScanState *node, EState *estate, int eflags)
* The plan will be cached across executions when originalDistributedPlan
* represents a prepared statement.
*/
CacheLocalPlanForShardQuery(task, originalDistributedPlan);
CacheLocalPlanForShardQuery(task, originalDistributedPlan,
estate->es_param_list_info);
}
}
@ -414,7 +415,8 @@ CitusBeginModifyScan(CustomScanState *node, EState *estate, int eflags)
* The plan will be cached across executions when originalDistributedPlan
* represents a prepared statement.
*/
CacheLocalPlanForShardQuery(task, originalDistributedPlan);
CacheLocalPlanForShardQuery(task, originalDistributedPlan,
estate->es_param_list_info);
}
MemoryContextSwitchTo(oldContext);

View File

@ -128,9 +128,6 @@ static void LogLocalCommand(Task *task);
static uint64 LocallyPlanAndExecuteMultipleQueries(List *queryStrings,
TupleDestination *tupleDest,
Task *task);
static void ExtractParametersForLocalExecution(ParamListInfo paramListInfo,
Oid **parameterTypes,
const char ***parameterValues);
static void ExecuteUdfTaskQuery(Query *localUdfCommandQuery);
static void EnsureTransitionPossible(LocalExecutionStatus from,
LocalExecutionStatus to);
@ -438,7 +435,7 @@ LocallyPlanAndExecuteMultipleQueries(List *queryStrings, TupleDestination *tuple
* value arrays. It does not change the oid of custom types, because the
* query will be run locally.
*/
static void
void
ExtractParametersForLocalExecution(ParamListInfo paramListInfo, Oid **parameterTypes,
const char ***parameterValues)
{

View File

@ -128,8 +128,8 @@ BuildPlacementAccessList(int32 groupId, List *relationShardList,
RelationShard *relationShard = NULL;
foreach_ptr(relationShard, relationShardList)
{
ShardPlacement *placement = FindShardPlacementOnGroup(groupId,
relationShard->shardId);
ShardPlacement *placement = ActiveShardPlacementOnGroup(groupId,
relationShard->shardId);
if (placement == NULL)
{
continue;

View File

@ -594,12 +594,14 @@ LoadShardPlacement(uint64 shardId, uint64 placementId)
/*
* FindShardPlacementOnGroup returns the shard placement for the given shard
* on the given group, or returns NULL if no placement for the shard exists
* on the group.
* ShardPlacementOnGroupIncludingOrphanedPlacements returns the shard placement
* for the given shard on the given group, or returns NULL if no placement for
* the shard exists on the group.
*
* NOTE: This can return inactive or orphaned placements.
*/
ShardPlacement *
FindShardPlacementOnGroup(int32 groupId, uint64 shardId)
ShardPlacementOnGroupIncludingOrphanedPlacements(int32 groupId, uint64 shardId)
{
ShardPlacement *placementOnNode = NULL;
@ -614,7 +616,6 @@ FindShardPlacementOnGroup(int32 groupId, uint64 shardId)
for (int placementIndex = 0; placementIndex < numberOfPlacements; placementIndex++)
{
GroupShardPlacement *placement = &placementArray[placementIndex];
if (placement->groupId == groupId)
{
placementOnNode = ResolveGroupShardPlacement(placement, tableEntry,
@ -627,6 +628,28 @@ FindShardPlacementOnGroup(int32 groupId, uint64 shardId)
}
/*
* ActiveShardPlacementOnGroup returns the active shard placement for the
* given shard on the given group, or returns NULL if no active placement for
* the shard exists on the group.
*/
ShardPlacement *
ActiveShardPlacementOnGroup(int32 groupId, uint64 shardId)
{
ShardPlacement *placement =
ShardPlacementOnGroupIncludingOrphanedPlacements(groupId, shardId);
if (placement == NULL)
{
return NULL;
}
if (placement->shardState != SHARD_STATE_ACTIVE)
{
return NULL;
}
return placement;
}
/*
* ResolveGroupShardPlacement takes a GroupShardPlacement and adds additional data to it,
* such as the node we should consider it to be on.
@ -791,13 +814,14 @@ LookupNodeForGroup(int32 groupId)
/*
* ShardPlacementList returns the list of placements for the given shard from
* the cache.
* the cache. This list includes placements that are orphaned, because they
* their deletion is postponed to a later point (shardstate = 4).
*
* The returned list is deep copied from the cache and thus can be modified
* and pfree()d freely.
*/
List *
ShardPlacementList(uint64 shardId)
ShardPlacementListIncludingOrphanedPlacements(uint64 shardId)
{
List *placementList = NIL;

View File

@ -644,7 +644,19 @@ DistributedTableSizeOnWorker(WorkerNode *workerNode, Oid relationId,
StringInfo tableSizeStringInfo = (StringInfo) linitial(sizeList);
char *tableSizeString = tableSizeStringInfo->data;
*tableSize = SafeStringToUint64(tableSizeString);
if (strlen(tableSizeString) > 0)
{
*tableSize = SafeStringToUint64(tableSizeString);
}
else
{
/*
* This means the shard is moved or dropped while citus_total_relation_size is
* being executed. For this case we get an empty string as table size.
* We can take that as zero to prevent any unnecessary errors.
*/
*tableSize = 0;
}
PQclear(result);
ClearResults(connection, failOnError);
@ -1080,7 +1092,7 @@ TableShardReplicationFactor(Oid relationId)
{
uint64 shardId = shardInterval->shardId;
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList = ShardPlacementListWithoutOrphanedPlacements(shardId);
uint32 shardPlacementCount = list_length(shardPlacementList);
/*
@ -1380,7 +1392,8 @@ List *
ActiveShardPlacementList(uint64 shardId)
{
List *activePlacementList = NIL;
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList =
ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *shardPlacement = NULL;
foreach_ptr(shardPlacement, shardPlacementList)
@ -1395,6 +1408,31 @@ ActiveShardPlacementList(uint64 shardId)
}
/*
* ShardPlacementListWithoutOrphanedPlacements returns shard placements exluding
* the ones that are orphaned, because they are marked to be deleted at a later
* point (shardstate = 4).
*/
List *
ShardPlacementListWithoutOrphanedPlacements(uint64 shardId)
{
List *activePlacementList = NIL;
List *shardPlacementList =
ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *shardPlacement = NULL;
foreach_ptr(shardPlacement, shardPlacementList)
{
if (shardPlacement->shardState != SHARD_STATE_TO_DELETE)
{
activePlacementList = lappend(activePlacementList, shardPlacement);
}
}
return SortList(activePlacementList, CompareShardPlacementsByWorker);
}
/*
* ActiveShardPlacement finds a shard placement for the given shardId from
* system catalog, chooses a placement that is in active state and returns
@ -1932,7 +1970,8 @@ UpdatePartitionShardPlacementStates(ShardPlacement *parentShardPlacement, char s
ColocatedShardIdInRelation(partitionOid, parentShardInterval->shardIndex);
ShardPlacement *partitionPlacement =
ShardPlacementOnGroup(partitionShardId, parentShardPlacement->groupId);
ShardPlacementOnGroupIncludingOrphanedPlacements(
parentShardPlacement->groupId, partitionShardId);
/* the partition should have a placement with the same group */
Assert(partitionPlacement != NULL);
@ -1942,28 +1981,6 @@ UpdatePartitionShardPlacementStates(ShardPlacement *parentShardPlacement, char s
}
/*
* ShardPlacementOnGroup gets a shardInterval and a groupId, returns a placement
* of the shard on the given group. If no such placement exists, the function
* return NULL.
*/
ShardPlacement *
ShardPlacementOnGroup(uint64 shardId, int groupId)
{
List *placementList = ShardPlacementList(shardId);
ShardPlacement *placement = NULL;
foreach_ptr(placement, placementList)
{
if (placement->groupId == groupId)
{
return placement;
}
}
return NULL;
}
/*
* MarkShardPlacementInactive is a wrapper around UpdateShardPlacementState where
* the state is set to SHARD_STATE_INACTIVE. It also marks partitions of the

View File

@ -287,7 +287,8 @@ CreateColocatedShards(Oid targetRelationId, Oid sourceRelationId, bool
int32 shardMaxValue = DatumGetInt32(sourceShardInterval->maxValue);
text *shardMinValueText = IntegerToText(shardMinValue);
text *shardMaxValueText = IntegerToText(shardMaxValue);
List *sourceShardPlacementList = ShardPlacementList(sourceShardId);
List *sourceShardPlacementList = ShardPlacementListWithoutOrphanedPlacements(
sourceShardId);
InsertShardRow(targetRelationId, newShardId, targetShardStorageType,
shardMinValueText, shardMaxValueText);
@ -295,11 +296,6 @@ CreateColocatedShards(Oid targetRelationId, Oid sourceRelationId, bool
ShardPlacement *sourcePlacement = NULL;
foreach_ptr(sourcePlacement, sourceShardPlacementList)
{
if (sourcePlacement->shardState == SHARD_STATE_TO_DELETE)
{
continue;
}
int32 groupId = sourcePlacement->groupId;
const ShardState shardState = SHARD_STATE_ACTIVE;
const uint64 shardSize = 0;

View File

@ -450,7 +450,8 @@ DropTaskList(Oid relationId, char *schemaName, char *relationName,
task->dependentTaskList = NULL;
task->replicationModel = REPLICATION_MODEL_INVALID;
task->anchorShardId = shardId;
task->taskPlacementList = ShardPlacementList(shardId);
task->taskPlacementList =
ShardPlacementListIncludingOrphanedPlacements(shardId);
taskList = lappend(taskList, task);
}

View File

@ -799,28 +799,32 @@ GatherIndexAndConstraintDefinitionList(Form_pg_index indexForm, List **indexDDLE
int indexFlags)
{
Oid indexId = indexForm->indexrelid;
char *statementDef = NULL;
bool indexImpliedByConstraint = IndexImpliedByAConstraint(indexForm);
/* get the corresponding constraint or index statement */
if (indexImpliedByConstraint)
{
Oid constraintId = get_index_constraint(indexId);
Assert(constraintId != InvalidOid);
if (indexFlags & INCLUDE_CREATE_CONSTRAINT_STATEMENTS)
{
Oid constraintId = get_index_constraint(indexId);
Assert(constraintId != InvalidOid);
statementDef = pg_get_constraintdef_command(constraintId);
/* include constraints backed by indexes only when explicitly asked */
char *statementDef = pg_get_constraintdef_command(constraintId);
*indexDDLEventList =
lappend(*indexDDLEventList,
makeTableDDLCommandString(statementDef));
}
}
else
else if (indexFlags & INCLUDE_CREATE_INDEX_STATEMENTS)
{
statementDef = pg_get_indexdef_string(indexId);
}
/* append found constraint or index definition to the list */
if (indexFlags & INCLUDE_CREATE_INDEX_STATEMENTS)
{
*indexDDLEventList = lappend(*indexDDLEventList, makeTableDDLCommandString(
statementDef));
/*
* Include indexes that are not backing constraints only when
* explicitly asked.
*/
char *statementDef = pg_get_indexdef_string(indexId);
*indexDDLEventList = lappend(*indexDDLEventList,
makeTableDDLCommandString(statementDef));
}
/* if table is clustered on this index, append definition to the list */

View File

@ -95,6 +95,15 @@ static void EnsureEnoughDiskSpaceForShardMove(List *colocatedShardList,
char *sourceNodeName, uint32 sourceNodePort,
char *targetNodeName, uint32
targetNodePort);
static List * RecreateShardDDLCommandList(ShardInterval *shardInterval,
const char *sourceNodeName,
int32 sourceNodePort);
static List * CopyShardContentsCommandList(ShardInterval *shardInterval,
const char *sourceNodeName,
int32 sourceNodePort);
static List * PostLoadShardCreationCommandList(ShardInterval *shardInterval,
const char *sourceNodeName,
int32 sourceNodePort);
/* declarations for dynamic loading */
@ -298,7 +307,7 @@ citus_move_shard_placement(PG_FUNCTION_ARGS)
ListCell *colocatedShardCell = NULL;
Oid relationId = RelationIdForShard(shardId);
ErrorIfMoveCitusLocalTable(relationId);
ErrorIfMoveUnsupportedTableType(relationId);
ErrorIfTargetNodeIsNotSafeToMove(targetNodeName, targetNodePort);
ShardInterval *shardInterval = LoadShardInterval(shardId);
@ -478,23 +487,40 @@ master_move_shard_placement(PG_FUNCTION_ARGS)
/*
* ErrorIfMoveCitusLocalTable is a helper function for rebalance_table_shards
* ErrorIfMoveUnsupportedTableType is a helper function for rebalance_table_shards
* and citus_move_shard_placement udf's to error out if relation with relationId
* is a citus local table.
* is not a distributed table.
*/
void
ErrorIfMoveCitusLocalTable(Oid relationId)
ErrorIfMoveUnsupportedTableType(Oid relationId)
{
if (!IsCitusTableType(relationId, CITUS_LOCAL_TABLE))
if (IsCitusTableType(relationId, DISTRIBUTED_TABLE))
{
return;
}
char *qualifiedRelationName = generate_qualified_relation_name(relationId);
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("table %s is a local table, moving shard of "
"a local table added to metadata is currently "
"not supported", qualifiedRelationName)));
if (!IsCitusTable(relationId))
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("table %s is a regular postgres table, you can "
"only move shards of a citus table",
qualifiedRelationName)));
}
else if (IsCitusTableType(relationId, CITUS_LOCAL_TABLE))
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("table %s is a local table, moving shard of "
"a local table added to metadata is currently "
"not supported", qualifiedRelationName)));
}
else if (IsCitusTableType(relationId, REFERENCE_TABLE))
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("table %s is a reference table, moving shard of "
"a reference table is not supported",
qualifiedRelationName)));
}
}
@ -731,7 +757,7 @@ RepairShardPlacement(int64 shardId, const char *sourceNodeName, int32 sourceNode
ddlCommandList);
/* after successful repair, we update shard state as healthy*/
List *placementList = ShardPlacementList(shardId);
List *placementList = ShardPlacementListWithoutOrphanedPlacements(shardId);
ShardPlacement *placement = SearchShardPlacementInListOrError(placementList,
targetNodeName,
targetNodePort);
@ -915,12 +941,38 @@ CopyShardTablesViaBlockWrites(List *shardIntervalList, char *sourceNodeName,
ShardInterval *shardInterval = NULL;
foreach_ptr(shardInterval, shardIntervalList)
{
bool includeDataCopy = !PartitionedTable(shardInterval->relationId);
List *ddlCommandList = CopyShardCommandList(shardInterval, sourceNodeName,
sourceNodePort, includeDataCopy);
/*
* For each shard we first create the shard table in a separate
* transaction and then we copy the data and create the indexes in a
* second separate transaction. The reason we don't do both in a single
* transaction is so we can see the size of the new shard growing
* during the copy when we run get_rebalance_progress in another
* session. If we wouldn't split these two phases up, then the table
* wouldn't be visible in the session that get_rebalance_progress uses.
* So get_rebalance_progress would always report its size as 0.
*/
List *ddlCommandList = RecreateShardDDLCommandList(shardInterval, sourceNodeName,
sourceNodePort);
char *tableOwner = TableOwner(shardInterval->relationId);
SendCommandListToWorkerInSingleTransaction(targetNodeName, targetNodePort,
tableOwner, ddlCommandList);
ddlCommandList = NIL;
/*
* Skip copying data for partitioned tables, because they contain no
* data themselves. Their partitions do contain data, but those are
* different colocated shards that will be copied seperately.
*/
if (!PartitionedTable(shardInterval->relationId))
{
ddlCommandList = CopyShardContentsCommandList(shardInterval, sourceNodeName,
sourceNodePort);
}
ddlCommandList = list_concat(
ddlCommandList,
PostLoadShardCreationCommandList(shardInterval, sourceNodeName,
sourceNodePort));
SendCommandListToWorkerInSingleTransaction(targetNodeName, targetNodePort,
tableOwner, ddlCommandList);
@ -1012,7 +1064,8 @@ static void
EnsureShardCanBeRepaired(int64 shardId, const char *sourceNodeName, int32 sourceNodePort,
const char *targetNodeName, int32 targetNodePort)
{
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList =
ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *sourcePlacement = SearchShardPlacementInListOrError(
shardPlacementList,
@ -1044,7 +1097,7 @@ static void
EnsureShardCanBeCopied(int64 shardId, const char *sourceNodeName, int32 sourceNodePort,
const char *targetNodeName, int32 targetNodePort)
{
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList = ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *sourcePlacement = SearchShardPlacementInListOrError(
shardPlacementList,
@ -1068,7 +1121,7 @@ EnsureShardCanBeCopied(int64 shardId, const char *sourceNodeName, int32 sourceNo
* the shard.
*/
DropOrphanedShardsInSeparateTransaction();
shardPlacementList = ShardPlacementList(shardId);
shardPlacementList = ShardPlacementListIncludingOrphanedPlacements(shardId);
targetPlacement = SearchShardPlacementInList(shardPlacementList,
targetNodeName,
targetNodePort);
@ -1147,47 +1200,82 @@ SearchShardPlacementInListOrError(List *shardPlacementList, const char *nodeName
/*
* CopyShardCommandList generates command list to copy the given shard placement
* from the source node to the target node. Caller could optionally skip copying
* the data by the flag includeDataCopy.
* from the source node to the target node. To do this it recreates the shard
* on the target, and then copies the data. Caller could optionally skip
* copying the data by the flag includeDataCopy.
*/
List *
CopyShardCommandList(ShardInterval *shardInterval, const char *sourceNodeName,
int32 sourceNodePort, bool includeDataCopy)
{
List *copyShardToNodeCommandsList = RecreateShardDDLCommandList(
shardInterval, sourceNodeName, sourceNodePort);
if (includeDataCopy)
{
copyShardToNodeCommandsList = list_concat(
copyShardToNodeCommandsList,
CopyShardContentsCommandList(shardInterval, sourceNodeName,
sourceNodePort));
}
return list_concat(copyShardToNodeCommandsList,
PostLoadShardCreationCommandList(shardInterval, sourceNodeName,
sourceNodePort));
}
/*
* RecreateShardDDLCommandList generates a command list to recreate a shard,
* but without any data init and without the post-load table creation commands.
*/
static List *
RecreateShardDDLCommandList(ShardInterval *shardInterval, const char *sourceNodeName,
int32 sourceNodePort)
{
int64 shardId = shardInterval->shardId;
char *shardName = ConstructQualifiedShardName(shardInterval);
List *copyShardToNodeCommandsList = NIL;
StringInfo copyShardDataCommand = makeStringInfo();
Oid relationId = shardInterval->relationId;
List *tableRecreationCommandList = RecreateTableDDLCommandList(relationId);
tableRecreationCommandList =
WorkerApplyShardDDLCommandList(tableRecreationCommandList, shardId);
return WorkerApplyShardDDLCommandList(tableRecreationCommandList, shardId);
}
copyShardToNodeCommandsList = list_concat(copyShardToNodeCommandsList,
tableRecreationCommandList);
if (includeDataCopy)
{
appendStringInfo(copyShardDataCommand, WORKER_APPEND_TABLE_TO_SHARD,
quote_literal_cstr(shardName), /* table to append */
quote_literal_cstr(shardName), /* remote table name */
quote_literal_cstr(sourceNodeName), /* remote host */
sourceNodePort); /* remote port */
/*
* CopyShardContentsCommandList generates a command list to copy the data of the
* given shard placement from the source node to the target node. This copying
* requires a precreated table for the shard on the target node to have been
* created already (using RecreateShardDDLCommandList).
*/
static List *
CopyShardContentsCommandList(ShardInterval *shardInterval, const char *sourceNodeName,
int32 sourceNodePort)
{
char *shardName = ConstructQualifiedShardName(shardInterval);
StringInfo copyShardDataCommand = makeStringInfo();
appendStringInfo(copyShardDataCommand, WORKER_APPEND_TABLE_TO_SHARD,
quote_literal_cstr(shardName), /* table to append */
quote_literal_cstr(shardName), /* remote table name */
quote_literal_cstr(sourceNodeName), /* remote host */
sourceNodePort); /* remote port */
copyShardToNodeCommandsList = lappend(copyShardToNodeCommandsList,
copyShardDataCommand->data);
}
return list_make1(copyShardDataCommand->data);
}
/*
* PostLoadShardCreationCommandList generates a command list to finalize the
* creation of a shard after the data has been loaded. This creates stuff like
* the indexes on the table.
*/
static List *
PostLoadShardCreationCommandList(ShardInterval *shardInterval, const char *sourceNodeName,
int32 sourceNodePort)
{
int64 shardId = shardInterval->shardId;
Oid relationId = shardInterval->relationId;
bool includeReplicaIdentity = true;
List *indexCommandList =
GetPostLoadTableCreationCommands(relationId, true, includeReplicaIdentity);
indexCommandList = WorkerApplyShardDDLCommandList(indexCommandList, shardId);
copyShardToNodeCommandsList = list_concat(copyShardToNodeCommandsList,
indexCommandList);
return copyShardToNodeCommandsList;
return WorkerApplyShardDDLCommandList(indexCommandList, shardId);
}
@ -1412,7 +1500,8 @@ DropColocatedShardPlacement(ShardInterval *shardInterval, char *nodeName, int32
char *qualifiedTableName = ConstructQualifiedShardName(colocatedShard);
StringInfo dropQuery = makeStringInfo();
uint64 shardId = colocatedShard->shardId;
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList =
ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *placement =
SearchShardPlacementInListOrError(shardPlacementList, nodeName, nodePort);
@ -1425,9 +1514,9 @@ DropColocatedShardPlacement(ShardInterval *shardInterval, char *nodeName, int32
/*
* MarkForDropColocatedShardPlacement marks the shard placement metadata for the given
* shard placement to be deleted in pg_dist_placement. The function does this for all
* colocated placements.
* MarkForDropColocatedShardPlacement marks the shard placement metadata for
* the given shard placement to be deleted in pg_dist_placement. The function
* does this for all colocated placements.
*/
static void
MarkForDropColocatedShardPlacement(ShardInterval *shardInterval, char *nodeName, int32
@ -1440,7 +1529,8 @@ MarkForDropColocatedShardPlacement(ShardInterval *shardInterval, char *nodeName,
{
ShardInterval *colocatedShard = (ShardInterval *) lfirst(colocatedShardCell);
uint64 shardId = colocatedShard->shardId;
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList =
ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *placement =
SearchShardPlacementInListOrError(shardPlacementList, nodeName, nodePort);

View File

@ -744,12 +744,12 @@ SetupRebalanceMonitor(List *placementUpdateList, Oid relationId)
List *colocatedUpdateList = GetColocatedRebalanceSteps(placementUpdateList);
ListCell *colocatedUpdateCell = NULL;
ProgressMonitorData *monitor = CreateProgressMonitor(REBALANCE_ACTIVITY_MAGIC_NUMBER,
list_length(colocatedUpdateList),
sizeof(
PlacementUpdateEventProgress),
relationId);
PlacementUpdateEventProgress *rebalanceSteps = monitor->steps;
dsm_handle dsmHandle;
ProgressMonitorData *monitor = CreateProgressMonitor(
list_length(colocatedUpdateList),
sizeof(PlacementUpdateEventProgress),
&dsmHandle);
PlacementUpdateEventProgress *rebalanceSteps = ProgressMonitorSteps(monitor);
int32 eventIndex = 0;
foreach(colocatedUpdateCell, colocatedUpdateList)
@ -767,6 +767,7 @@ SetupRebalanceMonitor(List *placementUpdateList, Oid relationId)
eventIndex++;
}
RegisterProgressMonitor(REBALANCE_ACTIVITY_MAGIC_NUMBER, relationId, dsmHandle);
}
@ -793,7 +794,7 @@ rebalance_table_shards(PG_FUNCTION_ARGS)
if (!PG_ARGISNULL(0))
{
Oid relationId = PG_GETARG_OID(0);
ErrorIfMoveCitusLocalTable(relationId);
ErrorIfMoveUnsupportedTableType(relationId);
relationIdList = list_make1_oid(relationId);
}
@ -996,7 +997,7 @@ get_rebalance_table_shards_plan(PG_FUNCTION_ARGS)
if (!PG_ARGISNULL(0))
{
Oid relationId = PG_GETARG_OID(0);
ErrorIfMoveCitusLocalTable(relationId);
ErrorIfMoveUnsupportedTableType(relationId);
relationIdList = list_make1_oid(relationId);
}
@ -1075,7 +1076,6 @@ get_rebalance_progress(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
List *segmentList = NIL;
ListCell *rebalanceMonitorCell = NULL;
TupleDesc tupdesc;
Tuplestorestate *tupstore = SetupTuplestore(fcinfo, &tupdesc);
@ -1083,12 +1083,12 @@ get_rebalance_progress(PG_FUNCTION_ARGS)
List *rebalanceMonitorList = ProgressMonitorList(REBALANCE_ACTIVITY_MAGIC_NUMBER,
&segmentList);
foreach(rebalanceMonitorCell, rebalanceMonitorList)
ProgressMonitorData *monitor = NULL;
foreach_ptr(monitor, rebalanceMonitorList)
{
ProgressMonitorData *monitor = lfirst(rebalanceMonitorCell);
PlacementUpdateEventProgress *placementUpdateEvents = monitor->steps;
HTAB *shardStatistics = BuildWorkerShardStatisticsHash(monitor->steps,
PlacementUpdateEventProgress *placementUpdateEvents = ProgressMonitorSteps(
monitor);
HTAB *shardStatistics = BuildWorkerShardStatisticsHash(placementUpdateEvents,
monitor->stepCount);
HTAB *shardSizes = BuildShardSizesHash(monitor, shardStatistics);
for (int eventIndex = 0; eventIndex < monitor->stepCount; eventIndex++)
@ -1158,10 +1158,12 @@ BuildShardSizesHash(ProgressMonitorData *monitor, HTAB *shardStatistics)
HTAB *shardSizes = hash_create(
"ShardSizeHash", 32, &info,
HASH_ELEM | HASH_CONTEXT | HASH_BLOBS);
PlacementUpdateEventProgress *placementUpdateEvents = monitor->steps;
PlacementUpdateEventProgress *placementUpdateEvents = ProgressMonitorSteps(monitor);
for (int eventIndex = 0; eventIndex < monitor->stepCount; eventIndex++)
{
PlacementUpdateEventProgress *step = placementUpdateEvents + eventIndex;
uint64 shardId = step->shardId;
uint64 shardSize = 0;
uint64 backupShardSize = 0;
@ -2769,9 +2771,9 @@ UpdateColocatedShardPlacementProgress(uint64 shardId, char *sourceName, int sour
{
ProgressMonitorData *header = GetCurrentProgressMonitor();
if (header != NULL && header->steps != NULL)
if (header != NULL)
{
PlacementUpdateEventProgress *steps = header->steps;
PlacementUpdateEventProgress *steps = ProgressMonitorSteps(header);
ListCell *colocatedShardIntervalCell = NULL;
ShardInterval *shardInterval = LoadShardInterval(shardId);

View File

@ -41,8 +41,6 @@
static void AddInsertAliasIfNeeded(Query *query);
static void UpdateTaskQueryString(Query *query, Task *task);
static bool ReplaceRelationConstraintByShardConstraint(List *relationShardList,
OnConflictExpr *onConflict);
static RelationShard * FindRelationShard(Oid inputRelationId, List *relationShardList);
static void ConvertRteToSubqueryWithEmptyResult(RangeTblEntry *rte);
static bool ShouldLazyDeparseQuery(Task *task);
@ -269,124 +267,6 @@ UpdateRelationToShardNames(Node *node, List *relationShardList)
}
/*
* UpdateRelationsToLocalShardTables walks over the query tree and appends shard ids to
* relations. The caller is responsible for ensuring that the resulting Query can
* be executed locally.
*/
bool
UpdateRelationsToLocalShardTables(Node *node, List *relationShardList)
{
if (node == NULL)
{
return false;
}
/* want to look at all RTEs, even in subqueries, CTEs and such */
if (IsA(node, Query))
{
return query_tree_walker((Query *) node, UpdateRelationsToLocalShardTables,
relationShardList, QTW_EXAMINE_RTES_BEFORE);
}
if (IsA(node, OnConflictExpr))
{
OnConflictExpr *onConflict = (OnConflictExpr *) node;
return ReplaceRelationConstraintByShardConstraint(relationShardList, onConflict);
}
if (!IsA(node, RangeTblEntry))
{
return expression_tree_walker(node, UpdateRelationsToLocalShardTables,
relationShardList);
}
RangeTblEntry *newRte = (RangeTblEntry *) node;
if (newRte->rtekind != RTE_RELATION)
{
return false;
}
RelationShard *relationShard = FindRelationShard(newRte->relid,
relationShardList);
/* the function should only be called with local shards */
if (relationShard == NULL)
{
return true;
}
Oid shardOid = GetTableLocalShardOid(relationShard->relationId,
relationShard->shardId);
newRte->relid = shardOid;
return false;
}
/*
* ReplaceRelationConstraintByShardConstraint replaces given OnConflictExpr's
* constraint id with constraint id of the corresponding shard.
*/
static bool
ReplaceRelationConstraintByShardConstraint(List *relationShardList,
OnConflictExpr *onConflict)
{
Oid constraintId = onConflict->constraint;
if (!OidIsValid(constraintId))
{
return false;
}
Oid constraintRelationId = InvalidOid;
HeapTuple heapTuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constraintId));
if (HeapTupleIsValid(heapTuple))
{
Form_pg_constraint contup = (Form_pg_constraint) GETSTRUCT(heapTuple);
constraintRelationId = contup->conrelid;
ReleaseSysCache(heapTuple);
}
/*
* We can return here without calling the walker function, since we know there
* will be no possible tables or constraints after this point, by the syntax.
*/
if (!OidIsValid(constraintRelationId))
{
ereport(ERROR, (errmsg("Invalid relation id (%u) for constraint: %s",
constraintRelationId, get_constraint_name(constraintId))));
}
RelationShard *relationShard = FindRelationShard(constraintRelationId,
relationShardList);
if (relationShard != NULL)
{
char *constraintName = get_constraint_name(constraintId);
AppendShardIdToName(&constraintName, relationShard->shardId);
Oid shardOid = GetTableLocalShardOid(relationShard->relationId,
relationShard->shardId);
Oid shardConstraintId = get_relation_constraint_oid(shardOid, constraintName,
false);
onConflict->constraint = shardConstraintId;
return false;
}
return true;
}
/*
* FindRelationShard finds the RelationShard for shard relation with
* given Oid if exists in given relationShardList. Otherwise, returns NULL.

View File

@ -16,19 +16,29 @@
#include "distributed/local_plan_cache.h"
#include "distributed/deparse_shard_query.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/insert_select_planner.h"
#include "distributed/metadata_cache.h"
#include "distributed/multi_executor.h"
#include "distributed/version_compat.h"
#include "optimizer/optimizer.h"
#include "optimizer/clauses.h"
static Query * GetLocalShardQueryForCache(Query *jobQuery, Task *task,
ParamListInfo paramListInfo);
static char * DeparseLocalShardQuery(Query *jobQuery, List *relationShardList,
Oid anchorDistributedTableId, int64 anchorShardId);
static int ExtractParameterTypesForParamListInfo(ParamListInfo originalParamListInfo,
Oid **parameterTypes);
/*
* CacheLocalPlanForShardQuery replaces the relation OIDs in the job query
* with shard relation OIDs and then plans the query and caches the result
* in the originalDistributedPlan (which may be preserved across executions).
*/
void
CacheLocalPlanForShardQuery(Task *task, DistributedPlan *originalDistributedPlan)
CacheLocalPlanForShardQuery(Task *task, DistributedPlan *originalDistributedPlan,
ParamListInfo paramListInfo)
{
PlannedStmt *localPlan = GetCachedLocalPlan(task, originalDistributedPlan);
if (localPlan != NULL)
@ -54,14 +64,14 @@ CacheLocalPlanForShardQuery(Task *task, DistributedPlan *originalDistributedPlan
* We prefer to use jobQuery (over task->query) because we don't want any
* functions/params to have been evaluated in the cached plan.
*/
Query *shardQuery = copyObject(originalDistributedPlan->workerJob->jobQuery);
Query *jobQuery = copyObject(originalDistributedPlan->workerJob->jobQuery);
UpdateRelationsToLocalShardTables((Node *) shardQuery, task->relationShardList);
Query *localShardQuery = GetLocalShardQueryForCache(jobQuery, task, paramListInfo);
LOCKMODE lockMode = GetQueryLockMode(shardQuery);
LOCKMODE lockMode = GetQueryLockMode(localShardQuery);
/* fast path queries can only have a single RTE by definition */
RangeTblEntry *rangeTableEntry = (RangeTblEntry *) linitial(shardQuery->rtable);
RangeTblEntry *rangeTableEntry = (RangeTblEntry *) linitial(localShardQuery->rtable);
/*
* If the shard has been created in this transction, we wouldn't see the relationId
@ -69,24 +79,16 @@ CacheLocalPlanForShardQuery(Task *task, DistributedPlan *originalDistributedPlan
*/
if (rangeTableEntry->relid == InvalidOid)
{
pfree(shardQuery);
pfree(jobQuery);
pfree(localShardQuery);
MemoryContextSwitchTo(oldContext);
return;
}
if (IsLoggableLevel(DEBUG5))
{
StringInfo queryString = makeStringInfo();
pg_get_query_def(shardQuery, queryString);
ereport(DEBUG5, (errmsg("caching plan for query: %s",
queryString->data)));
}
LockRelationOid(rangeTableEntry->relid, lockMode);
LocalPlannedStatement *localPlannedStatement = CitusMakeNode(LocalPlannedStatement);
localPlan = planner_compat(shardQuery, 0, NULL);
localPlan = planner_compat(localShardQuery, 0, NULL);
localPlannedStatement->localPlan = localPlan;
localPlannedStatement->shardId = task->anchorShardId;
localPlannedStatement->localGroupId = GetLocalGroupId();
@ -99,6 +101,128 @@ CacheLocalPlanForShardQuery(Task *task, DistributedPlan *originalDistributedPlan
}
/*
* GetLocalShardQueryForCache is a helper function which generates
* the local shard query based on the jobQuery. The function should
* not be used for generic purposes, it is specialized for local cached
* queries.
*
* It is not guaranteed to have consistent attribute numbers on the shards
* and on the shell (e.g., distributed/reference tables) due to DROP COLUMN
* commands.
*
* To avoid any edge cases due to such discrepancies, we first deparse the
* jobQuery with the tables replaced to shards, and parse the query string
* back. This is normally a very expensive operation, however we only do it
* once per cached local plan, which is acceptable.
*/
static Query *
GetLocalShardQueryForCache(Query *jobQuery, Task *task, ParamListInfo orig_paramListInfo)
{
char *shardQueryString =
DeparseLocalShardQuery(jobQuery, task->relationShardList,
task->anchorDistributedTableId,
task->anchorShardId);
ereport(DEBUG5, (errmsg("Local shard query that is going to be cached: %s",
shardQueryString)));
Oid *parameterTypes = NULL;
int numberOfParameters =
ExtractParameterTypesForParamListInfo(orig_paramListInfo, &parameterTypes);
Query *localShardQuery =
ParseQueryString(shardQueryString, parameterTypes, numberOfParameters);
return localShardQuery;
}
/*
* DeparseLocalShardQuery is a helper function to deparse given jobQuery for the shard(s)
* identified by the relationShardList, anchorDistributedTableId and anchorShardId.
*
* For the details and comparison with TaskQueryString(), see the comments in the function.
*/
static char *
DeparseLocalShardQuery(Query *jobQuery, List *relationShardList, Oid
anchorDistributedTableId, int64 anchorShardId)
{
StringInfo queryString = makeStringInfo();
/*
* We imitate what TaskQueryString() does, but we cannot rely on that function
* as the parameters might have been already resolved on the QueryTree in the
* task. Instead, we operate on the jobQuery where are sure that the
* coordination evaluation has not happened.
*
* Local shard queries are only applicable for local cached query execution.
* In the local cached query execution mode, we can use a query structure
* (or query string) with unevaluated expressions as we allow function calls
* to be evaluated when the query on the shard is executed (e.g., do no have
* coordinator evaluation, instead let Postgres executor evaluate values).
*
* Additionally, we can allow them to be evaluated again because they are stable,
* and we do not cache plans / use unevaluated query strings for queries containing
* volatile functions.
*/
if (jobQuery->commandType == CMD_INSERT)
{
/*
* We currently do not support INSERT .. SELECT here. To support INSERT..SELECT
* queries, we should update the relation names to shard names in the SELECT
* clause (e.g., UpdateRelationToShardNames()).
*/
Assert(!CheckInsertSelectQuery(jobQuery));
/*
* For INSERT queries we cannot use pg_get_query_def. Mainly because we
* cannot run UpdateRelationToShardNames on an INSERT query. This is
* because the PG deparsing logic fails when trying to insert into a
* RTE_FUNCTION (which is what will happen if you call
* UpdateRelationToShardNames).
*/
deparse_shard_query(jobQuery, anchorDistributedTableId, anchorShardId,
queryString);
}
else
{
UpdateRelationToShardNames((Node *) jobQuery, relationShardList);
pg_get_query_def(jobQuery, queryString);
}
return queryString->data;
}
/*
* ExtractParameterTypesForParamListInfo is a helper function which helps to
* extract the parameter types of the given ParamListInfo via the second
* parameter of the function.
*
* The function also returns the number of parameters. If no parameter exists,
* the function returns 0.
*/
static int
ExtractParameterTypesForParamListInfo(ParamListInfo originalParamListInfo,
Oid **parameterTypes)
{
*parameterTypes = NULL;
int numberOfParameters = 0;
if (originalParamListInfo != NULL)
{
const char **parameterValues = NULL;
ParamListInfo paramListInfo = copyParamList(originalParamListInfo);
ExtractParametersForLocalExecution(paramListInfo, parameterTypes,
&parameterValues);
numberOfParameters = paramListInfo->numParams;
}
return numberOfParameters;
}
/*
* GetCachedLocalPlan is a helper function which return the cached
* plan in the distributedPlan for the given task if exists.

View File

@ -2767,8 +2767,10 @@ CoPartitionedTables(Oid firstRelationId, Oid secondRelationId)
static bool
CoPlacedShardIntervals(ShardInterval *firstInterval, ShardInterval *secondInterval)
{
List *firstShardPlacementList = ShardPlacementList(firstInterval->shardId);
List *secondShardPlacementList = ShardPlacementList(secondInterval->shardId);
List *firstShardPlacementList = ShardPlacementListWithoutOrphanedPlacements(
firstInterval->shardId);
List *secondShardPlacementList = ShardPlacementListWithoutOrphanedPlacements(
secondInterval->shardId);
ListCell *firstShardPlacementCell = NULL;
ListCell *secondShardPlacementCell = NULL;

View File

@ -830,13 +830,8 @@ IsTableLocallyAccessible(Oid relationId)
ShardInterval *shardInterval = linitial(shardIntervalList);
uint64 shardId = shardInterval->shardId;
ShardPlacement *localShardPlacement =
ShardPlacementOnGroup(shardId, GetLocalGroupId());
if (localShardPlacement != NULL)
{
/* the table has a placement on this node */
return true;
}
return false;
ActiveShardPlacementOnGroup(GetLocalGroupId(), shardId);
return localShardPlacement != NULL;
}
@ -1667,7 +1662,8 @@ RouterInsertTaskList(Query *query, bool parametersInQueryResolved,
relationShard->relationId = distributedTableId;
modifyTask->relationShardList = list_make1(relationShard);
modifyTask->taskPlacementList = ShardPlacementList(modifyRoute->shardId);
modifyTask->taskPlacementList = ActiveShardPlacementList(
modifyRoute->shardId);
modifyTask->parametersInQueryStringResolved = parametersInQueryResolved;
insertTaskList = lappend(insertTaskList, modifyTask);

View File

@ -1571,6 +1571,22 @@ LowerShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
/* setup partitionColumnValue argument once */
fcSetArg(compareFunction, 0, partitionColumnValue);
/*
* Now we test partitionColumnValue used in where clause such as
* partCol > partitionColumnValue (or partCol >= partitionColumnValue)
* against four possibilities, these are:
* 1) partitionColumnValue falls into a specific shard, such that:
* partitionColumnValue >= shard[x].min, and
* partitionColumnValue < shard[x].max (or partitionColumnValue <= shard[x].max).
* 2) partitionColumnValue < shard[x].min for all the shards
* 3) partitionColumnValue > shard[x].max for all the shards
* 4) partitionColumnValue falls in between two shards, such that:
* partitionColumnValue > shard[x].max and
* partitionColumnValue < shard[x+1].min
*
* For 1), we find that shard in below loop using binary search and
* return the index of it. For the others, see the end of this function.
*/
while (lowerBoundIndex < upperBoundIndex)
{
int middleIndex = lowerBoundIndex + ((upperBoundIndex - lowerBoundIndex) / 2);
@ -1603,7 +1619,7 @@ LowerShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
continue;
}
/* found interval containing partitionValue */
/* partitionColumnValue falls into a specific shard, possibility 1) */
return middleIndex;
}
@ -1614,20 +1630,30 @@ LowerShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
* (we'd have hit the return middleIndex; case otherwise). Figure out
* whether there's possibly any interval containing a value that's bigger
* than the partition key one.
*
* Also note that we initialized lowerBoundIndex with 0. Similarly,
* we always set it to the index of the shard that we consider as our
* lower boundary during binary search.
*/
if (lowerBoundIndex == 0)
if (lowerBoundIndex == shardCount)
{
/* all intervals are bigger, thus return 0 */
return 0;
}
else if (lowerBoundIndex == shardCount)
{
/* partition value is bigger than all partition values */
/*
* Since lowerBoundIndex is an inclusive index, being equal to shardCount
* means all the shards have smaller values than partitionColumnValue,
* which corresponds to possibility 3).
* In that case, since we can't have a lower bound shard, we return
* INVALID_SHARD_INDEX here.
*/
return INVALID_SHARD_INDEX;
}
/* value falls inbetween intervals */
return lowerBoundIndex + 1;
/*
* partitionColumnValue is either smaller than all the shards or falls in
* between two shards, which corresponds to possibility 2) or 4).
* Knowing that lowerBoundIndex is an inclusive index, we directly return
* it as the index for the lower bound shard here.
*/
return lowerBoundIndex;
}
@ -1647,6 +1673,23 @@ UpperShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
/* setup partitionColumnValue argument once */
fcSetArg(compareFunction, 0, partitionColumnValue);
/*
* Now we test partitionColumnValue used in where clause such as
* partCol < partitionColumnValue (or partCol <= partitionColumnValue)
* against four possibilities, these are:
* 1) partitionColumnValue falls into a specific shard, such that:
* partitionColumnValue <= shard[x].max, and
* partitionColumnValue > shard[x].min (or partitionColumnValue >= shard[x].min).
* 2) partitionColumnValue > shard[x].max for all the shards
* 3) partitionColumnValue < shard[x].min for all the shards
* 4) partitionColumnValue falls in between two shards, such that:
* partitionColumnValue > shard[x].max and
* partitionColumnValue < shard[x+1].min
*
* For 1), we find that shard in below loop using binary search and
* return the index of it. For the others, see the end of this function.
*/
while (lowerBoundIndex < upperBoundIndex)
{
int middleIndex = lowerBoundIndex + ((upperBoundIndex - lowerBoundIndex) / 2);
@ -1679,7 +1722,7 @@ UpperShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
continue;
}
/* found interval containing partitionValue */
/* partitionColumnValue falls into a specific shard, possibility 1) */
return middleIndex;
}
@ -1690,19 +1733,29 @@ UpperShardBoundary(Datum partitionColumnValue, ShardInterval **shardIntervalCach
* (we'd have hit the return middleIndex; case otherwise). Figure out
* whether there's possibly any interval containing a value that's smaller
* than the partition key one.
*
* Also note that we initialized upperBoundIndex with shardCount. Similarly,
* we always set it to the index of the next shard that we consider as our
* upper boundary during binary search.
*/
if (upperBoundIndex == shardCount)
if (upperBoundIndex == 0)
{
/* all intervals are smaller, thus return 0 */
return shardCount - 1;
}
else if (upperBoundIndex == 0)
{
/* partition value is smaller than all partition values */
/*
* Since upperBoundIndex is an exclusive index, being equal to 0 means
* all the shards have greater values than partitionColumnValue, which
* corresponds to possibility 3).
* In that case, since we can't have an upper bound shard, we return
* INVALID_SHARD_INDEX here.
*/
return INVALID_SHARD_INDEX;
}
/* value falls inbetween intervals, return the inverval one smaller as bound */
/*
* partitionColumnValue is either greater than all the shards or falls in
* between two shards, which corresponds to possibility 2) or 4).
* Knowing that upperBoundIndex is an exclusive index, we return the index
* for the previous shard here.
*/
return upperBoundIndex - 1;
}

View File

@ -27,18 +27,16 @@ static ProgressMonitorData * MonitorDataFromDSMHandle(dsm_handle dsmHandle,
/*
* CreateProgressMonitor is used to create a place to store progress information related
* to long running processes. The function creates a dynamic shared memory segment
* consisting of a header regarding to the process and an array of "steps" that the long
* running "operations" consists of. The handle of the dynamic shared memory is stored in
* pg_stat_get_progress_info output, to be parsed by a progress retrieval command
* later on. This behavior may cause unrelated (but hopefully harmless) rows in
* pg_stat_progress_vacuum output. The caller of this function should provide a magic
* number, a unique 64 bit unsigned integer, to distinguish different types of commands.
* CreateProgressMonitor is used to create a place to store progress
* information related to long running processes. The function creates a
* dynamic shared memory segment consisting of a header regarding to the
* process and an array of "steps" that the long running "operations" consists
* of. After initializing the data in the array of steps, the shared memory
* segment can be shared with other processes using RegisterProgressMonitor, by
* giving it the value that's written to the dsmHandle argument.
*/
ProgressMonitorData *
CreateProgressMonitor(uint64 progressTypeMagicNumber, int stepCount, Size stepSize,
Oid relationId)
CreateProgressMonitor(int stepCount, Size stepSize, dsm_handle *dsmHandle)
{
if (stepSize <= 0 || stepCount <= 0)
{
@ -58,20 +56,37 @@ CreateProgressMonitor(uint64 progressTypeMagicNumber, int stepCount, Size stepSi
return NULL;
}
dsm_handle dsmHandle = dsm_segment_handle(dsmSegment);
*dsmHandle = dsm_segment_handle(dsmSegment);
ProgressMonitorData *monitor = MonitorDataFromDSMHandle(dsmHandle, &dsmSegment);
ProgressMonitorData *monitor = MonitorDataFromDSMHandle(*dsmHandle, &dsmSegment);
monitor->stepCount = stepCount;
monitor->processId = MyProcPid;
return monitor;
}
/*
* RegisterProgressMonitor shares dsmHandle with other postgres process by
* storing it in pg_stat_get_progress_info output, to be parsed by a
* progress retrieval command later on. This behavior may cause unrelated (but
* hopefully harmless) rows in pg_stat_progress_vacuum output. The caller of
* this function should provide a magic number, a unique 64 bit unsigned
* integer, to distinguish different types of commands.
*
* IMPORTANT: After registering the progress monitor, all modification to the
* data should be done using concurrency safe operations (i.e. locks and
* atomics)
*/
void
RegisterProgressMonitor(uint64 progressTypeMagicNumber, Oid relationId,
dsm_handle dsmHandle)
{
pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM, relationId);
pgstat_progress_update_param(1, dsmHandle);
pgstat_progress_update_param(0, progressTypeMagicNumber);
currentProgressDSMHandle = dsmHandle;
return monitor;
}
@ -204,24 +219,46 @@ ProgressMonitorData *
MonitorDataFromDSMHandle(dsm_handle dsmHandle, dsm_segment **attachedSegment)
{
dsm_segment *dsmSegment = dsm_find_mapping(dsmHandle);
ProgressMonitorData *monitor = NULL;
if (dsmSegment == NULL)
{
dsmSegment = dsm_attach(dsmHandle);
}
if (dsmSegment != NULL)
if (dsmSegment == NULL)
{
monitor = (ProgressMonitorData *) dsm_segment_address(dsmSegment);
monitor->steps = (void *) (monitor + 1);
*attachedSegment = dsmSegment;
return NULL;
}
ProgressMonitorData *monitor = (ProgressMonitorData *) dsm_segment_address(
dsmSegment);
*attachedSegment = dsmSegment;
return monitor;
}
/*
* ProgressMonitorSteps returns a pointer to the array of steps that are stored
* in a progress monitor. This is simply the data right after the header, so
* this function is trivial. The main purpose of this function is to make the
* intent clear to readers of the code.
*
* NOTE: The pointer this function returns is explicitly not stored in the
* header, because the header is shared between processes. The absolute pointer
* to the steps can have a different value between processes though, because
* the same piece of shared memory often has a different address in different
* processes. So we calculate this pointer over and over to make sure we use
* the right value for each process.
*/
void *
ProgressMonitorSteps(ProgressMonitorData *monitor)
{
return monitor + 1;
}
/*
* DetachFromDSMSegments ensures that the process is detached from all of the segments in
* the given list.

View File

@ -0,0 +1,20 @@
-- citus--10.0-3--10.0-4
-- This migration file aims to fix 2 issues with upgrades on clusters
-- 1. a bug in public schema dependency for citus_tables view.
--
-- Users who do not have public schema in their clusters were unable to upgrade
-- to Citus 10.x due to the citus_tables view that used to be created in public
-- schema
#include "udfs/citus_tables/10.0-4.sql"
-- 2. a bug in our PG upgrade functions
--
-- Users who took the 9.5-2--10.0-1 upgrade path already have the fix, but users
-- who took the 9.5-1--10.0-1 upgrade path do not. Hence, we repeat the CREATE OR
-- REPLACE from the 9.5-2 definition for citus_prepare_pg_upgrade.
#include "udfs/citus_prepare_pg_upgrade/9.5-2.sql"
#include "udfs/citus_finish_pg_upgrade/10.0-4.sql"

View File

@ -1,4 +1,4 @@
-- citus--10.0-3--10.1-1
-- citus--10.0-4--10.1-1
-- add the current database to the distributed objects if not already in there.
-- this is to reliably propagate some of the alter database commands that might be

View File

@ -0,0 +1,3 @@
-- 9.4-1--9.4-2 was added later as a patch to fix a bug in our PG upgrade functions
#include "udfs/citus_prepare_pg_upgrade/9.4-2.sql"
#include "udfs/citus_finish_pg_upgrade/9.4-2.sql"

View File

@ -0,0 +1,9 @@
--
-- 9.4-1--9.4-2 was added later as a patch to fix a bug in our PG upgrade functions
--
-- This script brings users who installed the patch released back to the 9.4-1
-- upgrade path. We do this via a semantical downgrade since there has already been
-- introduced new changes in the schema from 9.4-1 to 9.5-1. To make sure we include all
-- changes made during that version change we decide to use the existing upgrade path from
-- our later introduced 9.4-2 version.
--

View File

@ -0,0 +1,7 @@
-- 9.4-2--9.4-3 was added later as a patch to improve master_update_table_statistics
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_update_table_statistics$$;
COMMENT ON FUNCTION pg_catalog.master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table';

View File

@ -1,10 +1,8 @@
-- citus--10.0-3--10.0-2
-- this is a downgrade path that will revert the changes made in citus--10.0-2--10.0-3.sql
DROP FUNCTION pg_catalog.citus_update_table_statistics(regclass);
#include "../udfs/citus_update_table_statistics/10.0-1.sql"
-- citus--9.4-3--9.4-2
-- This is a downgrade path that will revert the changes made in citus--9.4-2--9.4-3.sql
-- 9.4-2--9.4-3 was added later as a patch to improve master_update_table_statistics.
-- We have this downgrade script so that we can continue from the main upgrade path
-- when upgrading to later versions.
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID AS $$
DECLARE
@ -22,5 +20,3 @@ END;
$$ LANGUAGE 'plpgsql';
COMMENT ON FUNCTION master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table and its colocated tables';
DROP FUNCTION pg_catalog.citus_get_active_worker_nodes(OUT text, OUT bigint);

View File

@ -1,10 +1,16 @@
-- citus--9.5-1--10.0-1
-- citus--9.5-1--10.0-4
-- This migration file aims to fix the issues with upgrades on clusters without public schema.
-- This file is created by the following command, and some more changes in a separate commit
-- cat citus--9.5-1--10.0-1.sql citus--10.0-1--10.0-2.sql citus--10.0-2--10.0-3.sql > citus--9.5-1--10.0-4.sql
-- copy of citus--9.5-1--10.0-1
DROP FUNCTION pg_catalog.upgrade_to_reference_table(regclass);
DROP FUNCTION IF EXISTS pg_catalog.citus_total_relation_size(regclass);
#include "udfs/citus_total_relation_size/10.0-1.sql"
#include "udfs/citus_tables/10.0-1.sql"
#include "udfs/citus_finish_pg_upgrade/10.0-1.sql"
#include "udfs/alter_distributed_table/10.0-1.sql"
#include "udfs/alter_table_set_access_method/10.0-1.sql"
@ -164,4 +170,48 @@ SELECT * FROM pg_catalog.citus_worker_stat_activity();
ALTER VIEW citus.citus_worker_stat_activity SET SCHEMA pg_catalog;
GRANT SELECT ON pg_catalog.citus_worker_stat_activity TO PUBLIC;
-- copy of citus--10.0-1--10.0-2
#include "../../columnar/sql/columnar--10.0-1--10.0-2.sql"
-- copy of citus--10.0-2--10.0-3
#include "udfs/citus_update_table_statistics/10.0-3.sql"
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_update_table_statistics$$;
COMMENT ON FUNCTION pg_catalog.master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table';
CREATE OR REPLACE FUNCTION pg_catalog.citus_get_active_worker_nodes(OUT node_name text, OUT node_port bigint)
RETURNS SETOF record
LANGUAGE C STRICT ROWS 100
AS 'MODULE_PATHNAME', $$citus_get_active_worker_nodes$$;
COMMENT ON FUNCTION pg_catalog.citus_get_active_worker_nodes()
IS 'fetch set of active worker nodes';
-- copy of citus--10.0-3--10.0-4
-- This migration file aims to fix 2 issues with upgrades on clusters
-- 1. a bug in public schema dependency for citus_tables view.
--
-- Users who do not have public schema in their clusters were unable to upgrade
-- to Citus 10.x due to the citus_tables view that used to be created in public
-- schema
#include "udfs/citus_tables/10.0-4.sql"
-- 2. a bug in our PG upgrade functions
--
-- Users who took the 9.5-2--10.0-1 upgrade path already have the fix, but users
-- who took the 9.5-1--10.0-1 upgrade path do not. Hence, we repeat the CREATE OR
-- REPLACE from the 9.5-2 definition for citus_prepare_pg_upgrade.
#include "udfs/citus_prepare_pg_upgrade/9.5-2.sql"
#include "udfs/citus_finish_pg_upgrade/10.0-4.sql"
RESET search_path;

View File

@ -0,0 +1,3 @@
-- 9.5-1--9.5-2 was added later as a patch to fix a bug in our PG upgrade functions
#include "udfs/citus_prepare_pg_upgrade/9.5-2.sql"
#include "udfs/citus_finish_pg_upgrade/9.5-2.sql"

View File

@ -0,0 +1,9 @@
--
-- 9.5-1--9.5-2 was added later as a patch to fix a bug in our PG upgrade functions
--
-- This script brings users who installed the patch released back to the 9.5-1
-- upgrade path. We do this via a semantical downgrade since there has already been
-- introduced new changes in the schema from 9.5-1 to 10.0-1. To make sure we include all
-- changes made during that version change we decide to use the existing upgrade path from
-- our later introduced 9.5-1 version.
--

View File

@ -0,0 +1,7 @@
-- 9.5-2--9.5-3 was added later as a patch to improve master_update_table_statistics
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_update_table_statistics$$;
COMMENT ON FUNCTION pg_catalog.master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table';

View File

@ -0,0 +1,22 @@
-- citus--9.5-3--9.5-2
-- This is a downgrade path that will revert the changes made in citus--9.5-2--9.5-3.sql
-- 9.5-2--9.5-3 was added later as a patch to improve master_update_table_statistics.
-- We have this downgrade script so that we can continue from the main upgrade path
-- when upgrading to later versions.
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID AS $$
DECLARE
colocated_tables regclass[];
BEGIN
SELECT get_colocated_table_array(relation) INTO colocated_tables;
PERFORM
master_update_shard_statistics(shardid)
FROM
pg_dist_shard
WHERE
logicalrelid = ANY (colocated_tables);
END;
$$ LANGUAGE 'plpgsql';
COMMENT ON FUNCTION master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table and its colocated tables';

View File

@ -1,4 +0,0 @@
/* citus--10.0-2--10.0-1.sql */
#include "../../../columnar/sql/downgrades/columnar--10.0-2--10.0-1.sql"
REVOKE SELECT ON public.citus_tables FROM public;

View File

@ -1,4 +1,51 @@
-- citus--10.0-1--9.5-1
-- citus--10.0-4--9.5-1
-- This migration file aims to fix the issues with upgrades on clusters without public schema.
-- This file is created by the following command, and some more changes in a separate commit
-- cat citus--10.0-3--10.0-2.sql citus--10.0-2--10.0-1.sql citus--10.0-1--9.5-1.sql > citus--10.0-4--9.5-1.sql
-- copy of citus--10.0-4--10.0-3
--
-- 10.0-3--10.0-4 was added later as a patch to fix a bug in our PG upgrade functions
--
-- The upgrade fixes a bug in citus_(prepare|finish)_pg_upgrade. Given the old versions of
-- these functions contain a bug it is better to _not_ restore the old version and keep
-- the patched version of the function.
--
-- This is inline with the downgrade scripts for earlier versions of this patch
--
-- copy of citus--10.0-3--10.0-2
-- this is a downgrade path that will revert the changes made in citus--10.0-2--10.0-3.sql
DROP FUNCTION pg_catalog.citus_update_table_statistics(regclass);
#include "../udfs/citus_update_table_statistics/10.0-1.sql"
CREATE OR REPLACE FUNCTION master_update_table_statistics(relation regclass)
RETURNS VOID AS $$
DECLARE
colocated_tables regclass[];
BEGIN
SELECT get_colocated_table_array(relation) INTO colocated_tables;
PERFORM
master_update_shard_statistics(shardid)
FROM
pg_dist_shard
WHERE
logicalrelid = ANY (colocated_tables);
END;
$$ LANGUAGE 'plpgsql';
COMMENT ON FUNCTION master_update_table_statistics(regclass)
IS 'updates shard statistics of the given table and its colocated tables';
DROP FUNCTION pg_catalog.citus_get_active_worker_nodes(OUT text, OUT bigint);
/* copy of citus--10.0-2--10.0-1.sql */
#include "../../../columnar/sql/downgrades/columnar--10.0-2--10.0-1.sql"
-- copy of citus--10.0-1--9.5-1
-- In Citus 10.0, we added another internal udf (notify_constraint_dropped)
-- to be called by citus_drop_trigger. Since this script is executed when
@ -18,7 +65,8 @@ DROP FUNCTION pg_catalog.notify_constraint_dropped();
#include "../../../columnar/sql/downgrades/columnar--10.0-1--9.5-1.sql"
DROP VIEW public.citus_tables;
DROP VIEW IF EXISTS pg_catalog.citus_tables;
DROP VIEW IF EXISTS public.citus_tables;
DROP FUNCTION pg_catalog.alter_distributed_table(regclass, text, int, text, boolean);
DROP FUNCTION pg_catalog.alter_table_set_access_method(regclass, text);
DROP FUNCTION pg_catalog.citus_total_relation_size(regclass,boolean);

View File

@ -1,4 +1,8 @@
-- citus--10.1-1--10.0-3
-- citus--10.1-1--10.0-4
-- This migration file aims to fix the issues with upgrades on clusters without public schema.
-- copy of citus--10.1-1--10.0-3
-- remove databases as distributed objects to prevent unknown object types being managed
-- on older versions.

View File

@ -0,0 +1,108 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
DECLARE
table_name regclass;
command text;
trigger_name text;
BEGIN
--
-- restore citus catalog tables
--
INSERT INTO pg_catalog.pg_dist_partition SELECT * FROM public.pg_dist_partition;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
INSERT INTO pg_catalog.pg_dist_node SELECT * FROM public.pg_dist_node;
INSERT INTO pg_catalog.pg_dist_local_group SELECT * FROM public.pg_dist_local_group;
INSERT INTO pg_catalog.pg_dist_transaction SELECT * FROM public.pg_dist_transaction;
INSERT INTO pg_catalog.pg_dist_colocation SELECT * FROM public.pg_dist_colocation;
-- enterprise catalog tables
INSERT INTO pg_catalog.pg_dist_authinfo SELECT * FROM public.pg_dist_authinfo;
INSERT INTO pg_catalog.pg_dist_poolinfo SELECT * FROM public.pg_dist_poolinfo;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy DISABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
INSERT INTO pg_catalog.pg_dist_rebalance_strategy SELECT
name,
default_strategy,
shard_cost_function::regprocedure::regproc,
node_capacity_function::regprocedure::regproc,
shard_allowed_on_node_function::regprocedure::regproc,
default_threshold,
minimum_threshold
FROM public.pg_dist_rebalance_strategy;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy ENABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
--
-- drop backup tables
--
DROP TABLE public.pg_dist_authinfo;
DROP TABLE public.pg_dist_colocation;
DROP TABLE public.pg_dist_local_group;
DROP TABLE public.pg_dist_node;
DROP TABLE public.pg_dist_node_metadata;
DROP TABLE public.pg_dist_partition;
DROP TABLE public.pg_dist_placement;
DROP TABLE public.pg_dist_poolinfo;
DROP TABLE public.pg_dist_shard;
DROP TABLE public.pg_dist_transaction;
DROP TABLE public.pg_dist_rebalance_strategy;
--
-- reset sequences
--
PERFORM setval('pg_catalog.pg_dist_shardid_seq', (SELECT MAX(shardid)+1 AS max_shard_id FROM pg_dist_shard), false);
PERFORM setval('pg_catalog.pg_dist_placement_placementid_seq', (SELECT MAX(placementid)+1 AS max_placement_id FROM pg_dist_placement), false);
PERFORM setval('pg_catalog.pg_dist_groupid_seq', (SELECT MAX(groupid)+1 AS max_group_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_node_nodeid_seq', (SELECT MAX(nodeid)+1 AS max_node_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_colocationid_seq', (SELECT MAX(colocationid)+1 AS max_colocation_id FROM pg_dist_colocation), false);
--
-- register triggers
--
FOR table_name IN SELECT logicalrelid FROM pg_catalog.pg_dist_partition
LOOP
trigger_name := 'truncate_trigger_' || table_name::oid;
command := 'create trigger ' || trigger_name || ' after truncate on ' || table_name || ' execute procedure pg_catalog.citus_truncate_trigger()';
EXECUTE command;
command := 'update pg_trigger set tgisinternal = true where tgname = ' || quote_literal(trigger_name);
EXECUTE command;
END LOOP;
--
-- set dependencies
--
INSERT INTO pg_depend
SELECT
'pg_class'::regclass::oid as classid,
p.logicalrelid::regclass::oid as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
address.objid,
address.objsubid,
naming.distribution_argument_index,
naming.colocationid
FROM
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
PERFORM citus_internal.columnar_ensure_objects_exist();
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_finish_pg_upgrade()
IS 'perform tasks to restore citus settings from a location that has been prepared before pg_upgrade';

View File

@ -85,17 +85,7 @@ BEGIN
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
-- DELETE/INSERT to avoid primary key violations
WITH old_records AS (
DELETE FROM
citus.pg_dist_object
RETURNING
type,
object_names,
object_args,
distribution_argument_index,
colocationid
)
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
@ -104,8 +94,10 @@ BEGIN
naming.distribution_argument_index,
naming.colocationid
FROM
old_records naming,
pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;

View File

@ -0,0 +1,105 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
DECLARE
table_name regclass;
command text;
trigger_name text;
BEGIN
--
-- restore citus catalog tables
--
INSERT INTO pg_catalog.pg_dist_partition SELECT * FROM public.pg_dist_partition;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
INSERT INTO pg_catalog.pg_dist_node SELECT * FROM public.pg_dist_node;
INSERT INTO pg_catalog.pg_dist_local_group SELECT * FROM public.pg_dist_local_group;
INSERT INTO pg_catalog.pg_dist_transaction SELECT * FROM public.pg_dist_transaction;
INSERT INTO pg_catalog.pg_dist_colocation SELECT * FROM public.pg_dist_colocation;
-- enterprise catalog tables
INSERT INTO pg_catalog.pg_dist_authinfo SELECT * FROM public.pg_dist_authinfo;
INSERT INTO pg_catalog.pg_dist_poolinfo SELECT * FROM public.pg_dist_poolinfo;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy DISABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
INSERT INTO pg_catalog.pg_dist_rebalance_strategy SELECT
name,
default_strategy,
shard_cost_function::regprocedure::regproc,
node_capacity_function::regprocedure::regproc,
shard_allowed_on_node_function::regprocedure::regproc,
default_threshold,
minimum_threshold
FROM public.pg_dist_rebalance_strategy;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy ENABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
--
-- drop backup tables
--
DROP TABLE public.pg_dist_authinfo;
DROP TABLE public.pg_dist_colocation;
DROP TABLE public.pg_dist_local_group;
DROP TABLE public.pg_dist_node;
DROP TABLE public.pg_dist_node_metadata;
DROP TABLE public.pg_dist_partition;
DROP TABLE public.pg_dist_placement;
DROP TABLE public.pg_dist_poolinfo;
DROP TABLE public.pg_dist_shard;
DROP TABLE public.pg_dist_transaction;
--
-- reset sequences
--
PERFORM setval('pg_catalog.pg_dist_shardid_seq', (SELECT MAX(shardid)+1 AS max_shard_id FROM pg_dist_shard), false);
PERFORM setval('pg_catalog.pg_dist_placement_placementid_seq', (SELECT MAX(placementid)+1 AS max_placement_id FROM pg_dist_placement), false);
PERFORM setval('pg_catalog.pg_dist_groupid_seq', (SELECT MAX(groupid)+1 AS max_group_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_node_nodeid_seq', (SELECT MAX(nodeid)+1 AS max_node_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_colocationid_seq', (SELECT MAX(colocationid)+1 AS max_colocation_id FROM pg_dist_colocation), false);
--
-- register triggers
--
FOR table_name IN SELECT logicalrelid FROM pg_catalog.pg_dist_partition
LOOP
trigger_name := 'truncate_trigger_' || table_name::oid;
command := 'create trigger ' || trigger_name || ' after truncate on ' || table_name || ' execute procedure pg_catalog.citus_truncate_trigger()';
EXECUTE command;
command := 'update pg_trigger set tgisinternal = true where tgname = ' || quote_literal(trigger_name);
EXECUTE command;
END LOOP;
--
-- set dependencies
--
INSERT INTO pg_depend
SELECT
'pg_class'::regclass::oid as classid,
p.logicalrelid::regclass::oid as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
address.objid,
address.objsubid,
naming.distribution_argument_index,
naming.colocationid
FROM
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_finish_pg_upgrade()
IS 'perform tasks to restore citus settings from a location that has been prepared before pg_upgrade';

View File

@ -0,0 +1,106 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
DECLARE
table_name regclass;
command text;
trigger_name text;
BEGIN
--
-- restore citus catalog tables
--
INSERT INTO pg_catalog.pg_dist_partition SELECT * FROM public.pg_dist_partition;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
INSERT INTO pg_catalog.pg_dist_node SELECT * FROM public.pg_dist_node;
INSERT INTO pg_catalog.pg_dist_local_group SELECT * FROM public.pg_dist_local_group;
INSERT INTO pg_catalog.pg_dist_transaction SELECT * FROM public.pg_dist_transaction;
INSERT INTO pg_catalog.pg_dist_colocation SELECT * FROM public.pg_dist_colocation;
-- enterprise catalog tables
INSERT INTO pg_catalog.pg_dist_authinfo SELECT * FROM public.pg_dist_authinfo;
INSERT INTO pg_catalog.pg_dist_poolinfo SELECT * FROM public.pg_dist_poolinfo;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy DISABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
INSERT INTO pg_catalog.pg_dist_rebalance_strategy SELECT
name,
default_strategy,
shard_cost_function::regprocedure::regproc,
node_capacity_function::regprocedure::regproc,
shard_allowed_on_node_function::regprocedure::regproc,
default_threshold,
minimum_threshold
FROM public.pg_dist_rebalance_strategy;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy ENABLE TRIGGER pg_dist_rebalance_strategy_enterprise_check_trigger;
--
-- drop backup tables
--
DROP TABLE public.pg_dist_authinfo;
DROP TABLE public.pg_dist_colocation;
DROP TABLE public.pg_dist_local_group;
DROP TABLE public.pg_dist_node;
DROP TABLE public.pg_dist_node_metadata;
DROP TABLE public.pg_dist_partition;
DROP TABLE public.pg_dist_placement;
DROP TABLE public.pg_dist_poolinfo;
DROP TABLE public.pg_dist_shard;
DROP TABLE public.pg_dist_transaction;
DROP TABLE public.pg_dist_rebalance_strategy;
--
-- reset sequences
--
PERFORM setval('pg_catalog.pg_dist_shardid_seq', (SELECT MAX(shardid)+1 AS max_shard_id FROM pg_dist_shard), false);
PERFORM setval('pg_catalog.pg_dist_placement_placementid_seq', (SELECT MAX(placementid)+1 AS max_placement_id FROM pg_dist_placement), false);
PERFORM setval('pg_catalog.pg_dist_groupid_seq', (SELECT MAX(groupid)+1 AS max_group_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_node_nodeid_seq', (SELECT MAX(nodeid)+1 AS max_node_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_colocationid_seq', (SELECT MAX(colocationid)+1 AS max_colocation_id FROM pg_dist_colocation), false);
--
-- register triggers
--
FOR table_name IN SELECT logicalrelid FROM pg_catalog.pg_dist_partition
LOOP
trigger_name := 'truncate_trigger_' || table_name::oid;
command := 'create trigger ' || trigger_name || ' after truncate on ' || table_name || ' execute procedure pg_catalog.citus_truncate_trigger()';
EXECUTE command;
command := 'update pg_trigger set tgisinternal = true where tgname = ' || quote_literal(trigger_name);
EXECUTE command;
END LOOP;
--
-- set dependencies
--
INSERT INTO pg_depend
SELECT
'pg_class'::regclass::oid as classid,
p.logicalrelid::regclass::oid as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
address.objid,
address.objsubid,
naming.distribution_argument_index,
naming.colocationid
FROM
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_finish_pg_upgrade()
IS 'perform tasks to restore citus settings from a location that has been prepared before pg_upgrade';

View File

@ -85,17 +85,7 @@ BEGIN
FROM pg_catalog.pg_dist_partition p;
-- restore pg_dist_object from the stable identifiers
-- DELETE/INSERT to avoid primary key violations
WITH old_records AS (
DELETE FROM
citus.pg_dist_object
RETURNING
type,
object_names,
object_args,
distribution_argument_index,
colocationid
)
TRUNCATE citus.pg_dist_object;
INSERT INTO citus.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
@ -104,8 +94,10 @@ BEGIN
naming.distribution_argument_index,
naming.colocationid
FROM
old_records naming,
pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;

View File

@ -18,6 +18,7 @@ BEGIN
DROP TABLE IF EXISTS public.pg_dist_authinfo;
DROP TABLE IF EXISTS public.pg_dist_poolinfo;
DROP TABLE IF EXISTS public.pg_dist_rebalance_strategy;
DROP TABLE IF EXISTS public.pg_dist_object;
--
-- backup citus catalog tables
@ -45,8 +46,14 @@ BEGIN
FROM pg_catalog.pg_dist_rebalance_strategy;
-- store upgrade stable identifiers on pg_dist_object catalog
UPDATE citus.pg_dist_object
SET (type, object_names, object_args) = (SELECT * FROM pg_identify_object_as_address(classid, objid, objsubid));
CREATE TABLE public.pg_dist_object AS SELECT
address.type,
address.object_names,
address.object_args,
objects.distribution_argument_index,
objects.colocationid
FROM citus.pg_dist_object objects,
pg_catalog.pg_identify_object_as_address(objects.classid, objects.objid, objects.objsubid) address;
END;
$cppu$;

View File

@ -0,0 +1,44 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_prepare_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
BEGIN
--
-- backup citus catalog tables
--
CREATE TABLE public.pg_dist_partition AS SELECT * FROM pg_catalog.pg_dist_partition;
CREATE TABLE public.pg_dist_shard AS SELECT * FROM pg_catalog.pg_dist_shard;
CREATE TABLE public.pg_dist_placement AS SELECT * FROM pg_catalog.pg_dist_placement;
CREATE TABLE public.pg_dist_node_metadata AS SELECT * FROM pg_catalog.pg_dist_node_metadata;
CREATE TABLE public.pg_dist_node AS SELECT * FROM pg_catalog.pg_dist_node;
CREATE TABLE public.pg_dist_local_group AS SELECT * FROM pg_catalog.pg_dist_local_group;
CREATE TABLE public.pg_dist_transaction AS SELECT * FROM pg_catalog.pg_dist_transaction;
CREATE TABLE public.pg_dist_colocation AS SELECT * FROM pg_catalog.pg_dist_colocation;
-- enterprise catalog tables
CREATE TABLE public.pg_dist_authinfo AS SELECT * FROM pg_catalog.pg_dist_authinfo;
CREATE TABLE public.pg_dist_poolinfo AS SELECT * FROM pg_catalog.pg_dist_poolinfo;
CREATE TABLE public.pg_dist_rebalance_strategy AS SELECT
name,
default_strategy,
shard_cost_function::regprocedure::text,
node_capacity_function::regprocedure::text,
shard_allowed_on_node_function::regprocedure::text,
default_threshold,
minimum_threshold
FROM pg_catalog.pg_dist_rebalance_strategy;
-- store upgrade stable identifiers on pg_dist_object catalog
CREATE TABLE public.pg_dist_object AS SELECT
address.type,
address.object_names,
address.object_args,
objects.distribution_argument_index,
objects.colocationid
FROM citus.pg_dist_object objects,
pg_catalog.pg_identify_object_as_address(objects.classid, objects.objid, objects.objsubid) address;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_prepare_pg_upgrade()
IS 'perform tasks to copy citus settings to a location that could later be restored after pg_upgrade is done';

View File

@ -0,0 +1,60 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_prepare_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
BEGIN
--
-- Drop existing backup tables
--
DROP TABLE IF EXISTS public.pg_dist_partition;
DROP TABLE IF EXISTS public.pg_dist_shard;
DROP TABLE IF EXISTS public.pg_dist_placement;
DROP TABLE IF EXISTS public.pg_dist_node_metadata;
DROP TABLE IF EXISTS public.pg_dist_node;
DROP TABLE IF EXISTS public.pg_dist_local_group;
DROP TABLE IF EXISTS public.pg_dist_transaction;
DROP TABLE IF EXISTS public.pg_dist_colocation;
DROP TABLE IF EXISTS public.pg_dist_authinfo;
DROP TABLE IF EXISTS public.pg_dist_poolinfo;
DROP TABLE IF EXISTS public.pg_dist_rebalance_strategy;
DROP TABLE IF EXISTS public.pg_dist_object;
--
-- backup citus catalog tables
--
CREATE TABLE public.pg_dist_partition AS SELECT * FROM pg_catalog.pg_dist_partition;
CREATE TABLE public.pg_dist_shard AS SELECT * FROM pg_catalog.pg_dist_shard;
CREATE TABLE public.pg_dist_placement AS SELECT * FROM pg_catalog.pg_dist_placement;
CREATE TABLE public.pg_dist_node_metadata AS SELECT * FROM pg_catalog.pg_dist_node_metadata;
CREATE TABLE public.pg_dist_node AS SELECT * FROM pg_catalog.pg_dist_node;
CREATE TABLE public.pg_dist_local_group AS SELECT * FROM pg_catalog.pg_dist_local_group;
CREATE TABLE public.pg_dist_transaction AS SELECT * FROM pg_catalog.pg_dist_transaction;
CREATE TABLE public.pg_dist_colocation AS SELECT * FROM pg_catalog.pg_dist_colocation;
-- enterprise catalog tables
CREATE TABLE public.pg_dist_authinfo AS SELECT * FROM pg_catalog.pg_dist_authinfo;
CREATE TABLE public.pg_dist_poolinfo AS SELECT * FROM pg_catalog.pg_dist_poolinfo;
CREATE TABLE public.pg_dist_rebalance_strategy AS SELECT
name,
default_strategy,
shard_cost_function::regprocedure::text,
node_capacity_function::regprocedure::text,
shard_allowed_on_node_function::regprocedure::text,
default_threshold,
minimum_threshold
FROM pg_catalog.pg_dist_rebalance_strategy;
-- store upgrade stable identifiers on pg_dist_object catalog
CREATE TABLE public.pg_dist_object AS SELECT
address.type,
address.object_names,
address.object_args,
objects.distribution_argument_index,
objects.colocationid
FROM citus.pg_dist_object objects,
pg_catalog.pg_identify_object_as_address(objects.classid, objects.objid, objects.objsubid) address;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_prepare_pg_upgrade()
IS 'perform tasks to copy citus settings to a location that could later be restored after pg_upgrade is done';

View File

@ -18,6 +18,7 @@ BEGIN
DROP TABLE IF EXISTS public.pg_dist_authinfo;
DROP TABLE IF EXISTS public.pg_dist_poolinfo;
DROP TABLE IF EXISTS public.pg_dist_rebalance_strategy;
DROP TABLE IF EXISTS public.pg_dist_object;
--
-- backup citus catalog tables
@ -45,8 +46,14 @@ BEGIN
FROM pg_catalog.pg_dist_rebalance_strategy;
-- store upgrade stable identifiers on pg_dist_object catalog
UPDATE citus.pg_dist_object
SET (type, object_names, object_args) = (SELECT * FROM pg_identify_object_as_address(classid, objid, objsubid));
CREATE TABLE public.pg_dist_object AS SELECT
address.type,
address.object_names,
address.object_args,
objects.distribution_argument_index,
objects.colocationid
FROM citus.pg_dist_object objects,
pg_catalog.pg_identify_object_as_address(objects.classid, objects.objid, objects.objsubid) address;
END;
$cppu$;

View File

@ -0,0 +1,38 @@
DO $$
declare
citus_tables_create_query text;
BEGIN
citus_tables_create_query=$CTCQ$
CREATE OR REPLACE VIEW %I.citus_tables AS
SELECT
logicalrelid AS table_name,
CASE WHEN partkey IS NOT NULL THEN 'distributed' ELSE 'reference' END AS citus_table_type,
coalesce(column_to_column_name(logicalrelid, partkey), '<none>') AS distribution_column,
colocationid AS colocation_id,
pg_size_pretty(citus_total_relation_size(logicalrelid, fail_on_error := false)) AS table_size,
(select count(*) from pg_dist_shard where logicalrelid = p.logicalrelid) AS shard_count,
pg_get_userbyid(relowner) AS table_owner,
amname AS access_method
FROM
pg_dist_partition p
JOIN
pg_class c ON (p.logicalrelid = c.oid)
LEFT JOIN
pg_am a ON (a.oid = c.relam)
WHERE
partkey IS NOT NULL OR repmodel = 't'
ORDER BY
logicalrelid::text;
$CTCQ$;
IF EXISTS (SELECT 1 FROM pg_namespace WHERE nspname = 'public') THEN
EXECUTE format(citus_tables_create_query, 'public');
GRANT SELECT ON public.citus_tables TO public;
ELSE
EXECUTE format(citus_tables_create_query, 'citus');
ALTER VIEW citus.citus_tables SET SCHEMA pg_catalog;
GRANT SELECT ON pg_catalog.citus_tables TO public;
END IF;
END;
$$;

View File

@ -1,20 +1,38 @@
CREATE VIEW public.citus_tables AS
SELECT
logicalrelid AS table_name,
CASE WHEN partkey IS NOT NULL THEN 'distributed' ELSE 'reference' END AS citus_table_type,
coalesce(column_to_column_name(logicalrelid, partkey), '<none>') AS distribution_column,
colocationid AS colocation_id,
pg_size_pretty(citus_total_relation_size(logicalrelid, fail_on_error := false)) AS table_size,
(select count(*) from pg_dist_shard where logicalrelid = p.logicalrelid) AS shard_count,
pg_get_userbyid(relowner) AS table_owner,
amname AS access_method
FROM
pg_dist_partition p
JOIN
pg_class c ON (p.logicalrelid = c.oid)
LEFT JOIN
pg_am a ON (a.oid = c.relam)
WHERE
partkey IS NOT NULL OR repmodel = 't'
ORDER BY
logicalrelid::text;
DO $$
declare
citus_tables_create_query text;
BEGIN
citus_tables_create_query=$CTCQ$
CREATE OR REPLACE VIEW %I.citus_tables AS
SELECT
logicalrelid AS table_name,
CASE WHEN partkey IS NOT NULL THEN 'distributed' ELSE 'reference' END AS citus_table_type,
coalesce(column_to_column_name(logicalrelid, partkey), '<none>') AS distribution_column,
colocationid AS colocation_id,
pg_size_pretty(citus_total_relation_size(logicalrelid, fail_on_error := false)) AS table_size,
(select count(*) from pg_dist_shard where logicalrelid = p.logicalrelid) AS shard_count,
pg_get_userbyid(relowner) AS table_owner,
amname AS access_method
FROM
pg_dist_partition p
JOIN
pg_class c ON (p.logicalrelid = c.oid)
LEFT JOIN
pg_am a ON (a.oid = c.relam)
WHERE
partkey IS NOT NULL OR repmodel = 't'
ORDER BY
logicalrelid::text;
$CTCQ$;
IF EXISTS (SELECT 1 FROM pg_namespace WHERE nspname = 'public') THEN
EXECUTE format(citus_tables_create_query, 'public');
GRANT SELECT ON public.citus_tables TO public;
ELSE
EXECUTE format(citus_tables_create_query, 'citus');
ALTER VIEW citus.citus_tables SET SCHEMA pg_catalog;
GRANT SELECT ON pg_catalog.citus_tables TO public;
END IF;
END;
$$;

View File

@ -132,7 +132,7 @@ load_shard_placement_array(PG_FUNCTION_ARGS)
}
else
{
placementList = ShardPlacementList(shardId);
placementList = ShardPlacementListIncludingOrphanedPlacements(shardId);
}
placementList = SortList(placementList, CompareShardPlacementsByWorker);

View File

@ -36,12 +36,13 @@ create_progress(PG_FUNCTION_ARGS)
{
uint64 magicNumber = PG_GETARG_INT64(0);
int stepCount = PG_GETARG_INT32(1);
ProgressMonitorData *monitor = CreateProgressMonitor(magicNumber, stepCount,
sizeof(uint64), 0);
dsm_handle dsmHandle;
ProgressMonitorData *monitor = CreateProgressMonitor(stepCount,
sizeof(uint64), &dsmHandle);
if (monitor != NULL)
{
uint64 *steps = (uint64 *) monitor->steps;
uint64 *steps = (uint64 *) ProgressMonitorSteps(monitor);
int i = 0;
for (; i < stepCount; i++)
@ -50,6 +51,7 @@ create_progress(PG_FUNCTION_ARGS)
}
}
RegisterProgressMonitor(magicNumber, 0, dsmHandle);
PG_RETURN_VOID();
}
@ -64,7 +66,7 @@ update_progress(PG_FUNCTION_ARGS)
if (monitor != NULL && step < monitor->stepCount)
{
uint64 *steps = (uint64 *) monitor->steps;
uint64 *steps = (uint64 *) ProgressMonitorSteps(monitor);
steps[step] = newValue;
}
@ -93,7 +95,7 @@ show_progress(PG_FUNCTION_ARGS)
ProgressMonitorData *monitor = NULL;
foreach_ptr(monitor, monitorList)
{
uint64 *steps = monitor->steps;
uint64 *steps = ProgressMonitorSteps(monitor);
for (int stepIndex = 0; stepIndex < monitor->stepCount; stepIndex++)
{

View File

@ -350,8 +350,10 @@ ErrorIfShardPlacementsNotColocated(Oid leftRelationId, Oid rightRelationId)
leftRelationName, rightRelationName)));
}
List *leftPlacementList = ShardPlacementList(leftShardId);
List *rightPlacementList = ShardPlacementList(rightShardId);
List *leftPlacementList = ShardPlacementListWithoutOrphanedPlacements(
leftShardId);
List *rightPlacementList = ShardPlacementListWithoutOrphanedPlacements(
rightShardId);
if (list_length(leftPlacementList) != list_length(rightPlacementList))
{

View File

@ -18,6 +18,7 @@
#include "access/htup_details.h"
#include "distributed/distribution_column.h"
#include "distributed/metadata_cache.h"
#include "distributed/multi_partitioning_utils.h"
#include "distributed/version_compat.h"
#include "nodes/makefuncs.h"
#include "nodes/nodes.h"
@ -115,6 +116,53 @@ column_to_column_name(PG_FUNCTION_ARGS)
}
/*
* FindColumnWithNameOnTargetRelation gets a source table and
* column name. The function returns the the column with the
* same name on the target table.
*
* Note that due to dropping columns, the parent's distribution key may not
* match the partition's distribution key. See issue #5123.
*
* The function throws error if the input or output is not valid or does
* not exist.
*/
Var *
FindColumnWithNameOnTargetRelation(Oid sourceRelationId, char *sourceColumnName,
Oid targetRelationId)
{
if (sourceColumnName == NULL || sourceColumnName[0] == '\0')
{
ereport(ERROR, (errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("cannot find the given column on table \"%s\"",
generate_qualified_relation_name(sourceRelationId))));
}
AttrNumber attributeNumberOnTarget = get_attnum(targetRelationId, sourceColumnName);
if (attributeNumberOnTarget == InvalidAttrNumber)
{
ereport(ERROR, (errmsg("Column \"%s\" does not exist on "
"relation \"%s\"", sourceColumnName,
get_rel_name(targetRelationId))));
}
Index varNo = 1;
Oid targetTypeId = InvalidOid;
int32 targetTypMod = 0;
Oid targetCollation = InvalidOid;
Index varlevelsup = 0;
/* this function throws error in case anything goes wrong */
get_atttypetypmodcoll(targetRelationId, attributeNumberOnTarget,
&targetTypeId, &targetTypMod, &targetCollation);
Var *targetColumn =
makeVar(varNo, attributeNumberOnTarget, targetTypeId, targetTypMod,
targetCollation, varlevelsup);
return targetColumn;
}
/*
* BuildDistributionKeyFromColumnName builds a simple distribution key consisting
* only out of a reference to the column of name columnName. Errors out if the

View File

@ -344,7 +344,7 @@ ReplicateShardToNode(ShardInterval *shardInterval, char *nodeName, int nodePort)
List *ddlCommandList =
CopyShardCommandList(shardInterval, srcNodeName, srcNodePort, includeData);
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList = ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *targetPlacement = SearchShardPlacementInList(shardPlacementList,
nodeName, nodePort);
char *tableOwner = TableOwner(shardInterval->relationId);

View File

@ -73,6 +73,11 @@ alter_role_if_exists(PG_FUNCTION_ARGS)
Datum
worker_create_or_alter_role(PG_FUNCTION_ARGS)
{
if (PG_ARGISNULL(0))
{
ereport(ERROR, (errmsg("role name cannot be NULL")));
}
text *rolenameText = PG_GETARG_TEXT_P(0);
const char *rolename = text_to_cstring(rolenameText);

View File

@ -476,7 +476,7 @@ SingleReplicatedTable(Oid relationId)
/* checking only for the first shard id should suffice */
uint64 shardId = *(uint64 *) linitial(shardList);
shardPlacementList = ShardPlacementList(shardId);
shardPlacementList = ShardPlacementListWithoutOrphanedPlacements(shardId);
if (list_length(shardPlacementList) != 1)
{
return false;
@ -489,7 +489,7 @@ SingleReplicatedTable(Oid relationId)
foreach_ptr(shardIdPointer, shardIntervalList)
{
uint64 shardId = *shardIdPointer;
shardPlacementList = ShardPlacementList(shardId);
shardPlacementList = ShardPlacementListWithoutOrphanedPlacements(shardId);
if (list_length(shardPlacementList) != 1)
{

View File

@ -120,7 +120,7 @@ worker_drop_distributed_table(PG_FUNCTION_ARGS)
{
uint64 shardId = *shardIdPointer;
List *shardPlacementList = ShardPlacementList(shardId);
List *shardPlacementList = ShardPlacementListIncludingOrphanedPlacements(shardId);
ShardPlacement *placement = NULL;
foreach_ptr(placement, shardPlacementList)
{

View File

@ -101,9 +101,11 @@ typedef enum TableDDLCommandType
typedef enum IndexDefinitionDeparseFlags
{
INCLUDE_CREATE_INDEX_STATEMENTS = 1 << 0,
INCLUDE_INDEX_CLUSTERED_STATEMENTS = 1 << 1,
INCLUDE_INDEX_STATISTICS_STATEMENTTS = 1 << 2,
INCLUDE_CREATE_CONSTRAINT_STATEMENTS = 1 << 1,
INCLUDE_INDEX_CLUSTERED_STATEMENTS = 1 << 2,
INCLUDE_INDEX_STATISTICS_STATEMENTTS = 1 << 3,
INCLUDE_INDEX_ALL_STATEMENTS = INCLUDE_CREATE_INDEX_STATEMENTS |
INCLUDE_CREATE_CONSTRAINT_STATEMENTS |
INCLUDE_INDEX_CLUSTERED_STATEMENTS |
INCLUDE_INDEX_STATISTICS_STATEMENTTS
} IndexDefinitionDeparseFlags;
@ -280,7 +282,6 @@ extern ShardPlacement * SearchShardPlacementInListOrError(List *shardPlacementLi
uint32 nodePort);
extern void ErrorIfTargetNodeIsNotSafeToMove(const char *targetNodeName, int
targetNodePort);
extern void ErrorIfMoveCitusLocalTable(Oid relationId);
extern char LookupShardTransferMode(Oid shardReplicationModeOid);
extern void BlockWritesToShardList(List *shardList);
extern List * WorkerApplyShardDDLCommandList(List *ddlCommandList, int64 shardId);

View File

@ -28,7 +28,7 @@ extern void SetTaskQueryString(Task *task, char *queryString);
extern void SetTaskQueryStringList(Task *task, List *queryStringList);
extern char * TaskQueryString(Task *task);
extern char * TaskQueryStringAtIndex(Task *task, int index);
extern bool UpdateRelationsToLocalShardTables(Node *node, List *relationShardList);
extern int GetTaskQueryType(Task *task);
#endif /* DEPARSE_SHARD_QUERY_H */

View File

@ -19,6 +19,9 @@
/* Remaining metadata utility functions */
extern Var * FindColumnWithNameOnTargetRelation(Oid sourceRelationId,
char *sourceColumnName,
Oid targetRelationId);
extern Var * BuildDistributionKeyFromColumnName(Relation distributedRelation,
char *columnName);
extern char * ColumnToColumnName(Oid relationId, char *columnNodeString);

View File

@ -46,5 +46,8 @@ extern bool TaskAccessesLocalNode(Task *task);
extern void ErrorIfTransactionAccessedPlacementsLocally(void);
extern void DisableLocalExecution(void);
extern void SetLocalExecutionStatus(LocalExecutionStatus newStatus);
extern void ExtractParametersForLocalExecution(ParamListInfo paramListInfo,
Oid **parameterTypes,
const char ***parameterValues);
#endif /* LOCAL_EXECUTION_H */

View File

@ -5,6 +5,7 @@ extern bool IsLocalPlanCachingSupported(Job *currentJob,
DistributedPlan *originalDistributedPlan);
extern PlannedStmt * GetCachedLocalPlan(Task *task, DistributedPlan *distributedPlan);
extern void CacheLocalPlanForShardQuery(Task *task,
DistributedPlan *originalDistributedPlan);
DistributedPlan *originalDistributedPlan,
ParamListInfo paramListInfo);
#endif /* LOCAL_PLAN_CACHE */

View File

@ -148,7 +148,9 @@ extern List * CitusTableList(void);
extern ShardInterval * LoadShardInterval(uint64 shardId);
extern Oid RelationIdForShard(uint64 shardId);
extern bool ReferenceTableShardId(uint64 shardId);
extern ShardPlacement * FindShardPlacementOnGroup(int32 groupId, uint64 shardId);
extern ShardPlacement * ShardPlacementOnGroupIncludingOrphanedPlacements(int32 groupId,
uint64 shardId);
extern ShardPlacement * ActiveShardPlacementOnGroup(int32 groupId, uint64 shardId);
extern GroupShardPlacement * LoadGroupShardPlacement(uint64 shardId, uint64 placementId);
extern ShardPlacement * LoadShardPlacement(uint64 shardId, uint64 placementId);
extern CitusTableCacheEntry * GetCitusTableCacheEntry(Oid distributedRelationId);
@ -158,7 +160,7 @@ extern DistObjectCacheEntry * LookupDistObjectCacheEntry(Oid classid, Oid objid,
extern int32 GetLocalGroupId(void);
extern void CitusTableCacheFlushInvalidatedEntries(void);
extern Oid LookupShardRelationFromCatalog(int64 shardId, bool missing_ok);
extern List * ShardPlacementList(uint64 shardId);
extern List * ShardPlacementListIncludingOrphanedPlacements(uint64 shardId);
extern bool ShardExists(int64 shardId);
extern void CitusInvalidateRelcacheByRelid(Oid relationId);
extern void CitusInvalidateRelcacheByShardId(int64 shardId);

View File

@ -214,6 +214,7 @@ extern bool NodeGroupHasShardPlacements(int32 groupId,
bool onlyConsiderActivePlacements);
extern List * ActiveShardPlacementListOnGroup(uint64 shardId, int32 groupId);
extern List * ActiveShardPlacementList(uint64 shardId);
extern List * ShardPlacementListWithoutOrphanedPlacements(uint64 shardId);
extern ShardPlacement * ActiveShardPlacement(uint64 shardId, bool missingOk);
extern List * BuildShardPlacementList(ShardInterval *shardInterval);
extern List * AllShardPlacementsOnNodeGroup(int32 groupId);
@ -223,7 +224,6 @@ extern StringInfo GenerateSizeQueryOnMultiplePlacements(List *shardIntervalList,
SizeQueryType sizeQueryType,
bool optimizePartitionCalculations);
extern List * RemoveCoordinatorPlacementIfNotSingleNode(List *placementList);
extern ShardPlacement * ShardPlacementOnGroup(uint64 shardId, int groupId);
/* Function declarations to modify shard and shard placement data */
extern void InsertShardRow(Oid relationId, uint64 shardId, char storageType,

View File

@ -13,26 +13,31 @@
#define MULTI_PROGRESS_H
#include "postgres.h"
#include "fmgr.h"
#include "nodes/pg_list.h"
#include "storage/dsm.h"
typedef struct ProgressMonitorData
{
uint64 processId;
int stepCount;
void *steps;
} ProgressMonitorData;
extern ProgressMonitorData * CreateProgressMonitor(uint64 progressTypeMagicNumber,
int stepCount, Size stepSize,
Oid relationId);
extern ProgressMonitorData * CreateProgressMonitor(int stepCount, Size stepSize,
dsm_handle *dsmHandle);
extern void RegisterProgressMonitor(uint64 progressTypeMagicNumber,
Oid relationId,
dsm_handle dsmHandle);
extern ProgressMonitorData * GetCurrentProgressMonitor(void);
extern void FinalizeCurrentProgressMonitor(void);
extern List * ProgressMonitorList(uint64 commandTypeMagicNumber,
List **attachedDSMSegmentList);
extern void DetachFromDSMSegments(List *dsmSegmentList);
extern void * ProgressMonitorSteps(ProgressMonitorData *monitor);
#endif /* MULTI_PROGRESS_H */

View File

@ -13,3 +13,4 @@
extern uint64 ShardListSizeInBytes(List *colocatedShardList,
char *workerNodeName, uint32 workerNodePort);
extern void ErrorIfMoveUnsupportedTableType(Oid relationId);

View File

@ -0,0 +1,379 @@
CREATE SCHEMA drop_column_partitioned_table;
SET search_path TO drop_column_partitioned_table;
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 2580000;
-- create a partitioned table with some columns that
-- are going to be dropped within the tests
CREATE TABLE sensors(
col_to_drop_0 text,
col_to_drop_1 text,
col_to_drop_2 date,
col_to_drop_3 inet,
col_to_drop_4 date,
measureid integer,
eventdatetime date,
measure_data jsonb)
PARTITION BY RANGE(eventdatetime);
-- drop column even before attaching any partitions
ALTER TABLE sensors DROP COLUMN col_to_drop_1;
-- now attach the first partition and create the distributed table
CREATE TABLE sensors_2000 PARTITION OF sensors FOR VALUES FROM ('2000-01-01') TO ('2001-01-01');
SELECT create_distributed_table('sensors', 'measureid');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- prepared statements should work fine even after columns are dropped
PREPARE drop_col_prepare_insert(int, date, jsonb) AS INSERT INTO sensors (measureid, eventdatetime, measure_data) VALUES ($1, $2, $3);
PREPARE drop_col_prepare_select(int, date) AS SELECT count(*) FROM sensors WHERE measureid = $1 AND eventdatetime = $2;
-- execute 7 times to make sure it is cached
EXECUTE drop_col_prepare_insert(1, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-05', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-06', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-07', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(1, '2000-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-02');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-03');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-04');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-05');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-06');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(1, '2000-10-07');
count
---------------------------------------------------------------------
1
(1 row)
-- drop another column before attaching another partition
-- with .. PARTITION OF .. syntax
ALTER TABLE sensors DROP COLUMN col_to_drop_0;
CREATE TABLE sensors_2001 PARTITION OF sensors FOR VALUES FROM ('2001-01-01') TO ('2002-01-01');
-- drop another column before attaching another partition
-- with ALTER TABLE .. ATTACH PARTITION
ALTER TABLE sensors DROP COLUMN col_to_drop_2;
CREATE TABLE sensors_2002(
col_to_drop_4 date, col_to_drop_3 inet, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
ALTER TABLE sensors ATTACH PARTITION sensors_2002 FOR VALUES FROM ('2002-01-01') TO ('2003-01-01');
-- drop another column before attaching another partition
-- that is already distributed
ALTER TABLE sensors DROP COLUMN col_to_drop_3;
CREATE TABLE sensors_2003(
col_to_drop_4 date, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
SELECT create_distributed_table('sensors_2003', 'measureid');
create_distributed_table
---------------------------------------------------------------------
(1 row)
ALTER TABLE sensors ATTACH PARTITION sensors_2003 FOR VALUES FROM ('2003-01-01') TO ('2004-01-01');
CREATE TABLE sensors_2004(
col_to_drop_4 date, measureid integer NOT NULL, eventdatetime date NOT NULL, measure_data jsonb NOT NULL);
ALTER TABLE sensors ATTACH PARTITION sensors_2004 FOR VALUES FROM ('2004-01-01') TO ('2005-01-01');
ALTER TABLE sensors DROP COLUMN col_to_drop_4;
-- show that all partitions have the same distribution key
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
logicalrelid | column_to_column_name
---------------------------------------------------------------------
sensors | measureid
sensors_2000 | measureid
sensors_2001 | measureid
sensors_2002 | measureid
sensors_2003 | measureid
sensors_2004 | measureid
(6 rows)
-- show that all the tables prune to the same shard for the same distribution key
WITH
sensors_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors', 3)),
sensors_2000_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2000', 3)),
sensors_2001_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2001', 3)),
sensors_2002_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2002', 3)),
sensors_2003_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2003', 3)),
sensors_2004_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2004', 3)),
all_shardids AS (SELECT * FROM sensors_shardid UNION SELECT * FROM sensors_2000_shardid UNION
SELECT * FROM sensors_2001_shardid UNION SELECT * FROM sensors_2002_shardid
UNION SELECT * FROM sensors_2003_shardid UNION SELECT * FROM sensors_2004_shardid)
SELECT logicalrelid, shardid, shardminvalue, shardmaxvalue FROM pg_dist_shard WHERE shardid IN (SELECT * FROM all_shardids);
logicalrelid | shardid | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
sensors | 2580001 | -1073741824 | -1
sensors_2000 | 2580005 | -1073741824 | -1
sensors_2001 | 2580009 | -1073741824 | -1
sensors_2002 | 2580013 | -1073741824 | -1
sensors_2003 | 2580017 | -1073741824 | -1
sensors_2004 | 2580021 | -1073741824 | -1
(6 rows)
VACUUM ANALYZE sensors, sensors_2000, sensors_2001, sensors_2002, sensors_2003;
-- show that both INSERT and SELECT can route to a single node when distribution
-- key is provided in the query
EXPLAIN (COSTS FALSE) INSERT INTO sensors VALUES (3, '2000-02-02', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2580001
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2000 VALUES (3, '2000-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2000_2580005
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2001 VALUES (3, '2001-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2001_2580009
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2002 VALUES (3, '2002-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2002_2580013
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2003 VALUES (3, '2003-01-01', row_to_json(row(1)));
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Insert on sensors_2003_2580017
-> Result
(7 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors WHERE measureid = 3 AND eventdatetime = '2000-02-02';
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Seq Scan on sensors_2000_2580005 sensors
Filter: ((measureid = 3) AND (eventdatetime = '2000-02-02'::date))
(8 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2000 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Seq Scan on sensors_2000_2580005 sensors_2000
Filter: (measureid = 3)
(8 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2001 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Seq Scan on sensors_2001_2580009 sensors_2001
Filter: (measureid = 3)
(8 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2002 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Bitmap Heap Scan on sensors_2002_2580013 sensors_2002
Recheck Cond: (measureid = 3)
-> Bitmap Index Scan on sensors_2002_pkey_2580013
Index Cond: (measureid = 3)
(10 rows)
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2003 WHERE measureid = 3;
QUERY PLAN
---------------------------------------------------------------------
Custom Scan (Citus Adaptive)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=xxxxx dbname=regression
-> Aggregate
-> Bitmap Heap Scan on sensors_2003_2580017 sensors_2003
Recheck Cond: (measureid = 3)
-> Bitmap Index Scan on sensors_2003_pkey_2580017
Index Cond: (measureid = 3)
(10 rows)
-- execute 7 times to make sure it is re-cached
EXECUTE drop_col_prepare_insert(3, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2001-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2002-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(4, '2003-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(5, '2003-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(3, '2000-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2001-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2002-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2003-10-01');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(3, '2003-10-02');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(4, '2003-10-03');
count
---------------------------------------------------------------------
1
(1 row)
EXECUTE drop_col_prepare_select(5, '2003-10-04');
count
---------------------------------------------------------------------
1
(1 row)
-- non-fast router planner queries should also work
-- so we switched to DEBUG2 to show that dist. key
-- and the query is router
SET client_min_messages TO DEBUG2;
SELECT count(*) FROM (
SELECT * FROM sensors WHERE measureid = 3
UNION
SELECT * FROM sensors_2000 WHERE measureid = 3
UNION
SELECT * FROM sensors_2001 WHERE measureid = 3
UNION
SELECT * FROM sensors_2002 WHERE measureid = 3
UNION
SELECT * FROM sensors_2003 WHERE measureid = 3
UNION
SELECT * FROM sensors_2004 WHERE measureid = 3
) as foo;
DEBUG: Creating router plan
DEBUG: query has a single distribution column value: 3
count
---------------------------------------------------------------------
5
(1 row)
RESET client_min_messages;
-- show that all partitions have the same distribution key
-- even after alter_distributed_table changes the shard count
-- remove this comment once https://github.com/citusdata/citus/issues/5137 is fixed
--SELECT alter_distributed_table('sensors', shard_count:='3');
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
logicalrelid | column_to_column_name
---------------------------------------------------------------------
sensors | measureid
sensors_2000 | measureid
sensors_2001 | measureid
sensors_2002 | measureid
sensors_2003 | measureid
sensors_2004 | measureid
(6 rows)
SET client_min_messages TO WARNING;
DROP SCHEMA drop_column_partitioned_table CASCADE;

View File

@ -77,13 +77,13 @@ SELECT create_distributed_table('referencing_table', 'ref_id');
ALTER TABLE referencing_table ADD CONSTRAINT fkey_ref FOREIGN KEY(ref_id) REFERENCES referenced_table(id) ON UPDATE SET NULL;
ERROR: cannot create foreign key constraint
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DROP TABLE referencing_table;
BEGIN;
CREATE TABLE referencing_table(id int, ref_id int, FOREIGN KEY(ref_id) REFERENCES referenced_table(id) ON UPDATE SET NULL);
SELECT create_distributed_table('referencing_table', 'ref_id');
ERROR: cannot create foreign key constraint
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
ROLLBACK;
-- try with multiple columns including the distribution column
DROP TABLE referenced_table;
@ -103,12 +103,12 @@ SELECT create_distributed_table('referencing_table', 'ref_id');
ALTER TABLE referencing_table ADD CONSTRAINT fkey_ref FOREIGN KEY(id, ref_id) REFERENCES referenced_table(id, test_column) ON UPDATE SET DEFAULT;
ERROR: cannot create foreign key constraint
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DROP TABLE referencing_table;
CREATE TABLE referencing_table(id int, ref_id int, FOREIGN KEY(id, ref_id) REFERENCES referenced_table(id, test_column) ON UPDATE SET DEFAULT);
SELECT create_distributed_table('referencing_table', 'ref_id');
ERROR: cannot create foreign key constraint
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DROP TABLE referencing_table;
CREATE TABLE referencing_table(id int, ref_id int);
SELECT create_distributed_table('referencing_table', 'ref_id');
@ -119,13 +119,13 @@ SELECT create_distributed_table('referencing_table', 'ref_id');
ALTER TABLE referencing_table ADD CONSTRAINT fkey_ref FOREIGN KEY(id, ref_id) REFERENCES referenced_table(id, test_column) ON UPDATE CASCADE;
ERROR: cannot create foreign key constraint
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DROP TABLE referencing_table;
BEGIN;
CREATE TABLE referencing_table(id int, ref_id int, FOREIGN KEY(id, ref_id) REFERENCES referenced_table(id, test_column) ON UPDATE CASCADE);
SELECT create_distributed_table('referencing_table', 'ref_id');
ERROR: cannot create foreign key constraint
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
DETAIL: SET NULL, SET DEFAULT or CASCADE is not supported in ON UPDATE operation when distribution key included in the foreign constraint.
ROLLBACK;
-- all of the above is supported if the foreign key does not include distribution column
DROP TABLE referenced_table;
@ -349,8 +349,9 @@ SELECT create_distributed_table('referencing_table', 'ref_id', 'append');
(1 row)
ALTER TABLE referencing_table ADD CONSTRAINT fkey_ref FOREIGN KEY (id) REFERENCES referenced_table(id);
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ERROR: cannot create foreign key constraint
DETAIL: Citus currently supports foreign key constraints only for "citus.shard_replication_factor = 1".
HINT: Please change "citus.shard_replication_factor to 1". To learn more about using foreign keys with other replication factors, please contact us at https://citusdata.com/about/contact_us.
SELECT * FROM table_fkeys_in_workers WHERE name LIKE 'fkey_ref%' ORDER BY 1,2,3;
name | relid | refd_relid
---------------------------------------------------------------------
@ -365,8 +366,9 @@ SELECT create_distributed_table('referencing_table', 'ref_id', 'range');
(1 row)
ALTER TABLE referencing_table ADD CONSTRAINT fkey_ref FOREIGN KEY (id) REFERENCES referenced_table(id);
ERROR: cannot create foreign key constraint since relations are not colocated or not referencing a reference table
DETAIL: A distributed table can only have foreign keys if it is referencing another colocated hash distributed table or a reference table
ERROR: cannot create foreign key constraint
DETAIL: Citus currently supports foreign key constraints only for "citus.shard_replication_factor = 1".
HINT: Please change "citus.shard_replication_factor to 1". To learn more about using foreign keys with other replication factors, please contact us at https://citusdata.com/about/contact_us.
SELECT * FROM table_fkeys_in_workers WHERE name LIKE 'fkey_ref%' ORDER BY 1,2,3;
name | relid | refd_relid
---------------------------------------------------------------------

View File

@ -0,0 +1,355 @@
CREATE SCHEMA ignoring_orphaned_shards;
SET search_path TO ignoring_orphaned_shards;
-- Use a weird shard count that we don't use in any other tests
SET citus.shard_count TO 13;
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 92448000;
CREATE TABLE ref(id int PRIMARY KEY);
SELECT * FROM create_reference_table('ref');
create_reference_table
---------------------------------------------------------------------
(1 row)
SET citus.next_shard_id TO 92448100;
ALTER SEQUENCE pg_catalog.pg_dist_colocationid_seq RESTART 92448100;
CREATE TABLE dist1(id int);
SELECT * FROM create_distributed_table('dist1', 'id');
create_distributed_table
---------------------------------------------------------------------
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
dist1
(1 row)
-- Move first shard, so that the first shard now has 2 placements. One that's
-- active and one that's orphaned.
SELECT citus_move_shard_placement(92448100, 'localhost', :worker_1_port, 'localhost', :worker_2_port, 'block_writes');
citus_move_shard_placement
---------------------------------------------------------------------
(1 row)
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448100 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448100 | 4 | 57637
92448100 | 1 | 57638
(2 rows)
-- Add a new table that should get colocated with dist1 automatically, but
-- should not get a shard for the orphaned placement.
SET citus.next_shard_id TO 92448200;
CREATE TABLE dist2(id int);
SELECT * FROM create_distributed_table('dist2', 'id');
create_distributed_table
---------------------------------------------------------------------
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
dist1
dist2
(2 rows)
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448200 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448200 | 1 | 57638
(1 row)
-- uncolocate it
SELECT update_distributed_table_colocation('dist2', 'none');
update_distributed_table_colocation
---------------------------------------------------------------------
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
dist1
(1 row)
-- Make sure we can add it back to the colocation, even though it has a
-- different number of shard placements for the first shard.
SELECT update_distributed_table_colocation('dist2', 'dist1');
update_distributed_table_colocation
---------------------------------------------------------------------
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
dist1
dist2
(2 rows)
-- Make sure that replication count check in FOR UPDATE ignores orphaned
-- shards.
SELECT * FROM dist1 WHERE id = 1 FOR UPDATE;
id
---------------------------------------------------------------------
(0 rows)
-- Make sure we don't send a query to the orphaned shard
BEGIN;
SET LOCAL citus.log_remote_commands TO ON;
INSERT INTO dist1 VALUES (1);
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing INSERT INTO ignoring_orphaned_shards.dist1_92448100 (id) VALUES (1)
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
ROLLBACK;
NOTICE: issuing ROLLBACK
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
-- Make sure we can create a foreign key on community edition, because
-- replication factor is 1
ALTER TABLE dist1
ADD CONSTRAINT dist1_ref_fk
FOREIGN KEY (id)
REFERENCES ref(id);
SET citus.shard_replication_factor TO 2;
SET citus.next_shard_id TO 92448300;
ALTER SEQUENCE pg_catalog.pg_dist_colocationid_seq RESTART 92448300;
CREATE TABLE rep1(id int);
SELECT * FROM create_distributed_table('rep1', 'id');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- Add the coordinator, so we can have a replicated shard
SELECT 1 FROM citus_add_node('localhost', :master_port, 0);
NOTICE: Replicating reference table "ref" to the node localhost:xxxxx
?column?
---------------------------------------------------------------------
1
(1 row)
SELECT 1 FROM citus_set_node_property('localhost', :master_port, 'shouldhaveshards', true);
?column?
---------------------------------------------------------------------
1
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
rep1
(1 row)
SELECT citus_move_shard_placement(92448300, 'localhost', :worker_1_port, 'localhost', :master_port);
citus_move_shard_placement
---------------------------------------------------------------------
(1 row)
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448300 | 4 | 57637
92448300 | 1 | 57638
92448300 | 1 | 57636
(3 rows)
-- Add a new table that should get colocated with rep1 automatically, but
-- should not get a shard for the orphaned placement.
SET citus.next_shard_id TO 92448400;
CREATE TABLE rep2(id int);
SELECT * FROM create_distributed_table('rep2', 'id');
create_distributed_table
---------------------------------------------------------------------
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
rep1
rep2
(2 rows)
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448400 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448400 | 1 | 57636
92448400 | 1 | 57638
(2 rows)
-- uncolocate it
SELECT update_distributed_table_colocation('rep2', 'none');
update_distributed_table_colocation
---------------------------------------------------------------------
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
rep1
(1 row)
-- Make sure we can add it back to the colocation, even though it has a
-- different number of shard placements for the first shard.
SELECT update_distributed_table_colocation('rep2', 'rep1');
update_distributed_table_colocation
---------------------------------------------------------------------
(1 row)
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
logicalrelid
---------------------------------------------------------------------
rep1
rep2
(2 rows)
UPDATE pg_dist_placement SET shardstate = 3 WHERE shardid = 92448300 AND groupid = 0;
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448300 | 4 | 57637
92448300 | 1 | 57638
92448300 | 3 | 57636
(3 rows)
-- cannot copy from an orphaned shard
SELECT * FROM citus_copy_shard_placement(92448300, 'localhost', :worker_1_port, 'localhost', :master_port);
ERROR: source placement must be in active state
-- cannot copy to an orphaned shard
SELECT * FROM citus_copy_shard_placement(92448300, 'localhost', :worker_2_port, 'localhost', :worker_1_port);
ERROR: target placement must be in inactive state
-- can still copy to an inactive shard
SELECT * FROM citus_copy_shard_placement(92448300, 'localhost', :worker_2_port, 'localhost', :master_port);
citus_copy_shard_placement
---------------------------------------------------------------------
(1 row)
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448300 | 4 | 57637
92448300 | 1 | 57638
92448300 | 1 | 57636
(3 rows)
-- Make sure we don't send a query to the orphaned shard
BEGIN;
SET LOCAL citus.log_remote_commands TO ON;
SET LOCAL citus.log_local_commands TO ON;
INSERT INTO rep1 VALUES (1);
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing INSERT INTO ignoring_orphaned_shards.rep1_92448300 (id) VALUES (1)
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: executing the command locally: INSERT INTO ignoring_orphaned_shards.rep1_92448300 (id) VALUES (1)
ROLLBACK;
NOTICE: issuing ROLLBACK
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
-- Cause the orphaned shard to be local
SELECT 1 FROM citus_drain_node('localhost', :master_port);
NOTICE: Moving shard xxxxx from localhost:xxxxx to localhost:xxxxx ...
?column?
---------------------------------------------------------------------
1
(1 row)
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448300 | 1 | 57638
92448300 | 4 | 57636
92448300 | 1 | 57637
(3 rows)
-- Make sure we don't send a query to the orphaned shard if it's local
BEGIN;
SET LOCAL citus.log_remote_commands TO ON;
SET LOCAL citus.log_local_commands TO ON;
INSERT INTO rep1 VALUES (1);
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing INSERT INTO ignoring_orphaned_shards.rep1_92448300 (id) VALUES (1)
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing INSERT INTO ignoring_orphaned_shards.rep1_92448300 (id) VALUES (1)
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
ROLLBACK;
NOTICE: issuing ROLLBACK
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
NOTICE: issuing ROLLBACK
DETAIL: on server postgres@localhost:xxxxx connectionId: xxxxxxx
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 92448500;
CREATE TABLE range1(id int);
SELECT create_distributed_table('range1', 'id', 'range');
create_distributed_table
---------------------------------------------------------------------
(1 row)
CALL public.create_range_partitioned_shards('range1', '{0,3}','{2,5}');
-- Move shard placement and clean it up
SELECT citus_move_shard_placement(92448500, 'localhost', :worker_2_port, 'localhost', :worker_1_port, 'block_writes');
citus_move_shard_placement
---------------------------------------------------------------------
(1 row)
CALL citus_cleanup_orphaned_shards();
NOTICE: cleaned up 3 orphaned shards
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448300 | 1 | 57638
92448300 | 1 | 57637
(2 rows)
SET citus.next_shard_id TO 92448600;
CREATE TABLE range2(id int);
SELECT create_distributed_table('range2', 'id', 'range');
create_distributed_table
---------------------------------------------------------------------
(1 row)
CALL public.create_range_partitioned_shards('range2', '{0,3}','{2,5}');
-- Move shard placement and DON'T clean it up, now range1 and range2 are
-- colocated, but only range2 has an orphaned shard.
SELECT citus_move_shard_placement(92448600, 'localhost', :worker_2_port, 'localhost', :worker_1_port, 'block_writes');
citus_move_shard_placement
---------------------------------------------------------------------
(1 row)
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448600 ORDER BY placementid;
shardid | shardstate | nodeport
---------------------------------------------------------------------
92448600 | 4 | 57638
92448600 | 1 | 57637
(2 rows)
-- Make sure that tables are detected as colocated
SELECT * FROM range1 JOIN range2 ON range1.id = range2.id;
id | id
---------------------------------------------------------------------
(0 rows)
-- Make sure we can create a foreign key on community edition, because
-- replication factor is 1
ALTER TABLE range1
ADD CONSTRAINT range1_ref_fk
FOREIGN KEY (id)
REFERENCES ref(id);
SET client_min_messages TO WARNING;
DROP SCHEMA ignoring_orphaned_shards CASCADE;

View File

@ -65,8 +65,8 @@ step s3-progress:
table_name shardid shard_size sourcename sourceport source_shard_sizetargetname targetport target_shard_sizeprogress
colocated1 1500001 49152 localhost 57637 49152 localhost 57638 49152 2
colocated2 1500005 376832 localhost 57637 376832 localhost 57638 376832 2
colocated1 1500001 73728 localhost 57637 49152 localhost 57638 73728 2
colocated2 1500005 401408 localhost 57637 376832 localhost 57638 401408 2
colocated1 1500002 196608 localhost 57637 196608 localhost 57638 0 1
colocated2 1500006 8192 localhost 57637 8192 localhost 57638 0 1
step s2-unlock-2:

View File

@ -0,0 +1,334 @@
CREATE SCHEMA local_shard_execution_dropped_column;
SET search_path TO local_shard_execution_dropped_column;
SET citus.next_shard_id TO 2460000;
-- the scenario is described on https://github.com/citusdata/citus/issues/5038
-- first stop the metadata syncing to the node do that drop column
-- is not propogated
SELECT stop_metadata_sync_to_node('localhost',:worker_1_port);
stop_metadata_sync_to_node
---------------------------------------------------------------------
(1 row)
SELECT stop_metadata_sync_to_node('localhost',:worker_2_port);
stop_metadata_sync_to_node
---------------------------------------------------------------------
(1 row)
-- create a distributed table, drop a column and sync the metadata
SET citus.shard_replication_factor TO 1;
CREATE TABLE t1 (a int, b int, c int UNIQUE);
SELECT create_distributed_table('t1', 'c');
create_distributed_table
---------------------------------------------------------------------
(1 row)
ALTER TABLE t1 DROP COLUMN b;
SELECT start_metadata_sync_to_node('localhost',:worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
(1 row)
SELECT start_metadata_sync_to_node('localhost',:worker_2_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
(1 row)
\c - - - :worker_1_port
SET search_path TO local_shard_execution_dropped_column;
-- show the dropped columns
SELECT attrelid::regclass, attname, attnum, attisdropped
FROM pg_attribute WHERE attrelid IN ('t1'::regclass, 't1_2460000'::regclass) and attname NOT IN ('tableoid','cmax', 'xmax', 'cmin', 'xmin', 'ctid')
ORDER BY 1, 3, 2, 4;
attrelid | attname | attnum | attisdropped
---------------------------------------------------------------------
t1_2460000 | a | 1 | f
t1_2460000 | ........pg.dropped.2........ | 2 | t
t1_2460000 | c | 3 | f
t1 | a | 1 | f
t1 | c | 2 | f
(5 rows)
-- connect to a worker node where local execution is done
prepare p1(int) as insert into t1(a,c) VALUES (5,$1) ON CONFLICT (c) DO NOTHING;
SET citus.log_remote_commands TO ON;
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5, 8) ON CONFLICT(c) DO NOTHING
prepare p2(int) as SELECT count(*) FROM t1 WHERE c = $1 GROUP BY c;
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
prepare p3(int) as INSERT INTO t1(a,c) VALUES (5, $1), (6, $1), (7, $1),(5, $1), (6, $1), (7, $1) ON CONFLICT DO NOTHING;
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (a, c) VALUES (5,8), (6,8), (7,8), (5,8), (6,8), (7,8) ON CONFLICT DO NOTHING
prepare p4(int) as UPDATE t1 SET a = a + 1 WHERE c = $1;
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
execute p4(8);
NOTICE: executing the command locally: UPDATE local_shard_execution_dropped_column.t1_2460000 t1 SET a = (a OPERATOR(pg_catalog.+) 1) WHERE (c OPERATOR(pg_catalog.=) 8)
\c - - - :master_port
-- one another combination is that the shell table
-- has a dropped column but not the shard, via rebalance operation
SET search_path TO local_shard_execution_dropped_column;
ALTER TABLE t1 DROP COLUMN a;
SELECT citus_move_shard_placement(2460000, 'localhost', :worker_1_port, 'localhost', :worker_2_port, 'block_writes');
citus_move_shard_placement
---------------------------------------------------------------------
(1 row)
\c - - - :worker_2_port
SET search_path TO local_shard_execution_dropped_column;
-- show the dropped columns
SELECT attrelid::regclass, attname, attnum, attisdropped
FROM pg_attribute WHERE attrelid IN ('t1'::regclass, 't1_2460000'::regclass) and attname NOT IN ('tableoid','cmax', 'xmax', 'cmin', 'xmin', 'ctid')
ORDER BY 1, 3, 2, 4;
attrelid | attname | attnum | attisdropped
---------------------------------------------------------------------
t1 | ........pg.dropped.1........ | 1 | t
t1 | c | 2 | f
t1_2460000 | c | 1 | f
(3 rows)
prepare p1(int) as insert into t1(c) VALUES ($1) ON CONFLICT (c) DO NOTHING;
SET citus.log_remote_commands TO ON;
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
execute p1(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8) ON CONFLICT(c) DO NOTHING
prepare p2(int) as SELECT count(*) FROM t1 WHERE c = $1 GROUP BY c;
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
execute p2(8);
NOTICE: executing the command locally: SELECT count(*) AS count FROM local_shard_execution_dropped_column.t1_2460000 t1 WHERE (c OPERATOR(pg_catalog.=) 8) GROUP BY c
count
---------------------------------------------------------------------
1
(1 row)
prepare p3(int) as INSERT INTO t1(c) VALUES ($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1) ON CONFLICT DO NOTHING;
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
execute p3(8);
NOTICE: executing the command locally: INSERT INTO local_shard_execution_dropped_column.t1_2460000 AS citus_table_alias (c) VALUES (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8), (8) ON CONFLICT DO NOTHING
\c - - - :master_port
DROP SCHEMA local_shard_execution_dropped_column CASCADE;
NOTICE: drop cascades to table local_shard_execution_dropped_column.t1

View File

@ -16,6 +16,7 @@ SELECT substring(:'server_version', '\d+')::int > 11 AS version_above_eleven;
(1 row)
SET citus.next_shard_id TO 580000;
CREATE SCHEMA multi_extension;
SELECT $definition$
CREATE OR REPLACE FUNCTION test.maintenance_worker()
RETURNS pg_stat_activity
@ -42,13 +43,14 @@ END;
$$;
$definition$ create_function_test_maintenance_worker
\gset
CREATE TABLE prev_objects(description text);
CREATE TABLE extension_diff(previous_object text COLLATE "C",
CREATE TABLE multi_extension.prev_objects(description text);
CREATE TABLE multi_extension.extension_diff(previous_object text COLLATE "C",
current_object text COLLATE "C");
CREATE FUNCTION print_extension_changes()
CREATE FUNCTION multi_extension.print_extension_changes()
RETURNS TABLE(previous_object text, current_object text)
AS $func$
BEGIN
SET LOCAL search_path TO multi_extension;
TRUNCATE TABLE extension_diff;
CREATE TABLE current_objects AS
@ -130,7 +132,7 @@ ALTER EXTENSION citus UPDATE TO '9.1-1';
ALTER EXTENSION citus UPDATE TO '9.2-1';
ALTER EXTENSION citus UPDATE TO '9.2-2';
-- Snapshot of state at 9.2-2
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
| event trigger citus_cascade_to_partition
@ -327,7 +329,7 @@ SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.2-4';
ALTER EXTENSION citus UPDATE TO '9.2-2';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
@ -342,7 +344,7 @@ ALTER EXTENSION citus UPDATE TO '9.3-1';
ERROR: extension "citus" has no update path from version "9.2-2" to version "9.3-1"
ALTER EXTENSION citus UPDATE TO '9.2-4';
-- Snapshot of state at 9.2-4
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
@ -351,14 +353,14 @@ SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.3-2';
ALTER EXTENSION citus UPDATE TO '9.2-4';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.3-2
ALTER EXTENSION citus UPDATE TO '9.3-2';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
function citus_extradata_container(internal) void |
@ -374,20 +376,119 @@ SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.4-1';
ALTER EXTENSION citus UPDATE TO '9.3-2';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
| function worker_last_saved_explain_analyze() TABLE(explain_analyze_output text, execution_duration double precision)
| function worker_save_query_explain_analyze(text,jsonb) SETOF record
(2 rows)
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.4-2';
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
ALTER EXTENSION citus UPDATE TO '9.4-2';
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
ALTER EXTENSION citus UPDATE TO '9.4-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Test downgrade to 9.4-1 from 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
BEGIN;
@ -414,14 +515,14 @@ ROLLBACK;
-- now we can downgrade as there is no citus local table
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
function master_drop_sequences(text[]) void |
@ -436,18 +537,119 @@ SELECT * FROM print_extension_changes();
| function worker_record_sequence_dependency(regclass,regclass,name) void
(10 rows)
-- Test downgrade to 9.5-1 from 10.0-1
ALTER EXTENSION citus UPDATE TO '10.0-1';
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.5-2';
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 10.0-1
ALTER EXTENSION citus UPDATE TO '10.0-1';
SELECT * FROM print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.5-2';
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
ALTER EXTENSION citus UPDATE TO '9.5-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
citus_update_table_statistics
(1 row)
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
prosrc
---------------------------------------------------------------------
+
DECLARE +
colocated_tables regclass[]; +
BEGIN +
SELECT get_colocated_table_array(relation) INTO colocated_tables;+
PERFORM +
master_update_shard_statistics(shardid) +
FROM +
pg_dist_shard +
WHERE +
logicalrelid = ANY (colocated_tables); +
END; +
(1 row)
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- We removed the upgrade paths to 10.0-1, 10.0-2 and 10.0-3 due to a bug that blocked
-- upgrades to 10.0, Therefore we test upgrades to 10.0-4 instead
-- Test downgrade to 9.5-1 from 10.0-4
ALTER EXTENSION citus UPDATE TO '10.0-4';
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 10.0-4
ALTER EXTENSION citus UPDATE TO '10.0-4';
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
function citus_total_relation_size(regclass) bigint |
@ -488,6 +690,7 @@ SELECT * FROM print_extension_changes();
| function citus_dist_shard_cache_invalidate() trigger
| function citus_drain_node(text,integer,citus.shard_transfer_mode,name) void
| function citus_drop_all_shards(regclass,text,text) integer
| function citus_get_active_worker_nodes() SETOF record
| function citus_internal.columnar_ensure_objects_exist() void
| function citus_move_shard_placement(bigint,text,integer,text,integer,citus.shard_transfer_mode) void
| function citus_remove_node(text,integer) void
@ -515,55 +718,50 @@ SELECT * FROM print_extension_changes();
| table columnar.options
| table columnar.stripe
| view citus_shards
| view citus_tables
| view public.citus_tables
| view time_partitions
(67 rows)
(68 rows)
-- Test downgrade to 10.0-1 from 10.0-2
ALTER EXTENSION citus UPDATE TO '10.0-2';
ALTER EXTENSION citus UPDATE TO '10.0-1';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
previous_object | current_object
-- check that we depend on the existence of public schema, and we can not drop it now
DROP SCHEMA public;
ERROR: cannot drop schema public because other objects depend on it
DETAIL: extension citus depends on schema public
HINT: Use DROP ... CASCADE to drop the dependent objects too.
-- verify that citus_tables view is on pg_catalog if public schema is absent.
ALTER EXTENSION citus UPDATE TO '9.5-1';
DROP SCHEMA public;
ALTER EXTENSION citus UPDATE TO '10.0-4';
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
view public.citus_tables |
| view citus_tables
(2 rows)
-- Snapshot of state at 10.0-2
ALTER EXTENSION citus UPDATE TO '10.0-2';
SELECT * FROM print_extension_changes();
previous_object | current_object
-- recreate public schema, and recreate citus_tables in the public schema by default
CREATE SCHEMA public;
GRANT ALL ON SCHEMA public TO public;
ALTER EXTENSION citus UPDATE TO '9.5-1';
ALTER EXTENSION citus UPDATE TO '10.0-4';
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
view citus_tables |
| view public.citus_tables
(2 rows)
-- Test downgrade to 10.0-2 from 10.0-3
ALTER EXTENSION citus UPDATE TO '10.0-3';
ALTER EXTENSION citus UPDATE TO '10.0-2';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 10.0-3
ALTER EXTENSION citus UPDATE TO '10.0-3';
SELECT * FROM print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
| function citus_get_active_worker_nodes() SETOF record
(1 row)
-- Test downgrade to 10.0-3 from 10.1-1
-- Test downgrade to 10.0-4 from 10.1-1
ALTER EXTENSION citus UPDATE TO '10.1-1';
ALTER EXTENSION citus UPDATE TO '10.0-3';
ALTER EXTENSION citus UPDATE TO '10.0-4';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
(0 rows)
-- Snapshot of state at 10.1-1
ALTER EXTENSION citus UPDATE TO '10.1-1';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
previous_object | current_object
---------------------------------------------------------------------
function citus_add_rebalance_strategy(name,regproc,regproc,regproc,real,real) void |
@ -583,12 +781,12 @@ SELECT * FROM print_extension_changes();
| function worker_partitioned_table_size(regclass) bigint
(15 rows)
DROP TABLE prev_objects, extension_diff;
DROP TABLE multi_extension.prev_objects, multi_extension.extension_diff;
-- show running version
SHOW citus.version;
citus.version
---------------------------------------------------------------------
10.1devel
10.1.2
(1 row)
-- ensure no unexpected objects were created outside pg_catalog
@ -926,3 +1124,7 @@ SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Maintenanc
(1 row)
DROP TABLE version_mismatch_table;
DROP SCHEMA multi_extension;
ERROR: cannot drop schema multi_extension because other objects depend on it
DETAIL: function multi_extension.print_extension_changes() depends on schema multi_extension
HINT: Use DROP ... CASCADE to drop the dependent objects too.

View File

@ -38,14 +38,48 @@ DEBUG: Router planner does not support append-partitioned tables.
-- Partition pruning left three shards for the lineitem and one shard for the
-- orders table. These shard sets don't overlap, so join pruning should prune
-- out all the shards, and leave us with an empty task list.
select * from pg_dist_shard
where logicalrelid='lineitem'::regclass or
logicalrelid='orders'::regclass
order by shardid;
logicalrelid | shardid | shardstorage | shardminvalue | shardmaxvalue
---------------------------------------------------------------------
lineitem | 290000 | t | 1 | 5986
lineitem | 290001 | t | 8997 | 14947
orders | 290002 | t | 1 | 5986
orders | 290003 | t | 8997 | 14947
(4 rows)
set citus.explain_distributed_queries to on;
-- explain the query before actually executing it
EXPLAIN SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
DEBUG: Router planner does not support append-partitioned tables.
DEBUG: join prunable for intervals [8997,14947] and [1,5986]
QUERY PLAN
---------------------------------------------------------------------
Aggregate (cost=750.01..750.02 rows=1 width=40)
-> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=100000 width=24)
Task Count: 0
Tasks Shown: All
(4 rows)
set citus.explain_distributed_queries to off;
set client_min_messages to debug3;
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
DEBUG: Router planner does not support append-partitioned tables.
DEBUG: constraint (gt) value: '6000'::bigint
DEBUG: shard count after pruning for lineitem: 1
DEBUG: constraint (lt) value: '6000'::bigint
DEBUG: shard count after pruning for orders: 1
DEBUG: join prunable for intervals [8997,14947] and [1,5986]
sum | avg
---------------------------------------------------------------------
|
(1 row)
set client_min_messages to debug2;
-- Make sure that we can handle filters without a column
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND false;

View File

@ -84,7 +84,7 @@ SELECT prune_using_both_values('pruning', 'tomato', 'rose');
-- unit test of the equality expression generation code
SELECT debug_equality_expression('pruning');
debug_equality_expression
debug_equality_expression
---------------------------------------------------------------------
{OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ({VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 1 :location -1} {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull true :location -1 :constvalue <>}) :location -1}
(1 row)
@ -543,6 +543,154 @@ SELECT * FROM numeric_test WHERE id = 21.1::numeric;
21.1 | 87
(1 row)
CREATE TABLE range_dist_table_1 (dist_col BIGINT);
SELECT create_distributed_table('range_dist_table_1', 'dist_col', 'range');
create_distributed_table
---------------------------------------------------------------------
(1 row)
CALL public.create_range_partitioned_shards('range_dist_table_1', '{1000,3000,6000}', '{2000,4000,7000}');
INSERT INTO range_dist_table_1 VALUES (1001);
INSERT INTO range_dist_table_1 VALUES (3800);
INSERT INTO range_dist_table_1 VALUES (6500);
-- all were returning false before fixing #5077
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2999;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2999;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2500;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2000;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 1001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col > 1000;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1000;
?column?
---------------------------------------------------------------------
t
(1 row)
-- we didn't have such an off-by-one error in upper bound
-- calculation, but let's test such cases too
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 4001;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4500;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 6000;
?column?
---------------------------------------------------------------------
t
(1 row)
-- now test with composite type and more shards
CREATE TYPE comp_type AS (
int_field_1 BIGINT,
int_field_2 BIGINT
);
CREATE TYPE comp_type_range AS RANGE (
subtype = comp_type);
CREATE TABLE range_dist_table_2 (dist_col comp_type);
SELECT create_distributed_table('range_dist_table_2', 'dist_col', 'range');
create_distributed_table
---------------------------------------------------------------------
(1 row)
CALL public.create_range_partitioned_shards(
'range_dist_table_2',
'{"(10,24)","(10,58)",
"(10,90)","(20,100)"}',
'{"(10,25)","(10,65)",
"(10,99)","(20,100)"}');
INSERT INTO range_dist_table_2 VALUES ((10, 24));
INSERT INTO range_dist_table_2 VALUES ((10, 60));
INSERT INTO range_dist_table_2 VALUES ((10, 91));
INSERT INTO range_dist_table_2 VALUES ((20, 100));
SELECT dist_col='(10, 60)'::comp_type FROM range_dist_table_2
WHERE dist_col >= '(10,26)'::comp_type AND
dist_col <= '(10,75)'::comp_type;
?column?
---------------------------------------------------------------------
t
(1 row)
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type AND
dist_col <= '(10,95)'::comp_type
ORDER BY dist_col;
dist_col
---------------------------------------------------------------------
(10,60)
(10,91)
(2 rows)
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type
ORDER BY dist_col;
dist_col
---------------------------------------------------------------------
(10,60)
(10,91)
(20,100)
(3 rows)
SELECT dist_col='(20,100)'::comp_type FROM range_dist_table_2
WHERE dist_col > '(20,99)'::comp_type;
?column?
---------------------------------------------------------------------
t
(1 row)
DROP TABLE range_dist_table_1, range_dist_table_2;
DROP TYPE comp_type CASCADE;
NOTICE: drop cascades to type comp_type_range
SET search_path TO public;
DROP SCHEMA prune_shard_list CASCADE;
NOTICE: drop cascades to 10 other objects

View File

@ -6,6 +6,8 @@
SET citus.next_shard_id TO 890000;
SET citus.shard_count TO 4;
SET citus.shard_replication_factor TO 1;
CREATE SCHEMA sequence_default;
SET search_path = sequence_default, public;
-- Cannot add a column involving DEFAULT nextval('..') because the table is not empty
CREATE SEQUENCE seq_0;
CREATE TABLE seq_test_0 (x int, y int);
@ -38,7 +40,7 @@ SELECT * FROM seq_test_0 ORDER BY 1, 2 LIMIT 5;
(5 rows)
\d seq_test_0
Table "public.seq_test_0"
Table "sequence_default.seq_test_0"
Column | Type | Collation | Nullable | Default
---------------------------------------------------------------------
x | integer | | |
@ -83,7 +85,7 @@ CREATE SEQUENCE seq_4;
ALTER TABLE seq_test_4 ADD COLUMN b int DEFAULT nextval('seq_4');
-- on worker it should generate high sequence number
\c - - - :worker_1_port
INSERT INTO seq_test_4 VALUES (1,2) RETURNING *;
INSERT INTO sequence_default.seq_test_4 VALUES (1,2) RETURNING *;
x | y | a | b
---------------------------------------------------------------------
1 | 2 | | 268435457
@ -91,6 +93,7 @@ INSERT INTO seq_test_4 VALUES (1,2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
@ -101,7 +104,7 @@ SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
CREATE SEQUENCE seq_1;
-- type is bigint by default
\d seq_1
Sequence "public.seq_1"
Sequence "sequence_default.seq_1"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------------------------------------------------------------------
bigint | 1 | 1 | 9223372036854775807 | 1 | no | 1
@ -116,14 +119,14 @@ SELECT create_distributed_table('seq_test_1','x');
ALTER TABLE seq_test_1 ADD COLUMN z int DEFAULT nextval('seq_1');
-- type is changed to int
\d seq_1
Sequence "public.seq_1"
Sequence "sequence_default.seq_1"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------------------------------------------------------------------
integer | 1 | 1 | 2147483647 | 1 | no | 1
-- check insertion is within int bounds in the worker
\c - - - :worker_1_port
INSERT INTO seq_test_1 values (1, 2) RETURNING *;
INSERT INTO sequence_default.seq_test_1 values (1, 2) RETURNING *;
x | y | z
---------------------------------------------------------------------
1 | 2 | 268435457
@ -131,6 +134,7 @@ INSERT INTO seq_test_1 values (1, 2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
@ -163,7 +167,7 @@ SELECT create_distributed_table('seq_test_2','x');
CREATE TABLE seq_test_2_0(x int, y smallint DEFAULT nextval('seq_2'));
-- shouldn't work
SELECT create_distributed_table('seq_test_2_0','x');
ERROR: The sequence public.seq_2 is already used for a different type in column 2 of the table public.seq_test_2
ERROR: The sequence sequence_default.seq_2 is already used for a different type in column 2 of the table sequence_default.seq_test_2
DROP TABLE seq_test_2;
DROP TABLE seq_test_2_0;
-- should work
@ -178,19 +182,20 @@ DROP TABLE seq_test_2;
CREATE TABLE seq_test_2 (x int, y int DEFAULT nextval('seq_2'), z bigint DEFAULT nextval('seq_2'));
-- shouldn't work
SELECT create_distributed_table('seq_test_2','x');
ERROR: The sequence public.seq_2 is already used for a different type in column 3 of the table public.seq_test_2
ERROR: The sequence sequence_default.seq_2 is already used for a different type in column 3 of the table sequence_default.seq_test_2
-- check rename is propagated properly
ALTER SEQUENCE seq_2 RENAME TO sequence_2;
-- check in the worker
\c - - - :worker_1_port
\d sequence_2
Sequence "public.sequence_2"
\d sequence_default.sequence_2
Sequence "sequence_default.sequence_2"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------------------------------------------------------------------
bigint | 281474976710657 | 281474976710657 | 562949953421313 | 1 | no | 1
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
@ -200,9 +205,8 @@ SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
-- check rename with another schema
-- we notice that schema is also propagated as one of the sequence's dependencies
CREATE SCHEMA sequence_default_0;
SET search_path TO public, sequence_default_0;
CREATE SEQUENCE sequence_default_0.seq_3;
CREATE TABLE seq_test_3 (x int, y bigint DEFAULT nextval('seq_3'));
CREATE TABLE seq_test_3 (x int, y bigint DEFAULT nextval('sequence_default_0.seq_3'));
SELECT create_distributed_table('seq_test_3', 'x');
create_distributed_table
---------------------------------------------------------------------
@ -220,6 +224,7 @@ ALTER SEQUENCE sequence_default_0.seq_3 RENAME TO sequence_3;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
@ -253,7 +258,7 @@ INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
-- but is still present on worker
\c - - - :worker_1_port
INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
INSERT INTO sequence_default.seq_test_5 VALUES (1, 2) RETURNING *;
x | y | a
---------------------------------------------------------------------
1 | 2 | 268435457
@ -261,6 +266,7 @@ INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
@ -277,7 +283,7 @@ SELECT run_command_on_workers('DROP SCHEMA sequence_default_1 CASCADE');
-- now the sequence is gone from the worker as well
\c - - - :worker_1_port
INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
INSERT INTO sequence_default.seq_test_5 VALUES (1, 2) RETURNING *;
x | y | a
---------------------------------------------------------------------
1 | 2 |
@ -285,6 +291,7 @@ INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
@ -319,20 +326,21 @@ CREATE TABLE seq_test_7_par (x text, s bigint DEFAULT nextval('seq_7_par'), t ti
ALTER TABLE seq_test_7 ATTACH PARTITION seq_test_7_par FOR VALUES FROM ('2021-05-31') TO ('2021-06-01');
-- check that both sequences are in worker
\c - - - :worker_1_port
\d seq_7
Sequence "public.seq_7"
\d sequence_default.seq_7
Sequence "sequence_default.seq_7"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------------------------------------------------------------------
bigint | 281474976710657 | 281474976710657 | 562949953421313 | 1 | no | 1
\d seq_7_par
Sequence "public.seq_7_par"
\d sequence_default.seq_7_par
Sequence "sequence_default.seq_7_par"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------------------------------------------------------------------
bigint | 281474976710657 | 281474976710657 | 562949953421313 | 1 | no | 1
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
@ -345,7 +353,7 @@ CREATE SEQUENCE seq_8;
CREATE SCHEMA sequence_default_8;
-- can change schema in a sequence not yet distributed
ALTER SEQUENCE seq_8 SET SCHEMA sequence_default_8;
ALTER SEQUENCE sequence_default_8.seq_8 SET SCHEMA public;
ALTER SEQUENCE sequence_default_8.seq_8 SET SCHEMA sequence_default;
CREATE TABLE seq_test_8 (x int, y int DEFAULT nextval('seq_8'));
SELECT create_distributed_table('seq_test_8', 'x');
create_distributed_table
@ -378,12 +386,81 @@ CREATE SEQUENCE seq_10;
CREATE TABLE seq_test_9 (x int, y int DEFAULT nextval('seq_9') - nextval('seq_10'));
SELECT create_distributed_table('seq_test_9', 'x');
ERROR: More than one sequence in a column default is not supported for distribution
-- clean up
DROP TABLE seq_test_0, seq_test_1, seq_test_2, seq_test_3, seq_test_4, seq_test_5, seq_test_6, seq_test_7, seq_test_8, seq_test_9;
DROP SEQUENCE seq_0, seq_1, sequence_2, seq_4, seq_6, seq_7, seq_7_par, seq_8, seq_9, seq_10;
-- Check some cases when default is defined by
-- DEFAULT nextval('seq_name'::text) (not by DEFAULT nextval('seq_name'))
SELECT stop_metadata_sync_to_node('localhost', :worker_1_port);
stop_metadata_sync_to_node
---------------------------------------------------------------------
(1 row)
CREATE SEQUENCE seq_11;
CREATE TABLE seq_test_10 (col0 int, col1 int DEFAULT nextval('seq_11'::text));
SELECT create_reference_table('seq_test_10');
create_reference_table
---------------------------------------------------------------------
(1 row)
INSERT INTO seq_test_10 VALUES (0);
CREATE TABLE seq_test_11 (col0 int, col1 bigint DEFAULT nextval('seq_11'::text));
-- works but doesn't create seq_11 in the workers
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
start_metadata_sync_to_node
---------------------------------------------------------------------
(1 row)
-- works because there is no dependency created between seq_11 and seq_test_10
SELECT create_distributed_table('seq_test_11', 'col1');
create_distributed_table
---------------------------------------------------------------------
(1 row)
-- insertion from workers fails
\c - - - :worker_1_port
INSERT INTO sequence_default.seq_test_10 VALUES (1);
ERROR: relation "seq_11" does not exist
\c - - - :master_port
-- clean up
DROP TABLE sequence_default.seq_test_7_par;
DROP SCHEMA sequence_default CASCADE;
NOTICE: drop cascades to 23 other objects
DETAIL: drop cascades to sequence sequence_default.seq_0
drop cascades to table sequence_default.seq_test_0
drop cascades to table sequence_default.seq_test_4
drop cascades to sequence sequence_default.seq_4
drop cascades to sequence sequence_default.seq_1
drop cascades to table sequence_default.seq_test_1
drop cascades to sequence sequence_default.sequence_2
drop cascades to table sequence_default.seq_test_2
drop cascades to table sequence_default.seq_test_3
drop cascades to table sequence_default.seq_test_5
drop cascades to sequence sequence_default.seq_6
drop cascades to table sequence_default.seq_test_6
drop cascades to sequence sequence_default.seq_7
drop cascades to table sequence_default.seq_test_7
drop cascades to sequence sequence_default.seq_7_par
drop cascades to sequence sequence_default.seq_8
drop cascades to table sequence_default.seq_test_8
drop cascades to sequence sequence_default.seq_9
drop cascades to sequence sequence_default.seq_10
drop cascades to table sequence_default.seq_test_9
drop cascades to sequence sequence_default.seq_11
drop cascades to table sequence_default.seq_test_10
drop cascades to table sequence_default.seq_test_11
SELECT run_command_on_workers('DROP SCHEMA IF EXISTS sequence_default CASCADE');
run_command_on_workers
---------------------------------------------------------------------
(localhost,57637,t,"DROP SCHEMA")
(localhost,57638,t,"DROP SCHEMA")
(2 rows)
SELECT stop_metadata_sync_to_node('localhost', :worker_1_port);
stop_metadata_sync_to_node
---------------------------------------------------------------------
(1 row)
SET search_path TO public;

View File

@ -364,6 +364,11 @@ SELECT worker_hash('(1, 2)'::test_composite_type);
SELECT citus_truncate_trigger();
ERROR: must be called as trigger
-- make sure worker_create_or_alter_role does not crash with NULL input
SELECT worker_create_or_alter_role(NULL, NULL, NULL);
ERROR: role name cannot be NULL
SELECT worker_create_or_alter_role(NULL, 'create role dontcrash', NULL);
ERROR: role name cannot be NULL
-- confirm that citus_create_restore_point works
SELECT 1 FROM citus_create_restore_point('regression-test');
NOTICE: issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(xx, xx, 'xxxxxxx');

View File

@ -1,13 +1,7 @@
--
-- MUTLI_SHARD_REBALANCER
--
CREATE TABLE dist_table_test(a int primary key);
SELECT create_distributed_table('dist_table_test', 'a');
create_distributed_table
---------------------------------------------------------------------
(1 row)
SET citus.next_shard_id TO 433000;
CREATE TABLE ref_table_test(a int primary key);
SELECT create_reference_table('ref_table_test');
create_reference_table
@ -15,6 +9,14 @@ SELECT create_reference_table('ref_table_test');
(1 row)
CREATE TABLE dist_table_test(a int primary key);
SELECT create_distributed_table('dist_table_test', 'a');
create_distributed_table
---------------------------------------------------------------------
(1 row)
CREATE TABLE postgres_table_test(a int primary key);
-- make sure that all rebalance operations works fine when
-- reference tables are replicated to the coordinator
SELECT 1 FROM master_add_node('localhost', :master_port, groupId=>0);
@ -41,6 +43,7 @@ SELECT rebalance_table_shards();
CALL citus_cleanup_orphaned_shards();
-- test that calling rebalance_table_shards without specifying relation
-- wouldn't move shard of the citus local table.
SET citus.next_shard_id TO 433100;
CREATE TABLE citus_local_table(a int, b int);
SELECT citus_add_local_table_to_metadata('citus_local_table');
citus_add_local_table_to_metadata
@ -56,11 +59,31 @@ SELECT rebalance_table_shards();
(1 row)
CALL citus_cleanup_orphaned_shards();
-- Check that rebalance_table_shards and get_rebalance_table_shards_plan fail
-- for any type of table, but distributed tables.
SELECT rebalance_table_shards('ref_table_test');
ERROR: table public.ref_table_test is a reference table, moving shard of a reference table is not supported
SELECT rebalance_table_shards('postgres_table_test');
ERROR: table public.postgres_table_test is a regular postgres table, you can only move shards of a citus table
SELECT rebalance_table_shards('citus_local_table');
ERROR: table public.citus_local_table is a local table, moving shard of a local table added to metadata is currently not supported
SELECT get_rebalance_table_shards_plan('ref_table_test');
ERROR: table public.ref_table_test is a reference table, moving shard of a reference table is not supported
SELECT get_rebalance_table_shards_plan('postgres_table_test');
ERROR: table public.postgres_table_test is a regular postgres table, you can only move shards of a citus table
SELECT get_rebalance_table_shards_plan('citus_local_table');
ERROR: table public.citus_local_table is a local table, moving shard of a local table added to metadata is currently not supported
-- Check that citus_move_shard_placement fails for shards belonging reference
-- tables or citus local tables
SELECT citus_move_shard_placement(433000, 'localhost', :worker_1_port, 'localhost', :worker_2_port);
ERROR: table public.ref_table_test is a reference table, moving shard of a reference table is not supported
SELECT citus_move_shard_placement(433100, 'localhost', :worker_1_port, 'localhost', :worker_2_port);
ERROR: table public.citus_local_table is a local table, moving shard of a local table added to metadata is currently not supported
-- show that citus local table shard is still on the coordinator
SELECT tablename FROM pg_catalog.pg_tables where tablename like 'citus_local_table_%';
tablename
---------------------------------------------------------------------
citus_local_table_102047
citus_local_table_433100
(1 row)
-- also check that we still can access shard relation, not the shell table
@ -111,7 +134,7 @@ CALL citus_cleanup_orphaned_shards();
SELECT tablename FROM pg_catalog.pg_tables where tablename like 'citus_local_table_%';
tablename
---------------------------------------------------------------------
citus_local_table_102047
citus_local_table_433100
(1 row)
-- also check that we still can access shard relation, not the shell table
@ -182,7 +205,7 @@ NOTICE: Copying shard xxxxx from localhost:xxxxx to localhost:xxxxx ...
(1 row)
DROP TABLE dist_table_test, dist_table_test_2, ref_table_test;
DROP TABLE dist_table_test, dist_table_test_2, ref_table_test, postgres_table_test;
RESET citus.shard_count;
RESET citus.shard_replication_factor;
-- Create a user to test multiuser usage of rebalancer functions
@ -1665,6 +1688,7 @@ SELECT * FROM public.table_placements_per_node;
57638 | tab2 | 1
(4 rows)
VACUUM FULL tab, tab2;
ANALYZE tab, tab2;
\c - - - :worker_1_port
SELECT table_schema, table_name, row_estimate, total_bytes

View File

@ -54,6 +54,9 @@ test: multi_mx_insert_select_repartition
test: locally_execute_intermediate_results
test: multi_mx_alter_distributed_table
# should be executed sequentially because it modifies metadata
test: local_shard_execution_dropped_column
# test that no tests leaked intermediate results. This should always be last
test: ensure_no_intermediate_data_leak

View File

@ -92,7 +92,7 @@ test: subquery_prepared_statements
test: non_colocated_leaf_subquery_joins non_colocated_subquery_joins non_colocated_join_order
test: cte_inline recursive_view_local_table values
test: pg13 pg12
test: tableam
test: tableam drop_column_partitioned_table
# ----------
# Tests for statistics propagation

View File

@ -7,3 +7,4 @@ test: foreign_key_to_reference_shard_rebalance
test: multi_move_mx
test: shard_move_deferred_delete
test: multi_colocated_shard_rebalance
test: ignoring_orphaned_shards

View File

@ -0,0 +1,168 @@
CREATE SCHEMA drop_column_partitioned_table;
SET search_path TO drop_column_partitioned_table;
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 2580000;
-- create a partitioned table with some columns that
-- are going to be dropped within the tests
CREATE TABLE sensors(
col_to_drop_0 text,
col_to_drop_1 text,
col_to_drop_2 date,
col_to_drop_3 inet,
col_to_drop_4 date,
measureid integer,
eventdatetime date,
measure_data jsonb)
PARTITION BY RANGE(eventdatetime);
-- drop column even before attaching any partitions
ALTER TABLE sensors DROP COLUMN col_to_drop_1;
-- now attach the first partition and create the distributed table
CREATE TABLE sensors_2000 PARTITION OF sensors FOR VALUES FROM ('2000-01-01') TO ('2001-01-01');
SELECT create_distributed_table('sensors', 'measureid');
-- prepared statements should work fine even after columns are dropped
PREPARE drop_col_prepare_insert(int, date, jsonb) AS INSERT INTO sensors (measureid, eventdatetime, measure_data) VALUES ($1, $2, $3);
PREPARE drop_col_prepare_select(int, date) AS SELECT count(*) FROM sensors WHERE measureid = $1 AND eventdatetime = $2;
-- execute 7 times to make sure it is cached
EXECUTE drop_col_prepare_insert(1, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-05', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-06', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(1, '2000-10-07', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(1, '2000-10-01');
EXECUTE drop_col_prepare_select(1, '2000-10-02');
EXECUTE drop_col_prepare_select(1, '2000-10-03');
EXECUTE drop_col_prepare_select(1, '2000-10-04');
EXECUTE drop_col_prepare_select(1, '2000-10-05');
EXECUTE drop_col_prepare_select(1, '2000-10-06');
EXECUTE drop_col_prepare_select(1, '2000-10-07');
-- drop another column before attaching another partition
-- with .. PARTITION OF .. syntax
ALTER TABLE sensors DROP COLUMN col_to_drop_0;
CREATE TABLE sensors_2001 PARTITION OF sensors FOR VALUES FROM ('2001-01-01') TO ('2002-01-01');
-- drop another column before attaching another partition
-- with ALTER TABLE .. ATTACH PARTITION
ALTER TABLE sensors DROP COLUMN col_to_drop_2;
CREATE TABLE sensors_2002(
col_to_drop_4 date, col_to_drop_3 inet, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
ALTER TABLE sensors ATTACH PARTITION sensors_2002 FOR VALUES FROM ('2002-01-01') TO ('2003-01-01');
-- drop another column before attaching another partition
-- that is already distributed
ALTER TABLE sensors DROP COLUMN col_to_drop_3;
CREATE TABLE sensors_2003(
col_to_drop_4 date, measureid integer, eventdatetime date, measure_data jsonb,
PRIMARY KEY (measureid, eventdatetime, measure_data));
SELECT create_distributed_table('sensors_2003', 'measureid');
ALTER TABLE sensors ATTACH PARTITION sensors_2003 FOR VALUES FROM ('2003-01-01') TO ('2004-01-01');
CREATE TABLE sensors_2004(
col_to_drop_4 date, measureid integer NOT NULL, eventdatetime date NOT NULL, measure_data jsonb NOT NULL);
ALTER TABLE sensors ATTACH PARTITION sensors_2004 FOR VALUES FROM ('2004-01-01') TO ('2005-01-01');
ALTER TABLE sensors DROP COLUMN col_to_drop_4;
-- show that all partitions have the same distribution key
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
-- show that all the tables prune to the same shard for the same distribution key
WITH
sensors_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors', 3)),
sensors_2000_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2000', 3)),
sensors_2001_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2001', 3)),
sensors_2002_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2002', 3)),
sensors_2003_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2003', 3)),
sensors_2004_shardid AS (SELECT * FROM get_shard_id_for_distribution_column('sensors_2004', 3)),
all_shardids AS (SELECT * FROM sensors_shardid UNION SELECT * FROM sensors_2000_shardid UNION
SELECT * FROM sensors_2001_shardid UNION SELECT * FROM sensors_2002_shardid
UNION SELECT * FROM sensors_2003_shardid UNION SELECT * FROM sensors_2004_shardid)
SELECT logicalrelid, shardid, shardminvalue, shardmaxvalue FROM pg_dist_shard WHERE shardid IN (SELECT * FROM all_shardids);
VACUUM ANALYZE sensors, sensors_2000, sensors_2001, sensors_2002, sensors_2003;
-- show that both INSERT and SELECT can route to a single node when distribution
-- key is provided in the query
EXPLAIN (COSTS FALSE) INSERT INTO sensors VALUES (3, '2000-02-02', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2000 VALUES (3, '2000-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2001 VALUES (3, '2001-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2002 VALUES (3, '2002-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) INSERT INTO sensors_2003 VALUES (3, '2003-01-01', row_to_json(row(1)));
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors WHERE measureid = 3 AND eventdatetime = '2000-02-02';
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2000 WHERE measureid = 3;
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2001 WHERE measureid = 3;
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2002 WHERE measureid = 3;
EXPLAIN (COSTS FALSE) SELECT count(*) FROM sensors_2003 WHERE measureid = 3;
-- execute 7 times to make sure it is re-cached
EXECUTE drop_col_prepare_insert(3, '2000-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2001-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2002-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-01', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(3, '2003-10-02', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(4, '2003-10-03', row_to_json(row(1)));
EXECUTE drop_col_prepare_insert(5, '2003-10-04', row_to_json(row(1)));
EXECUTE drop_col_prepare_select(3, '2000-10-01');
EXECUTE drop_col_prepare_select(3, '2001-10-01');
EXECUTE drop_col_prepare_select(3, '2002-10-01');
EXECUTE drop_col_prepare_select(3, '2003-10-01');
EXECUTE drop_col_prepare_select(3, '2003-10-02');
EXECUTE drop_col_prepare_select(4, '2003-10-03');
EXECUTE drop_col_prepare_select(5, '2003-10-04');
-- non-fast router planner queries should also work
-- so we switched to DEBUG2 to show that dist. key
-- and the query is router
SET client_min_messages TO DEBUG2;
SELECT count(*) FROM (
SELECT * FROM sensors WHERE measureid = 3
UNION
SELECT * FROM sensors_2000 WHERE measureid = 3
UNION
SELECT * FROM sensors_2001 WHERE measureid = 3
UNION
SELECT * FROM sensors_2002 WHERE measureid = 3
UNION
SELECT * FROM sensors_2003 WHERE measureid = 3
UNION
SELECT * FROM sensors_2004 WHERE measureid = 3
) as foo;
RESET client_min_messages;
-- show that all partitions have the same distribution key
-- even after alter_distributed_table changes the shard count
-- remove this comment once https://github.com/citusdata/citus/issues/5137 is fixed
--SELECT alter_distributed_table('sensors', shard_count:='3');
SELECT
p.logicalrelid::regclass, column_to_column_name(p.logicalrelid, p.partkey)
FROM
pg_dist_partition p
WHERE
logicalrelid IN ('sensors'::regclass, 'sensors_2000'::regclass,
'sensors_2001'::regclass, 'sensors_2002'::regclass,
'sensors_2003'::regclass, 'sensors_2004'::regclass);
SET client_min_messages TO WARNING;
DROP SCHEMA drop_column_partitioned_table CASCADE;

View File

@ -0,0 +1,147 @@
CREATE SCHEMA ignoring_orphaned_shards;
SET search_path TO ignoring_orphaned_shards;
-- Use a weird shard count that we don't use in any other tests
SET citus.shard_count TO 13;
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 92448000;
CREATE TABLE ref(id int PRIMARY KEY);
SELECT * FROM create_reference_table('ref');
SET citus.next_shard_id TO 92448100;
ALTER SEQUENCE pg_catalog.pg_dist_colocationid_seq RESTART 92448100;
CREATE TABLE dist1(id int);
SELECT * FROM create_distributed_table('dist1', 'id');
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
-- Move first shard, so that the first shard now has 2 placements. One that's
-- active and one that's orphaned.
SELECT citus_move_shard_placement(92448100, 'localhost', :worker_1_port, 'localhost', :worker_2_port, 'block_writes');
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448100 ORDER BY placementid;
-- Add a new table that should get colocated with dist1 automatically, but
-- should not get a shard for the orphaned placement.
SET citus.next_shard_id TO 92448200;
CREATE TABLE dist2(id int);
SELECT * FROM create_distributed_table('dist2', 'id');
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448200 ORDER BY placementid;
-- uncolocate it
SELECT update_distributed_table_colocation('dist2', 'none');
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
-- Make sure we can add it back to the colocation, even though it has a
-- different number of shard placements for the first shard.
SELECT update_distributed_table_colocation('dist2', 'dist1');
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448100 ORDER BY 1;
-- Make sure that replication count check in FOR UPDATE ignores orphaned
-- shards.
SELECT * FROM dist1 WHERE id = 1 FOR UPDATE;
-- Make sure we don't send a query to the orphaned shard
BEGIN;
SET LOCAL citus.log_remote_commands TO ON;
INSERT INTO dist1 VALUES (1);
ROLLBACK;
-- Make sure we can create a foreign key on community edition, because
-- replication factor is 1
ALTER TABLE dist1
ADD CONSTRAINT dist1_ref_fk
FOREIGN KEY (id)
REFERENCES ref(id);
SET citus.shard_replication_factor TO 2;
SET citus.next_shard_id TO 92448300;
ALTER SEQUENCE pg_catalog.pg_dist_colocationid_seq RESTART 92448300;
CREATE TABLE rep1(id int);
SELECT * FROM create_distributed_table('rep1', 'id');
-- Add the coordinator, so we can have a replicated shard
SELECT 1 FROM citus_add_node('localhost', :master_port, 0);
SELECT 1 FROM citus_set_node_property('localhost', :master_port, 'shouldhaveshards', true);
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
SELECT citus_move_shard_placement(92448300, 'localhost', :worker_1_port, 'localhost', :master_port);
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
-- Add a new table that should get colocated with rep1 automatically, but
-- should not get a shard for the orphaned placement.
SET citus.next_shard_id TO 92448400;
CREATE TABLE rep2(id int);
SELECT * FROM create_distributed_table('rep2', 'id');
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448400 ORDER BY placementid;
-- uncolocate it
SELECT update_distributed_table_colocation('rep2', 'none');
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
-- Make sure we can add it back to the colocation, even though it has a
-- different number of shard placements for the first shard.
SELECT update_distributed_table_colocation('rep2', 'rep1');
SELECT logicalrelid FROM pg_dist_partition WHERE colocationid = 92448300 ORDER BY 1;
UPDATE pg_dist_placement SET shardstate = 3 WHERE shardid = 92448300 AND groupid = 0;
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
-- cannot copy from an orphaned shard
SELECT * FROM citus_copy_shard_placement(92448300, 'localhost', :worker_1_port, 'localhost', :master_port);
-- cannot copy to an orphaned shard
SELECT * FROM citus_copy_shard_placement(92448300, 'localhost', :worker_2_port, 'localhost', :worker_1_port);
-- can still copy to an inactive shard
SELECT * FROM citus_copy_shard_placement(92448300, 'localhost', :worker_2_port, 'localhost', :master_port);
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
-- Make sure we don't send a query to the orphaned shard
BEGIN;
SET LOCAL citus.log_remote_commands TO ON;
SET LOCAL citus.log_local_commands TO ON;
INSERT INTO rep1 VALUES (1);
ROLLBACK;
-- Cause the orphaned shard to be local
SELECT 1 FROM citus_drain_node('localhost', :master_port);
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
-- Make sure we don't send a query to the orphaned shard if it's local
BEGIN;
SET LOCAL citus.log_remote_commands TO ON;
SET LOCAL citus.log_local_commands TO ON;
INSERT INTO rep1 VALUES (1);
ROLLBACK;
SET citus.shard_replication_factor TO 1;
SET citus.next_shard_id TO 92448500;
CREATE TABLE range1(id int);
SELECT create_distributed_table('range1', 'id', 'range');
CALL public.create_range_partitioned_shards('range1', '{0,3}','{2,5}');
-- Move shard placement and clean it up
SELECT citus_move_shard_placement(92448500, 'localhost', :worker_2_port, 'localhost', :worker_1_port, 'block_writes');
CALL citus_cleanup_orphaned_shards();
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448300 ORDER BY placementid;
SET citus.next_shard_id TO 92448600;
CREATE TABLE range2(id int);
SELECT create_distributed_table('range2', 'id', 'range');
CALL public.create_range_partitioned_shards('range2', '{0,3}','{2,5}');
-- Move shard placement and DON'T clean it up, now range1 and range2 are
-- colocated, but only range2 has an orphaned shard.
SELECT citus_move_shard_placement(92448600, 'localhost', :worker_2_port, 'localhost', :worker_1_port, 'block_writes');
SELECT shardid, shardstate, nodeport FROM pg_dist_shard_placement WHERE shardid = 92448600 ORDER BY placementid;
-- Make sure that tables are detected as colocated
SELECT * FROM range1 JOIN range2 ON range1.id = range2.id;
-- Make sure we can create a foreign key on community edition, because
-- replication factor is 1
ALTER TABLE range1
ADD CONSTRAINT range1_ref_fk
FOREIGN KEY (id)
REFERENCES ref(id);
SET client_min_messages TO WARNING;
DROP SCHEMA ignoring_orphaned_shards CASCADE;

View File

@ -0,0 +1,135 @@
CREATE SCHEMA local_shard_execution_dropped_column;
SET search_path TO local_shard_execution_dropped_column;
SET citus.next_shard_id TO 2460000;
-- the scenario is described on https://github.com/citusdata/citus/issues/5038
-- first stop the metadata syncing to the node do that drop column
-- is not propogated
SELECT stop_metadata_sync_to_node('localhost',:worker_1_port);
SELECT stop_metadata_sync_to_node('localhost',:worker_2_port);
-- create a distributed table, drop a column and sync the metadata
SET citus.shard_replication_factor TO 1;
CREATE TABLE t1 (a int, b int, c int UNIQUE);
SELECT create_distributed_table('t1', 'c');
ALTER TABLE t1 DROP COLUMN b;
SELECT start_metadata_sync_to_node('localhost',:worker_1_port);
SELECT start_metadata_sync_to_node('localhost',:worker_2_port);
\c - - - :worker_1_port
SET search_path TO local_shard_execution_dropped_column;
-- show the dropped columns
SELECT attrelid::regclass, attname, attnum, attisdropped
FROM pg_attribute WHERE attrelid IN ('t1'::regclass, 't1_2460000'::regclass) and attname NOT IN ('tableoid','cmax', 'xmax', 'cmin', 'xmin', 'ctid')
ORDER BY 1, 3, 2, 4;
-- connect to a worker node where local execution is done
prepare p1(int) as insert into t1(a,c) VALUES (5,$1) ON CONFLICT (c) DO NOTHING;
SET citus.log_remote_commands TO ON;
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
prepare p2(int) as SELECT count(*) FROM t1 WHERE c = $1 GROUP BY c;
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
prepare p3(int) as INSERT INTO t1(a,c) VALUES (5, $1), (6, $1), (7, $1),(5, $1), (6, $1), (7, $1) ON CONFLICT DO NOTHING;
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
prepare p4(int) as UPDATE t1 SET a = a + 1 WHERE c = $1;
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
execute p4(8);
\c - - - :master_port
-- one another combination is that the shell table
-- has a dropped column but not the shard, via rebalance operation
SET search_path TO local_shard_execution_dropped_column;
ALTER TABLE t1 DROP COLUMN a;
SELECT citus_move_shard_placement(2460000, 'localhost', :worker_1_port, 'localhost', :worker_2_port, 'block_writes');
\c - - - :worker_2_port
SET search_path TO local_shard_execution_dropped_column;
-- show the dropped columns
SELECT attrelid::regclass, attname, attnum, attisdropped
FROM pg_attribute WHERE attrelid IN ('t1'::regclass, 't1_2460000'::regclass) and attname NOT IN ('tableoid','cmax', 'xmax', 'cmin', 'xmin', 'ctid')
ORDER BY 1, 3, 2, 4;
prepare p1(int) as insert into t1(c) VALUES ($1) ON CONFLICT (c) DO NOTHING;
SET citus.log_remote_commands TO ON;
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
execute p1(8);
prepare p2(int) as SELECT count(*) FROM t1 WHERE c = $1 GROUP BY c;
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
execute p2(8);
prepare p3(int) as INSERT INTO t1(c) VALUES ($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1),($1) ON CONFLICT DO NOTHING;
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
execute p3(8);
\c - - - :master_port
DROP SCHEMA local_shard_execution_dropped_column CASCADE;

View File

@ -13,6 +13,7 @@ SHOW server_version \gset
SELECT substring(:'server_version', '\d+')::int > 11 AS version_above_eleven;
SET citus.next_shard_id TO 580000;
CREATE SCHEMA multi_extension;
SELECT $definition$
CREATE OR REPLACE FUNCTION test.maintenance_worker()
@ -41,14 +42,15 @@ $$;
$definition$ create_function_test_maintenance_worker
\gset
CREATE TABLE prev_objects(description text);
CREATE TABLE extension_diff(previous_object text COLLATE "C",
CREATE TABLE multi_extension.prev_objects(description text);
CREATE TABLE multi_extension.extension_diff(previous_object text COLLATE "C",
current_object text COLLATE "C");
CREATE FUNCTION print_extension_changes()
CREATE FUNCTION multi_extension.print_extension_changes()
RETURNS TABLE(previous_object text, current_object text)
AS $func$
BEGIN
SET LOCAL search_path TO multi_extension;
TRUNCATE TABLE extension_diff;
CREATE TABLE current_objects AS
@ -128,13 +130,13 @@ ALTER EXTENSION citus UPDATE TO '9.1-1';
ALTER EXTENSION citus UPDATE TO '9.2-1';
ALTER EXTENSION citus UPDATE TO '9.2-2';
-- Snapshot of state at 9.2-2
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Test downgrade to 9.2-2 from 9.2-4
ALTER EXTENSION citus UPDATE TO '9.2-4';
ALTER EXTENSION citus UPDATE TO '9.2-2';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
/*
* As we mistakenly bumped schema version to 9.3-1 in a bad release, we support
@ -146,27 +148,65 @@ ALTER EXTENSION citus UPDATE TO '9.3-1';
ALTER EXTENSION citus UPDATE TO '9.2-4';
-- Snapshot of state at 9.2-4
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Test downgrade to 9.2-4 from 9.3-2
ALTER EXTENSION citus UPDATE TO '9.3-2';
ALTER EXTENSION citus UPDATE TO '9.2-4';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 9.3-2
ALTER EXTENSION citus UPDATE TO '9.3-2';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Test downgrade to 9.3-2 from 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
ALTER EXTENSION citus UPDATE TO '9.3-2';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.4-2';
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM multi_extension.print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.4-2';
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
SELECT * FROM multi_extension.print_extension_changes();
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
ALTER EXTENSION citus UPDATE TO '9.4-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.4-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 9.4-1
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
-- Test downgrade to 9.4-1 from 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
@ -184,53 +224,90 @@ ROLLBACK;
ALTER EXTENSION citus UPDATE TO '9.4-1';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Test downgrade to 9.5-1 from 10.0-1
ALTER EXTENSION citus UPDATE TO '10.0-1';
-- Test upgrade paths for backported citus_pg_upgrade functions
ALTER EXTENSION citus UPDATE TO '9.5-2';
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- Should be empty result, even though the downgrade doesn't undo the upgrade, the
-- function signature doesn't change, which is reflected here.
SELECT * FROM multi_extension.print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.5-2';
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
SELECT * FROM multi_extension.print_extension_changes();
-- Test upgrade paths for backported improvement of master_update_table_statistics function
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
ALTER EXTENSION citus UPDATE TO '9.5-2';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
ALTER EXTENSION citus UPDATE TO '9.5-3';
-- should see the new source code with internal function citus_update_table_statistics
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 9.5-1
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- should see the old source code
SELECT prosrc FROM pg_proc WHERE proname = 'master_update_table_statistics' ORDER BY 1;
-- Should be empty result
SELECT * FROM multi_extension.print_extension_changes();
-- We removed the upgrade paths to 10.0-1, 10.0-2 and 10.0-3 due to a bug that blocked
-- upgrades to 10.0, Therefore we test upgrades to 10.0-4 instead
-- Test downgrade to 9.5-1 from 10.0-4
ALTER EXTENSION citus UPDATE TO '10.0-4';
ALTER EXTENSION citus UPDATE TO '9.5-1';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 10.0-1
ALTER EXTENSION citus UPDATE TO '10.0-1';
SELECT * FROM print_extension_changes();
-- Snapshot of state at 10.0-4
ALTER EXTENSION citus UPDATE TO '10.0-4';
SELECT * FROM multi_extension.print_extension_changes();
-- Test downgrade to 10.0-1 from 10.0-2
ALTER EXTENSION citus UPDATE TO '10.0-2';
ALTER EXTENSION citus UPDATE TO '10.0-1';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
-- check that we depend on the existence of public schema, and we can not drop it now
DROP SCHEMA public;
-- Snapshot of state at 10.0-2
ALTER EXTENSION citus UPDATE TO '10.0-2';
SELECT * FROM print_extension_changes();
-- verify that citus_tables view is on pg_catalog if public schema is absent.
ALTER EXTENSION citus UPDATE TO '9.5-1';
DROP SCHEMA public;
ALTER EXTENSION citus UPDATE TO '10.0-4';
SELECT * FROM multi_extension.print_extension_changes();
-- Test downgrade to 10.0-2 from 10.0-3
ALTER EXTENSION citus UPDATE TO '10.0-3';
ALTER EXTENSION citus UPDATE TO '10.0-2';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
-- recreate public schema, and recreate citus_tables in the public schema by default
CREATE SCHEMA public;
GRANT ALL ON SCHEMA public TO public;
ALTER EXTENSION citus UPDATE TO '9.5-1';
ALTER EXTENSION citus UPDATE TO '10.0-4';
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 10.0-3
ALTER EXTENSION citus UPDATE TO '10.0-3';
SELECT * FROM print_extension_changes();
-- Test downgrade to 10.0-3 from 10.1-1
-- Test downgrade to 10.0-4 from 10.1-1
ALTER EXTENSION citus UPDATE TO '10.1-1';
ALTER EXTENSION citus UPDATE TO '10.0-3';
ALTER EXTENSION citus UPDATE TO '10.0-4';
-- Should be empty result since upgrade+downgrade should be a no-op
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
-- Snapshot of state at 10.1-1
ALTER EXTENSION citus UPDATE TO '10.1-1';
SELECT * FROM print_extension_changes();
SELECT * FROM multi_extension.print_extension_changes();
DROP TABLE prev_objects, extension_diff;
DROP TABLE multi_extension.prev_objects, multi_extension.extension_diff;
-- show running version
SHOW citus.version;
@ -498,3 +575,4 @@ FROM test.maintenance_worker();
SELECT count(*) FROM pg_stat_activity WHERE application_name = 'Citus Maintenance Daemon';
DROP TABLE version_mismatch_table;
DROP SCHEMA multi_extension;

View File

@ -27,8 +27,21 @@ SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
-- orders table. These shard sets don't overlap, so join pruning should prune
-- out all the shards, and leave us with an empty task list.
select * from pg_dist_shard
where logicalrelid='lineitem'::regclass or
logicalrelid='orders'::regclass
order by shardid;
set citus.explain_distributed_queries to on;
-- explain the query before actually executing it
EXPLAIN SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
set citus.explain_distributed_queries to off;
set client_min_messages to debug3;
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders
WHERE l_orderkey = o_orderkey AND l_orderkey > 6000 AND o_orderkey < 6000;
set client_min_messages to debug2;
-- Make sure that we can handle filters without a column
SELECT sum(l_linenumber), avg(l_linenumber) FROM lineitem, orders

View File

@ -218,5 +218,75 @@ SELECT * FROM numeric_test WHERE id = 21;
SELECT * FROM numeric_test WHERE id = 21::numeric;
SELECT * FROM numeric_test WHERE id = 21.1::numeric;
CREATE TABLE range_dist_table_1 (dist_col BIGINT);
SELECT create_distributed_table('range_dist_table_1', 'dist_col', 'range');
CALL public.create_range_partitioned_shards('range_dist_table_1', '{1000,3000,6000}', '{2000,4000,7000}');
INSERT INTO range_dist_table_1 VALUES (1001);
INSERT INTO range_dist_table_1 VALUES (3800);
INSERT INTO range_dist_table_1 VALUES (6500);
-- all were returning false before fixing #5077
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2999;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2999;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col >= 2500;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 2000;
SELECT SUM(dist_col)=3800+6500 FROM range_dist_table_1 WHERE dist_col > 1001;
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1001;
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col > 1000;
SELECT SUM(dist_col)=1001+3800+6500 FROM range_dist_table_1 WHERE dist_col >= 1000;
-- we didn't have such an off-by-one error in upper bound
-- calculation, but let's test such cases too
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4001;
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 4001;
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col <= 4500;
SELECT SUM(dist_col)=1001+3800 FROM range_dist_table_1 WHERE dist_col < 6000;
-- now test with composite type and more shards
CREATE TYPE comp_type AS (
int_field_1 BIGINT,
int_field_2 BIGINT
);
CREATE TYPE comp_type_range AS RANGE (
subtype = comp_type);
CREATE TABLE range_dist_table_2 (dist_col comp_type);
SELECT create_distributed_table('range_dist_table_2', 'dist_col', 'range');
CALL public.create_range_partitioned_shards(
'range_dist_table_2',
'{"(10,24)","(10,58)",
"(10,90)","(20,100)"}',
'{"(10,25)","(10,65)",
"(10,99)","(20,100)"}');
INSERT INTO range_dist_table_2 VALUES ((10, 24));
INSERT INTO range_dist_table_2 VALUES ((10, 60));
INSERT INTO range_dist_table_2 VALUES ((10, 91));
INSERT INTO range_dist_table_2 VALUES ((20, 100));
SELECT dist_col='(10, 60)'::comp_type FROM range_dist_table_2
WHERE dist_col >= '(10,26)'::comp_type AND
dist_col <= '(10,75)'::comp_type;
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type AND
dist_col <= '(10,95)'::comp_type
ORDER BY dist_col;
SELECT * FROM range_dist_table_2
WHERE dist_col >= '(10,57)'::comp_type
ORDER BY dist_col;
SELECT dist_col='(20,100)'::comp_type FROM range_dist_table_2
WHERE dist_col > '(20,99)'::comp_type;
DROP TABLE range_dist_table_1, range_dist_table_2;
DROP TYPE comp_type CASCADE;
SET search_path TO public;
DROP SCHEMA prune_shard_list CASCADE;

View File

@ -7,6 +7,8 @@
SET citus.next_shard_id TO 890000;
SET citus.shard_count TO 4;
SET citus.shard_replication_factor TO 1;
CREATE SCHEMA sequence_default;
SET search_path = sequence_default, public;
-- Cannot add a column involving DEFAULT nextval('..') because the table is not empty
@ -52,9 +54,10 @@ CREATE SEQUENCE seq_4;
ALTER TABLE seq_test_4 ADD COLUMN b int DEFAULT nextval('seq_4');
-- on worker it should generate high sequence number
\c - - - :worker_1_port
INSERT INTO seq_test_4 VALUES (1,2) RETURNING *;
INSERT INTO sequence_default.seq_test_4 VALUES (1,2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
@ -69,9 +72,10 @@ ALTER TABLE seq_test_1 ADD COLUMN z int DEFAULT nextval('seq_1');
\d seq_1
-- check insertion is within int bounds in the worker
\c - - - :worker_1_port
INSERT INTO seq_test_1 values (1, 2) RETURNING *;
INSERT INTO sequence_default.seq_test_1 values (1, 2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
@ -107,16 +111,16 @@ SELECT create_distributed_table('seq_test_2','x');
ALTER SEQUENCE seq_2 RENAME TO sequence_2;
-- check in the worker
\c - - - :worker_1_port
\d sequence_2
\d sequence_default.sequence_2
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
-- check rename with another schema
-- we notice that schema is also propagated as one of the sequence's dependencies
CREATE SCHEMA sequence_default_0;
SET search_path TO public, sequence_default_0;
CREATE SEQUENCE sequence_default_0.seq_3;
CREATE TABLE seq_test_3 (x int, y bigint DEFAULT nextval('seq_3'));
CREATE TABLE seq_test_3 (x int, y bigint DEFAULT nextval('sequence_default_0.seq_3'));
SELECT create_distributed_table('seq_test_3', 'x');
ALTER SEQUENCE sequence_default_0.seq_3 RENAME TO sequence_3;
-- check in the worker
@ -124,6 +128,7 @@ ALTER SEQUENCE sequence_default_0.seq_3 RENAME TO sequence_3;
\d sequence_default_0.sequence_3
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
DROP SEQUENCE sequence_default_0.sequence_3 CASCADE;
DROP SCHEMA sequence_default_0;
@ -140,17 +145,19 @@ DROP SCHEMA sequence_default_1 CASCADE;
INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
-- but is still present on worker
\c - - - :worker_1_port
INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
INSERT INTO sequence_default.seq_test_5 VALUES (1, 2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
-- apply workaround
SELECT run_command_on_workers('DROP SCHEMA sequence_default_1 CASCADE');
-- now the sequence is gone from the worker as well
\c - - - :worker_1_port
INSERT INTO seq_test_5 VALUES (1, 2) RETURNING *;
INSERT INTO sequence_default.seq_test_5 VALUES (1, 2) RETURNING *;
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
@ -173,10 +180,11 @@ CREATE TABLE seq_test_7_par (x text, s bigint DEFAULT nextval('seq_7_par'), t ti
ALTER TABLE seq_test_7 ATTACH PARTITION seq_test_7_par FOR VALUES FROM ('2021-05-31') TO ('2021-06-01');
-- check that both sequences are in worker
\c - - - :worker_1_port
\d seq_7
\d seq_7_par
\d sequence_default.seq_7
\d sequence_default.seq_7_par
\c - - - :master_port
SET citus.shard_replication_factor TO 1;
SET search_path = sequence_default, public;
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
@ -186,7 +194,7 @@ CREATE SEQUENCE seq_8;
CREATE SCHEMA sequence_default_8;
-- can change schema in a sequence not yet distributed
ALTER SEQUENCE seq_8 SET SCHEMA sequence_default_8;
ALTER SEQUENCE sequence_default_8.seq_8 SET SCHEMA public;
ALTER SEQUENCE sequence_default_8.seq_8 SET SCHEMA sequence_default;
CREATE TABLE seq_test_8 (x int, y int DEFAULT nextval('seq_8'));
SELECT create_distributed_table('seq_test_8', 'x');
-- cannot change sequence specifications
@ -209,7 +217,26 @@ CREATE TABLE seq_test_9 (x int, y int DEFAULT nextval('seq_9') - nextval('seq_10
SELECT create_distributed_table('seq_test_9', 'x');
-- clean up
DROP TABLE seq_test_0, seq_test_1, seq_test_2, seq_test_3, seq_test_4, seq_test_5, seq_test_6, seq_test_7, seq_test_8, seq_test_9;
DROP SEQUENCE seq_0, seq_1, sequence_2, seq_4, seq_6, seq_7, seq_7_par, seq_8, seq_9, seq_10;
-- Check some cases when default is defined by
-- DEFAULT nextval('seq_name'::text) (not by DEFAULT nextval('seq_name'))
SELECT stop_metadata_sync_to_node('localhost', :worker_1_port);
CREATE SEQUENCE seq_11;
CREATE TABLE seq_test_10 (col0 int, col1 int DEFAULT nextval('seq_11'::text));
SELECT create_reference_table('seq_test_10');
INSERT INTO seq_test_10 VALUES (0);
CREATE TABLE seq_test_11 (col0 int, col1 bigint DEFAULT nextval('seq_11'::text));
-- works but doesn't create seq_11 in the workers
SELECT start_metadata_sync_to_node('localhost', :worker_1_port);
-- works because there is no dependency created between seq_11 and seq_test_10
SELECT create_distributed_table('seq_test_11', 'col1');
-- insertion from workers fails
\c - - - :worker_1_port
INSERT INTO sequence_default.seq_test_10 VALUES (1);
\c - - - :master_port
-- clean up
DROP TABLE sequence_default.seq_test_7_par;
DROP SCHEMA sequence_default CASCADE;
SELECT run_command_on_workers('DROP SCHEMA IF EXISTS sequence_default CASCADE');
SELECT stop_metadata_sync_to_node('localhost', :worker_1_port);
SET search_path TO public;

View File

@ -202,5 +202,9 @@ SELECT worker_hash('(1, 2)'::test_composite_type);
SELECT citus_truncate_trigger();
-- make sure worker_create_or_alter_role does not crash with NULL input
SELECT worker_create_or_alter_role(NULL, NULL, NULL);
SELECT worker_create_or_alter_role(NULL, 'create role dontcrash', NULL);
-- confirm that citus_create_restore_point works
SELECT 1 FROM citus_create_restore_point('regression-test');

View File

@ -2,10 +2,12 @@
-- MUTLI_SHARD_REBALANCER
--
CREATE TABLE dist_table_test(a int primary key);
SELECT create_distributed_table('dist_table_test', 'a');
SET citus.next_shard_id TO 433000;
CREATE TABLE ref_table_test(a int primary key);
SELECT create_reference_table('ref_table_test');
CREATE TABLE dist_table_test(a int primary key);
SELECT create_distributed_table('dist_table_test', 'a');
CREATE TABLE postgres_table_test(a int primary key);
-- make sure that all rebalance operations works fine when
-- reference tables are replicated to the coordinator
@ -20,6 +22,7 @@ CALL citus_cleanup_orphaned_shards();
-- test that calling rebalance_table_shards without specifying relation
-- wouldn't move shard of the citus local table.
SET citus.next_shard_id TO 433100;
CREATE TABLE citus_local_table(a int, b int);
SELECT citus_add_local_table_to_metadata('citus_local_table');
INSERT INTO citus_local_table VALUES (1, 2);
@ -27,6 +30,20 @@ INSERT INTO citus_local_table VALUES (1, 2);
SELECT rebalance_table_shards();
CALL citus_cleanup_orphaned_shards();
-- Check that rebalance_table_shards and get_rebalance_table_shards_plan fail
-- for any type of table, but distributed tables.
SELECT rebalance_table_shards('ref_table_test');
SELECT rebalance_table_shards('postgres_table_test');
SELECT rebalance_table_shards('citus_local_table');
SELECT get_rebalance_table_shards_plan('ref_table_test');
SELECT get_rebalance_table_shards_plan('postgres_table_test');
SELECT get_rebalance_table_shards_plan('citus_local_table');
-- Check that citus_move_shard_placement fails for shards belonging reference
-- tables or citus local tables
SELECT citus_move_shard_placement(433000, 'localhost', :worker_1_port, 'localhost', :worker_2_port);
SELECT citus_move_shard_placement(433100, 'localhost', :worker_1_port, 'localhost', :worker_2_port);
-- show that citus local table shard is still on the coordinator
SELECT tablename FROM pg_catalog.pg_tables where tablename like 'citus_local_table_%';
-- also check that we still can access shard relation, not the shell table
@ -83,7 +100,7 @@ SELECT pg_sleep(.1); -- wait to make sure the config has changed before running
SET citus.shard_replication_factor TO 2;
SELECT replicate_table_shards('dist_table_test_2', max_shard_copies := 4, shard_transfer_mode:='block_writes');
DROP TABLE dist_table_test, dist_table_test_2, ref_table_test;
DROP TABLE dist_table_test, dist_table_test_2, ref_table_test, postgres_table_test;
RESET citus.shard_count;
RESET citus.shard_replication_factor;
@ -909,6 +926,7 @@ SELECT * FROM get_rebalance_table_shards_plan('tab', rebalance_strategy := 'by_d
SELECT * FROM rebalance_table_shards('tab', rebalance_strategy := 'by_disk_size', shard_transfer_mode:='block_writes');
CALL citus_cleanup_orphaned_shards();
SELECT * FROM public.table_placements_per_node;
VACUUM FULL tab, tab2;
ANALYZE tab, tab2;
\c - - - :worker_1_port

View File

@ -66,6 +66,8 @@ def main(config):
common.run_pg_regress(config.old_bindir, config.pg_srcdir,
NODE_PORTS[COORDINATOR_NAME], AFTER_PG_UPGRADE_SCHEDULE)
citus_prepare_pg_upgrade(config.old_bindir)
# prepare should be idempotent, calling it a second time should never fail.
citus_prepare_pg_upgrade(config.old_bindir)
common.stop_databases(config.old_bindir, config.old_datadir)