Compare commits

...

173 Commits

Author SHA1 Message Date
Gürkan İndibay f44248c1d5
Bump Citus version to 11.0.10 (#7512) 2024-02-16 13:01:35 +03:00
Gürkan İndibay dcecefce68
Adds changelog for version 11.0.10 (#7510)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2024-02-15 14:13:24 +00:00
Gürkan İndibay 23290bee6b
Removes pg_send_cancellation and all references (#7509)
Cherry pick from 371f094b68
2024-02-15 15:50:32 +03:00
Gürkan İndibay 5ea5da8ef5
Bump Citus version to 11.0.9 (#7501) 2024-02-13 16:39:50 +03:00
Gürkan İndibay 1dc16c7cd6
Adds changelog for version 11.0.9 (#7493)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2024-02-12 12:28:20 +00:00
Teja Mupparti 0ae0a86d42 Fix the incorrect column count after ALTER TABLE, this fixes the bug #7378 (please read the analysis in the bug for more information)
(cherry picked from commit 00068e07c5)
2024-01-26 18:16:20 -08:00
Gokhan Gulbiz eca999d6fe
Backport GHA Migration to release-11.0 (#7300)
DESCRIPTION: PR description that will go into the change log, up to 78
characters

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2023-11-07 11:22:23 +02:00
Nils Dijk 76655957fd
Fix leaking of memory and memory contexts in Foreign Constraint Graphs (#7236)
DESCRIPTION: Fix leaking of memory and memory contexts in Foreign
Constraint Graphs

Previously, every time we (re)created the Foreign Constraint
Relationship Graph, we created a new Memory Context while loosing a
reference to the previous context. This old context could still have
left over memory in there causing a memory leak.

With this patch we statically have one memory context that we lazily
initialize the first time we create our foreign constraint relationship
graph. On every subsequent creation, beside destroying our previous
hashmap we also reset our memory context to remove any left over
references.
2023-10-09 13:13:52 +02:00
Hanefi Onaldi df50a2c0ea
Create a new colocation properly after braking one
When braking a colocation, we need to create a new colocation group
record in pg_dist_colocation for the relation. It is not sufficient to
have a new colocationid value in pg_dist_partition only.

This patch also fixes a bug when deleting a colocation group if no
tables are left in it. Previously we passed a relation id as a parameter
to DeleteColocationGroupIfNoTablesBelong function, where we should have
passed a colocation id.

(cherry picked from commit c22547d221)
2023-09-05 11:45:27 +03:00
zhjwpku 57e8bb3891 PQputCopyData's return value 0 should be considered fail (#7152) 2023-08-29 11:57:42 +02:00
onderkalaci 01c9ee30b5 Improve failure handling of distributed execution
Prior to this commit, the code would skip processing the
    errors happened for local commands.

    Prior to https://github.com/citusdata/citus/pull/5379, it might
    make sense to allow the execution continue. But, as of today,
    if a modification fails on any placement, we can safely fail
    the execution.

(cherry picked from commit b4008bc872)
2023-08-01 13:43:07 +03:00
Hanefi Onaldi 7603fe510a
Bump Citus version to 11.0.8 2023-04-25 14:23:44 +03:00
Hanefi Onaldi 367c22ac11
Add changelog entries for 11.0.8
(cherry picked from commit 214bc39a5a)
2023-04-25 14:23:43 +03:00
Gürkan İndibay f9b02ae2c7
Fix packaging test pipelines
We had #6737 fix the same issue on main branch, but we need to fix it on
release branches as well. As the patch does not really apply to earlier
release branches, we just added the fix manually.

(cherry picked from commit 4f9a344085)
2023-04-25 14:23:43 +03:00
Jelte Fennema 360537bc42 Use pg_total_relation_size in citus_shards (#6748)
DESCRIPTION: Correctly report shard size in citus_shards view

When looking at citus_shards, people are interested in the actual size
that all the data related to the shard takes up on disk.
`pg_total_relation_size` is the function to use for that purpose. The
previously used `pg_relation_size` does not include indexes or TOAST.
Especially the missing toast can have enormous impact on the size of the
shown data.

(cherry picked from commit b489d763e1)
2023-03-06 11:39:21 +01:00
aykut-bozkurt 01506e8a57 fix single tuple result memory leak (#6724)
We should not omit to free PGResult when we receive single tuple result
from an internal backend.
Single tuple results are normally freed by our ReceiveResults for
`tupleDescriptor != NULL` flow but not for those with `tupleDescriptor
== NULL`. See PR #6722 for details.

DESCRIPTION: Fixes memory leak issue with query results that returns
single row.

(cherry picked from commit 9e69dd0e7f)
2023-02-17 14:37:06 +03:00
Jelte Fennema 5729f9e690 Quote all identifiers that we use for logical replication (#6604)
In #6598 it was noticed that Citus could generate syntactically invalid
statements during logical replication. With #6603 we resolved the direct
issue, by only generating valid subscription names. But there was also
the underlying problem that we did not escape certain identifier
strings. While in theory this should be okay since we should only
generate names that are valid, this issue reiterated that we should not
take this for granted. As an extra line of defense this quotes all
identifiers we use during logical replication setup.

(cherry picked from commit c2b4087ff0)
2023-02-10 16:26:46 +01:00
Gürkan İndibay be8cd00d3f Fixes validate Output phase of packaging pipeline (#6678)
Pyenv is installed in our container images but I found out that pyenv is
not being activated since it is activated from ~/bashrc script and in
GitHub Actions (GHA) this script is not being executed
Since pyenv is not activated, default python versions comes from docker
images is being used and in this case we get errors for python version
3.11.
Additionally, $HOME directory is /github/home for containers executed
under GHA and our pyenv installation is under /root directory which is
normally home directory for our packaging containers
This PR activates usage of pyenv and additionally uses pyenv virtualenv
feature to execute validate_output function in isolation

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
(cherry picked from commit d919506076)
2023-01-31 14:01:05 +03:00
Onur Tirtir b9e18406fa Fall-back to seq-scan when accessing columnar metadata if the index doesn't exist
Fixes #6570.

In the past, having columnar tables in the cluster was causing pg
upgrades to fail when attempting to access columnar metadata. This is
because, pg_dump doesn't see objects that we use for columnar-am related
booking as the dependencies of the tables using columnar-am.
To fix that; in #5456, we inserted some "normal dependency" edges (from
those objects to columnar-am) into pg_depend.

This helped us ensuring the existency of a class of metadata objects
--such as columnar.storageid_seq-- and helped fixing #5437.

However, the normal-dependency edges that we added for indexes on
columnar metadata tables --such columnar.stripe_pkey-- didn't help at
all because they were indeed causing dependency loops (#5510) and
pg_dump was not able to take those dependency edges into the account.

For this reason, instead of inserting such dependency edges from indexes
to columnar-am, we allow columnar metadata accessors to fall-back to
sequential scan during pg upgrades.

(cherry picked from commit 1c51ddae49)
2023-01-30 19:13:58 +03:00
aykut-bozkurt 3eab345b67 fix dropping table_name option from foreign table (#6669)
We should disallow dropping table_name option if foreign table is in
metadata. Otherwise, we get table not found error which contains
shardid.

DESCRIPTION: Fixes an unexpected foreign table error by disallowing to drop the table_name option.

Fixes #6663

(cherry picked from commit 8a9bb272e4)
2023-01-30 17:48:47 +03:00
Gokhan Gulbiz 9b441c21da Allow plain pg foreign tables without a table_name option (#6652)
(cherry picked from commit 4e26464969)
2023-01-30 17:48:07 +03:00
Ahmet Gedemenli 2027d592a1 Fix crash when trying to replicate a ref table that is actually dropped (#6595)
DESCRIPTION: Fix crash when trying to replicate a ref table that is actually dropped

see #6592
We should have a real solution for it.

(cherry picked from commit bc3383170e)
(cherry picked from commit 9e32e34313)
2023-01-10 14:50:01 +03:00
Ahmet Gedemenli 1f03f13665 Use %u instead of %i for naming subscriptions & roles 2023-01-06 17:01:39 +03:00
Gürkan İndibay db7bd8e1a3
Add jobs to test builds on different distros (release-11.0) (#6542)
With this PR, citus code will be tested in all packaging environments.
Sometimes, there can be compile errors which blocks packaging and in
this case unplanned delays may occur.
By testing the code in packaging environments, I'm aiming to detect any
compilation errors before packaging.

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>

DESCRIPTION: PR description that will go into the change log, up to 78
characters

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2022-12-05 14:23:42 +03:00
Jelte Fennema bd7ee60991 Correctly fix OpenSSL 3.0 warnings (#6502)
In #6038 I tried to fix OpenSSL 3.0 warnings with PG13, but I had made a
mistake when doing that. This actually fixes these warnings.

(cherry picked from commit a477ffdf4b)
2022-12-05 09:30:57 +01:00
Jelte Fennema a9a9aa1098 Fix compilation warning on PG13 + OpenSSL 3.0 (#6038)
This removes some warnings that are present when building on Ubuntu 22.04.
It removes warnings on PG13 + OpenSSL 3.0. OpenSSL 3.0 has marked some
functions that we use as deprecated, but we want to continue support OpenSSL
1.0.1 for the time being too. This indicates that to OpenSSL 3.0, so it doesn't
show warnings.

(cherry picked from commit 3fadb98380)
2022-12-05 09:30:54 +01:00
Teja Mupparti 9aea377ce5 Fix the dangling pointer bug in get_merged_argument_list()
(cherry picked from commit edaf88e0ff)
2022-11-22 10:49:04 -08:00
Onur Tirtir 96e20ee42e Fix dangling pointer warning in AnyTableReplicated (#6504)
DESCRIPTION: Fixes a potential dangling pointer issue

Need to backport to 11.0 & 11.1 since we might want to release packages
for debian/bookworm based on those branches in future.

(cherry picked from commit 80faf47ab5)
2022-11-21 16:43:28 +03:00
Hanefi Onaldi a005db871a
Bump Citus version to 11.0.7 2022-11-08 12:12:17 +03:00
Hanefi Onaldi 011f5cfa83
Add changelog entries for 11.0.7 2022-11-08 12:11:16 +03:00
Naisila Puka b1a72bb822 Fixes empty password issue (#6417)
(cherry picked from commit 89aa9a015f)
2022-10-11 15:59:12 +03:00
Onur Tirtir 39225963c7 Use memcpy instead of memcpy_s to avoid pointless limits in columnar (#6419)
DESCRIPTION: Raises memory limits in columnar from 256MB to 1GB for
reads and writes

This doesn't completely fix #5918 but at least increases the
buffer limits that might cause throwing an error when reading
from or writing into into columnar storage. A way better approach
to fix this is documented in #6420.

Replacing memcpy_s with memcpy is quite safe in those places
since we anyway make sure to allocate enough amount of memory
before writing into related buffers.

(cherry picked from commit 0b81f68def)
2022-10-11 15:01:12 +03:00
Onur Tirtir 99697fb1e5 Fix use-after-free in GetAlterTriggerStateCommand() (#6413)
Fix use-after-free in GetAlterTriggerStateCommand() introduced in #6398.

(cherry picked from commit 517b72a9d5)
2022-10-10 16:38:52 +03:00
Onur Tirtir 9929d9240e Retain trigger settings when re-creating the triggers (on shards) (#6398)
Fixes https://github.com/citusdata/citus/issues/6394.

DESCRIPTION: Fixes a bug that causes creating disabled-triggers on
shards as enabled

Since CREATE TRIGGER doesn't have syntax support to specify
whether the trigger should be enabled/disabled, the underlying
PG function (`pg_get_triggerdef()`) that we use to generate the
command to create the trigger is not enough. For this reason, we
append a second command to enable/disable trigger, right after
creating it.

We don't retain explicit extension dependencies set by using
`ALTER trigger DEPENDS ON EXTENSION` commands too, but apparently
right fix for that is to throw an error as in
`PreprocessAlterTriggerDependsStmt()`; so, opened a separate PR
to fix that #6399.

(cherry picked from commit 86e186f671)
2022-10-10 11:24:01 +03:00
Ying Xu d3757ff15d
[Columnar] 11.0 Cherry-Pick Bugfix for Columnar: options ignored during ALTER TABLE (#6410)
DESCRIPTION: Fixes a bug that prevents retaining columnar table options
after a table-rewrite A fix for this issue: Columnar: options ignored
during ALTER TABLE rewrite #5927
The OID for the temporary table created during ALTER TABLE was not the
same as the original table's OID so the columnar options were not being
applied during rewrite.

The change is that I applied the original table's columnar options to
the new table so that it has the correct options during write. I also
added a test.

cherry-pick from commit f21cbe68f8
2022-10-09 22:30:59 -07:00
Naisila Puka 51da46c021 Use original relation to retrieve column name because of syscache (#6387)
During alter_distributed_table, we create a new table like the
original table but with the altered options.

To retrieve the name of the distribution column, we were using
the attribute syscache of the new table, since we already created
the new table as identical to the original table.

However, the attribute syscaches of these two tables are not
the same if the original table has dropped columns. The reason
is that dropped columns are all still present in the cache.
Hence, for example, the attnos would be different in the syscaches.

So, let's use the attribute syscache of the original table.
2022-10-06 12:11:35 +03:00
Hanefi Onaldi ff6358749d
Ensure no dependencies to index before drop
(cherry picked from commit 11a9a3771f)
2022-10-04 21:08:39 +03:00
Hanefi Onaldi 628908e990
Document failing downgrades from 10.2-4 to 10.2-2
(cherry picked from commit 5ddd4754a2)
2022-10-04 21:08:39 +03:00
Hanefi Onaldi 3efafe49ba
Fix tests for missing downgrades
(cherry picked from commit 0efd6f7829)
2022-10-04 21:08:39 +03:00
Onur Tirtir b14bae6311 Not allow ON DELETE/UPDATE SET DEFAULT actions on columns that default to sequences (#6340)
Given that we drop DEFAULT nextval('sequence') expressions from
shard relation columns, allowing `ON DELETE/UPDATE SET DEFAULT`
on such columns might cause inserting NULL values as a result
of a delete/update operation.

For this reason, we disallow ON DELETE/UPDATE SET DEFAULT actions
on columns that default to sequences.

DESCRIPTION: Disallows having ON DELETE/UPDATE SET DEFAULT actions on
columns that default to sequences

Fixes #6339.

(cherry picked from commit a868cc049a)

 Conflicts (dropped those changes since pg15 is not supported on 11.0):
	src/test/regress/expected/pg15.out
	src/test/regress/sql/pg15.sql
2022-09-23 14:24:29 +03:00
Onur Tirtir 308f9298d7 Include IntegerArrayTypeToList from worker_protocol.h instead of array_type.h
On Citus versions <= 11.0, IntegerArrayTypeToList() doesn't exist and
its helpers (DeconstructArrayObject() & ArrayObjectCount()) are defined
in worker_protocol.h. (See 9476f377b5).

So we add IntegerArrayTypeToList() into worker_protocol.c and include
IntegerArrayTypeToList from worker_protocol.h instead of array_type.h
in foreign_constraint.c.

This is needed to backport a868cc049a into
this (release-11.0) branch, see the next commit.
2022-09-23 14:24:22 +03:00
Onur Tirtir fcd0bdf370 Not drop default col exprs from shard when adding local table to metadata (#6323)
As we did for GENERATED STORED columns in #4613, we should not drop
column
default expressions that are not based on sequences from shard relation
since
such expressions need to exist e.g. for foreign key actions.

For the column default expressions that are based on sequences we cannot
do much, so we need to disallow having ON DELETE SET DEFAULT actions on
such columns in a separate PR, see #6339.

Fixes #6318.

DESCRIPTION: Fixes a bug that might cause inserting incorrect DEFAULT
values when applying foreign key actions

(cherry picked from commit de24a3eda5)

 Conflicts (dropped those changes since pg15 is not supported on 11.0):
	src/test/regress/expected/pg15.out
	src/test/regress/sql/pg15.sql
2022-09-23 13:58:23 +03:00
Naisila Puka 498131b4f6 Use RelationGetPrimaryKeyIndex for citus catalog tables (#6262)
pg_dist_node and pg_dist_colocation have a primary key index, not a replica identity index.

Citus catalog tables are created in public schema, which has replica identity index by default 
as primary key index. Later the citus catalog tables are moved to pg_catalog schema.

During pg_upgrade, all tables are recreated, and given that pg_dist_colocation is found in
pg_catalog schema, it is recreated in that schema, and when it is recreated it doesn't
have a replica identity index, because catalog tables have no replica identity.

Further action:
Do we even need to acquire this lock on the primary key index?
Postgres doesn't acquire such locks on indexes before deleting catalog tuples.
Also, catalog tuples don't have replica identities by definition.
2022-09-22 12:50:11 +03:00
Jelte Fennema b2d0ac5a9c Revert to using old image tag for upgrade tests 2022-09-07 13:27:49 +02:00
Hanefi Onaldi 38f088d10c Normalize messages from different libpq versions
Historically we have been testing with the 'latest' version of libpq
when the CI images were build. This has the downside that rebuilding the
images often break our tests due to different errors returned from
libpq.

With this change we will actually test with a stable version of libpq
that is based on the postgres minor version that we test against.

This will make it easier to maintain postgres images over time, as well
as running _all_ tests locally, where we change libpq in sync with the
postgres server version.

(cherry picked from commit f944f97d01)
2022-09-07 13:27:49 +02:00
Hanefi Onaldi 0ae1c630cf Introduce one new alternative text output to fix flakiness (#5913)
Here is a flaky test output that is quite hard to fix:

```diff
diff -dU10 -w /home/circleci/project/src/test/regress/expected/isolation_master_update_node_1.out /home/circleci/project/src/test/regress/results/isolation_master_update_node.out
--- /home/circleci/project/src/test/regress/expected/isolation_master_update_node_1.out.modified	2022-03-21 19:03:54.237042562 +0000
+++ /home/circleci/project/src/test/regress/results/isolation_master_update_node.out.modified	2022-03-21 19:03:54.257043084 +0000
@@ -49,18 +49,20 @@
  <waiting ...>
 step s2-update-node-1-force: <... completed>
 master_update_node
 ------------------

 (1 row)

 step s2-abort: ABORT;
 step s1-abort: ABORT;
 FATAL:  terminating connection due to administrator command
-SSL connection has been closed unexpectedly
+server closed the connection unexpectedly
+	This probably means the server terminated abnormally
+	before or while processing the request.
```

I could not come up with a solution that would decrease the flakiness in the test outputs. We already have 3 output files for the same test and now I introduced a 4th one.

I can also add complex regular expressions that span multiple lines, and normalize these error messages. Feel free to suggest a normalized error message in a comment here.

## Current alternative file contents

`isolation_master_update_node.out`
```
step s1-abort: ABORT;
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
SSL connection has been closed unexpectedly
```

`isolation_master_update_node_0.out`
```
step s1-abort: ABORT;
WARNING: this step had a leftover error message
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
```

`isolation_master_update_node_1.out`
```
step s1-abort: ABORT;
FATAL:  terminating connection due to administrator command
SSL connection has been closed unexpectedly
```

new file: `isolation_master_update_node_2.out`
```
step s1-abort: ABORT;
FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
```

(cherry picked from commit 518fb0873e)
2022-09-07 13:27:49 +02:00
Jelte Fennema 4d21903d3e Fix flakyness in failure_setup (#6205)
In CI sometimes failure_setup will fail with the following error:
```diff
 SELECT master_add_node('localhost', :worker_2_proxy_port);  -- an mitmproxy which forwards to the second worker
- master_add_node
----------------------------------------------------------------------
-               2
-(1 row)
-
+ERROR:  connection to the remote node localhost:9060 failed with the following error: could not connect to server: Connection refused
+	Is the server running on host "localhost" (127.0.0.1) and accepting
+	TCP/IP connections on port 9060?
+could not connect to server: Connection refused
+	Is the server running on host "localhost" (127.0.0.1) and accepting
+	TCP/IP connections on port 9060?
+could not connect to server: Cannot assign requested address
+	Is the server running on host "localhost" (::1) and accepting
+	TCP/IP connections on port 9060?
diff -dU10 -w /home/circleci/project/src/test/regress/expected/failure_online_move_shard_placement.out /home/circleci/project/src/test/regress/results/failure_online_move_shard_placement.out
```

This then breaks all the tests run after it as well, because we're
missing one worker node.

Locally I was able to reproduce this error by sleeping for 10 seconds in
the forked process sleep before actually starting mitmproxy. So I'm
expecting what's happening in CI is that due to limited resources,
mitmproxy is not up yet when we try to add its port as a workernode.

This PR fixes this by waiting until mitmproxy is listening on its socket
before actually starting to run our tests. This fixed it locally for me
when I made the forked process sleep for 10 seconds before starting
mitmproxy.

In passing it also improves the detection and errors that we already
had for the case where something was already listening on the
mitmproxy port.

Because both @gledis69 and me were changing things in our CI images
at the same time this also includes a bump of the style checker tools.
Closes #6200

(cherry picked from commit 25e5cf2e50)
2022-09-07 13:27:49 +02:00
Gledis Zeneli c2b584c4bf Update stylechecker version (#6194)
Update stylechecker image to include versions similar to the other test images.

(cherry picked from commit 2b74735496)
2022-09-07 13:27:49 +02:00
Jelte Fennema 5901b815a2 Fix flakyness in adaptive_executor (#6275)
Sometimes in CI our adaptive_executor test would fail randomly with the
following error:

```diff
 SELECT sum(result::bigint) FROM run_command_on_workers($$
   SELECT count(*) FROM pg_stat_activity
   WHERE pid <> pg_backend_pid() AND query LIKE '%8010090%'
 $$);
  sum
 -----
-   4
+   2
 (1 row)

 END;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26665/workflows/40665680-0044-4852-8fe4-5fd628f9fb47/jobs/764371

This means that the low slow start interval did not have any effect on
the number of connections being opened. I could see two possibilities
for this to happen:
1. CI was slow and actually doing the start of the second connection. I
   tried to solve this by doubling the time a query to the worker takes.
2. The second option is that the shards were queried in the oposite
   order than we expect. This would mean that the first query to the
   worker completes quickly because there's no, sleep because it doesn't
   contain any rows. I tried to solve this option by adding a row to
   each shard.

After trying to reproduce the random failure in CI it turned out that I
needed both of these fixes to resolve the random failure.

(cherry picked from commit f22a47981a)
2022-09-07 13:27:49 +02:00
Jelte Fennema 8451dd3554 Hopefully fix flakyeness in drop_partitioned_table (#6270)
Sometimes in CI our drop_partitioned_talbe test would fail with the
following error:

```diff
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing DROP TABLE IF EXISTS drop_partitioned_table.child1_727001 CASCADE
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
 ROLLBACK;
 NOTICE:  issuing ROLLBACK
 NOTICE:  issuing ROLLBACK
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26631/workflows/31536032-e1ba-493b-b12a-f40757f3a7d6/jobs/762170

For some reason the colocationid of the distributed partitioned table
would be one less than we expected. Why this happens I'm not sure, but
it seems fairly harmless that it does.

In an attempt to work around this flakyness I now reset the colocation
id sequence right before creating the table in question. This is good
practice in general, because it allows us to run the test successfully
using `check-minimal` and it also allows us to rerun it multiple times.

(cherry picked from commit 895a484b39)
2022-09-07 13:27:49 +02:00
Jelte Fennema 33913bed37 Fix flakyness in failure_connection_establishment (#6251)
In CI sometimes failure_connection_establishment would fail with the
following error:
```diff
 -- cancel all connections to this node
 SELECT citus.mitmproxy('conn.onAuthenticationOk().cancel(' || pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 SELECT * FROM citus_check_cluster_node_health();
```

The reason for this is that the mitm command that was used is very
broad and doesn't actually do what the comment says. What happens is
that if any connection is made, the current backend is cancelled, which
is not the always the same as the backend that made the connection. My
assessment is that likely the maintenance daemon makes a connection to
the node while we are executing the mitmproxy command. The mitmproxy
command goes through, and then triggers a cancel of itself due to the
connection made by the maintenance daemon.

This PR simply removes this test, since it doesn't seem to test what it
intended to test anyway. There's also still the "kill" version of this
test, which does do the intended thing. So I don't think we lose
important coverage by removing this test.

(cherry picked from commit 2a0c0b3ba6)
2022-09-07 13:27:49 +02:00
Jelte Fennema 77596cd62f Fix flakyness in multi_transaction_recovery (#6249)
Sometimes in CI multi_transaction_recovery would fail with the following
error:
```diff
 SET LOCAL citus.defer_drop_after_shard_move TO OFF;
 SELECT citus_move_shard_placement((SELECT * FROM selected_shard), 'localhost', :worker_1_port, 'localhost', :worker_2_port, shard_transfer_mode := 'block_writes');
- citus_move_shard_placement
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  could not find placement matching "localhost:57637"
+HINT:  Confirm the placement still exists and try again.
 COMMIT;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26510/workflows/8269ea93-d9b4-4376-ae0e-8332a5c15fc6/jobs/755548

The reason for this was that when choosing `selected_shard` we didn't
ensure that it was actually located on the node that we were moving it
from. Instead we simply picked the first shard for the table that was
returned by the query.

To fix this issue this PR adds a filter to only choose shards that are
located on the intended node.

(cherry picked from commit 18015ca501)
2022-09-07 13:27:49 +02:00
Jelte Fennema bf63788b98 Fix flakyness in isolation_distributed_deadlock_detection (#6240)
Our isolation_distributed_deadlock_detection test would fail randomly in
CI in three different ways.

The first type of failure looked like this:

```diff
 check_distributed_deadlocks
 ---------------------------
 t
 (1 row)

-step s1-update-5: <... completed>
 step s5-update-1: <... completed>
 ERROR:  canceling the transaction since it was involved in a distributed deadlock
+step s1-update-5: <... completed>
 step s1-commit:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26399/workflows/d213ee85-397a-467a-9ffb-39e4f44e6688/jobs/749533

This random change in output was harmless and happened because when the
deadlock detector cancelled a query, two queries would continue: The one
that was cancelled would throw an error (and thus complete), and the one
that was unblocked would now complete.

It was random which of the two the isolation tester would first detect
as completed. To resolve this PR starts using the ["marker" feature][1],
this allows us to make sure one of the steps won't be marked as
completed until the other one completed first.

The second random failure was very similar:
```diff
 check_distributed_deadlocks
 ---------------------------
 t
 (1 row)

-step s2-update-2: <... completed>
-step s3-update-3: <... completed>
-ERROR:  canceling the transaction since it was involved in a distributed deadlock
 step s6-commit:
   COMMIT;

 step s5-update-6: <... completed>
+step s2-update-2: <... completed>
+step s3-update-3: <... completed>
+ERROR:  canceling the transaction since it was involved in a distributed deadlock
 step s5-commit:
```

Again a harmless difference in test output. In this case it's possible
that the deadlock detector would not detect the unblocked processes
right away, and would thus continue with to the next step. This step was
a commit on a session that was not blocked, and which thus could
complete without issues.

To solve this I changed the order of the commits at the end of the
permutation, to always have the first session that would commit be the
session that would be unblocked the last. This ensures that no commit
will ever be executed before completing all the queries.

The third issue was different and looked like this:
```diff
 step s4-update-5: <... completed>
 step s4-commit:
   COMMIT;

+step s1-update-4: <... completed>
+isolationtester: canceling step s3-update-4 after 5 seconds
 step s3-update-4: <... completed>
+ERROR:  canceling statement due to user request
+step s2-update-2: <... completed>
 step s3-commit:
   COMMIT;

-step s2-update-2: <... completed>
-step s1-update-4: <... completed>
 step s1-commit:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26411/workflows/9089beec-4f0f-4027-b4ce-0e84889afc06/jobs/750143

The reason for this failure is not entirely clear to me, but I was able
to remove the flakyness without impacting the goal of the test. What was
happening was that both `s1` and `s3` were waiting for `s4` to commit
and release it's lock on the row 4. For some reason it wasn't
deterministic which of the two sessions would be granted the lock after
it was released by row 4. The test expected `s3` to be granted the lock,
but sometimes it would be granted to `s1` instead. Which would in turn
cause `s3` to still be blocked.

To solve this I simply removed `s1` completely from this test. It wasn't
actually part of the cycle that the deadlock detector should detect and
was an unrelated appendage:

```mermaid
  graph TD;
      s2-->s3;
      s3-->s4;
      s1-->s4;
      s4-->s5;
      s5-->s6;
      s6-->s5;
```

By removing `s1` completely there was no contention for the lock and
`s3` could always acquire it.

[1]: a73d6c87f2/src/test/isolation/README (L163-L188)

(cherry picked from commit 9749622399)
2022-09-07 13:27:49 +02:00
Jelte Fennema 45838a0139 Fix flakyness in multi_replicate_reference_table (#6235)
In CI multi_replicate_reference_table would sometimes fail like this:

```diff
 -- detects correctly that referecence table doesn't have replica identity
 SELECT replicate_reference_tables();
-ERROR:  cannot use logical replication to transfer shards of the relation initially_not_replicated_reference_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
+ERROR:  cannot use logical replication to transfer shards of the relation ref_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
 DETAIL:  UPDATE and DELETE commands on the shard will error out during logical replication unless there is a REPLICA IDENTITY or PRIMARY KEY.
 HINT:  If you wish to continue without a replica identity set the shard_transfer_mode to 'force_logical' or 'block_writes'.
```

Because `CitusTableTypeIdList` returns tables in heap order so it's
a bit random which one is first in the list. And the test contained
multiple tables that didn't have a primary key or replica identity. So
it made sense that the error could be for either one of these tables.
This PR makes the test output consistent by changing one of the tables
to have a primary key.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26387/workflows/fc3196e7-ddf2-4000-a70b-5ac71c836321/jobs/748940

(cherry picked from commit 5c0205ce10)
2022-09-07 13:27:49 +02:00
Jelte Fennema f87940221f Fix flakyness in ch_benchmarks_1 (#6228)
One of our arbitrary config tests would sometimes fail like this in CI:
```diff
     su_nationkey,
     cust_nation,
     l_year;
- supp_nation | cust_nation | l_year | revenue
----------------------------------------------------------------------
-           9 | C           |   2008 |    3.00
-(1 row)
-
+ERROR:  cannot connect to localhost:10212 to fetch intermediate results
+CONTEXT:  while executing command on localhost:10211
```

When looking at the logs it seems like we were running out of
connections:
```
2022-08-23 14:03:52.856 UTC [28122] FATAL:  sorry, too many clients already
2022-08-23 14:03:52.860 UTC [21027] ERROR:  cannot connect to localhost:10212 to fetch intermediate results
```

This happened with `CitusThreeWorkersManyShards` config. This test on
purpose tries to push the limits of Citus quite far. And the
`ch_benchmarks_1` test is also run in parallel with a few more ones. So
it's not too weird that it ran out of connections. This doubles the
connection limit in the arbitrary config tests to hopefully not hit this
error again.

Example of failed test: https://app.circleci.com/pipelines/github/citusdata/citus/26365/workflows/7a1b5688-85cc-4bc3-ade5-9bd1d83cd0ed/jobs/747908/parallel-runs/1

(cherry picked from commit 21780b4f65)
2022-09-07 13:27:49 +02:00
Jelte Fennema 5d60bbf7f8 Better test failure debugging for arbitrary-configs (#5861)
This improves debugging of arbitrary configs in two ways:
1. Enable logging of distributed deadlock detection
2. Show output of `psql` commands

(cherry picked from commit a645cb4b94)
2022-09-07 13:27:49 +02:00
Jelte Fennema 0fd78b1bde Fix flakyness in failure_connection_establishment (#6226)
In CI our failure_connection_establishment sometimes failed randomly
with the following error:
```diff
 -- verify a connection attempt was made to the intercepted node, this would have cause the
 -- connection to have been delayed and thus caused a timeout
 SELECT * FROM citus.dump_network_traffic() WHERE conn=0;
  conn | source | message
 ------+--------+---------
-    0 | coordinator | [initial message]
-(1 row)
+(0 rows)

 SELECT citus.mitmproxy('conn.allow()');
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26318/workflows/d3354024-9a67-4b01-9416-5cf79aec6bd8/jobs/745558

The way I fixed this was by removing the dump_network_traffic call. This
might sound simple, but doing this while continuing to let the test
serve its intended purpose required quite some more changes.

This dump_network_traffic call was there because we didn't want to show
warnings in the queries above, because the exact warnings were not
reliable. The main reason this error was not reliable was because we
were using round-robin task assignment. We did the same query twice, so
that it would hit the node with the intercepted connection in one of
those connections. Instead of doing that I'm now using the
"first-replica" policy and do the queries only once. This works, because
the first placements by placementid for each of the used tables are on
the second node, so first-replica will cause the first connection to go
there.

This solved most of the flakyness, but when confirming that the
flakyness was fixed I found some additional errors:

```diff
 -- show that INSERT failed
 SELECT citus.mitmproxy('conn.allow()');
  mitmproxy
 -----------

 (1 row)

 SELECT count(*) FROM single_replicatated WHERE key = 100;
- count
----------------------------------------------------------------------
-     0
-(1 row)
-
+ERROR:  could not establish any connections to the node localhost:9060 after 400 ms
 RESET client_min_messages;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26321/workflows/fd5f4622-400c-465e-8d82-83f5f55a87ec/jobs/745666

I addressed this with a combination of two things:
1. Only change citus.node_connection_timeout for the queries that we
   want to test timeout behaviour for. When those queries are done I
   reset the value to the default again.
2. Change our mitm framework to only delay the initial connection packet
   instead of all packets. I think sometimes a follow on packet of a previous
   connection attempt was causing the next connection attempt to be delayed
   even if `conn.allow()` was already called. For our tests we only care about
   connection timeouts, so there's no reason to delay any other packets than
   the initial connection packet.

Then there was some last flakyness in the exact error that was given:

```diff
 -- tests for connectivity checks
 SELECT name FROM r1 WHERE id = 2;
 WARNING:  could not establish any connections to the node localhost:9060 after 900 ms
+WARNING:  connection to the remote node localhost:9060 failed with the following error:
  name
 ------
  bar
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26338/workflows/9610941c-4d01-4f62-84dc-b91abc56c252/jobs/746467

I don't have a good explaination for this slight change in error message, but
given that it is missing the actual error message I expected this to be related
to some small difference in timing: e.g. the server responding to the connection
attempt right after the coordinator determined that the connection timed out.
To solve this last  flakyness I increased the connection timeouts and made the
difference between the timeout and the delay a bit bigger. With these tweaks
I wasn't able to reproduce this error on CI anymore.

Finally, I made most of the same changes to failure_failover_to_local_execution,
since it was using the `conn.delay()` mitm method too. The only change that
I left out was the timing increase, since it might not be strictly necessary and
increases time it takes to run the test. If this test ever becomes flaky the first
thing we should try is increase its timeout.

(cherry picked from commit cc7e93a56a)
2022-09-07 13:27:49 +02:00
Jelte Fennema 583080b872 Fix flakyness in failure_single_select (#6223)
The failure_single_select test would sometimes fail with an error that's
similar to this:
```diff
 -- cancel after first SELECT; txn should fail and nothing should be marked as invalid
 SELECT citus.mitmproxy('conn.onQuery(query="^SELECT").cancel(' ||  pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 BEGIN;
```

This error looked very to the one from #6217 and indeed the cause turned
out to be similar. Because we were canceling all SELECT queries, we
would actually sometimes cancel our mitmproxy SELECT queries itself.

This puts some additional restrictions on the queries that we cancel,
most importantly it should contain the name of the table that we're
selecting from.

I was able to reproduce the original issue locally pretty reliably. With
the changes in this PR it didn't happen again.

In passing this also changes one other failure test that was cancelling
all selects and puts similar additional restrictions on those
cancellations.

Example of failed test in CI: https://app.circleci.com/pipelines/github/citusdata/citus/26305/workflows/4d942b91-f83c-453c-8d9a-ae22d608e756/jobs/745071

(cherry picked from commit 506c16efdf)
2022-09-07 13:27:49 +02:00
Jelte Fennema 9b5027917a Fix flakyness in failure_create_distributed_table_non_empty (#6217)
The failure_create_distributed_table_non_empty test would sometimes fail
like this:
```diff
 -- in the first test, cancel the first connection we sent from the coordinator
 SELECT citus.mitmproxy('conn.cancel(' ||  pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 SELECT create_distributed_table('test_table', 'id');
```

Because the cancel command had no filter it would actually sometimes
cancel the mitmproxy cancel command itself. This PR addresses that by
filtering on CREATE TABLE, which is one of the command that
create_distributed_table will send to the workers.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26252/workflows/1b7e5464-cca4-4ec1-99b3-48ddf25c29fa/jobs/742829

(cherry picked from commit e2a24b921e)
2022-09-07 13:27:49 +02:00
Jelte Fennema a3325c1146 Fix flakyness in columnar_memory test (#6216)
Sometimes in CI the columnar_memory test was using slightly more memory
than expected.
```diff
 SELECT CASE WHEN 1.0 * TopMemoryContext / :top_post BETWEEN 0.98 AND 1.02 THEN 1 ELSE 1.0 * TopMemoryContext / :top_post END AS top_growth
 FROM columnar_test_helpers.columnar_store_memory_stats();
--[ RECORD 1 ]-
-top_growth | 1
+-[ RECORD 1 ]------------------
+top_growth | 1.0206132116232119

 -- before this change, max mem usage while executing inserts was 28MB and
```

This PR changes the expectation to be slightly higher, such that this
random increase in memory usage doesn't cause a flaky test.

Failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26256/workflows/c0870f66-3346-4f8d-a1d3-36dfd7c98289/jobs/743028

(cherry picked from commit 4ce17f015b)
2022-09-07 13:27:49 +02:00
Jelte Fennema 242f40d640 Improve debugability for columnar_memory flakyness (#6203)
Sometimes the columnar_memory test fails in CI with the following error:
```diff
 SELECT 1.0 * TopMemoryContext / :top_post BETWEEN 0.98 AND 1.02 AS top_growth_ok
 FROM columnar_test_helpers.columnar_store_memory_stats();
 -[ RECORD 1 ]-+--
-top_growth_ok | t
+top_growth_ok | f

 -- before this change, max mem usage while executing inserts was 28MB and
```

This is almost certainly a harmless failure that simply requires bumping
the margin a little bit. However, it's impossible to say with the
current output. I was unable to reproduce this on-demand on my local
machine or even in CI. So this changes the test to include the actual
value difference in the size of TopMemoryContext when it's outside the
expected range. Then next time it fails we at least have some
information about why.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/25966/workflows/d472a57b-419a-4f33-b8bc-2e174a98d4d6/jobs/730576

(cherry picked from commit e6a1a86db0)
2022-09-07 13:27:49 +02:00
Jelte Fennema c1f58fac6c Don't run any isolation tests in parallel (#6212)
By running isolation tests in parallel we're just asking for flaky
tasks. The first test might temporarily block one of the commands in the
second test, which we then detect as waiting like this:
```diff
 step s2-vacuum-analyze:
     VACUUM ANALYZE test_insert_vacuum;
-
+ <waiting ...>
 step s1-commit:
     COMMIT;

+step s2-vacuum-analyze: <... completed>
```

Debugging flaky tests is also much harder when they are run in parallel.
This PR starts running all our isolation tests sequentially.

The reason for opening this PR was me seeing this failing test:
https://app.circleci.com/pipelines/github/citusdata/citus/26194/workflows/ff57e2cf-8ac4-40fe-bc0c-74a7f8fecb53/jobs/740454

As well as having fixed a similar issue recently in #6122

(cherry picked from commit 85305b2773)
2022-09-07 13:27:49 +02:00
Jelte Fennema 3d67ae6497 Fix flakyness in failure_insert_select_repartition (#6202)
This fixes our most commonly randomly failing failure test. The failing
diff is as follows:

```diff
SELECT citus.mitmproxy('conn.onQuery(query="fetch_intermediate_results").kill()');
  mitmproxy
 -----------

 (1 row)

 INSERT INTO target_table SELECT * FROM source_table;
-ERROR:  connection to the remote node localhost:xxxxx failed with the following error: connection not open
+ERROR:  could not open file "base/pgsql_job_cache/10_0_40/repartitioned_results_20770193413_from_4213590_to_1.data": No such file or directory
+CONTEXT:  while executing command on localhost:9060
+while executing command on localhost:57637
 SELECT * FROM target_table ORDER BY a;
```

As far as I can tell this is the cause of a race condition: After killing
fetch_intermediate_results on worker 9060, the previously created data
file gets cleaned up. The fetch_intermediate_results call that's sent
to worker 57637 will be cancelled and rolled back soon because of the
failure on the other connection. But if that fetch_intermediate_results
call is able to connect to 9060 before it is cancelled, it won't find
the file it's looking for there anymore. So while it's not the error we
expect, it does indicate that we succeeded.

To avoid this issue instead of killing the fetch_intermediate_results
call directly, we kill the COPY command that it uses to do the fetch.
This results in stable output as can be seen here, where 227 runs of
failure_insert_select_repartition succeeded:
https://app.circleci.com/pipelines/github/citusdata/citus/26168/workflows/9c64a3b6-f46c-4725-9fb4-8f6a2d00a023/jobs/739389

To be clear this changes the test to affects the opposite
fetch_intermediate_results call. This kills the fetch_intermediate_results
call of worker 57637, instead of killing the fetch_intermediate_results call
on worker 9060.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26147/workflows/780e95ea-264a-4c9f-ad2e-cf11449a795e/jobs/738467

(cherry picked from commit 8ce12eb51f)
2022-09-07 13:27:49 +02:00
Önder Kalacı 173b7ed5ad Properly add / remove coordinator for isolation tests (#6181)
We used to rely on a seperate session to add the coordinator.
However, that might prevent the existing sessions to get
assigned proper gpids, which causes flaky tests.

(cherry picked from commit 961fcff5db)
2022-09-07 13:27:49 +02:00
Jelte Fennema eed4d6452d Fix flakyness in columnar_first_row_number test (#6192)
When running columnar_first_row_number in parallel with the
columnar_query test sometimes it would fail. This bug is tracked
in #6191. For now to make CI less flaky we simply don't run these tests
in parallel.

Example of failed test: https://app.circleci.com/pipelines/github/citusdata/citus/26106/workflows/75d00ea9-23f8-4bff-a927-bced19e1f81b/jobs/736713

Fixes #6184

(cherry picked from commit 0a045afd3a)
2022-09-07 13:27:49 +02:00
Jelte Fennema aa879108b7 Remove the flaky rollback_to_savepoint test (#6190)
This removes a flaky test that I introduced in #3868 after I fixed the
issue described in #3622. This test is sometimes fails randomly in CI.
The way it fails indicates that there might be some bug: A connection
breaks after rolling back to a savepoint.

I tried reproducing this issue locally, but I wasn't able to. I don't
understand what causes the failure.

Things that I tried were:

1. Running the test with:
   ```sql
   SET citus.force_max_query_parallelization = true;
   ```
2. Running the test with:
   ```sql
   SET citus.max_adaptive_executor_pool_size = 1;
   ```
3. Running the test in parallel with the same tests that it is run in
   parallel with in multi_schedule.

None of these allowed me to reproduce the issue locally.

So I think it's time to give on fixing this test and simply remove the
test. The regression that this test protects against seems very unlikely
to reappear, since in #3868 I also added a big comment about the need
for the newly added `UnclaimConnection` call. So, I think the need for
the test is quite small, and removing it will make our CI less flaky.

In case the cause of the bug ever gets found, I tracked the bug in #6189

Example of a failing CI run:
https://app.circleci.com/pipelines/github/citusdata/citus/26098/workflows/f84741d9-13b1-4ae7-9155-c21ed3466951/jobs/736424

For reference the unexpected diff is this (so both warnings and an error):
```diff
 INSERT INTO t SELECT i FROM generate_series(1, 100) i;
+WARNING:  connection to the remote node localhost:57638 failed with the following error:
+WARNING:
+CONTEXT:  while executing command on localhost:57638
+ERROR:  connection to the remote node localhost:57638 failed with the following error:
 ROLLBACK;
```

This test is also mentioned as the most failing regression test in #5975

(cherry picked from commit d16b458e2a)
2022-09-07 13:27:49 +02:00
Jelte Fennema ee887ef648 Fix flakyness in create index concurrently isolation tests (#6158)
This creates consistent test output for isolation tests that involve
`CREATE INDEX CONCURRENTLY`. `CREATE INDEX CONCURRENTLY` is sometimes
temporarily detected as blocking, even though it will complete without any other
queries needing to be run. This change makes sure that we wait until that happens
without running any other queries in the meantime. This way we always get consistent
output. The way we do that is addressed by using an empty step in the same
session as the `CREATE INDEX CONCURRENLTY` command. Doing so forces
the isolation tester to wait until the command is finished and not continue with
steps from other sessions. This is [the recommended approach by Postgres][1].

There's two separate cases which are addressed in slightly different ways:
1. If `CREATE INDEX CONCURRENTLY` is actually blocked on another session: Add an
    empty step right after the commit of blocking session.
    e.g. `"s2-ddl-create-index-concurrently" "s1-commit" "s2-empty"`
2. If it's not actually blocked on another session: Add [an asterisk marker][2] to make
    it look like it's blocked (because sometimes this happens randomly) and right
    after that we add an empty step to trigger waiting.
    e.g. `"s2-ddl-create-index-concurrently"(*) "s2-empty" "s1-commit"`

In passing this also enables isolation tests that were disabled due to a
bug that has already been fixed for a while.

Fixes #5993
Related to #5910 and #2966

[1]: 5f0adec253/src/test/isolation/README (L197-L204)
[2]: 5f0adec253/src/test/isolation/README (L174-L179)

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
(cherry picked from commit fd07cc9baf)
2022-09-07 13:27:49 +02:00
Jelte Fennema a9a114145a Fix flakyness in isolation_data_migration.spec (#6122)
The tests isolation_concurrent_dml and isolation_data_migration tests
were being run in parallel, but they were interfering with each others
output. Sometimes queries from isolation_concurrent_dml were blocking
create_distributed_table in isolation_data_migration:

1. https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/f9d0a6ff-bb7a-4b71-9fcf-1a3e46d54425/jobs/713270
2. https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/1e22454c-1623-48a7-97fb-c6803c7959c7/jobs/713223
3. https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/618c419e-eefb-4582-9482-322dbb9ac96d/jobs/713110

This fixes it changing the schedule to not run these tests in parallel.

(cherry picked from commit dff71abc32)
2022-09-07 13:27:49 +02:00
Jelte Fennema 86614e3555 Fix flakyness in isolation_replicate_reference_tables_to_coordinator.spec (#6123)
When the deadlock detector kills s2-update-dist-table both sessions
finish at the same time. The order in which they are displayed can be
swapped. To counteract this we start using the ["marker" feature][1] of
the isolationtester framework to create consistent output.

In passing this also sets the next_shard_id to the expected value by
this test so it can be run using `make check-isolation-base`.

Failed CI test: https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/dfe6f88a-c306-4d91-b771-d5d1deb1798d/jobs/713417

[1]: ec62ce55a8/src/test/isolation/README (L152)

(cherry picked from commit 8bbc1a45e1)
2022-09-07 13:27:49 +02:00
Hanefi Onaldi a29b689fc9 Replace isolation tester func only once on enterprise tests (#6064)
This is a continuation of a refactor (with commit sha
2b7cf0c097) that aimed to use Citus helper
UDFs by default in iso tests.

PostgreSQL isolation test infrastructure uses some UDFs to detect
whether concurrent sessions block each other. Citus implements
alternatives to that UDF so that we are able to detect and report
distributed transactions that get blocked on the worker nodes as well.

We needed to explicitly replace PG helper functions with Citus
implementations in each isolation file. Now we replace them by default.

(cherry picked from commit ae58ca5783)
2022-09-07 13:27:49 +02:00
Hanefi Onaldi b90523628f Replace iso tester func only once (#5964)
Use Citus helper UDFs by default in iso tests

PostgreSQL isolation test infrastructure uses some UDFs to detect
whether concurrent sessions block each other. Citus implements
alternatives to that UDF so that we are able to detect and report
distributed transactions that get blocked on the worker nodes as well.

We needed to explicitly replace PG helper functions with Citus
implementations in each isolation file. Now we replace them by default.

(cherry picked from commit 2b7cf0c097)
2022-09-07 13:27:49 +02:00
Jelte Fennema 75cf7a748d
Define symbols required for downgrade from 11.1 (#6301)
Since #6300/e29db74 changed the C symbol that our bigint overrides of
pg_cancel_backend and pg_terminate_backend called. We needed to do
something to continue to make these functions work after downgrading.

Recreating the old definition with a downgrade scripts is not really
possible, since people are expected to run the downgrade steps when
using the new .so file, which does not contain the old symbols.

So, the easiest way to solve it was also defining the new symbols in our
old Citus versions. Luckily our overrides haven't existed for long, so
these symbol definitions only needed to be backported to 11.0.
2022-09-07 12:18:39 +02:00
Marco Slot 5f57d77899 Allow citus_internal application_name with additional suffix (#6282)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-05 21:41:06 +02:00
Marco Slot 0a11da1291 Add an allow_unsafe_constraints flag for constraints without distribution column (#6237)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-25 16:13:07 +02:00
Gokhan Gulbiz 07143e7d12 Use the same colocation group for child and parent rels when altering a distributed table (#6225)
* Alter_distributed_table colocateWith:none bug fix for partitioned tables.

* Regression tests added for alter_distributed_table colocateWith:none for partitioned tables

* Update query comparision to be more accurate

(cherry picked from commit 69d2fcf5c0)
2022-08-25 11:47:06 +03:00
Marco Slot 006c8eacc0 Verify that we can replicate reference tables using rebalancer 2022-08-23 23:26:49 +02:00
Marco Slot 9dc6273b88 Set application_name to citus_rebalancer when copying reference tables 2022-08-23 23:26:49 +02:00
Onur Tirtir dbfdaca0f0 Add changelog entries for 11.0.6
(cherry picked from commit b6b8f198d9)
2022-08-19 11:09:46 +03:00
Onur Tirtir a5d6e841df Bump citus version to 11.0.6 2022-08-19 10:58:38 +03:00
Jelte Fennema 73e993e908
Fix flakyness in isolation_reference_table (#6193)
The newly introduced isolation_reference_table test had some flakyness,
because the assumption on how the arbitrary reference table gets chosen
was incorrect. This introduces a VACUUM FULL at the start of the test to
ensure the assumption actually holds.

Example of failed test: https://app.circleci.com/pipelines/github/citusdata/citus/26108/workflows/0a5cd526-006b-423e-8b67-7411b9c6be36/jobs/736802
2022-08-18 14:47:59 +02:00
Nils Dijk 08dee6fe08
Fix reference table lock contention (#6173)
DESCRIPTION: Fix reference table lock contention

Dropping and creating reference tables unintentionally blocked on each other due to the use of an ExclusiveLock for both the Drop and conditionally copying existing reference tables to (new) nodes.

The patch does the following:
 - Lower lock lever for dropping (reference) tables to `ShareLock` so they don't self conflict
 - Treat reference tables and distributed tables equally and acquire the colocation lock when dropping any table that is in a colocation group
 - Perform the precondition check for copying reference tables twice, first time with a lower lock that doesn't conflict with anything. Could have been a NoLock, however, in preparation for dropping a colocation group, it is an `AccessShareLock`

During normal operation the first check will always pass and we don't have to escalate that lock. Making it that we won't be blocked on adding and remove reference tables. Only after a node addition the first `create_reference_table` will still need to acquire an `ExclusiveLock` on the colocation group to perform the copy.
2022-08-18 13:22:31 +02:00
Onder Kalaci 87787dd146 Support Sequences owned by columns before distributing tables
There are 3 different ways that a sequence can be interacting
with tables. (1) and (2) are already supported. This commit adds
support for (3).

     (1) column DEFAULT nextval('seq'):

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
         ^                                     |
         |------------------ sequence <--------|

    (2) serial columns: Bigserial/small serial etc:

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
                                 ^             |
				 |             |
          		     sequence <--------|

   (3) Sequence OWNED BY table.column: Added support for
       this type of resolution in this commit.

       The dependency is almost like the following, and
       ExpandCitusSupportedTypes() is NOT responsible for finding
       the dependency.

        schema <--- table <--- column
                                 ^
				 |
          		     sequence

(cherry picked from commit 9ec8e627c1)
2022-08-18 11:22:25 +02:00
Marco Slot 56939f0d14
Fix relation access tracking for local only transactions on release-11.0 (#6182)
Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-08-18 10:13:41 +02:00
Ahmet Gedemenli 7df8588107
Fix upgrade paths for 11.0 (#6171)
* Fix upgrade paths for 11.0
2022-08-17 21:34:23 +03:00
aykut-bozkurt e0b4455e45 sysid should be parsed as int. (#6150)
(cherry picked from commit 898801504e)
2022-08-11 11:03:41 +03:00
Onur Tirtir f8e3e8c444 Add CHANGELOG entries for 11.0.5 (#6108)
(cherry picked from commit 0a04b115aa)
2022-08-01 13:40:43 +03:00
Onur Tirtir a18f6c4e40 Bump citus version to 11.0.5 2022-08-01 10:56:31 +03:00
Jelte Fennema d6c885713e Work around flaky test related to search_path (#5894)
For some reason search_path is not always set correctly on the worker
when calling a distributed function, this shows up when calling
`insert_document` in our distributed_triggers test. The underlying
reason is currently unknown and warrants deeper investigation.

Currently this test is one of the main causes for random CI failures. So
this change sets the search_path of each function explicitly, to reduce
these failures. So other devs can be more efficient, while I continue
investigating the root cause of this issue.

Also changes explicit `SET citus.enable_unsafe_triggers = false` to
`RESET citus.enable_unsafe_triggers` in passing.

(cherry picked from commit 6d8c5931d6)
2022-08-01 10:17:26 +03:00
Ying Xu a8aa82a3ec Bugfix for IN clause to be considered during planner phase in Columnar (#6030)
Reported bug #5803 shows that we are currently not sending the IN clause to our planner for columnar. This PR fixes it by checking for ScalarArrayOpExpr in ExtractPushdownClause so that we do not skip it. Also added a test case for this new addition.
2022-07-29 17:56:24 +02:00
Ahmet Gedemenli 2f1719c149
Do not create truncate triggers on foreign tables (#6103) 2022-07-29 16:43:09 +03:00
Marco Slot 4eb0749369 Avoid catalog read via superuser() call in DecrementSharedConnectionCounter 2022-07-29 14:22:51 +02:00
Marco Slot 4439124b6d Fix issues with insert..select casts and column ordering 2022-07-28 13:54:04 +02:00
Jelte Fennema 1cf079581f Avoid possible information leakage about existing users (#6090)
(cherry picked from commit 0f50bef696)
2022-07-27 17:58:24 +02:00
Ahmet Gedemenli 4d01af5160 Error out for views with circular dependencies (#6051)
Adds error check for views with circular dependencies

(cherry picked from commit 2b2a529653)
2022-07-27 17:59:49 +03:00
Marco Slot e45b6ece0d Allow WITH HOLD cursors with parameters 2022-07-27 14:08:18 +02:00
Onder Kalaci 9af736c7a6 Concurrent shard move/copy and colocated table creation fix
It turns out that create_distributed_table
and citus_move/copy_shard_placement does not
work well concurrently.

To fix that, we need to acquire a lock, which
sounds like a good use of colocation lock.

However, the current usage of colocation lock is
limited to higher level UDFs like rebalance_table_shards
etc. Those usage of lock is still useful, but
we cannot acquire the same lock on citus_move_shard_placement
etc. because the coordinator connects to itself to acquire
the lock. Hence, the high level UDF blocks itself.

To fix that, we use one more colocation lock, with the placements
are the main objects to consider.

(cherry picked from commit 12fa3aaf6b)
2022-07-27 10:10:46 +02:00
Onder Kalaci a21a4e128c Optimize StringJoin() for when prefix-postfix is needed
Before this commit, we required multiple copies of the
same stringInfo if we needed to append/prepend data to
the stringInfo. Now, we optionally get prefix/postfix.

For large string operations, this can save up to %10
memory.

(cherry picked from commit 26fdcb68f0)
2022-07-27 10:02:32 +02:00
Onder Kalaci 2a684e426c Do not cache all the metadata during fix_all_partition_shard_index_names
(cherry picked from commit f076e81166)
2022-07-27 10:02:05 +02:00
Onder Kalaci 377375de2a Reduce memory consumption while adjust partition index names
Previously, CreateFixPartitionShardIndexNames() created all
the relevant query strings for all the shards, and executed
the large query string. And, in terms of the memory consumption,
this huge command (and its ExprContext generated while running
the command) is the main bottleneck/

With this change, we are reducing the total amount of memory
usage to almost 1/shard_count.

On my local machine, a distributed partitioned table with 120 partitions,
each 32 shards, the total memory consumption reduced from ~3GB
to ~0.1GB. And, the total execution time increased from ~28 seconds
to ~30 seconds. This seems like a good trade-off.

(cherry picked from commit b8008999dc)
2022-07-27 10:02:00 +02:00
Nitish Upreti fcdf4434c6
Fix blocking shard moves failure due to constraint failure.
DESCRIPTION:
Fix Bug #4949 where Blocking shard moves fails if there is a foreign key between partitioned distributed tables (from child to parent). This is because we try to create constraints before attaching child partitions to parent. This causes constraint failure as parent table will be empty. Fix is to reverse the order i.e. attach partitions before we create constraints.

TESTING:
Added a new test 'shard_move_constraints_blocking' inspired for existing 'shard_move_constraints' where we trigger shard move with 'block_writes' instead of 'force_logical' to add coverage for this scenario.
2022-07-24 21:21:25 -07:00
Hanefi Onaldi 5ca792aef9
Bump Citus version to 11.0.4 2022-07-13 18:06:04 +03:00
Hanefi Onaldi 096047bfbc
Add changelog entry for 11.0.4 2022-07-13 18:05:19 +03:00
Onder Kalaci c51095c462 Add more generic read-replica tests
(cherry picked from commit 6cd7319f12)
2022-07-13 15:16:04 +02:00
Onder Kalaci 857a770b86 Add regression tests for LOCK command citus.use_secondary_nodes=always mode
(cherry picked from commit 3c343d4563)
2022-07-13 15:15:52 +02:00
Onder Kalaci 06e55df141 Make sure citus_is_coordinator works on read replicas
(cherry picked from commit b2e9a5baf1)
2022-07-13 15:15:46 +02:00
Onder Kalaci 06d6ffbb6e LOCK COMMAND does not require primaries at the start
(cherry picked from commit 8ab696f7e2)
2022-07-13 15:15:40 +02:00
Hanefi Onaldi 2bb106508a
Bump Citus version to 11.0.3 2022-07-05 13:19:10 +03:00
Hanefi Onaldi 5f46f2e9f7
Add changelog entry for 11.0.3
(cherry picked from commit c33915c3e6)
2022-07-05 13:19:10 +03:00
Ahmet Gedemenli ac7511de7d Fix matviews for citus_add_local_table_to_metadata (#6023)
(cherry picked from commit c8e1e243b8)
2022-07-04 17:01:40 +03:00
Hanefi Onaldi 0eee7fd9b8
Fix downgrade scripts from 11.0-2 to 11.0-1
(cherry picked from commit f60809a6c1)

Conflicts:
	src/test/regress/expected/multi_extension.out
	src/test/regress/sql/multi_extension.sql
2022-06-29 22:52:07 +03:00
Önder Kalacı 03a4305e06
Fixes a bug that prevents upgrades when there are no worker nodes (#6037)
(cherry picked from commit bab4c0a8c3)
2022-06-29 14:36:24 +03:00
Onder Kalaci d397dd0dfe Fixes a bug that prevents upgrades when there COMPRESSION and DEFAULT columns 2022-06-29 10:45:33 +02:00
Hanefi Onaldi 9d05c30c13
Bump Citus version to 11.0.2 2022-06-16 16:54:47 +03:00
Hanefi Onaldi bd02bd2dda
Add changelog entries for 11.0.2 (#6007)
(cherry picked from commit 26172636c9)
2022-06-16 16:53:40 +03:00
Ahmet Gedemenli b559ae5813 Fix creating stats bug when CREATE TABLE LIKE (#6006)
(cherry picked from commit 1ee3e8b7f4)
2022-06-16 12:45:23 +03:00
Jelte Fennema a01e45f3df Make enterprise features open source
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 08:09:45 +02:00
Marco Slot 0861c80c8b Fix bug in unqualified, non-existing DROP DOMAIN IF EXISTS
(cherry picked from commit ee34e1ed9d)
2022-06-15 16:53:25 +02:00
Burak Velioglu de6373b842 Fix dropping temporary view without specifying the explicit schema name
(cherry picked from commit 4d533c3c56)
2022-06-15 16:36:52 +02:00
Ahmet Gedemenli 4345627480 Fix materialized view intermediate result filename (#5982)
(cherry picked from commit 268d3fa3a6)
2022-06-14 15:43:18 +03:00
Onder Kalaci 978d31f330 Use citus_finish_citus_upgrade() in the tests
We already have tests relying on citus_finalize_upgrade_to_citus11().
Now, adjust those to rely on citus_finish_citus_upgrade() and
always call citus_finish_citus_upgrade().
2022-06-13 13:28:41 +02:00
Marco Slot 4bcffce036 Introduce a citus_finish_citus_upgrade() function 2022-06-13 13:28:31 +02:00
Halil Ozan Akgul 7166901492 Fixes the bug where undistribute can drop Citus extension
(cherry picked from commit b255706189)
2022-06-01 18:56:56 +03:00
Hanefi Onaldi 8ef705012a
Add normalization rules for flaky isolation tests
We remove `<waiting ...>` and `<... completed>` outputs for some CREATE
INDEX CONCURRENTLY commands since they can cause flakiness in some scenarios.

Postgres calls WaitForOlderSnapshots() and this can cause CREATE INDEX
CONCURRENTLY commands for shards to get blocked by each other for brief
periods of time. The extra waits can pop-up, or they can get completed
at different lines in the output files. To remedy that, we rename those
indexes to be captured by the new normalization rule.

(cherry picked from commit 52541c5802)
2022-06-01 16:12:01 +03:00
Hanefi Onaldi 530aafd8ee
Grep logs for deterministic global_cancel test results (#5948)
(cherry picked from commit 313104ab9b)
2022-06-01 16:12:01 +03:00
Gledis Zeneli c440cbb643 Fix memory error with citus_add_node reported by valgrind test (#5967)
The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed.
Also fix another place in the code that might have been susceptible to this issue.
I tested on both multi-vg and multi-1-vg and the test were successful.

(cherry picked from commit beef392f5a)
2022-06-01 13:06:54 +03:00
gledis69 a64e135a36 Revert "Copy data from heap tuples instead of using references"
This reverts commit 50e8638ede.
2022-06-01 13:06:38 +03:00
gledis69 50e8638ede Copy data from heap tuples instead of using references
The general rule is:
If the data is used within the bounds of table_open ... table_close > no need to copy
If the data is required for use even after the table is closed > copy

(cherry picked from commit dc9da7630f)
2022-06-01 12:27:11 +03:00
jeff-davis b34b1ce06b Columnar: fix wraparound bug. (#5962)
columnar_vacuum_rel() now advances relfrozenxid.

Fixes #5958.

(cherry picked from commit 74ce210f8b)
2022-05-31 07:46:12 -07:00
Onder Kalaci 0d0dd0af1c Show that no metadata is sent when disabled
(cherry picked from commit 89c1ccb7a5)
2022-05-30 17:01:49 +02:00
Onder Kalaci 3227d6551e Do not send metadata changes during add node if citus.enable_metadata_sync is set to false
(cherry picked from commit 7157152f6c)
2022-05-30 17:01:44 +02:00
Onder Kalaci d147d5d0c5 Avoid assertion failure on citus_add_node
(cherry picked from commit 010a2a408e)
2022-05-30 17:01:38 +02:00
Ahmet Gedemenli 4b5f749c23 Propagate dependent views upon distribution (#5950)
(cherry picked from commit 26d927178c)
2022-05-26 18:58:04 +03:00
Burak Velioglu 29c67c660d Create view and materialized views with right schema and owner while
altering the distributed table.

To be able to alter view's owner without enforcing sequential mode.
Alter view process functions have been udpated to use metadata
connection.
2022-05-25 10:42:54 +03:00
Gledis Zeneli 6da2d41e00 Do not obtain AccessShareLock before actual lock (#5965)
Do not obtain AccessShareLock before acquiring the distributed locks.

Acquiring an AccessShareLock ensures that the relations which we are trying to get a distributed lock on will not be dropped in the time between when the LOCK command is issued and the LOCK commands are send to the worker. However, this also leads to distributed deadlocks in such scenarios:

```sql
-- for dist lock acquiring order coor, w1, w2

-- on w2
LOCK t1 IN ACCESS EXLUSIVE MODE;
-- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock

      -- concurrently on w1
      LOCK t1 IN ACCESS EXLUSIVE MODE;
      -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock
      -- acquire dist lock on coor, w1, gets blocked on local AccessShareLock on w2

-- on w2 continuation of the execution above
-- starts to acquire dist locks and gets blocked on the coor by the lock acquired by w1

-- distributed deadlock

```

We opt for avoiding such deadlocks with the cost of the possibility of running into errors when the relations on which we are trying to acquire locks on get dropped.

(cherry picked from commit 27ddb4fc8e)
2022-05-23 17:28:37 +03:00
Onder Kalaci 2d5560537b Due to new commits in master branch, outputs diverged 2022-05-23 09:36:38 +02:00
Onder Kalaci 8b0499c91a Parallelize metadata syncing on node activate
It is often useful to be able to sync the metadata in parallel
across nodes.

Also citus_finalize_upgrade_to_citus11() uses
start_metadata_sync_to_primary_nodes() after this commit.

Note that this commit does not parallelize all pieces of node
activation or metadata syncing. Instead, it tries to parallelize
potenially large parts of metadata, which is the objects and
distributed tables (in general Citus tables).

In the future, it would be nice to sync the reference tables
in parallel across nodes.

Create ~720 distributed tables / ~23450 shards
```SQL
-- declaratively partitioned table
CREATE TABLE github_events_looooooooooooooong_name (
  event_id bigint,
  event_type text,
  event_public boolean,
  repo_id bigint,
  payload jsonb,
  repo jsonb,
  actor jsonb,
  org jsonb,
  created_at timestamp
) PARTITION BY RANGE (created_at);

SELECT create_time_partitions(
  table_name         := 'github_events_looooooooooooooong_name',
  partition_interval := '1 day',
  end_at             := now() + '24 months'
);

CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id);
SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id');

SET client_min_messages TO ERROR;

```

across 1 node: almost same as expected
```SQL

SELECT start_metadata_sync_to_primary_nodes();
Time: 15664.418 ms (00:15.664)

select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 14284.069 ms (00:14.284)
```

across 7 nodes: ~3.5x improvement
```SQL

SELECT start_metadata_sync_to_primary_nodes();
┌──────────────────────────────────────┐
│ start_metadata_sync_to_primary_nodes │
├──────────────────────────────────────┤
│ t                                    │
└──────────────────────────────────────┘
(1 row)

Time: 25711.192 ms (00:25.711)

-- across 7 nodes
select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 82126.075 ms (01:22.126)
```

(cherry picked from commit dd02e1755f)
2022-05-23 09:25:31 +02:00
Onder Kalaci 513e073206 Fixes a bug that prevents dropping/altering indexes
There are two problems in this area. First, when there are expressions
on the index name, we should call `transformIndexExpression()` before
generating the index name. That is what Postgres does.

Second, because of 40c24bfef9
PG 13 and PG 14 generates different names for indexes with function calls even for local PG tables.
Assume we have:
```SQL
create table t(id int);
select create_distributed_table('t', 'id');
create index ON t (my_very_boring_function(id));
```

On PG 13, the name of the index is `t_expr_idx`
```SQL
\d t
Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_expr_idx" btree (my_very_boring_function(id::bigint))
```

On PG 14, the name of the index is `t_my_very_boring_function_idx`
```SQL
\d t
 Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_my_very_boring_function_idx" btree (my_very_boring_function(id::bigint))

```

The second issue is not very critical. The important part is that
we adjust regression tests to drop all the indexes, which ensures
the index names are sane on any version.

(cherry picked from commit 2cc4053fc1)
2022-05-23 09:22:25 +02:00
Onder Kalaci 4b5cb7e2b9 Mark existing views as distributed when upgrade to 11.0+
We have a mechanism which ensures that newly distributed
objects are recorded during `alter extension citus update`.

However, the logic was lacking "view"s. With this commit, we make
sure that existing views are also marked as distributed during
upgrade.

(cherry picked from commit ee45e7bfbf)
2022-05-23 09:22:17 +02:00
Gledis Zeneli 97b453e679 Add TRUNCATE arbitrary config tests (#5848)
Adds TRUNCATE arbitrary config tests.
Also adds the ability to skip tests from particular configs.
2022-05-20 19:53:18 +02:00
Marco Slot 8c5035c0a5 Improve nested execution checks and add GUC to disable 2022-05-20 19:35:59 +02:00
Marco Slot 7c6784b1f4 Add caching for functions that check the backend type 2022-05-20 19:35:52 +02:00
Marco Slot 556f43f24a Fix prepared statement bug when switching from local to remote execution 2022-05-20 19:35:45 +02:00
gledis69 909b72b027 Add distributing lock command support
(cherry picked from commit 4731630741)
2022-05-20 18:02:34 +03:00
Gledis Zeneli 3f282c660b Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930)
Breaking down #5899 into smaller PR-s

This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views).

This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command:

TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3;

We generate and send the following command to all the workers in metadata:
```sql
SEL citus.enable_ddl_propagation TO FALSE;
LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE;
SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE');
SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE');
LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE;
SEL citus.enable_ddl_propagation TO TRUE;
```

Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations.
When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.

(cherry picked from commit 4c6f62efc6)
2022-05-20 17:24:44 +03:00
Marco Slot 73fd4f7ded Allow distributed execution from run_command_on_* functions 2022-05-20 15:42:50 +02:00
Burak Velioglu 8229d4b7ee Add ALTER VIEW support
Adds support for propagation ALTER VIEW commands to
- Change owner of view
- SET/RESET option
- Rename view and view's column name
- Change schema of the view

Since PG also supports targeting views with ALTER TABLE
commands, related code also added to direct such ALTER TABLE
commands to ALTER VIEW commands while sending them to workers.
2022-05-20 12:18:14 +03:00
Burak Velioglu 0cf769c43a Introduce CREATE/DROP VIEW
Adds support for propagating create/drop view commands and views to
worker node while scaling out the cluster. Since views are dropped while
converting the table type, metadata connection will be used while
propagating view commands to not switch to sequential mode.
2022-05-20 12:18:02 +03:00
Burak Velioglu 591f2565cc Use object address instead of relation id on DDLJob to decide on syncing metadata 2022-05-20 12:17:56 +03:00
Ahmet Gedemenli ddfcbfdca1 Add tests for materialized views 2022-05-20 12:17:48 +03:00
Ahmet Gedemenli 16071fac1d Add view tests to arbitrary configs 2022-05-20 12:17:41 +03:00
Onder Kalaci 9c4e3329f6 Rename metadata sync to node metadata sync where applicable 2022-05-19 11:00:51 +02:00
Onder Kalaci 36f641c586 Serialize reference table modifications with node changes & restore point
With Citus MX enabled, when a reference table is modified, it does
some operations on the first worker node(e.g., acquire locks).

If node metadata is locked (via add node or create restore point),
the changes to the reference tables should be blocked.
2022-05-19 11:00:51 +02:00
Onder Kalaci 5fe384329e Adds "sync" option to citus_disable_node() UDF 2022-05-19 11:00:51 +02:00
Marco Slot c20732142e Add a run_command_on_coordinator function 2022-05-19 10:41:10 +02:00
Marco Slot 082a14656d Fix downgrade scripts and add new downgrade tests 2022-05-19 10:37:56 +02:00
Marco Slot 33dede5b75 Add a citus_is_coordinator function 2022-05-19 10:36:22 +02:00
Nils Dijk 5e4c0e4bea
Merge pull request #5931 from citusdata/refactor/dedupe-object-propagation
Refactor: reduce complexity and code duplication for Object Propagation
2022-05-18 18:06:24 +02:00
Ahmet Gedemenli c2d9e88bf5
Fix schema name bug for sequences (#5937) 2022-05-18 17:29:30 +02:00
Ahmet Gedemenli 88369b6b23
Merge pull request #5934 from citusdata/fix-alter-statistics-nspname
Fix alter statistics namespace name
2022-05-18 17:29:30 +02:00
Onder Kalaci b7a39a232d Refrain reading the metadata cache for all tables during upgrade
First, it is not needed. Second, in the past we had issues regarding
this: https://github.com/citusdata/citus/pull/4344

When I create 10k tables, ~120K shards, this saves
40Mb of memory during ALTER EXTENSION citus UPDATE.

Before the change:  MetadataCacheMemoryContext: 41943040 ~ 40MB
After the change:  MetadataCacheMemoryContext: 8192

(cherry picked from commit f193e16a01)
2022-05-06 13:53:43 +02:00
Marco Slot e8b41d1e5b Convert citus.hide_shards_from_app_name_prefixes to citus.show_shards_for_app_name_prefixes 2022-05-05 13:24:23 +02:00
Onder Kalaci b4a65b9c45 Do not set coordinator's metadatasynced column to false
After a disable_node

(cherry picked from commit 5fc7661169)
2022-04-25 09:35:00 +02:00
Onder Kalaci 6ca3478c8d Do not assign distributed transaction ids for local execution
In the past, for all modifications on the local execution,
we enabled 2PC (with 6a7ed7b309).

This also required us to enable coordinated transactions
via https://github.com/citusdata/citus/pull/4831 .

However, it does have a very substantial impact on the
distributed deadlock detection. The distributed deadlock
detection is designed to avoid single-statement transactions
because they cannot lead to any actual deadlocks.

The implementation is to skip backends without distributed
transactions are assigned. Now that we assign single
statement local executions in the lock graphs, we are
conflicting with the design of distributed deadlock
detection.

In general, we should fix it. However, one might
think that it is not a big deal, even if the processes
show up in the lock graphs, the deadlock detection
should not be causing any false positives. That is
false, unless https://github.com/citusdata/citus/issues/1803
is fixed. Now that local processes are considered as a single
distributed backend, the lock graphs might find:

    local execution 1 [tx id: 1] -> any local process [tx id: 0]
    any local process [tx id: 0] -> local execution 2 [tx id: 2]

And, decides that there is a distributed deadlock.

This commit is:
   (a) right thing to do, as local execuion should not need any
       distributed tx id
   (b) Eliminates performance issues that might come up with
       deadlock detection does a lot of unncessary checks
   (c) After moving local execution after the remote execution
       via https://github.com/citusdata/citus/pull/4301, the
       vauge requirement for assigning distributed tx ids are
       already gone.

(cherry picked from commit a2debe0f02)
2022-04-25 09:34:32 +02:00
Hanefi Onaldi 86df61cae8
Bump Citus to 11.0.1_beta 2022-04-11 16:09:11 +03:00
Hanefi Onaldi e20a6dcd78
Add changelog entries for 11.0.1_beta
(cherry picked from commit 3ec1fc48fc)
2022-04-11 16:08:16 +03:00
Burak Velioglu 6eed51b75c
Create function in transaction according to create object propagation guc
(cherry picked from commit 5d9599f964)
2022-04-11 13:01:14 +03:00
Nils Dijk 675ba65f22
Implement DOMAIN propagation for citus 2022-04-08 16:18:02 +02:00
Marco Slot d611a50a80 Allow adding a unique constraint with an index 2022-04-07 16:41:10 +02:00
Marco Slot c5797030de Fix EXPLAIN ANALYZE JSON format for subplans 2022-04-07 16:00:12 +02:00
Marco Slot a74d991445 Handle user-defined type parameters in EXPLAIN ANALYZE 2022-04-07 11:37:43 +02:00
Marco Slot cb9e510e40 Add TABLESAMPLE support 2022-04-01 16:48:29 +02:00
Onder Kalaci e336b92552 Only hide shards from client backends and pg bg workers
The aim of hiding shards is to hide shards from client applications.

Certain bg workers (such as pg_cron or Citus maintanince daemon)
should be treated like client applications because users can run
queries from such bg workers. And, these bg workers should follow
the similar application_name checks as client backeends.

Certain other bg workers, such as logical replication or postgres'
parallel workers, should never hide shards. They are internal
operations.

Similarly the other backend types like the walsender or
checkpointer or autovacuum should never hide shards.

(cherry picked from commit 9043a1ed3f)
2022-03-30 17:44:03 +02:00
Hanefi Onaldi 4784d5579b
Bump Citus to 11.0.0_beta 2022-03-24 16:17:47 +03:00
744 changed files with 60321 additions and 12119 deletions

View File

@ -1,724 +0,0 @@
version: 2.1
orbs:
codecov: codecov/codecov@1.1.1
azure-cli: circleci/azure-cli@1.0.0
parameters:
image_suffix:
type: string
default: '-vabaecad'
pg13_version:
type: string
default: '13.4'
pg14_version:
type: string
default: '14.0'
upgrade_pg_versions:
type: string
default: '13.4-14.0'
jobs:
build:
description: Build the citus extension
parameters:
pg_major:
description: postgres major version building citus for
type: integer
image:
description: docker image to use for the build
type: string
default: citus/extbuilder
image_tag:
description: tag to use for the docker image
type: string
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
steps:
- checkout
- run:
name: 'Configure, Build, and Install'
command: |
./ci/build-citus.sh
- persist_to_workspace:
root: .
paths:
- build-<< parameters.pg_major >>/*
- install-<<parameters.pg_major >>.tar
check-style:
docker:
- image: 'citus/stylechecker:latest'
steps:
- checkout
- run:
name: 'Check Style'
command: citus_indent --check
- run:
name: 'Fix whitespace'
command: ci/editorconfig.sh && git diff --exit-code
- run:
name: 'Remove useless declarations'
command: ci/remove_useless_declarations.sh && git diff --cached --exit-code
- run:
name: 'Normalize test output'
command: ci/normalize_expected.sh && git diff --exit-code
- run:
name: 'Check for C-style comments in migration files'
command: ci/disallow_c_comments_in_migrations.sh && git diff --exit-code
- run:
name: 'Check for comment--cached ns that start with # character in spec files'
command: ci/disallow_hash_comments_in_spec_files.sh && git diff --exit-code
- run:
name: 'Check for gitignore entries .for source files'
command: ci/fix_gitignore.sh && git diff --exit-code
- run:
name: 'Check for lengths of changelog entries'
command: ci/disallow_long_changelog_entries.sh
- run:
name: 'Check for banned C API usage'
command: ci/banned.h.sh
- run:
name: 'Check for tests missing in schedules'
command: ci/check_all_tests_are_run.sh
- run:
name: 'Check if all CI scripts are actually run'
command: ci/check_all_ci_scripts_are_run.sh
- run:
name: 'Check if all GUCs are sorted alphabetically'
command: ci/check_gucs_are_alphabetically_sorted.sh
check-sql-snapshots:
docker:
- image: 'citus/extbuilder:latest'
steps:
- checkout
- run:
name: 'Check Snapshots'
command: ci/check_sql_snapshots.sh
test-pg-upgrade:
description: Runs postgres upgrade tests
parameters:
old_pg_major:
description: 'postgres major version to use before the upgrade'
type: integer
new_pg_major:
description: 'postgres major version to upgrade to'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/pgupgradetester
image_tag:
description: 'docker image tag to use'
type: string
default: 12-13
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.old_pg_major >>.tar" --directory /
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.new_pg_major >>.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Install and test postgres upgrade'
command: |
gosu circleci \
make -C src/test/regress \
check-pg-upgrade \
old-bindir=/usr/lib/postgresql/<< parameters.old_pg_major >>/bin \
new-bindir=/usr/lib/postgresql/<< parameters.new_pg_major >>/bin
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- run:
name: 'Copy pg_upgrade logs for newData dir'
command: |
mkdir -p /tmp/pg_upgrade_newData_logs
if ls src/test/regress/tmp_upgrade/newData/*.log 1> /dev/null 2>&1; then
cp src/test/regress/tmp_upgrade/newData/*.log /tmp/pg_upgrade_newData_logs
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- store_artifacts:
name: 'Save pg_upgrade logs for newData dir'
path: /tmp/pg_upgrade_newData_logs
- codecov/upload:
flags: 'test_<< parameters.old_pg_major >>_<< parameters.new_pg_major >>,upgrade'
test-arbitrary-configs:
description: Runs tests on arbitrary configs
parallelism: 6
parameters:
pg_major:
description: 'postgres major version to use'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/failtester
image_tag:
description: 'docker image tag to use'
type: string
default: 12-13
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
resource_class: xlarge
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-<< parameters.pg_major >>.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Test arbitrary configs'
command: |
TESTS=$(src/test/regress/citus_tests/print_test_names.py | circleci tests split)
# Our test suite expects comma separated values
TESTS=$(echo $TESTS | tr ' ' ',')
# TESTS will contain subset of configs that will be run on a container and we use multiple containers
# to run the test suite
gosu circleci \
make -C src/test/regress \
check-arbitrary-configs parallel=4 CONFIGS=$TESTS
no_output_timeout: 2m
- run:
name: 'Show regressions'
command: |
find src/test/regress/tmp_citus_test/ -name "regression*.diffs" -exec cat {} +
lines=$(find src/test/regress/tmp_citus_test/ -name "regression*.diffs" | wc -l)
if [ $lines -ne 0 ]; then
exit 1
fi
when: on_fail
- run:
name: 'Copy logfiles'
command: |
mkdir src/test/regress/tmp_citus_test/logfiles
find src/test/regress/tmp_citus_test/ -name "logfile_*" -exec cp -t src/test/regress/tmp_citus_test/logfiles/ {} +
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- store_artifacts:
name: 'Save logfiles'
path: src/test/regress/tmp_citus_test/logfiles
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,upgrade'
test-citus-upgrade:
description: Runs citus upgrade tests
parameters:
pg_major:
description: 'postgres major version'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/citusupgradetester
image_tag:
description: 'docker image tag to use'
type: string
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Install and test citus upgrade'
command: |
# run make check-citus-upgrade for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-old-version=${citus_version} \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=/home/circleci/project/install-$PG_MAJOR.tar; \
done;
# run make check-citus-upgrade-mixed for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade-mixed \
citus-old-version=${citus_version} \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=/home/circleci/project/install-$PG_MAJOR.tar; \
done;
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,upgrade'
test-citus:
description: Runs the common tests of citus
parameters:
pg_major:
description: 'postgres major version'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/exttester
image_tag:
description: 'docker image tag to use'
type: string
make:
description: 'make target'
type: string
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-${PG_MAJOR}.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Run Test'
command: |
gosu circleci make -C src/test/regress << parameters.make >>
no_output_timeout: 2m
- run:
name: 'Regressions'
command: |
if [ -f "src/test/regress/regression.diffs" ]; then
cat src/test/regress/regression.diffs
exit 1
fi
when: on_fail
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save regressions'
path: src/test/regress/regression.diffs
- store_artifacts:
name: 'Save mitmproxy output (failure test specific)'
path: src/test/regress/proxy.output
- store_artifacts:
name: 'Save results'
path: src/test/regress/results/
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,<< parameters.make >>'
when: always
tap-test-citus:
description: Runs tap tests for citus
parameters:
pg_major:
description: 'postgres major version'
type: integer
image:
description: 'docker image to use as for the tests'
type: string
default: citus/exttester
image_tag:
description: 'docker image tag to use'
type: string
suite:
description: 'name of the tap test suite to run'
type: string
make:
description: 'make target'
type: string
default: installcheck
docker:
- image: '<< parameters.image >>:<< parameters.image_tag >><< pipeline.parameters.image_suffix >>'
working_directory: /home/circleci/project
steps:
- checkout
- attach_workspace:
at: .
- run:
name: 'Install Extension'
command: |
tar xfv "${CIRCLE_WORKING_DIRECTORY}/install-${PG_MAJOR}.tar" --directory /
- run:
name: 'Configure'
command: |
chown -R circleci .
gosu circleci ./configure
- run:
name: 'Enable core dumps'
command: |
ulimit -c unlimited
- run:
name: 'Run Test'
command: |
gosu circleci make -C src/test/<< parameters.suite >> << parameters.make >>
no_output_timeout: 2m
- run:
name: 'Copy coredumps'
command: |
mkdir -p /tmp/core_dumps
if ls core.* 1> /dev/null 2>&1; then
cp core.* /tmp/core_dumps
fi
when: on_fail
- store_artifacts:
name: 'Save tap logs'
path: /home/circleci/project/src/test/recovery/tmp_check/log
- store_artifacts:
name: 'Save core dumps'
path: /tmp/core_dumps
- codecov/upload:
flags: 'test_<< parameters.pg_major >>,tap_<< parameters.suite >>_<< parameters.make >>'
when: always
check-merge-to-enterprise:
docker:
- image: citus/extbuilder:<< pipeline.parameters.pg13_version >>
working_directory: /home/circleci/project
steps:
- checkout
- run:
command: |
ci/check_enterprise_merge.sh
ch_benchmark:
docker:
- image: buildpack-deps:stretch
working_directory: /home/circleci/project
steps:
- checkout
- azure-cli/install
- azure-cli/login-with-service-principal
- run:
command: |
cd ./src/test/hammerdb
sh run_hammerdb.sh citusbot_ch_benchmark_rg
name: install dependencies and run ch_benchmark tests
no_output_timeout: 20m
tpcc_benchmark:
docker:
- image: buildpack-deps:stretch
working_directory: /home/circleci/project
steps:
- checkout
- azure-cli/install
- azure-cli/login-with-service-principal
- run:
command: |
cd ./src/test/hammerdb
sh run_hammerdb.sh citusbot_tpcc_benchmark_rg
name: install dependencies and run ch_benchmark tests
no_output_timeout: 20m
workflows:
version: 2
build_and_test:
jobs:
- check-merge-to-enterprise:
filters:
branches:
ignore:
- /release-[0-9]+\.[0-9]+.*/ # match with releaseX.Y.*
- build:
name: build-13
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
- build:
name: build-14
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
- check-style
- check-sql-snapshots
- test-citus:
name: 'test-13_check-multi'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-multi
requires: [build-13]
- test-citus:
name: 'test-13_check-multi-1'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-multi-1
requires: [build-13]
- test-citus:
name: 'test-13_check-mx'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-multi-mx
requires: [build-13]
- test-citus:
name: 'test-13_check-vanilla'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-vanilla
requires: [build-13]
- test-citus:
name: 'test-13_check-isolation'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-isolation
requires: [build-13]
- test-citus:
name: 'test-13_check-worker'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-worker
requires: [build-13]
- test-citus:
name: 'test-13_check-operations'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-operations
requires: [build-13]
- test-citus:
name: 'test-13_check-follower-cluster'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-follower-cluster
requires: [build-13]
- test-citus:
name: 'test-13_check-columnar'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-columnar
requires: [build-13]
- test-citus:
name: 'test-13_check-columnar-isolation'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-columnar-isolation
requires: [build-13]
- tap-test-citus:
name: 'test_13_tap-recovery'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
suite: recovery
requires: [build-13]
- test-citus:
name: 'test-13_check-failure'
pg_major: 13
image: citus/failtester
image_tag: '<< pipeline.parameters.pg13_version >>'
make: check-failure
requires: [build-13]
- test-citus:
name: 'test-14_check-multi'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-multi
requires: [build-14]
- test-citus:
name: 'test-14_check-multi-1'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-multi-1
requires: [build-14]
- test-citus:
name: 'test-14_check-mx'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-multi-mx
requires: [build-14]
- test-citus:
name: 'test-14_check-vanilla'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-vanilla
requires: [build-14]
- test-citus:
name: 'test-14_check-isolation'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-isolation
requires: [build-14]
- test-citus:
name: 'test-14_check-worker'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-worker
requires: [build-14]
- test-citus:
name: 'test-14_check-operations'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-operations
requires: [build-14]
- test-citus:
name: 'test-14_check-follower-cluster'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-follower-cluster
requires: [build-14]
- test-citus:
name: 'test-14_check-columnar'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-columnar
requires: [build-14]
- test-citus:
name: 'test-14_check-columnar-isolation'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-columnar-isolation
requires: [build-14]
- tap-test-citus:
name: 'test_14_tap-recovery'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
suite: recovery
requires: [build-14]
- test-citus:
name: 'test-14_check-failure'
pg_major: 14
image: citus/failtester
image_tag: '<< pipeline.parameters.pg14_version >>'
make: check-failure
requires: [build-14]
- test-arbitrary-configs:
name: 'test-13_check-arbitrary-configs'
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
requires: [build-13]
- test-arbitrary-configs:
name: 'test-14_check-arbitrary-configs'
pg_major: 14
image_tag: '<< pipeline.parameters.pg14_version >>'
requires: [build-14]
- test-pg-upgrade:
name: 'test-13-14_check-pg-upgrade'
old_pg_major: 13
new_pg_major: 14
image_tag: '<< pipeline.parameters.upgrade_pg_versions >>'
requires: [build-13, build-14]
- test-citus-upgrade:
name: test-13_check-citus-upgrade
pg_major: 13
image_tag: '<< pipeline.parameters.pg13_version >>'
requires: [build-13]
- ch_benchmark:
requires: [build-13]
filters:
branches:
only:
- /ch_benchmark\/.*/ # match with ch_benchmark/ prefix
- tpcc_benchmark:
requires: [build-13]
filters:
branches:
only:
- /tpcc_benchmark\/.*/ # match with tpcc_benchmark/ prefix

View File

@ -0,0 +1,23 @@
name: 'Parallelization matrix'
inputs:
count:
required: false
default: 32
outputs:
json:
value: ${{ steps.generate_matrix.outputs.json }}
runs:
using: "composite"
steps:
- name: Generate parallelization matrix
id: generate_matrix
shell: bash
run: |-
json_array="{\"include\": ["
for ((i = 1; i <= ${{ inputs.count }}; i++)); do
json_array+="{\"id\":\"$i\"},"
done
json_array=${json_array%,}
json_array+=" ]}"
echo "json=$json_array" >> "$GITHUB_OUTPUT"
echo "json=$json_array"

View File

@ -0,0 +1,38 @@
name: save_logs_and_results
inputs:
folder:
required: false
default: "log"
runs:
using: composite
steps:
- uses: actions/upload-artifact@v3.1.1
name: Upload logs
with:
name: ${{ inputs.folder }}
if-no-files-found: ignore
path: |
src/test/**/proxy.output
src/test/**/results/
src/test/**/tmp_check/master/log
src/test/**/tmp_check/worker.57638/log
src/test/**/tmp_check/worker.57637/log
src/test/**/*.diffs
src/test/**/out/ddls.sql
src/test/**/out/queries.sql
src/test/**/logfile_*
/tmp/pg_upgrade_newData_logs
- name: Publish regression.diffs
run: |-
diffs="$(find src/test/regress -name "*.diffs" -exec cat {} \;)"
if ! [ -z "$diffs" ]; then
echo '```diff' >> $GITHUB_STEP_SUMMARY
echo -E "$diffs" >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
echo -E $diffs
fi
shell: bash
- name: Print stack traces
run: "./ci/print_stack_trace.sh"
if: failure()
shell: bash

View File

@ -0,0 +1,35 @@
name: setup_extension
inputs:
pg_major:
required: false
skip_installation:
required: false
default: false
type: boolean
runs:
using: composite
steps:
- name: Expose $PG_MAJOR to Github Env
run: |-
if [ -z "${{ inputs.pg_major }}" ]; then
echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
else
echo "PG_MAJOR=${{ inputs.pg_major }}" >> $GITHUB_ENV
fi
shell: bash
- uses: actions/download-artifact@v3.0.1
with:
name: build-${{ env.PG_MAJOR }}
- name: Install Extension
if: ${{ inputs.skip_installation == 'false' }}
run: tar xfv "install-$PG_MAJOR.tar" --directory /
shell: bash
- name: Configure
run: |-
chown -R circleci .
git config --global --add safe.directory ${GITHUB_WORKSPACE}
gosu circleci ./configure --without-pg-version-check
shell: bash
- name: Enable core dumps
run: ulimit -c unlimited
shell: bash

View File

@ -0,0 +1,15 @@
name: coverage
inputs:
flags:
required: false
codecov_token:
required: true
runs:
using: composite
steps:
- uses: codecov/codecov-action@v3
with:
flags: ${{ inputs.flags }}
token: ${{ inputs.codecov_token }}
verbose: true
gcov: true

View File

@ -0,0 +1,3 @@
base:
- ".* warning: ignoring old recipe for target [`']check'"
- ".* warning: overriding recipe for target [`']check'"

20
.github/packaging/validate_build_output.sh vendored Executable file
View File

@ -0,0 +1,20 @@
package_type=${1}
# Since $HOME is set in GH_Actions as /github/home, pyenv fails to create virtualenvs.
# For this script, we set $HOME to /root and then set it back to /github/home.
GITHUB_HOME="${HOME}"
export HOME="/root"
eval "$(pyenv init -)"
pyenv versions
pyenv virtualenv ${PACKAGING_PYTHON_VERSION} packaging_env
pyenv activate packaging_env
git clone -b v0.8.25 --depth=1 https://github.com/citusdata/tools.git tools
python3 -m pip install -r tools/packaging_automation/requirements.txt
python3 -m tools.packaging_automation.validate_build_output --output_file output.log \
--ignore_file .github/packaging/packaging_ignore.yml \
--package_type ${package_type}
pyenv deactivate
# Set $HOME back to /github/home
export HOME=${GITHUB_HOME}

381
.github/workflows/build_and_test.yml vendored Normal file
View File

@ -0,0 +1,381 @@
name: Build & Test
run-name: Build & Test - ${{ github.event.pull_request.title || github.ref_name }}
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
on:
workflow_dispatch:
inputs:
skip_test_flakyness:
required: false
default: false
type: boolean
pull_request:
types: [opened, reopened,synchronize]
jobs:
# Since GHA does not interpolate env varibles in matrix context, we need to
# define them in a separate job and use them in other jobs.
params:
runs-on: ubuntu-latest
name: Initialize parameters
outputs:
build_image_name: "citus/extbuilder"
test_image_name: "citus/exttester"
citusupgrade_image_name: "citus/citusupgradetester"
fail_test_image_name: "citus/failtester"
pgupgrade_image_name: "citus/pgupgradetester"
style_checker_image_name: "citus/stylechecker"
style_checker_tools_version: "0.8.18"
image_suffix: "-vb4dd087"
pg13_version: '{ "major": "13", "full": "13.4" }'
pg14_version: '{ "major": "14", "full": "14.0" }'
upgrade_pg_versions: "13.4-14.0"
steps:
# Since GHA jobs needs at least one step we use a noop step here.
- name: Set up parameters
run: echo 'noop'
check-sql-snapshots:
needs: params
runs-on: ubuntu-20.04
container:
image: ${{ needs.params.outputs.build_image_name }}:latest
options: --user root
steps:
- uses: actions/checkout@v3.5.0
- name: Check Snapshots
run: |
git config --global --add safe.directory ${GITHUB_WORKSPACE}
ci/check_sql_snapshots.sh
check-style:
needs: params
runs-on: ubuntu-20.04
container:
image: ${{ needs.params.outputs.style_checker_image_name }}:${{ needs.params.outputs.style_checker_tools_version }}${{ needs.params.outputs.image_suffix }}
steps:
- name: Check Snapshots
run: |
git config --global --add safe.directory ${GITHUB_WORKSPACE}
- uses: actions/checkout@v3.5.0
with:
fetch-depth: 0
- name: Check C Style
run: citus_indent --check
- name: Fix whitespace
run: ci/editorconfig.sh && git diff --exit-code
- name: Remove useless declarations
run: ci/remove_useless_declarations.sh && git diff --cached --exit-code
- name: Normalize test output
run: ci/normalize_expected.sh && git diff --exit-code
- name: Check for C-style comments in migration files
run: ci/disallow_c_comments_in_migrations.sh && git diff --exit-code
- name: 'Check for comment--cached ns that start with # character in spec files'
run: ci/disallow_hash_comments_in_spec_files.sh && git diff --exit-code
- name: Check for gitignore entries .for source files
run: ci/fix_gitignore.sh && git diff --exit-code
- name: Check for lengths of changelog entries
run: ci/disallow_long_changelog_entries.sh
- name: Check for banned C API usage
run: ci/banned.h.sh
- name: Check for tests missing in schedules
run: ci/check_all_tests_are_run.sh
- name: Check if all CI scripts are actually run
run: ci/check_all_ci_scripts_are_run.sh
- name: Check if all GUCs are sorted alphabetically
run: ci/check_gucs_are_alphabetically_sorted.sh
build:
needs: params
name: Build for PG${{ fromJson(matrix.pg_version).major }}
strategy:
fail-fast: false
matrix:
image_name:
- ${{ needs.params.outputs.build_image_name }}
image_suffix:
- ${{ needs.params.outputs.image_suffix}}
pg_version:
- ${{ needs.params.outputs.pg13_version }}
- ${{ needs.params.outputs.pg14_version }}
runs-on: ubuntu-20.04
container:
image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ matrix.image_suffix }}"
options: --user root
steps:
- uses: actions/checkout@v3.5.0
- name: Expose $PG_MAJOR to Github Env
run: echo "PG_MAJOR=${PG_MAJOR}" >> $GITHUB_ENV
shell: bash
- name: Build
run: "./ci/build-citus.sh"
shell: bash
- uses: actions/upload-artifact@v3.1.1
with:
name: build-${{ env.PG_MAJOR }}
path: |-
./build-${{ env.PG_MAJOR }}/*
./install-${{ env.PG_MAJOR }}.tar
test-citus:
name: PG${{ fromJson(matrix.pg_version).major }} - ${{ matrix.make }}
strategy:
fail-fast: false
matrix:
suite:
- regress
image_name:
- ${{ needs.params.outputs.test_image_name }}
pg_version:
- ${{ needs.params.outputs.pg13_version }}
- ${{ needs.params.outputs.pg14_version }}
make:
- check-multi
- check-multi-1
- check-multi-mx
- check-vanilla
- check-isolation
- check-worker
- check-operations
- check-follower-cluster
- check-columnar
- check-columnar-isolation
- check-enterprise
- check-enterprise-isolation
- check-enterprise-isolation-logicalrep-1
- check-enterprise-isolation-logicalrep-2
- check-enterprise-isolation-logicalrep-3
include:
- make: check-failure
pg_version: ${{ needs.params.outputs.pg13_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-failure
pg_version: ${{ needs.params.outputs.pg14_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-enterprise-failure
pg_version: ${{ needs.params.outputs.pg13_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: check-enterprise-failure
pg_version: ${{ needs.params.outputs.pg14_version }}
suite: regress
image_name: ${{ needs.params.outputs.fail_test_image_name }}
- make: installcheck
suite: recovery
image_name: ${{ needs.params.outputs.test_image_name }}
pg_version: ${{ needs.params.outputs.pg13_version }}
- make: installcheck
suite: recovery
image_name: ${{ needs.params.outputs.test_image_name }}
pg_version: ${{ needs.params.outputs.pg14_version }}
- make: installcheck
suite: columnar_freezing
image_name: ${{ needs.params.outputs.test_image_name }}
pg_version: ${{ needs.params.outputs.pg13_version }}
- make: installcheck
suite: columnar_freezing
image_name: ${{ needs.params.outputs.test_image_name }}
pg_version: ${{ needs.params.outputs.pg14_version }}
runs-on: ubuntu-20.04
container:
image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root --dns=8.8.8.8
# Due to Github creates a default network for each job, we need to use
# --dns= to have similar DNS settings as our other CI systems or local
# machines. Otherwise, we may see different results.
needs:
- params
- build
steps:
- uses: actions/checkout@v3.5.0
- uses: "./.github/actions/setup_extension"
- name: Run Test
run: gosu circleci make -C src/test/${{ matrix.suite }} ${{ matrix.make }}
timeout-minutes: 20
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: ${{ fromJson(matrix.pg_version).major }}_${{ matrix.make }}
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.PG_MAJOR }}_${{ matrix.suite }}_${{ matrix.make }}
codecov_token: ${{ secrets.CODECOV_TOKEN }}
test-arbitrary-configs:
name: PG${{ fromJson(matrix.pg_version).major }} - check-arbitrary-configs-${{ matrix.parallel }}
runs-on: ["self-hosted", "1ES.Pool=1es-gha-citusdata-pool"]
container:
image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root
needs:
- params
- build
strategy:
fail-fast: false
matrix:
image_name:
- ${{ needs.params.outputs.fail_test_image_name }}
pg_version:
- ${{ needs.params.outputs.pg13_version }}
- ${{ needs.params.outputs.pg14_version }}
parallel: [0,1,2,3,4,5] # workaround for running 6 parallel jobs
steps:
- uses: actions/checkout@v3.5.0
- uses: "./.github/actions/setup_extension"
- name: Test arbitrary configs
run: |-
# we use parallel jobs to split the tests into 6 parts and run them in parallel
# the script below extracts the tests for the current job
N=6 # Total number of jobs (see matrix.parallel)
X=${{ matrix.parallel }} # Current job number
TESTS=$(src/test/regress/citus_tests/print_test_names.py |
tr '\n' ',' | awk -v N="$N" -v X="$X" -F, '{
split("", parts)
for (i = 1; i <= NF; i++) {
parts[i % N] = parts[i % N] $i ","
}
print substr(parts[X], 1, length(parts[X])-1)
}')
echo $TESTS
gosu circleci \
make -C src/test/regress \
check-arbitrary-configs parallel=4 CONFIGS=$TESTS
- uses: "./.github/actions/save_logs_and_results"
if: always()
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.pg_major }}_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
test-pg-upgrade:
name: PG${{ matrix.old_pg_major }}-PG${{ matrix.new_pg_major }} - check-pg-upgrade
runs-on: ubuntu-20.04
container:
image: "${{ needs.params.outputs.pgupgrade_image_name }}:${{ needs.params.outputs.upgrade_pg_versions }}-vabaecad"
options: --user root
needs:
- params
- build
strategy:
fail-fast: false
matrix:
include:
- old_pg_major: 13
new_pg_major: 14
env:
old_pg_major: ${{ matrix.old_pg_major }}
new_pg_major: ${{ matrix.new_pg_major }}
steps:
- name: Install dependencies
run: |-
# update stretch repositories
# sed -i -e 's/deb.debian.org/archive.debian.org/g' -e 's|security.debian.org|archive.debian.org/|g' -e '/stretch-updates/d' /etc/apt/sources.list
apt update || true
apt install git -y
- uses: actions/checkout@v3.5.0
- uses: "./.github/actions/setup_extension"
with:
pg_major: "${{ env.old_pg_major }}"
- uses: "./.github/actions/setup_extension"
with:
pg_major: "${{ env.new_pg_major }}"
- name: Install and test postgres upgrade
run: |-
gosu circleci \
make -C src/test/regress \
check-pg-upgrade \
old-bindir=/usr/lib/postgresql/${{ env.old_pg_major }}/bin \
new-bindir=/usr/lib/postgresql/${{ env.new_pg_major }}/bin
- name: Copy pg_upgrade logs for newData dir
run: |-
mkdir -p /tmp/pg_upgrade_newData_logs
if ls src/test/regress/tmp_upgrade/newData/*.log 1> /dev/null 2>&1; then
cp src/test/regress/tmp_upgrade/newData/*.log /tmp/pg_upgrade_newData_logs
fi
if: failure()
- uses: "./.github/actions/save_logs_and_results"
if: always()
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.old_pg_major }}_${{ env.new_pg_major }}_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
test-citus-upgrade:
name: PG${{ fromJson(needs.params.outputs.pg13_version).major }} - check-citus-upgrade
runs-on: ubuntu-20.04
container:
image: "${{ needs.params.outputs.citusupgrade_image_name }}:${{ fromJson(needs.params.outputs.pg13_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root
needs:
- params
- build
steps:
- uses: actions/checkout@v3.5.0
- uses: "./.github/actions/setup_extension"
with:
skip_installation: true
- name: Install and test citus upgrade
run: |-
# run make check-citus-upgrade for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-old-version=${citus_version} \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar; \
done;
# run make check-citus-upgrade-mixed for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
check-citus-upgrade-mixed \
citus-old-version=${citus_version} \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar; \
done;
- uses: "./.github/actions/save_logs_and_results"
if: always()
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.pg_major }}_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
ch_benchmark:
name: CH Benchmark
if: startsWith(github.ref, 'refs/heads/ch_benchmark/')
runs-on: ubuntu-20.04
needs:
- build
steps:
- uses: actions/checkout@v3.5.0
- uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: install dependencies and run ch_benchmark tests
uses: azure/CLI@v1
with:
inlineScript: |
cd ./src/test/hammerdb
chmod +x run_hammerdb.sh
run_hammerdb.sh citusbot_ch_benchmark_rg
tpcc_benchmark:
name: TPCC Benchmark
if: startsWith(github.ref, 'refs/heads/tpcc_benchmark/')
runs-on: ubuntu-20.04
needs:
- build
steps:
- uses: actions/checkout@v3.5.0
- uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: install dependencies and run tpcc_benchmark tests
uses: azure/CLI@v1
with:
inlineScript: |
cd ./src/test/hammerdb
chmod +x run_hammerdb.sh
run_hammerdb.sh citusbot_tpcc_benchmark_rg

View File

@ -0,0 +1,162 @@
name: Build tests in packaging images
on:
push:
branches: "**"
workflow_dispatch:
jobs:
get_postgres_versions_from_file:
runs-on: ubuntu-latest
outputs:
pg_versions: ${{ steps.get-postgres-versions.outputs.pg_versions }}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Get Postgres Versions
id: get-postgres-versions
run: |
set -euxo pipefail
# Postgres versions are stored in .github/workflows/build_and_test.yml
# file in json strings with major and full keys.
# Below command extracts the versions and get the unique values.
pg_versions=$(cat .github/workflows/build_and_test.yml | grep -oE '"major": "[0-9]+", "full": "[0-9.]+"' | sed -E 's/"major": "([0-9]+)", "full": "([0-9.]+)"/\1/g' | sort | uniq | tr '\n', ',')
pg_versions_array="[ ${pg_versions} ]"
echo "Supported PG Versions: ${pg_versions_array}"
# Below line is needed to set the output variable to be used in the next job
echo "pg_versions=${pg_versions_array}" >> $GITHUB_OUTPUT
shell: bash
rpm_build_tests:
name: rpm_build_tests
needs: get_postgres_versions_from_file
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
# While we use separate images for different Postgres versions in rpm
# based distros
# For this reason, we need to use a "matrix" to generate names of
# rpm images, e.g. citus/packaging:centos-7-pg12
packaging_docker_image:
- oraclelinux-7
- oraclelinux-8
- centos-7
- almalinux-8
- almalinux-9
POSTGRES_VERSION: ${{ fromJson(needs.get_postgres_versions_from_file.outputs.pg_versions) }}
container:
image: citus/packaging:${{ matrix.packaging_docker_image }}-pg${{ matrix.POSTGRES_VERSION }}
options: --user root
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set Postgres and python parameters for rpm based distros
run: |
echo "/usr/pgsql-${{ matrix.POSTGRES_VERSION }}/bin" >> $GITHUB_PATH
echo "/root/.pyenv/bin:$PATH" >> $GITHUB_PATH
echo "PACKAGING_PYTHON_VERSION=3.8.16" >> $GITHUB_ENV
- name: Configure
run: |
echo "Current Shell:$0"
echo "GCC Version: $(gcc --version)"
./configure 2>&1 | tee output.log
- name: Make clean
run: |
make clean
- name: Make
run: |
make CFLAGS="-Wno-missing-braces" -sj$(cat /proc/cpuinfo | grep "core id" | wc -l) 2>&1 | tee -a output.log
- name: Make install
run: |
make CFLAGS="-Wno-missing-braces" install 2>&1 | tee -a output.log
- name: Validate output
env:
POSTGRES_VERSION: ${{ matrix.POSTGRES_VERSION }}
PACKAGING_DOCKER_IMAGE: ${{ matrix.packaging_docker_image }}
run: |
echo "Postgres version: ${POSTGRES_VERSION}"
## Install required packages to execute packaging tools for rpm based distros
yum install python3-pip python3-devel postgresql-devel -y
python3 -m pip install wheel
./.github/packaging/validate_build_output.sh "rpm"
deb_build_tests:
name: deb_build_tests
needs: get_postgres_versions_from_file
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
# On deb based distros, we use the same docker image for
# builds based on different Postgres versions because deb
# based images include all postgres installations.
# For this reason, we have multiple runs --which is 3 today--
# for each deb based image and we use POSTGRES_VERSION to set
# PG_CONFIG variable in each of those runs.
packaging_docker_image:
- debian-buster-all
- debian-bookworm-all
- debian-bullseye-all
- ubuntu-focal-all
- ubuntu-jammy-all
- ubuntu-kinetic-all
POSTGRES_VERSION: ${{ fromJson(needs.get_postgres_versions_from_file.outputs.pg_versions) }}
container:
image: citus/packaging:${{ matrix.packaging_docker_image }}
options: --user root
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set pg_config path and python parameters for deb based distros
run: |
echo "PG_CONFIG=/usr/lib/postgresql/${{ matrix.POSTGRES_VERSION }}/bin/pg_config" >> $GITHUB_ENV
echo "/root/.pyenv/bin:$PATH" >> $GITHUB_PATH
echo "PACKAGING_PYTHON_VERSION=3.8.16" >> $GITHUB_ENV
- name: Configure
run: |
echo "Current Shell:$0"
echo "GCC Version: $(gcc --version)"
./configure 2>&1 | tee output.log
- name: Make clean
run: |
make clean
- name: Make
run: |
make -sj$(cat /proc/cpuinfo | grep "core id" | wc -l) 2>&1 | tee -a output.log
- name: Make install
run: |
make install 2>&1 | tee -a output.log
- name: Validate output
env:
POSTGRES_VERSION: ${{ matrix.POSTGRES_VERSION }}
PACKAGING_DOCKER_IMAGE: ${{ matrix.packaging_docker_image }}
run: |
echo "Postgres version: ${POSTGRES_VERSION}"
sudo apt-get update -y || true # ignore errors
## Install required packages to execute packaging tools for deb based distros
sudo apt-get purge -y python3-yaml
./.github/packaging/validate_build_output.sh "deb"

View File

@ -1,11 +1,164 @@
### citus v11.0.0_beta (March 22, 2022) ###
### citus v11.0.10 (February 15, 2024) ###
* Removes pg_send_cancellation and all references (#7135)
### citus v11.0.9 (February 12, 2024) ###
* Fixes a bug that could cause COPY logic to skip data in case of OOM (#7152)
* Fixes a bug with deleting colocation groups (#6929)
* Fixes memory and memory context leaks in Foreign Constraint Graphs (#7236)
* Fixes the incorrect column count after ALTER TABLE (#7462)
* Improve failure handling of distributed execution (#7090)
### citus v11.0.8 (April 20, 2023) ###
* Correctly reports shard size in `citus_shards` view (#6748)
* Fixes a bug that breaks pg upgrades if the user has a columnar table (#6624)
* Fixes an unexpected foreign table error by disallowing to drop the
`table_name` option (#6669)
* Fixes compilation warning on PG13 + OpenSSL 3.0 (#6038, #6502)
* Fixes crash that happens when trying to replicate a reference table that is
actually dropped (#6595)
* Fixes memory leak issue with query results that returns single row (#6724)
* Fixes the modifiers for subscription and role creation (#6603)
* Fixes two potential dangling pointer issues (#6504, #6507)
* Makes sure to quote all identifiers used for logical replication to prevent
potential issues (#6604)
### citus v11.0.7 (November 8, 2022) ###
* Adds the GUC `citus.allow_unsafe_constraints` to allow unique/exclusion/
primary key constraints without distribution column
* Allows `citus_internal` `application_name` with additional suffix
* Disallows having `ON DELETE/UPDATE SET DEFAULT` actions on columns that
default to sequences
* Fixes a bug in `ALTER EXTENSION citus UPDATE`
* Fixes a bug that causes a crash with empty/null password
* Fixes a bug that causes not retaining trigger enable/disable settings when
re-creating them on shards
* Fixes a bug that might cause inserting incorrect `DEFAULT` values when
applying foreign key actions
* Fixes a bug that prevents retaining columnar table options after a
table-rewrite
* Fixes a bug that prevents setting colocation group of a partitioned
distributed table to `none`
* Fixes an issue that can cause logical reference table replication to fail
* Raises memory limits in columnar from 256MB to 1GB for reads and writes
### citus v11.0.6 (August 19, 2022) ###
* Fixes a bug that could cause failures in `CREATE ROLE` statement
* Fixes a bug that could cause failures in `create_distributed_table`
* Fixes a bug that prevents distributing tables that depend on sequences
* Fixes reference table lock contention
* Fixes upgrade paths for 11.0
### citus v11.0.5 (August 1, 2022) ###
* Avoids possible information leakage about existing users
* Allows using `WITH HOLD` cursors with parameters
* Fixes a bug that could cause failures in `INSERT INTO .. SELECT`
* Fixes a bug that prevents pushing down `IN` expressions when using columnar
custom scan
* Fixes a concurrency bug between creating a co-located distributed table and
shard moves
* Fixes a crash that can happen due to catalog read in `shmem_exit`
* Fixes an unexpected error caused by constraints when moving shards
* Fixes an unexpected error for foreign tables when upgrading Postgres
* Prevents adding local table into metadata if there is a view with circular
dependencies on it
* Reduces memory consumption of index name fix for partitioned tables
### citus v11.0.4 (July 13, 2022) ###
* Fixes a bug that prevents promoting read-replicas as primaries
### citus v11.0.3 (July 5, 2022) ###
* Fixes a bug that prevents adding local tables with materialized views to
Citus metadata
* Fixes a bug that prevents using `COMPRESSION` and `CONSTRAINT` on a column
* Fixes upgrades to Citus 11 when there are no nodes in the metadata
### citus v11.0.2 (June 15, 2022) ###
* Drops support for PostgreSQL 12
* Open sources enterprise features, see the rest of changelog items
* Turns metadata syncing on by default
* Adds `citus_finalize_upgrade_to_citus11()` which is necessary to upgrade to
Citus 11+ from earlier versions
* Introduces `citus_finish_citus_upgrade()` procedure which is necessary to
upgrade from earlier versions
* Open sources non-blocking shard moves/shard rebalancer
(`citus.logical_replication_timeout`)
* Open sources propagation of `CREATE/DROP/ALTER ROLE` statements
* Open sources propagation of `GRANT` statements
* Open sources propagation of `CLUSTER` statements
* Open sources propagation of `ALTER DATABASE ... OWNER TO ...`
* Open sources optimization for `COPY` when loading `JSON` to avoid double
parsing of the `JSON` object (`citus.skip_jsonb_validation_in_copy`)
* Open sources support for row level security
* Open sources support for `pg_dist_authinfo`, which allows storing different
authentication options for different users, e.g. you can store
passwords or certificates here.
* Open sources support for `pg_dist_poolinfo`, which allows using connection
poolers in between coordinator and workers
* Open sources tracking distributed query execution times using
citus_stat_statements (`citus.stat_statements_max`,
`citus.stat_statements_purge_interval`,
`citus.stat_statements_track`). This is disabled by default.
* Open sources tenant_isolation
* Open sources support for `sslkey` and `sslcert` in `citus.node_conninfo`
* Adds `citus.max_client_connections` GUC to limit non-Citus connections
@ -33,6 +186,8 @@
* Adds propagation for foreign server commands
* Adds propagation of `DOMAIN` objects
* Adds propagation of `TEXT SEARCH CONFIGURATION` objects
* Adds propagation of `TEXT SEARCH DICTIONARY` objects
@ -41,6 +196,8 @@
* Adds support for `CREATE SCHEMA AUTHORIZATION` statements without schema name
* Adds support for `CREATE/DROP/ALTER VIEW` commands
* Adds support for `TRUNCATE` for foreign tables
* Adds support for adding local tables to metadata using
@ -60,6 +217,12 @@
* Adds support for shard replication > 1 hash distributed tables on Citus MX
* Adds support for `LOCK` commands on distributed tables from worker nodes
* Adds support for `TABLESAMPLE`
* Adds support for propagating views when syncing Citus table metadata
* Improves handling of `IN`, `OUT` and `INOUT` parameters for functions
* Introduces `citus_backend_gpid()` UDF to get global pid of the current backend
@ -79,12 +242,23 @@
* Introduces a new flag `force_delegation` in `create_distributed_function()`
* Introduces `run_command_on_coordinator` UDF
* Introduces `synchronous` option to `citus_disable_node()` UDF
* Introduces `citus_is_coordinator` UDF to check whether a node is the
coordinator
* Allows adding a unique constraint with an index
* Allows `create_distributed_function()` on a function owned by an extension
* Allows creating distributed tables in sequential mode
* Allows disabling nodes when multiple failures happen
* Allows `lock_table_if_exits` to be called outside of a transaction blocks
* Adds support for pushing procedures with `OUT` arguments down to the worker
nodes
@ -108,6 +282,8 @@
* `citus_shards_on_worker` shows all local shards regardless of `search_path`
* Enables distributed execution from `run_command_on_*` functions
* Deprecates inactive shard state, never marks any placement inactive
* Disables distributed & reference foreign tables
@ -130,6 +306,20 @@
* Avoids unnecessary errors for `ALTER STATISTICS IF EXISTS` when the statistics
does not exist
* Fixes a bug that prevents dropping/altering indexes
* Fixes a bug that prevents non-client backends from accessing shards
* Fixes columnar freezing/wraparound bug
* Fixes `invalid read of size 1` memory error with `citus_add_node`
* Fixes schema name qualification for `ALTER/DROP SEQUENCE`
* Fixes schema name qualification for `ALTER/DROP STATISTICS`
* Fixes schema name qualification for `CREATE STATISTICS`
* Fixes a bug that causes columnar storage pages to have zero LSN
* Fixes a bug that causes issues while create dependencies from multiple
@ -157,6 +347,26 @@
* Fixes a bug that could cause re-partition joins involving local shards to fail
* Fixes a bug that could cause false positive distributed deadlocks due to local
execution
* Fixes a bug that could cause leaking files when materialized views are
refreshed
* Fixes a bug that could cause unqualified `DROP DOMAIN IF EXISTS` to fail
* Fixes a bug that could cause wrong schema and ownership after
`alter_distributed_table`
* Fixes a bug that could cause `EXPLAIN ANALYZE` to fail for prepared statements
with custom type
* Fixes a bug that could cause Citus not to create function in transaction block
properly
* Fixes a bug that could cause returning invalid JSON when running
`EXPLAIN ANALYZE` with subplans
* Fixes a bug that limits usage of sequences in non-int columns
* Fixes a bug that prevents `DROP SCHEMA CASCADE`
@ -188,17 +398,28 @@
* Fixes naming issues of newly created partitioned indexes
* Honors `enable_metadata_sync` in node operations
* Improves nested execution checks and adds GUC to control
(`citus.allow_nested_distributed_execution`)
* Improves self-deadlock prevention for `CREATE INDEX / REINDEX CONCURRENTLY`
commands for builds using PG14 or higher
* Moves `pg_dist_object` to `pg_catalog` schema
* Parallelizes metadata syncing on node activation
* Partitions shards to be co-located with the parent shards
* Prevents Citus table functions from being called on shards
* Prevents creating distributed functions when there are out of sync nodes
* Prevents alter table functions from dropping extensions
* Refrains reading the metadata cache for all tables during upgrade
* Provides notice message for idempotent `create_distributed_function` calls
* Reinstates optimisation for uniform shard interval ranges

View File

@ -30,13 +30,15 @@ clean-extension:
clean-full:
$(MAKE) -C src/backend/distributed/ clean-full
.PHONY: extension install-extension clean-extension clean-full
# Add to generic targets
install: install-extension install-headers
install-downgrades:
$(MAKE) -C src/backend/distributed/ install-downgrades
install-all: install-headers
$(MAKE) -C src/backend/distributed/ install-all
# Add to generic targets
install: install-extension install-headers
clean: clean-extension
# apply or check style

View File

@ -15,9 +15,6 @@ PG_MAJOR=${PG_MAJOR:?please provide the postgres major version}
codename=${VERSION#*(}
codename=${codename%)*}
# get project from argument
project="${CIRCLE_PROJECT_REPONAME}"
# we'll do everything with absolute paths
basedir="$(pwd)"
@ -28,7 +25,7 @@ build_ext() {
pg_major="$1"
builddir="${basedir}/build-${pg_major}"
echo "Beginning build of ${project} for PostgreSQL ${pg_major}..." >&2
echo "Beginning build for PostgreSQL ${pg_major}..." >&2
# do everything in a subdirectory to avoid clutter in current directory
mkdir -p "${builddir}" && cd "${builddir}"

View File

@ -14,8 +14,8 @@ ci_scripts=$(
grep -v -E '^(ci_helpers.sh|fix_style.sh)$'
)
for script in $ci_scripts; do
if ! grep "\\bci/$script\\b" .circleci/config.yml > /dev/null; then
echo "ERROR: CI script with name \"$script\" is not actually used in .circleci/config.yml"
if ! grep "\\bci/$script\\b" -r .github > /dev/null; then
echo "ERROR: CI script with name \"$script\" is not actually used in .github folder"
exit 1
fi
if ! grep "^## \`$script\`\$" ci/README.md > /dev/null; then

View File

@ -1,96 +0,0 @@
#!/bin/bash
# Testing this script locally requires you to set the following environment
# variables:
# CIRCLE_BRANCH, GIT_USERNAME and GIT_TOKEN
# fail if trying to reference a variable that is not set.
set -u
# exit immediately if a command fails
set -e
# Fail on pipe failures
set -o pipefail
PR_BRANCH="${CIRCLE_BRANCH}"
ENTERPRISE_REMOTE="https://${GIT_USERNAME}:${GIT_TOKEN}@github.com/citusdata/citus-enterprise"
# shellcheck disable=SC1091
source ci/ci_helpers.sh
# List executed commands. This is done so debugging this script is easier when
# it fails. It's explicitly done after git remote add so username and password
# are not shown in CI output (even though it's also filtered out by CircleCI)
set -x
check_compile () {
echo "INFO: checking if merged code can be compiled"
./configure --without-libcurl
make -j10
}
# Clone current git repo (which should be community) to a temporary working
# directory and go there
GIT_DIR_ROOT="$(git rev-parse --show-toplevel)"
TMP_GIT_DIR="$(mktemp --directory -t citus-merge-check.XXXXXXXXX)"
git clone "$GIT_DIR_ROOT" "$TMP_GIT_DIR"
cd "$TMP_GIT_DIR"
# Fails in CI without this
git config user.email "citus-bot@microsoft.com"
git config user.name "citus bot"
# Disable "set -x" temporarily, because $ENTERPRISE_REMOTE contains passwords
{ set +x ; } 2> /dev/null
git remote add enterprise "$ENTERPRISE_REMOTE"
set -x
git remote set-url --push enterprise no-pushing
# Fetch enterprise-master
git fetch enterprise enterprise-master
git checkout "enterprise/enterprise-master"
if git merge --no-commit "origin/$PR_BRANCH"; then
echo "INFO: community PR branch could be merged into enterprise-master"
# check that we can compile after the merge
if check_compile; then
exit 0
fi
echo "WARN: Failed to compile after community PR branch was merged into enterprise"
fi
# undo partial merge
git merge --abort
# If we have a conflict on enterprise merge on the master branch, we have a problem.
# Provide an error message to indicate that enterprise merge is needed to fix this check.
if [[ $PR_BRANCH = master ]]; then
echo "ERROR: Master branch has merge conflicts with enterprise-master."
echo "Try re-running this CI job after merging your changes into enterprise-master."
exit 1
fi
if ! git fetch enterprise "$PR_BRANCH" ; then
echo "ERROR: enterprise/$PR_BRANCH was not found and community PR branch could not be merged into enterprise-master"
exit 1
fi
# Show the top commit of the enterprise PR branch to make debugging easier
git log -n 1 "enterprise/$PR_BRANCH"
# Check that this branch contains the top commit of the current community PR
# branch. If it does not it means it's not up to date with the current PR, so
# the enterprise branch should be updated.
if ! git merge-base --is-ancestor "origin/$PR_BRANCH" "enterprise/$PR_BRANCH" ; then
echo "ERROR: enterprise/$PR_BRANCH is not up to date with community PR branch"
exit 1
fi
# Now check if we can merge the enterprise PR into enterprise-master without
# issues.
git merge --no-commit "enterprise/$PR_BRANCH"
# check that we can compile after the merge
check_compile

18
configure vendored
View File

@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for Citus 11.0devel.
# Generated by GNU Autoconf 2.69 for Citus 11.0.10.
#
#
# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@ -579,8 +579,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='Citus'
PACKAGE_TARNAME='citus'
PACKAGE_VERSION='11.0devel'
PACKAGE_STRING='Citus 11.0devel'
PACKAGE_VERSION='11.0.10'
PACKAGE_STRING='Citus 11.0.10'
PACKAGE_BUGREPORT=''
PACKAGE_URL=''
@ -1260,7 +1260,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures Citus 11.0devel to adapt to many kinds of systems.
\`configure' configures Citus 11.0.10 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@ -1322,7 +1322,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of Citus 11.0devel:";;
short | recursive ) echo "Configuration of Citus 11.0.10:";;
esac
cat <<\_ACEOF
@ -1425,7 +1425,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
Citus configure 11.0devel
Citus configure 11.0.10
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
@ -1908,7 +1908,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by Citus $as_me 11.0devel, which was
It was created by Citus $as_me 11.0.10, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@ -5360,7 +5360,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by Citus $as_me 11.0devel, which was
This file was extended by Citus $as_me 11.0.10, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@ -5422,7 +5422,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
Citus config.status 11.0devel
Citus config.status 11.0.10
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"

View File

@ -5,7 +5,7 @@
# everyone needing autoconf installed, the resulting files are checked
# into the SCM.
AC_INIT([Citus], [11.0devel])
AC_INIT([Citus], [11.0.10])
AC_COPYRIGHT([Copyright (c) Citus Data, Inc.])
# we'll need sed and awk for some of the version commands

View File

@ -811,6 +811,18 @@ ExtractPushdownClause(PlannerInfo *root, RelOptInfo *rel, Node *node)
}
}
if (IsA(node, ScalarArrayOpExpr))
{
if (!contain_volatile_functions(node))
{
return (Expr *) node;
}
else
{
return NULL;
}
}
if (!IsA(node, OpExpr) || list_length(((OpExpr *) node)->args) != 2)
{
ereport(ColumnarPlannerDebugLevel,

View File

@ -59,6 +59,10 @@
#include "utils/rel.h"
#include "utils/relfilenodemap.h"
#define SLOW_METADATA_ACCESS_WARNING \
"Metadata index %s is not available, this might mean slower read/writes " \
"on columnar tables. This is expected during Postgres upgrades and not " \
"expected otherwise."
typedef struct
{
@ -551,15 +555,23 @@ ReadStripeSkipList(RelFileNode relfilenode, uint64 stripe, TupleDesc tupleDescri
Oid columnarChunkOid = ColumnarChunkRelationId();
Relation columnarChunk = table_open(columnarChunkOid, AccessShareLock);
Relation index = index_open(ColumnarChunkIndexRelationId(), AccessShareLock);
ScanKeyInit(&scanKey[0], Anum_columnar_chunk_storageid,
BTEqualStrategyNumber, F_OIDEQ, UInt64GetDatum(storageId));
ScanKeyInit(&scanKey[1], Anum_columnar_chunk_stripe,
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(stripe));
SysScanDesc scanDescriptor = systable_beginscan_ordered(columnarChunk, index,
snapshot, 2, scanKey);
Oid indexId = ColumnarChunkIndexRelationId();
bool indexOk = OidIsValid(indexId);
SysScanDesc scanDescriptor = systable_beginscan(columnarChunk, indexId,
indexOk, snapshot, 2, scanKey);
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
{
ereport(WARNING, (errmsg(SLOW_METADATA_ACCESS_WARNING, "chunk_pkey")));
loggedSlowMetadataAccessWarning = true;
}
StripeSkipList *chunkList = palloc0(sizeof(StripeSkipList));
chunkList->chunkCount = chunkCount;
@ -571,8 +583,7 @@ ReadStripeSkipList(RelFileNode relfilenode, uint64 stripe, TupleDesc tupleDescri
palloc0(chunkCount * sizeof(ColumnChunkSkipNode));
}
while (HeapTupleIsValid(heapTuple = systable_getnext_ordered(scanDescriptor,
ForwardScanDirection)))
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
Datum datumArray[Natts_columnar_chunk];
bool isNullArray[Natts_columnar_chunk];
@ -637,8 +648,7 @@ ReadStripeSkipList(RelFileNode relfilenode, uint64 stripe, TupleDesc tupleDescri
}
}
systable_endscan_ordered(scanDescriptor);
index_close(index, AccessShareLock);
systable_endscan(scanDescriptor);
table_close(columnarChunk, AccessShareLock);
chunkList->chunkGroupRowCounts =
@ -649,9 +659,9 @@ ReadStripeSkipList(RelFileNode relfilenode, uint64 stripe, TupleDesc tupleDescri
/*
* FindStripeByRowNumber returns StripeMetadata for the stripe whose
* firstRowNumber is greater than given rowNumber. If no such stripe
* exists, then returns NULL.
* FindStripeByRowNumber returns StripeMetadata for the stripe that has the
* smallest firstRowNumber among the stripes whose firstRowNumber is grater
* than given rowNumber. If no such stripe exists, then returns NULL.
*/
StripeMetadata *
FindNextStripeByRowNumber(Relation relation, uint64 rowNumber, Snapshot snapshot)
@ -741,8 +751,7 @@ StripeGetHighestRowNumber(StripeMetadata *stripeMetadata)
/*
* StripeMetadataLookupRowNumber returns StripeMetadata for the stripe whose
* firstRowNumber is less than or equal to (FIND_LESS_OR_EQUAL), or is
* greater than (FIND_GREATER) given rowNumber by doing backward index
* scan on stripe_first_row_number_idx.
* greater than (FIND_GREATER) given rowNumber.
* If no such stripe exists, then returns NULL.
*/
static StripeMetadata *
@ -773,31 +782,71 @@ StripeMetadataLookupRowNumber(Relation relation, uint64 rowNumber, Snapshot snap
ScanKeyInit(&scanKey[1], Anum_columnar_stripe_first_row_number,
strategyNumber, procedure, UInt64GetDatum(rowNumber));
Relation columnarStripes = table_open(ColumnarStripeRelationId(), AccessShareLock);
Relation index = index_open(ColumnarStripeFirstRowNumberIndexRelationId(),
AccessShareLock);
SysScanDesc scanDescriptor = systable_beginscan_ordered(columnarStripes, index,
snapshot, 2,
scanKey);
ScanDirection scanDirection = NoMovementScanDirection;
if (lookupMode == FIND_LESS_OR_EQUAL)
Oid indexId = ColumnarStripeFirstRowNumberIndexRelationId();
bool indexOk = OidIsValid(indexId);
SysScanDesc scanDescriptor = systable_beginscan(columnarStripes, indexId, indexOk,
snapshot, 2, scanKey);
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
{
scanDirection = BackwardScanDirection;
}
else if (lookupMode == FIND_GREATER)
{
scanDirection = ForwardScanDirection;
}
HeapTuple heapTuple = systable_getnext_ordered(scanDescriptor, scanDirection);
if (HeapTupleIsValid(heapTuple))
{
foundStripeMetadata = BuildStripeMetadata(columnarStripes, heapTuple);
ereport(WARNING, (errmsg(SLOW_METADATA_ACCESS_WARNING,
"stripe_first_row_number_idx")));
loggedSlowMetadataAccessWarning = true;
}
systable_endscan_ordered(scanDescriptor);
index_close(index, AccessShareLock);
if (indexOk)
{
ScanDirection scanDirection = NoMovementScanDirection;
if (lookupMode == FIND_LESS_OR_EQUAL)
{
scanDirection = BackwardScanDirection;
}
else if (lookupMode == FIND_GREATER)
{
scanDirection = ForwardScanDirection;
}
HeapTuple heapTuple = systable_getnext_ordered(scanDescriptor, scanDirection);
if (HeapTupleIsValid(heapTuple))
{
foundStripeMetadata = BuildStripeMetadata(columnarStripes, heapTuple);
}
}
else
{
HeapTuple heapTuple = NULL;
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
StripeMetadata *stripe = BuildStripeMetadata(columnarStripes, heapTuple);
if (!foundStripeMetadata)
{
/* first match */
foundStripeMetadata = stripe;
}
else if (lookupMode == FIND_LESS_OR_EQUAL &&
stripe->firstRowNumber > foundStripeMetadata->firstRowNumber)
{
/*
* Among the stripes with firstRowNumber less-than-or-equal-to given,
* we're looking for the one with the greatest firstRowNumber.
*/
foundStripeMetadata = stripe;
}
else if (lookupMode == FIND_GREATER &&
stripe->firstRowNumber < foundStripeMetadata->firstRowNumber)
{
/*
* Among the stripes with firstRowNumber greater-than given,
* we're looking for the one with the smallest firstRowNumber.
*/
foundStripeMetadata = stripe;
}
}
}
systable_endscan(scanDescriptor);
table_close(columnarStripes, AccessShareLock);
return foundStripeMetadata;
@ -871,8 +920,8 @@ CheckStripeMetadataConsistency(StripeMetadata *stripeMetadata)
/*
* FindStripeWithHighestRowNumber returns StripeMetadata for the stripe that
* has the row with highest rowNumber by doing backward index scan on
* stripe_first_row_number_idx. If given relation is empty, then returns NULL.
* has the row with highest rowNumber. If given relation is empty, then returns
* NULL.
*/
StripeMetadata *
FindStripeWithHighestRowNumber(Relation relation, Snapshot snapshot)
@ -885,19 +934,46 @@ FindStripeWithHighestRowNumber(Relation relation, Snapshot snapshot)
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(storageId));
Relation columnarStripes = table_open(ColumnarStripeRelationId(), AccessShareLock);
Relation index = index_open(ColumnarStripeFirstRowNumberIndexRelationId(),
AccessShareLock);
SysScanDesc scanDescriptor = systable_beginscan_ordered(columnarStripes, index,
snapshot, 1, scanKey);
HeapTuple heapTuple = systable_getnext_ordered(scanDescriptor, BackwardScanDirection);
if (HeapTupleIsValid(heapTuple))
Oid indexId = ColumnarStripeFirstRowNumberIndexRelationId();
bool indexOk = OidIsValid(indexId);
SysScanDesc scanDescriptor = systable_beginscan(columnarStripes, indexId, indexOk,
snapshot, 1, scanKey);
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
{
stripeWithHighestRowNumber = BuildStripeMetadata(columnarStripes, heapTuple);
ereport(WARNING, (errmsg(SLOW_METADATA_ACCESS_WARNING,
"stripe_first_row_number_idx")));
loggedSlowMetadataAccessWarning = true;
}
systable_endscan_ordered(scanDescriptor);
index_close(index, AccessShareLock);
if (indexOk)
{
/* do one-time fetch using the index */
HeapTuple heapTuple = systable_getnext_ordered(scanDescriptor,
BackwardScanDirection);
if (HeapTupleIsValid(heapTuple))
{
stripeWithHighestRowNumber = BuildStripeMetadata(columnarStripes, heapTuple);
}
}
else
{
HeapTuple heapTuple = NULL;
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
StripeMetadata *stripe = BuildStripeMetadata(columnarStripes, heapTuple);
if (!stripeWithHighestRowNumber ||
stripe->firstRowNumber > stripeWithHighestRowNumber->firstRowNumber)
{
/* first or a greater match */
stripeWithHighestRowNumber = stripe;
}
}
}
systable_endscan(scanDescriptor);
table_close(columnarStripes, AccessShareLock);
return stripeWithHighestRowNumber;
@ -914,7 +990,6 @@ ReadChunkGroupRowCounts(uint64 storageId, uint64 stripe, uint32 chunkGroupCount,
{
Oid columnarChunkGroupOid = ColumnarChunkGroupRelationId();
Relation columnarChunkGroup = table_open(columnarChunkGroupOid, AccessShareLock);
Relation index = index_open(ColumnarChunkGroupIndexRelationId(), AccessShareLock);
ScanKeyData scanKey[2];
ScanKeyInit(&scanKey[0], Anum_columnar_chunkgroup_storageid,
@ -922,15 +997,22 @@ ReadChunkGroupRowCounts(uint64 storageId, uint64 stripe, uint32 chunkGroupCount,
ScanKeyInit(&scanKey[1], Anum_columnar_chunkgroup_stripe,
BTEqualStrategyNumber, F_OIDEQ, Int32GetDatum(stripe));
Oid indexId = ColumnarChunkGroupIndexRelationId();
bool indexOk = OidIsValid(indexId);
SysScanDesc scanDescriptor =
systable_beginscan_ordered(columnarChunkGroup, index, snapshot, 2, scanKey);
systable_beginscan(columnarChunkGroup, indexId, indexOk, snapshot, 2, scanKey);
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
{
ereport(WARNING, (errmsg(SLOW_METADATA_ACCESS_WARNING, "chunk_group_pkey")));
loggedSlowMetadataAccessWarning = true;
}
uint32 chunkGroupIndex = 0;
HeapTuple heapTuple = NULL;
uint32 *chunkGroupRowCounts = palloc0(chunkGroupCount * sizeof(uint32));
while (HeapTupleIsValid(heapTuple = systable_getnext_ordered(scanDescriptor,
ForwardScanDirection)))
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
Datum datumArray[Natts_columnar_chunkgroup];
bool isNullArray[Natts_columnar_chunkgroup];
@ -941,24 +1023,16 @@ ReadChunkGroupRowCounts(uint64 storageId, uint64 stripe, uint32 chunkGroupCount,
uint32 tupleChunkGroupIndex =
DatumGetUInt32(datumArray[Anum_columnar_chunkgroup_chunk - 1]);
if (chunkGroupIndex >= chunkGroupCount ||
tupleChunkGroupIndex != chunkGroupIndex)
if (tupleChunkGroupIndex >= chunkGroupCount)
{
elog(ERROR, "unexpected chunk group");
}
chunkGroupRowCounts[chunkGroupIndex] =
chunkGroupRowCounts[tupleChunkGroupIndex] =
(uint32) DatumGetUInt64(datumArray[Anum_columnar_chunkgroup_row_count - 1]);
chunkGroupIndex++;
}
if (chunkGroupIndex != chunkGroupCount)
{
elog(ERROR, "unexpected chunk group count");
}
systable_endscan_ordered(scanDescriptor);
index_close(index, AccessShareLock);
systable_endscan(scanDescriptor);
table_close(columnarChunkGroup, AccessShareLock);
return chunkGroupRowCounts;
@ -1155,14 +1229,20 @@ UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
Oid columnarStripesOid = ColumnarStripeRelationId();
Relation columnarStripes = table_open(columnarStripesOid, AccessShareLock);
Relation columnarStripePkeyIndex = index_open(ColumnarStripePKeyIndexRelationId(),
AccessShareLock);
SysScanDesc scanDescriptor = systable_beginscan_ordered(columnarStripes,
columnarStripePkeyIndex,
&dirtySnapshot, 2, scanKey);
Oid indexId = ColumnarStripePKeyIndexRelationId();
bool indexOk = OidIsValid(indexId);
SysScanDesc scanDescriptor = systable_beginscan(columnarStripes, indexId, indexOk,
&dirtySnapshot, 2, scanKey);
HeapTuple oldTuple = systable_getnext_ordered(scanDescriptor, ForwardScanDirection);
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
{
ereport(WARNING, (errmsg(SLOW_METADATA_ACCESS_WARNING, "stripe_pkey")));
loggedSlowMetadataAccessWarning = true;
}
HeapTuple oldTuple = systable_getnext(scanDescriptor);
if (!HeapTupleIsValid(oldTuple))
{
ereport(ERROR, (errmsg("attempted to modify an unexpected stripe, "
@ -1197,8 +1277,7 @@ UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
CommandCounterIncrement();
systable_endscan_ordered(scanDescriptor);
index_close(columnarStripePkeyIndex, AccessShareLock);
systable_endscan(scanDescriptor);
table_close(columnarStripes, AccessShareLock);
/* return StripeMetadata object built from modified tuple */
@ -1209,6 +1288,10 @@ UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
/*
* ReadDataFileStripeList reads the stripe list for a given storageId
* in the given snapshot.
*
* Doesn't sort the stripes by their ids before returning if
* stripe_first_row_number_idx is not available --normally can only happen
* during pg upgrades.
*/
static List *
ReadDataFileStripeList(uint64 storageId, Snapshot snapshot)
@ -1223,22 +1306,27 @@ ReadDataFileStripeList(uint64 storageId, Snapshot snapshot)
Oid columnarStripesOid = ColumnarStripeRelationId();
Relation columnarStripes = table_open(columnarStripesOid, AccessShareLock);
Relation index = index_open(ColumnarStripeFirstRowNumberIndexRelationId(),
AccessShareLock);
SysScanDesc scanDescriptor = systable_beginscan_ordered(columnarStripes, index,
snapshot, 1,
scanKey);
Oid indexId = ColumnarStripeFirstRowNumberIndexRelationId();
bool indexOk = OidIsValid(indexId);
SysScanDesc scanDescriptor = systable_beginscan(columnarStripes, indexId,
indexOk, snapshot, 1, scanKey);
while (HeapTupleIsValid(heapTuple = systable_getnext_ordered(scanDescriptor,
ForwardScanDirection)))
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
{
ereport(WARNING, (errmsg(SLOW_METADATA_ACCESS_WARNING,
"stripe_first_row_number_idx")));
loggedSlowMetadataAccessWarning = true;
}
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
StripeMetadata *stripeMetadata = BuildStripeMetadata(columnarStripes, heapTuple);
stripeMetadataList = lappend(stripeMetadataList, stripeMetadata);
}
systable_endscan_ordered(scanDescriptor);
index_close(index, AccessShareLock);
systable_endscan(scanDescriptor);
table_close(columnarStripes, AccessShareLock);
return stripeMetadataList;
@ -1349,25 +1437,30 @@ DeleteStorageFromColumnarMetadataTable(Oid metadataTableId,
return;
}
Relation index = index_open(storageIdIndexId, AccessShareLock);
bool indexOk = OidIsValid(storageIdIndexId);
SysScanDesc scanDescriptor = systable_beginscan(metadataTable, storageIdIndexId,
indexOk, NULL, 1, scanKey);
SysScanDesc scanDescriptor = systable_beginscan_ordered(metadataTable, index, NULL,
1, scanKey);
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
{
ereport(WARNING, (errmsg(SLOW_METADATA_ACCESS_WARNING,
"on a columnar metadata table")));
loggedSlowMetadataAccessWarning = true;
}
ModifyState *modifyState = StartModifyRelation(metadataTable);
HeapTuple heapTuple;
while (HeapTupleIsValid(heapTuple = systable_getnext_ordered(scanDescriptor,
ForwardScanDirection)))
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
DeleteTupleAndEnforceConstraints(modifyState, heapTuple);
}
systable_endscan_ordered(scanDescriptor);
systable_endscan(scanDescriptor);
FinishModifyRelation(modifyState);
index_close(index, AccessShareLock);
table_close(metadataTable, AccessShareLock);
}
@ -1500,6 +1593,9 @@ create_estate_for_relation(Relation rel)
/*
* DatumToBytea serializes a datum into a bytea value.
*
* Since we don't want to limit datum size to RSIZE_MAX unnecessarily,
* we use memcpy instead of memcpy_s several places in this function.
*/
static bytea *
DatumToBytea(Datum value, Form_pg_attribute attrForm)
@ -1516,19 +1612,16 @@ DatumToBytea(Datum value, Form_pg_attribute attrForm)
Datum tmp;
store_att_byval(&tmp, value, attrForm->attlen);
memcpy_s(VARDATA(result), datumLength + VARHDRSZ,
&tmp, attrForm->attlen);
memcpy(VARDATA(result), &tmp, attrForm->attlen); /* IGNORE-BANNED */
}
else
{
memcpy_s(VARDATA(result), datumLength + VARHDRSZ,
DatumGetPointer(value), attrForm->attlen);
memcpy(VARDATA(result), DatumGetPointer(value), attrForm->attlen); /* IGNORE-BANNED */
}
}
else
{
memcpy_s(VARDATA(result), datumLength + VARHDRSZ,
DatumGetPointer(value), datumLength);
memcpy(VARDATA(result), DatumGetPointer(value), datumLength); /* IGNORE-BANNED */
}
return result;
@ -1547,8 +1640,12 @@ ByteaToDatum(bytea *bytes, Form_pg_attribute attrForm)
* after the byteaDatum is freed.
*/
char *binaryDataCopy = palloc0(VARSIZE_ANY_EXHDR(bytes));
memcpy_s(binaryDataCopy, VARSIZE_ANY_EXHDR(bytes),
VARDATA_ANY(bytes), VARSIZE_ANY_EXHDR(bytes));
/*
* We use IGNORE-BANNED here since we don't want to limit datum size to
* RSIZE_MAX unnecessarily.
*/
memcpy(binaryDataCopy, VARDATA_ANY(bytes), VARSIZE_ANY_EXHDR(bytes)); /* IGNORE-BANNED */
return fetch_att(binaryDataCopy, attrForm->attbyval, attrForm->attlen);
}

View File

@ -742,7 +742,9 @@ columnar_tuple_insert(Relation relation, TupleTableSlot *slot, CommandId cid,
*/
ColumnarWriteState *writeState = columnar_init_write_state(relation,
RelationGetDescr(relation),
slot->tts_tableOid,
GetCurrentSubTransactionId());
MemoryContext oldContext = MemoryContextSwitchTo(ColumnarWritePerTupleContext(
writeState));
@ -784,8 +786,14 @@ columnar_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
{
CheckCitusColumnarVersion(ERROR);
/*
* The callback to .multi_insert is table_multi_insert() and this is only used for the COPY
* command, so slot[i]->tts_tableoid will always be equal to relation->id. Thus, we can send
* RelationGetRelid(relation) as the tupSlotTableOid
*/
ColumnarWriteState *writeState = columnar_init_write_state(relation,
RelationGetDescr(relation),
RelationGetRelid(relation),
GetCurrentSubTransactionId());
ColumnarCheckLogicalReplication(relation);
@ -1033,6 +1041,27 @@ NeededColumnsList(TupleDesc tupdesc, Bitmapset *attr_needed)
}
/*
* ColumnarTableTupleCount returns the number of tuples that columnar
* table with relationId has by using stripe metadata.
*/
static uint64
ColumnarTableTupleCount(Relation relation)
{
List *stripeList = StripesForRelfilenode(relation->rd_node);
uint64 tupleCount = 0;
ListCell *lc = NULL;
foreach(lc, stripeList)
{
StripeMetadata *stripe = lfirst(lc);
tupleCount += stripe->rowCount;
}
return tupleCount;
}
/*
* columnar_vacuum_rel implements VACUUM without FULL option.
*/
@ -1049,6 +1078,9 @@ columnar_vacuum_rel(Relation rel, VacuumParams *params,
return;
}
pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
RelationGetRelid(rel));
/*
* If metapage version of relation is older, then we hint users to VACUUM
* the relation in ColumnarMetapageCheckVersion. So if needed, upgrade
@ -1072,6 +1104,79 @@ columnar_vacuum_rel(Relation rel, VacuumParams *params,
{
TruncateColumnar(rel, elevel);
}
RelationOpenSmgr(rel);
BlockNumber new_rel_pages = smgrnblocks(rel->rd_smgr, MAIN_FORKNUM);
/* get the number of indexes */
List *indexList = RelationGetIndexList(rel);
int nindexes = list_length(indexList);
TransactionId oldestXmin;
TransactionId freezeLimit;
MultiXactId multiXactCutoff;
/* initialize xids */
#if PG_VERSION_NUM >= PG_VERSION_15
MultiXactId oldestMxact;
vacuum_set_xid_limits(rel,
params->freeze_min_age,
params->freeze_table_age,
params->multixact_freeze_min_age,
params->multixact_freeze_table_age,
&oldestXmin, &oldestMxact,
&freezeLimit, &multiXactCutoff);
Assert(MultiXactIdPrecedesOrEquals(multiXactCutoff, oldestMxact));
#else
TransactionId xidFullScanLimit;
MultiXactId mxactFullScanLimit;
vacuum_set_xid_limits(rel,
params->freeze_min_age,
params->freeze_table_age,
params->multixact_freeze_min_age,
params->multixact_freeze_table_age,
&oldestXmin, &freezeLimit, &xidFullScanLimit,
&multiXactCutoff, &mxactFullScanLimit);
#endif
Assert(TransactionIdPrecedesOrEquals(freezeLimit, oldestXmin));
/*
* Columnar storage doesn't hold any transaction IDs, so we can always
* just advance to the most aggressive value.
*/
TransactionId newRelFrozenXid = oldestXmin;
#if PG_VERSION_NUM >= PG_VERSION_15
MultiXactId newRelminMxid = oldestMxact;
#else
MultiXactId newRelminMxid = multiXactCutoff;
#endif
double new_live_tuples = ColumnarTableTupleCount(rel);
/* all visible pages are always 0 */
BlockNumber new_rel_allvisible = 0;
#if PG_VERSION_NUM >= PG_VERSION_15
bool frozenxid_updated;
bool minmulti_updated;
vac_update_relstats(rel, new_rel_pages, new_live_tuples,
new_rel_allvisible, nindexes > 0,
newRelFrozenXid, newRelminMxid,
&frozenxid_updated, &minmulti_updated, false);
#else
vac_update_relstats(rel, new_rel_pages, new_live_tuples,
new_rel_allvisible, nindexes > 0,
newRelFrozenXid, newRelminMxid, false);
#endif
pgstat_report_vacuum(RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(new_live_tuples, 0),
0);
pgstat_progress_end_command();
}
@ -2270,8 +2375,13 @@ detoast_values(TupleDesc tupleDesc, Datum *orig_values, bool *isnull)
if (values == orig_values)
{
values = palloc(sizeof(Datum) * natts);
memcpy_s(values, sizeof(Datum) * natts,
orig_values, sizeof(Datum) * natts);
/*
* We use IGNORE-BANNED here since we don't want to limit
* size of the buffer that holds the datum array to RSIZE_MAX
* unnecessarily.
*/
memcpy(values, orig_values, sizeof(Datum) * natts); /* IGNORE-BANNED */
}
/* will be freed when per-tuple context is reset */

View File

@ -531,6 +531,9 @@ SerializeBoolArray(bool *boolArray, uint32 boolArrayLength)
/*
* SerializeSingleDatum serializes the given datum value and appends it to the
* provided string info buffer.
*
* Since we don't want to limit datum buffer size to RSIZE_MAX unnecessarily,
* we use memcpy instead of memcpy_s several places in this function.
*/
static void
SerializeSingleDatum(StringInfo datumBuffer, Datum datum, bool datumTypeByValue,
@ -552,15 +555,13 @@ SerializeSingleDatum(StringInfo datumBuffer, Datum datum, bool datumTypeByValue,
}
else
{
memcpy_s(currentDatumDataPointer, datumBuffer->maxlen - datumBuffer->len,
DatumGetPointer(datum), datumTypeLength);
memcpy(currentDatumDataPointer, DatumGetPointer(datum), datumTypeLength); /* IGNORE-BANNED */
}
}
else
{
Assert(!datumTypeByValue);
memcpy_s(currentDatumDataPointer, datumBuffer->maxlen - datumBuffer->len,
DatumGetPointer(datum), datumLength);
memcpy(currentDatumDataPointer, DatumGetPointer(datum), datumLength); /* IGNORE-BANNED */
}
datumBuffer->len += datumLengthAligned;
@ -714,7 +715,12 @@ DatumCopy(Datum datum, bool datumTypeByValue, int datumTypeLength)
{
uint32 datumLength = att_addlength_datum(0, datumTypeLength, datum);
char *datumData = palloc0(datumLength);
memcpy_s(datumData, datumLength, DatumGetPointer(datum), datumLength);
/*
* We use IGNORE-BANNED here since we don't want to limit datum size to
* RSIZE_MAX unnecessarily.
*/
memcpy(datumData, DatumGetPointer(datum), datumLength); /* IGNORE-BANNED */
datumCopy = PointerGetDatum(datumData);
}
@ -737,8 +743,12 @@ CopyStringInfo(StringInfo sourceString)
targetString->data = palloc0(sourceString->len);
targetString->len = sourceString->len;
targetString->maxlen = sourceString->len;
memcpy_s(targetString->data, sourceString->len,
sourceString->data, sourceString->len);
/*
* We use IGNORE-BANNED here since we don't want to limit string
* buffer size to RSIZE_MAX unnecessarily.
*/
memcpy(targetString->data, sourceString->data, sourceString->len); /* IGNORE-BANNED */
}
return targetString;

View File

@ -10,6 +10,17 @@
--
-- To do that, drop stripe_first_row_number_idx and create a unique
-- constraint with the same name to keep the code change at minimum.
--
-- If we have a pg_depend entry for this index, we can not drop it as
-- the extension depends on it. Remove the pg_depend entry if it exists.
DELETE FROM pg_depend
WHERE classid = 'pg_am'::regclass::oid
AND objid IN (select oid from pg_am where amname = 'columnar')
AND objsubid = 0
AND refclassid = 'pg_class'::regclass::oid
AND refobjid = 'columnar.stripe_first_row_number_idx'::regclass::oid
AND refobjsubid = 0
AND deptype = 'n';
DROP INDEX columnar.stripe_first_row_number_idx;
ALTER TABLE columnar.stripe ADD CONSTRAINT stripe_first_row_number_idx
UNIQUE (storage_id, first_row_number);

View File

@ -0,0 +1 @@
-- no changes needed

View File

@ -0,0 +1 @@
-- no changes needed

View File

@ -8,5 +8,16 @@ DROP FUNCTION citus_internal.upgrade_columnar_storage(regclass);
DROP FUNCTION citus_internal.downgrade_columnar_storage(regclass);
-- drop "first_row_number" column and the index defined on it
--
-- If we have a pg_depend entry for this index, we can not drop it as
-- the extension depends on it. Remove the pg_depend entry if it exists.
DELETE FROM pg_depend
WHERE classid = 'pg_am'::regclass::oid
AND objid IN (select oid from pg_am where amname = 'columnar')
AND objsubid = 0
AND refclassid = 'pg_class'::regclass::oid
AND refobjid = 'columnar.stripe_first_row_number_idx'::regclass::oid
AND refobjsubid = 0
AND deptype = 'n';
DROP INDEX columnar.stripe_first_row_number_idx;
ALTER TABLE columnar.stripe DROP COLUMN first_row_number;

View File

@ -1,4 +1,14 @@
-- columnar--10.2-3--10.2-2.sql
--
-- If we have a pg_depend entry for this index, we can not drop it as
-- the extension depends on it. Remove the pg_depend entry if it exists.
DELETE FROM pg_depend
WHERE classid = 'pg_am'::regclass::oid
AND objid IN (select oid from pg_am where amname = 'columnar')
AND objsubid = 0
AND refclassid = 'pg_class'::regclass::oid
AND refobjid = 'columnar.stripe_first_row_number_idx'::regclass::oid
AND refobjsubid = 0
AND deptype = 'n';
ALTER TABLE columnar.stripe DROP CONSTRAINT stripe_first_row_number_idx;
CREATE INDEX stripe_first_row_number_idx ON columnar.stripe USING BTREE(storage_id, first_row_number);

View File

@ -117,6 +117,7 @@ CleanupWriteStateMap(void *arg)
ColumnarWriteState *
columnar_init_write_state(Relation relation, TupleDesc tupdesc,
Oid tupSlotRelationId,
SubTransactionId currentSubXid)
{
bool found;
@ -180,7 +181,16 @@ columnar_init_write_state(Relation relation, TupleDesc tupdesc,
MemoryContext oldContext = MemoryContextSwitchTo(WriteStateContext);
ColumnarOptions columnarOptions = { 0 };
ReadColumnarOptions(relation->rd_id, &columnarOptions);
/*
* In case of a table rewrite, we need to fetch table options based on the
* relation id of the source tuple slot.
*
* For this reason, we always pass tupSlotRelationId here; which should be
* same as the target table if the write operation is not related to a table
* rewrite etc.
*/
ReadColumnarOptions(tupSlotRelationId, &columnarOptions);
SubXidWriteState *stackEntry = palloc0(sizeof(SubXidWriteState));
stackEntry->writeState = ColumnarBeginWrite(relation->rd_node,

View File

@ -22,7 +22,7 @@ SUBDIRS = . commands connection ddl deparser executor metadata operations planne
# columnar modules
SUBDIRS += ../columnar
# enterprise modules
SUBDIRS +=
SUBDIRS += replication
# Symlinks are not copied over to the build directory if a separete build
# directory is used during configure (such as on CI)

View File

@ -1,6 +1,6 @@
# Citus extension
comment = 'Citus distributed database'
default_version = '11.0-1'
default_version = '11.0-4'
module_pathname = '$libdir/citus'
relocatable = false
schema = pg_catalog

View File

@ -12,8 +12,8 @@ they are often moved to files that are named after the command.
| `create_distributed_table.c` | Implementation of UDF's for creating distributed tables |
| `drop_distributed_table.c` | Implementation for dropping metadata for partitions of distributed tables |
| `extension.c` | Implementation of `CREATE EXTENSION` commands for citus specific checks |
| `foreign_constraint.c` | Implementation of helper functions for foreign key constraints |
| `grant.c` | Placeholder for code granting users access to relations, implemented as enterprise feature |
| `foreign_constraint.c` | Implementation of and helper functions for foreign key constraints |
| `grant.c` | Implementation of `GRANT` commands for roles/users on relations |
| `index.c` | Implementation of commands specific to indices on distributed tables |
| `multi_copy.c` | Implementation of `COPY` command. There are multiple different copy modes which are described in detail below |
| `policy.c` | Implementation of `CREATE\ALTER POLICY` commands. |

View File

@ -1,89 +0,0 @@
/*-------------------------------------------------------------------------
*
* aggregate.c
* Commands for distributing AGGREGATE statements.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "distributed/metadata/dependency.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata/distobject.h"
#include "distributed/multi_executor.h"
#include "nodes/parsenodes.h"
#include "utils/lsyscache.h"
/*
* PreprocessDefineAggregateStmt only qualifies the node with schema name.
* We will handle the rest in the Postprocess phase.
*/
List *
PreprocessDefineAggregateStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
QualifyTreeNode((Node *) node);
return NIL;
}
/*
* PostprocessDefineAggregateStmt actually creates the plan we need to execute for
* aggregate propagation.
* This is the downside of using the locally created aggregate to get the sql statement.
*
* If the aggregate depends on any non-distributed relation, Citus can not distribute it.
* In order to not to prevent users from creating local aggregates on the coordinator,
* a WARNING message will be sent to the user about the case instead of erroring out.
*
* Besides creating the plan we also make sure all (new) dependencies of the aggregate
* are created on all nodes.
*/
List *
PostprocessDefineAggregateStmt(Node *node, const char *queryString)
{
DefineStmt *stmt = castNode(DefineStmt, node);
if (!ShouldPropagate())
{
return NIL;
}
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
EnsureCoordinator();
EnsureSequentialMode(OBJECT_AGGREGATE);
/* If the aggregate has any unsupported dependency, create it locally */
DeferredErrorMessage *depError = DeferErrorIfHasUnsupportedDependency(&address);
if (depError != NULL)
{
RaiseDeferredError(depError, WARNING);
return NIL;
}
EnsureDependenciesExistOnAllNodes(&address);
List *commands = CreateFunctionDDLCommandsIdempotent(&address);
commands = lcons(DISABLE_DDL_PROPAGATION, commands);
commands = lappend(commands, ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}

View File

@ -32,6 +32,8 @@
#include "access/xact.h"
#include "catalog/dependency.h"
#include "catalog/pg_am.h"
#include "catalog/pg_depend.h"
#include "catalog/pg_rewrite_d.h"
#include "columnar/columnar.h"
#include "columnar/columnar_tableam.h"
#include "commands/defrem.h"
@ -193,7 +195,6 @@ static void EnsureTableNotPartition(Oid relationId);
static TableConversionState * CreateTableConversion(TableConversionParameters *params);
static void CreateDistributedTableLike(TableConversionState *con);
static void CreateCitusTableLike(TableConversionState *con);
static List * GetViewCreationCommandsOfTable(Oid relationId);
static void ReplaceTable(Oid sourceId, Oid targetId, List *justBeforeDropCommands,
bool suppressNoticeMessages);
static bool HasAnyGeneratedStoredColumns(Oid relationId);
@ -205,16 +206,25 @@ static char * CreateWorkerChangeSequenceDependencyCommand(char *sequenceSchemaNa
char *sourceName,
char *targetSchemaName,
char *targetName);
static void ErrorIfMatViewSizeExceedsTheLimit(Oid matViewOid);
static char * CreateMaterializedViewDDLCommand(Oid matViewOid);
static char * GetAccessMethodForMatViewIfExists(Oid viewOid);
static bool WillRecreateForeignKeyToReferenceTable(Oid relationId,
CascadeToColocatedOption cascadeOption);
static void WarningsForDroppingForeignKeysWithDistributedTables(Oid relationId);
static void ErrorIfUnsupportedCascadeObjects(Oid relationId);
static bool DoesCascadeDropUnsupportedObject(Oid classId, Oid id, HTAB *nodeMap);
PG_FUNCTION_INFO_V1(undistribute_table);
PG_FUNCTION_INFO_V1(alter_distributed_table);
PG_FUNCTION_INFO_V1(alter_table_set_access_method);
PG_FUNCTION_INFO_V1(worker_change_sequence_dependency);
/* global variable keeping track of whether we are in a table type conversion function */
bool InTableTypeConversionFunctionCall = false;
/* controlled by GUC, in MB */
int MaxMatViewSizeToAutoRecreate = 1024;
/*
* undistribute_table gets a distributed table name and
@ -385,6 +395,8 @@ UndistributeTable(TableConversionParameters *params)
ErrorIfAnyPartitionRelationInvolvedInNonInheritedFKey(partitionList);
}
ErrorIfUnsupportedCascadeObjects(params->relationId);
params->conversionType = UNDISTRIBUTE_TABLE;
params->shardCountIsNull = true;
TableConversionState *con = CreateTableConversion(params);
@ -416,6 +428,8 @@ AlterDistributedTable(TableConversionParameters *params)
EnsureTableNotPartition(params->relationId);
EnsureHashDistributedTable(params->relationId);
ErrorIfUnsupportedCascadeObjects(params->relationId);
params->conversionType = ALTER_DISTRIBUTED_TABLE;
TableConversionState *con = CreateTableConversion(params);
CheckAlterDistributedTableConversionParameters(con);
@ -472,6 +486,8 @@ AlterTableSetAccessMethod(TableConversionParameters *params)
}
}
ErrorIfUnsupportedCascadeObjects(params->relationId);
params->conversionType = ALTER_TABLE_SET_ACCESS_METHOD;
params->shardCountIsNull = true;
TableConversionState *con = CreateTableConversion(params);
@ -503,10 +519,16 @@ AlterTableSetAccessMethod(TableConversionParameters *params)
*
* The function returns a TableConversionReturn object that can stores variables that
* can be used at the caller operations.
*
* To be able to provide more meaningful messages while converting a table type,
* Citus keeps InTableTypeConversionFunctionCall flag. Don't forget to set it properly
* in case you add a new way to return from this function.
*/
TableConversionReturn *
ConvertTable(TableConversionState *con)
{
InTableTypeConversionFunctionCall = true;
/*
* We undistribute citus local tables that are not chained with any reference
* tables via foreign keys at the end of the utility hook.
@ -535,6 +557,7 @@ ConvertTable(TableConversionState *con)
* subgraph including itself, so return here.
*/
SetLocalEnableLocalReferenceForeignKeys(oldEnableLocalReferenceForeignKeys);
InTableTypeConversionFunctionCall = false;
return NULL;
}
char *newAccessMethod = con->accessMethod ? con->accessMethod :
@ -562,8 +585,9 @@ ConvertTable(TableConversionState *con)
List *justBeforeDropCommands = NIL;
List *attachPartitionCommands = NIL;
postLoadCommands = list_concat(postLoadCommands,
GetViewCreationCommandsOfTable(con->relationId));
postLoadCommands =
list_concat(postLoadCommands,
GetViewCreationTableDDLCommandsOfTable(con->relationId));
List *foreignKeyCommands = NIL;
if (con->conversionType == ALTER_DISTRIBUTED_TABLE)
@ -614,6 +638,8 @@ ConvertTable(TableConversionState *con)
Oid partitionRelationId = InvalidOid;
foreach_oid(partitionRelationId, partitionList)
{
char *tableQualifiedName = generate_qualified_relation_name(
partitionRelationId);
char *detachPartitionCommand = GenerateDetachPartitionCommand(
partitionRelationId);
char *attachPartitionCommand = GenerateAlterTableAttachPartitionCommand(
@ -659,6 +685,19 @@ ConvertTable(TableConversionState *con)
foreignKeyCommands = list_concat(foreignKeyCommands,
partitionReturn->foreignKeyCommands);
}
/*
* If we are altering a partitioned distributed table by
* colocateWith:none, we override con->colocationWith parameter
* with the first newly created partition table to share the
* same colocation group for rest of partitions and partitioned
* table.
*/
if (con->colocateWith != NULL && IsColocateWithNone(con->colocateWith))
{
con->colocateWith = tableQualifiedName;
}
}
}
@ -819,6 +858,7 @@ ConvertTable(TableConversionState *con)
SetLocalEnableLocalReferenceForeignKeys(oldEnableLocalReferenceForeignKeys);
InTableTypeConversionFunctionCall = false;
return ret;
}
@ -1176,8 +1216,15 @@ CreateDistributedTableLike(TableConversionState *con)
newShardCount = con->shardCount;
}
/*
* To get the correct column name, we use the original relation id, not the
* new relation id. The reason is that the cached attributes of the original
* and newly created tables are not the same if the original table has
* dropped columns (dropped columns are still present in the attribute cache)
* Detailed example in https://github.com/citusdata/citus/pull/6387
*/
char *distributionColumnName =
ColumnToColumnName(con->newRelationId, (Node *) newDistributionKey);
ColumnToColumnName(con->relationId, (Node *) newDistributionKey);
Oid originalRelationId = con->relationId;
if (con->originalDistributionKey != NULL && PartitionTable(originalRelationId))
@ -1237,6 +1284,94 @@ CreateCitusTableLike(TableConversionState *con)
}
/*
* ErrorIfUnsupportedCascadeObjects gets oid of a relation, finds the objects
* that dropping this relation cascades into and errors if there are any extensions
* that would be dropped.
*/
static void
ErrorIfUnsupportedCascadeObjects(Oid relationId)
{
HASHCTL info;
memset(&info, 0, sizeof(info));
info.keysize = sizeof(Oid);
info.entrysize = sizeof(Oid);
info.hash = oid_hash;
uint32 hashFlags = (HASH_ELEM | HASH_FUNCTION);
HTAB *nodeMap = hash_create("object dependency map (oid)", 64, &info, hashFlags);
bool unsupportedObjectInDepGraph =
DoesCascadeDropUnsupportedObject(RelationRelationId, relationId, nodeMap);
if (unsupportedObjectInDepGraph)
{
ereport(ERROR, (errmsg("cannot alter table because an extension depends on it")));
}
}
/*
* DoesCascadeDropUnsupportedObject walks through the objects that depend on the
* object with object id and returns true if it finds any unsupported objects.
*
* This function only checks extensions as unsupported objects.
*
* Extension dependency is different than the rest. If an object depends on an extension
* dropping the object would drop the extension too.
* So we check with IsObjectAddressOwnedByExtension function.
*/
static bool
DoesCascadeDropUnsupportedObject(Oid classId, Oid objectId, HTAB *nodeMap)
{
bool found = false;
hash_search(nodeMap, &objectId, HASH_ENTER, &found);
if (found)
{
return false;
}
ObjectAddress objectAddress = { 0 };
ObjectAddressSet(objectAddress, classId, objectId);
if (IsObjectAddressOwnedByExtension(&objectAddress, NULL))
{
return true;
}
Oid targetObjectClassId = classId;
Oid targetObjectId = objectId;
List *dependencyTupleList = GetPgDependTuplesForDependingObjects(targetObjectClassId,
targetObjectId);
HeapTuple depTup = NULL;
foreach_ptr(depTup, dependencyTupleList)
{
Form_pg_depend pg_depend = (Form_pg_depend) GETSTRUCT(depTup);
Oid dependingOid = InvalidOid;
Oid dependingClassId = InvalidOid;
if (pg_depend->classid == RewriteRelationId)
{
dependingOid = GetDependingView(pg_depend);
dependingClassId = RelationRelationId;
}
else
{
dependingOid = pg_depend->objid;
dependingClassId = pg_depend->classid;
}
if (DoesCascadeDropUnsupportedObject(dependingClassId, dependingOid, nodeMap))
{
return true;
}
}
return false;
}
/*
* GetViewCreationCommandsOfTable takes a table oid generates the CREATE VIEW
* commands for views that depend to the given table. This includes the views
@ -1246,46 +1381,152 @@ List *
GetViewCreationCommandsOfTable(Oid relationId)
{
List *views = GetDependingViews(relationId);
List *commands = NIL;
Oid viewOid = InvalidOid;
foreach_oid(viewOid, views)
{
Datum viewDefinitionDatum = DirectFunctionCall1(pg_get_viewdef,
ObjectIdGetDatum(viewOid));
char *viewDefinition = TextDatumGetCString(viewDefinitionDatum);
StringInfo query = makeStringInfo();
char *viewName = get_rel_name(viewOid);
char *schemaName = get_namespace_name(get_rel_namespace(viewOid));
char *qualifiedViewName = quote_qualified_identifier(schemaName, viewName);
bool isMatView = get_rel_relkind(viewOid) == RELKIND_MATVIEW;
/* here we need to get the access method of the view to recreate it */
char *accessMethodName = GetAccessMethodForMatViewIfExists(viewOid);
appendStringInfoString(query, "CREATE ");
if (isMatView)
/* See comments on CreateMaterializedViewDDLCommand for its limitations */
if (get_rel_relkind(viewOid) == RELKIND_MATVIEW)
{
appendStringInfoString(query, "MATERIALIZED ");
ErrorIfMatViewSizeExceedsTheLimit(viewOid);
char *matViewCreateCommands = CreateMaterializedViewDDLCommand(viewOid);
appendStringInfoString(query, matViewCreateCommands);
}
else
{
char *viewCreateCommand = CreateViewDDLCommand(viewOid);
appendStringInfoString(query, viewCreateCommand);
}
appendStringInfo(query, "VIEW %s ", qualifiedViewName);
char *alterViewCommmand = AlterViewOwnerCommand(viewOid);
appendStringInfoString(query, alterViewCommmand);
if (accessMethodName)
{
appendStringInfo(query, "USING %s ", accessMethodName);
}
appendStringInfo(query, "AS %s", viewDefinition);
commands = lappend(commands, makeTableDDLCommandString(query->data));
commands = lappend(commands, query->data);
}
return commands;
}
/*
* GetViewCreationTableDDLCommandsOfTable is the same as GetViewCreationCommandsOfTable,
* but the returned list includes objects of TableDDLCommand's, not strings.
*/
List *
GetViewCreationTableDDLCommandsOfTable(Oid relationId)
{
List *commands = GetViewCreationCommandsOfTable(relationId);
List *tableDDLCommands = NIL;
char *command = NULL;
foreach_ptr(command, commands)
{
tableDDLCommands = lappend(tableDDLCommands, makeTableDDLCommandString(command));
}
return tableDDLCommands;
}
/*
* ErrorIfMatViewSizeExceedsTheLimit takes the oid of a materialized view and errors
* out if the size of the matview exceeds the limit set by the GUC
* citus.max_matview_size_to_auto_recreate.
*/
static void
ErrorIfMatViewSizeExceedsTheLimit(Oid matViewOid)
{
if (MaxMatViewSizeToAutoRecreate >= 0)
{
/* if it's below 0, it means the user has removed the limit */
Datum relSizeDatum = DirectFunctionCall1(pg_total_relation_size,
ObjectIdGetDatum(matViewOid));
uint64 matViewSize = DatumGetInt64(relSizeDatum);
/* convert from MB to bytes */
uint64 limitSizeInBytes = MaxMatViewSizeToAutoRecreate * 1024L * 1024L;
if (matViewSize > limitSizeInBytes)
{
ereport(ERROR, (errmsg("size of the materialized view %s exceeds "
"citus.max_matview_size_to_auto_recreate "
"(currently %d MB)", get_rel_name(matViewOid),
MaxMatViewSizeToAutoRecreate),
errdetail("Citus restricts automatically recreating "
"materialized views that are larger than the "
"limit, because it could take too long."),
errhint(
"Consider increasing the size limit by setting "
"citus.max_matview_size_to_auto_recreate; "
"or you can remove the limit by setting it to -1")));
}
}
}
/*
* CreateMaterializedViewDDLCommand creates the command to create materialized view.
* Note that this function doesn't support
* - Aliases
* - Storage parameters
* - Tablespace
* - WITH [NO] DATA
* options for the given materialized view. Parser functions for materialized views
* should be added to handle them.
*
* Related issue: https://github.com/citusdata/citus/issues/5968
*/
static char *
CreateMaterializedViewDDLCommand(Oid matViewOid)
{
StringInfo query = makeStringInfo();
char *viewName = get_rel_name(matViewOid);
char *schemaName = get_namespace_name(get_rel_namespace(matViewOid));
char *qualifiedViewName = quote_qualified_identifier(schemaName, viewName);
/* here we need to get the access method of the view to recreate it */
char *accessMethodName = GetAccessMethodForMatViewIfExists(matViewOid);
appendStringInfo(query, "CREATE MATERIALIZED VIEW %s ", qualifiedViewName);
if (accessMethodName)
{
appendStringInfo(query, "USING %s ", accessMethodName);
}
/*
* Set search_path to NIL so that all objects outside of pg_catalog will be
* schema-prefixed.
*/
OverrideSearchPath *overridePath = GetOverrideSearchPath(CurrentMemoryContext);
overridePath->schemas = NIL;
overridePath->addCatalog = true;
PushOverrideSearchPath(overridePath);
/*
* Push the transaction snapshot to be able to get vief definition with pg_get_viewdef
*/
PushActiveSnapshot(GetTransactionSnapshot());
Datum viewDefinitionDatum = DirectFunctionCall1(pg_get_viewdef,
ObjectIdGetDatum(matViewOid));
char *viewDefinition = TextDatumGetCString(viewDefinitionDatum);
PopActiveSnapshot();
PopOverrideSearchPath();
appendStringInfo(query, "AS %s", viewDefinition);
return query->data;
}
/*
* ReplaceTable replaces the source table with the target table.
* It moves all the rows of the source table to target table with INSERT SELECT.
@ -1789,6 +2030,19 @@ ExecuteQueryViaSPI(char *query, int SPIOK)
}
/*
* ExecuteAndLogQueryViaSPI is a wrapper around ExecuteQueryViaSPI, that logs
* the query to be executed, with the given log level.
*/
void
ExecuteAndLogQueryViaSPI(char *query, int SPIOK, int logLevel)
{
ereport(logLevel, (errmsg("executing \"%s\"", query)));
ExecuteQueryViaSPI(query, SPIOK);
}
/*
* SwitchToSequentialAndLocalExecutionIfRelationNameTooLong generates the longest shard name
* on the shards of a distributed table, and if exceeds the limit switches to sequential and

View File

@ -17,6 +17,7 @@
#include "catalog/pg_proc.h"
#include "commands/defrem.h"
#include "distributed/backend_data.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/colocation_utils.h"
#include "distributed/commands.h"

View File

@ -26,6 +26,7 @@
#include "distributed/reference_table_utils.h"
#include "distributed/relation_access_tracking.h"
#include "distributed/worker_protocol.h"
#include "executor/spi.h"
#include "miscadmin.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
@ -512,6 +513,51 @@ ExecuteCascadeOperationForRelationIdList(List *relationIdList,
}
/*
* ExecuteAndLogUtilityCommandListInTableTypeConversionViaSPI is a wrapper function
* around ExecuteAndLogQueryViaSPI, that executes view creation commands
* with the flag InTableTypeConversionFunctionCall set to true.
*/
void
ExecuteAndLogUtilityCommandListInTableTypeConversionViaSPI(List *utilityCommandList)
{
bool oldValue = InTableTypeConversionFunctionCall;
InTableTypeConversionFunctionCall = true;
MemoryContext savedMemoryContext = CurrentMemoryContext;
PG_TRY();
{
char *utilityCommand = NULL;
foreach_ptr(utilityCommand, utilityCommandList)
{
/*
* CREATE MATERIALIZED VIEW commands need to be parsed/transformed,
* which SPI does for us.
*/
ExecuteAndLogQueryViaSPI(utilityCommand, SPI_OK_UTILITY, DEBUG1);
}
}
PG_CATCH();
{
InTableTypeConversionFunctionCall = oldValue;
MemoryContextSwitchTo(savedMemoryContext);
ErrorData *errorData = CopyErrorData();
FlushErrorState();
if (errorData->elevel != ERROR)
{
PG_RE_THROW();
}
ThrowErrorData(errorData);
}
PG_END_TRY();
InTableTypeConversionFunctionCall = oldValue;
}
/*
* ExecuteAndLogUtilityCommandList takes a list of utility commands and calls
* ExecuteAndLogUtilityCommand function for each of them.

View File

@ -46,6 +46,7 @@
#include "utils/lsyscache.h"
#include "utils/ruleutils.h"
#include "utils/syscache.h"
#include "foreign/foreign.h"
/*
@ -61,6 +62,8 @@ static void ErrorIfAddingPartitionTableToMetadata(Oid relationId);
static void ErrorIfUnsupportedCreateCitusLocalTable(Relation relation);
static void ErrorIfUnsupportedCitusLocalTableKind(Oid relationId);
static void ErrorIfUnsupportedCitusLocalColumnDefinition(Relation relation);
static void EnsureIfPostgresFdwHasTableName(Oid relationId);
static void ErrorIfOptionListHasNoTableName(List *optionList);
static void NoticeIfAutoConvertingLocalTables(bool autoConverted, Oid relationId);
static CascadeOperationType GetCascadeTypeForCitusLocalTables(bool autoConverted);
static List * GetShellTableDDLEventsForCitusLocalTable(Oid relationId);
@ -81,12 +84,14 @@ static char * GetRenameShardTriggerCommand(Oid shardRelationId, char *triggerNam
uint64 shardId);
static void DropRelationTruncateTriggers(Oid relationId);
static char * GetDropTriggerCommand(Oid relationId, char *triggerName);
static void DropViewsOnTable(Oid relationId);
static List * GetRenameStatsCommandList(List *statsOidList, uint64 shardId);
static List * ReversedOidList(List *oidList);
static void AppendExplicitIndexIdsToList(Form_pg_index indexForm,
List **explicitIndexIdList,
int flags);
static void DropDefaultExpressionsAndMoveOwnedSequenceOwnerships(Oid sourceRelationId,
Oid targetRelationId);
static void DropNextValExprsAndMoveOwnedSeqOwnerships(Oid sourceRelationId,
Oid targetRelationId);
static void DropDefaultColumnDefinition(Oid relationId, char *columnName);
static void TransferSequenceOwnership(Oid ownedSequenceId, Oid targetRelationId,
char *columnName);
@ -328,6 +333,7 @@ CreateCitusLocalTable(Oid relationId, bool cascadeViaForeignKeys, bool autoConve
EnsureReferenceTablesExistOnAllNodes();
List *shellTableDDLEvents = GetShellTableDDLEventsForCitusLocalTable(relationId);
List *tableViewCreationCommands = GetViewCreationCommandsOfTable(relationId);
char *relationName = get_rel_name(relationId);
Oid relationSchemaId = get_rel_namespace(relationId);
@ -342,6 +348,12 @@ CreateCitusLocalTable(Oid relationId, bool cascadeViaForeignKeys, bool autoConve
*/
ExecuteAndLogUtilityCommandList(shellTableDDLEvents);
/*
* Execute the view creation commands with the shell table.
* Views will be distributed via FinalizeCitusLocalTableCreation below.
*/
ExecuteAndLogUtilityCommandListInTableTypeConversionViaSPI(tableViewCreationCommands);
/*
* Set shellRelationId as the relation with relationId now points
* to the shard relation.
@ -354,11 +366,11 @@ CreateCitusLocalTable(Oid relationId, bool cascadeViaForeignKeys, bool autoConve
/*
* Move sequence ownerships from shard table to shell table and also drop
* DEFAULT expressions from shard relation as we should evaluate such columns
* in shell table when needed.
* DEFAULT expressions based on sequences from shard relation as we should
* evaluate such columns in shell table when needed.
*/
DropDefaultExpressionsAndMoveOwnedSequenceOwnerships(shardRelationId,
shellRelationId);
DropNextValExprsAndMoveOwnedSeqOwnerships(shardRelationId,
shellRelationId);
InsertMetadataForCitusLocalTable(shellRelationId, shardId, autoConverted);
@ -476,6 +488,16 @@ ErrorIfUnsupportedCreateCitusLocalTable(Relation relation)
EnsureTableNotDistributed(relationId);
ErrorIfUnsupportedCitusLocalColumnDefinition(relation);
/*
* Error out with a hint if the foreign table is using postgres_fdw and
* the option table_name is not provided.
* Citus relays all the Citus local foreign table logic to the placement of the
* Citus local table. If table_name is NOT provided, Citus would try to talk to
* the foreign postgres table over the shard's table name, which would not exist
* on the remote server.
*/
EnsureIfPostgresFdwHasTableName(relationId);
/*
* When creating other citus table types, we don't need to check that case as
* EnsureTableNotDistributed already errors out if the given relation implies
@ -491,6 +513,93 @@ ErrorIfUnsupportedCreateCitusLocalTable(Relation relation)
}
/*
* ServerUsesPostgresFdw gets a foreign server Oid and returns true if the FDW that
* the server depends on is postgres_fdw. Returns false otherwise.
*/
bool
ServerUsesPostgresFdw(Oid serverId)
{
ForeignServer *server = GetForeignServer(serverId);
ForeignDataWrapper *fdw = GetForeignDataWrapper(server->fdwid);
if (strcmp(fdw->fdwname, "postgres_fdw") == 0)
{
return true;
}
return false;
}
/*
* EnsureIfPostgresFdwHasTableName errors out with a hint if the foreign table is using postgres_fdw and
* the option table_name is not provided.
*/
static void
EnsureIfPostgresFdwHasTableName(Oid relationId)
{
char relationKind = get_rel_relkind(relationId);
if (relationKind == RELKIND_FOREIGN_TABLE)
{
ForeignTable *foreignTable = GetForeignTable(relationId);
if (ServerUsesPostgresFdw(foreignTable->serverid))
{
ErrorIfOptionListHasNoTableName(foreignTable->options);
}
}
}
/*
* ErrorIfOptionListHasNoTableName gets an option list (DefElem) and errors out
* if the list does not contain a table_name element.
*/
static void
ErrorIfOptionListHasNoTableName(List *optionList)
{
char *table_nameString = "table_name";
DefElem *option = NULL;
foreach_ptr(option, optionList)
{
char *optionName = option->defname;
if (strcmp(optionName, table_nameString) == 0)
{
return;
}
}
ereport(ERROR, (errmsg(
"table_name option must be provided when using postgres_fdw with Citus"),
errhint("Provide the option \"table_name\" with value target table's"
" name")));
}
/*
* ForeignTableDropsTableNameOption returns true if given option list contains
* (DROP table_name).
*/
bool
ForeignTableDropsTableNameOption(List *optionList)
{
char *table_nameString = "table_name";
DefElem *option = NULL;
foreach_ptr(option, optionList)
{
char *optionName = option->defname;
DefElemAction optionAction = option->defaction;
if (strcmp(optionName, table_nameString) == 0 &&
optionAction == DEFELEM_DROP)
{
return true;
}
}
return false;
}
/*
* ErrorIfUnsupportedCitusLocalTableKind errors out if the relation kind of
* relation with relationId is not supported for citus local table creation.
@ -699,6 +808,9 @@ ConvertLocalTableToShard(Oid relationId)
*/
DropRelationTruncateTriggers(relationId);
/* drop views that depend on the shard table */
DropViewsOnTable(relationId);
/*
* We create INSERT|DELETE|UPDATE triggers on shard relation too.
* This is because citus prevents postgres executor to fire those
@ -1019,6 +1131,55 @@ GetDropTriggerCommand(Oid relationId, char *triggerName)
}
/*
* DropViewsOnTable drops the views that depend on the given relation.
*/
static void
DropViewsOnTable(Oid relationId)
{
List *views = GetDependingViews(relationId);
/*
* GetDependingViews returns views in the dependency order. We should drop views
* in the reversed order since dropping views can cascade to other views below.
*/
List *reverseOrderedViews = ReversedOidList(views);
Oid viewId = InvalidOid;
foreach_oid(viewId, reverseOrderedViews)
{
char *viewName = get_rel_name(viewId);
char *schemaName = get_namespace_name(get_rel_namespace(viewId));
char *qualifiedViewName = quote_qualified_identifier(schemaName, viewName);
StringInfo dropCommand = makeStringInfo();
appendStringInfo(dropCommand, "DROP %sVIEW IF EXISTS %s",
get_rel_relkind(viewId) == RELKIND_MATVIEW ? "MATERIALIZED " :
"",
qualifiedViewName);
ExecuteAndLogUtilityCommand(dropCommand->data);
}
}
/*
* ReversedOidList takes a list of oids and returns the reverse ordered version of it.
*/
static List *
ReversedOidList(List *oidList)
{
List *reversed = NIL;
Oid oid = InvalidOid;
foreach_oid(oid, oidList)
{
reversed = lcons_oid(oid, reversed);
}
return reversed;
}
/*
* GetExplicitIndexOidList returns a list of index oids defined "explicitly"
* on the relation with relationId by the "CREATE INDEX" commands. That means,
@ -1092,14 +1253,15 @@ GetRenameStatsCommandList(List *statsOidList, uint64 shardId)
/*
* DropDefaultExpressionsAndMoveOwnedSequenceOwnerships drops default column
* definitions for relation with sourceRelationId. Also, for each column that
* defaults to an owned sequence, it grants ownership to the same named column
* of the relation with targetRelationId.
* DropNextValExprsAndMoveOwnedSeqOwnerships drops default column definitions
* that are based on sequences for relation with sourceRelationId.
*
* Also, for each such column that owns a sequence, it grants ownership to the
* same named column of the relation with targetRelationId.
*/
static void
DropDefaultExpressionsAndMoveOwnedSequenceOwnerships(Oid sourceRelationId,
Oid targetRelationId)
DropNextValExprsAndMoveOwnedSeqOwnerships(Oid sourceRelationId,
Oid targetRelationId)
{
List *columnNameList = NIL;
List *ownedSequenceIdList = NIL;
@ -1110,9 +1272,28 @@ DropDefaultExpressionsAndMoveOwnedSequenceOwnerships(Oid sourceRelationId,
Oid ownedSequenceId = InvalidOid;
forboth_ptr_oid(columnName, columnNameList, ownedSequenceId, ownedSequenceIdList)
{
DropDefaultColumnDefinition(sourceRelationId, columnName);
/*
* We drop nextval() expressions because Citus currently evaluates
* nextval() on the shell table, not on the shards. Hence, there is
* no reason for keeping nextval(). Also, distributed/reference table
* shards do not have - so be consistent with those.
*
* Note that we keep other kind of DEFAULT expressions on shards
* because we still want to be able to evaluate DEFAULT expressions
* that are not based on sequences on shards, e.g., for foreign key
* - SET DEFAULT actions.
*/
AttrNumber columnAttrNumber = get_attnum(sourceRelationId, columnName);
if (ColumnDefaultsToNextVal(sourceRelationId, columnAttrNumber))
{
DropDefaultColumnDefinition(sourceRelationId, columnName);
}
/* column might not own a sequence */
/*
* Column might own a sequence without having a nextval() expr on it
* --e.g., due to ALTER SEQUENCE OWNED BY .. --, so check if that is
* the case even if the column doesn't have a DEFAULT.
*/
if (OidIsValid(ownedSequenceId))
{
TransferSequenceOwnership(ownedSequenceId, targetRelationId, columnName);

View File

@ -15,6 +15,7 @@
#include "distributed/backend_data.h"
#include "distributed/metadata_cache.h"
#include "distributed/remote_commands.h"
#include "distributed/worker_manager.h"
#include "lib/stringinfo.h"
#include "signal.h"
@ -23,6 +24,8 @@ static bool CitusSignalBackend(uint64 globalPID, uint64 timeout, int sig);
PG_FUNCTION_INFO_V1(pg_cancel_backend);
PG_FUNCTION_INFO_V1(pg_terminate_backend);
PG_FUNCTION_INFO_V1(citus_cancel_backend);
PG_FUNCTION_INFO_V1(citus_terminate_backend);
/*
* pg_cancel_backend overrides the Postgres' pg_cancel_backend to cancel
@ -47,6 +50,18 @@ pg_cancel_backend(PG_FUNCTION_ARGS)
}
/*
* citus_cancel_backend is needed to make the pg_cancel_backend SQL function
* still work after downgrading from 11.1, which changed its definition to call
* a different symbol. See #6300/e29db74 for details.
*/
Datum
citus_cancel_backend(PG_FUNCTION_ARGS)
{
return pg_cancel_backend(fcinfo);
}
/*
* pg_terminate_backend overrides the Postgres' pg_terminate_backend to terminate
* a query with a global pid so a query can be terminated from another node.
@ -70,6 +85,18 @@ pg_terminate_backend(PG_FUNCTION_ARGS)
}
/*
* citus_terminate_backend is needed to make the pg_terminate_backend SQL
* function still work after downgrading from 11.1, which changed its
* definition to call a different symbol. See #6300/e29db74 for details
*/
Datum
citus_terminate_backend(PG_FUNCTION_ARGS)
{
return pg_terminate_backend(fcinfo);
}
/*
* CitusSignalBackend gets a global pid and and ends the original query with the global pid
* that might have started in another node by connecting to that node and running either
@ -111,18 +138,39 @@ CitusSignalBackend(uint64 globalPID, uint64 timeout, int sig)
#endif
}
StringInfo queryResult = makeStringInfo();
int connectionFlags = 0;
MultiConnection *connection = GetNodeConnection(connectionFlags,
workerNode->workerName,
workerNode->workerPort);
bool reportResultError = true;
bool success = ExecuteRemoteQueryOrCommand(workerNode->workerName,
workerNode->workerPort, cancelQuery->data,
queryResult, reportResultError);
if (success && queryResult && strcmp(queryResult->data, "f") == 0)
if (!SendRemoteCommand(connection, cancelQuery->data))
{
/* if we cannot connect, we warn and report false */
ReportConnectionError(connection, WARNING);
return false;
}
bool raiseInterrupts = true;
PGresult *queryResult = GetRemoteCommandResult(connection, raiseInterrupts);
/* if remote node throws an error, we also throw an error */
if (!IsResponseOK(queryResult))
{
ReportResultError(connection, queryResult, ERROR);
}
StringInfo queryResultString = makeStringInfo();
bool success = EvaluateSingleQueryResult(connection, queryResult, queryResultString);
if (success && strcmp(queryResultString->data, "f") == 0)
{
/* worker node returned "f" */
success = false;
}
PQclear(queryResult);
bool raiseErrors = false;
ClearResults(connection, raiseErrors);
return success;
}

View File

@ -10,41 +10,105 @@
#include "postgres.h"
#include "distributed/pg_version_constants.h"
#include "commands/defrem.h"
#include "catalog/namespace.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/listutils.h"
#include "distributed/metadata_cache.h"
/* placeholder for PreprocessClusterStmt */
static bool IsClusterStmtVerbose_compat(ClusterStmt *clusterStmt);
/*
* PreprocessClusterStmt first determines whether a given cluster statement involves
* a distributed table. If so (and if it is supported, i.e. no verbose), it
* creates a DDLJob to encapsulate information needed during the worker node
* portion of DDL execution before returning that DDLJob in a List. If no
* distributed table is involved, this function returns NIL.
*/
List *
PreprocessClusterStmt(Node *node, const char *clusterCommand,
ProcessUtilityContext processUtilityContext)
{
ClusterStmt *clusterStmt = castNode(ClusterStmt, node);
bool showPropagationWarning = false;
bool missingOK = false;
DDLJob *ddlJob = NULL;
/* CLUSTER all */
if (clusterStmt->relation == NULL)
{
showPropagationWarning = true;
ereport(WARNING, (errmsg("not propagating CLUSTER command to worker nodes"),
errhint("Provide a specific table in order to CLUSTER "
"distributed tables.")));
return NIL;
}
else
/* PostgreSQL uses access exclusive lock for CLUSTER command */
Oid relationId = RangeVarGetRelid(clusterStmt->relation, AccessExclusiveLock,
missingOK);
/*
* If the table does not exist, don't do anything here to allow PostgreSQL
* to throw the appropriate error or notice message later.
*/
if (!OidIsValid(relationId))
{
bool missingOK = false;
return NIL;
}
Oid relationId = RangeVarGetRelid(clusterStmt->relation, AccessShareLock,
missingOK);
/* we have no planning to do unless the table is distributed */
bool isCitusRelation = IsCitusTable(relationId);
if (!isCitusRelation)
{
return NIL;
}
if (OidIsValid(relationId))
#if PG_VERSION_NUM >= 120000
if (IsClusterStmtVerbose_compat(clusterStmt))
#else
if (clusterStmt->verbose)
#endif
{
ereport(ERROR, (errmsg("cannot run CLUSTER command"),
errdetail("VERBOSE option is currently unsupported "
"for distributed tables.")));
}
ddlJob = palloc0(sizeof(DDLJob));
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->metadataSyncCommand = clusterCommand;
ddlJob->taskList = DDLTaskList(relationId, clusterCommand);
return list_make1(ddlJob);
}
/*
* IsClusterStmtVerbose_compat returns true if the given statement
* is a cluster statement with verbose option.
*/
static bool
IsClusterStmtVerbose_compat(ClusterStmt *clusterStmt)
{
#if PG_VERSION_NUM < PG_VERSION_14
if (clusterStmt->options & CLUOPT_VERBOSE)
{
return true;
}
return false;
#else
DefElem *opt = NULL;
foreach_ptr(opt, clusterStmt->params)
{
if (strcmp(opt->defname, "verbose") == 0)
{
showPropagationWarning = IsCitusTable(relationId);
return defGetBoolean(opt);
}
}
if (showPropagationWarning)
{
ereport(WARNING, (errmsg("not propagating CLUSTER command to worker nodes")));
}
return NIL;
return false;
#endif
}

View File

@ -36,9 +36,6 @@
static char * CreateCollationDDLInternal(Oid collationId, Oid *collowner,
char **quotedCollationName);
static List * FilterNameListForDistributedCollations(List *objects, bool missing_ok,
List **addresses);
static bool ShouldPropagateDefineCollationStmt(void);
/*
* GetCreateCollationDDLInternal returns a CREATE COLLATE sql string for the
@ -162,267 +159,6 @@ AlterCollationOwnerObjectAddress(Node *node, bool missing_ok)
}
/*
* FilterNameListForDistributedCollations takes a list of objects to delete.
* This list is filtered against the collations that are distributed.
*
* The original list will not be touched, a new list will be created with only the objects
* in there.
*
* objectAddresses is replaced with a list of object addresses for the filtered objects.
*/
static List *
FilterNameListForDistributedCollations(List *objects, bool missing_ok,
List **objectAddresses)
{
List *result = NIL;
*objectAddresses = NIL;
List *collName = NULL;
foreach_ptr(collName, objects)
{
Oid collOid = get_collation_oid(collName, true);
ObjectAddress collAddress = { 0 };
if (!OidIsValid(collOid))
{
continue;
}
ObjectAddressSet(collAddress, CollationRelationId, collOid);
if (IsObjectDistributed(&collAddress))
{
ObjectAddress *address = palloc0(sizeof(ObjectAddress));
*address = collAddress;
*objectAddresses = lappend(*objectAddresses, address);
result = lappend(result, collName);
}
}
return result;
}
List *
PreprocessDropCollationStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
/*
* We swap the list of objects to remove during deparse so we need a reference back to
* the old list to put back
*/
List *distributedTypeAddresses = NIL;
if (!ShouldPropagate())
{
return NIL;
}
QualifyTreeNode((Node *) stmt);
List *oldCollations = stmt->objects;
List *distributedCollations =
FilterNameListForDistributedCollations(oldCollations, stmt->missing_ok,
&distributedTypeAddresses);
if (list_length(distributedCollations) <= 0)
{
/* no distributed types to drop */
return NIL;
}
/*
* managing collations can only be done on the coordinator if ddl propagation is on. when
* it is off we will never get here. MX workers don't have a notion of distributed
* collations, so we block the call.
*/
EnsureCoordinator();
/*
* remove the entries for the distributed objects on dropping
*/
ObjectAddress *addressItem = NULL;
foreach_ptr(addressItem, distributedTypeAddresses)
{
UnmarkObjectDistributed(addressItem);
}
/*
* temporary swap the lists of objects to delete with the distributed objects and
* deparse to an executable sql statement for the workers
*/
stmt->objects = distributedCollations;
char *dropStmtSql = DeparseTreeNode((Node *) stmt);
stmt->objects = oldCollations;
EnsureSequentialMode(OBJECT_COLLATION);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) dropStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterCollationOwnerStmt is called for change of ownership of collations
* before the ownership is changed on the local instance.
*
* If the type for which the owner is changed is distributed we execute the change on all
* the workers to keep the type in sync across the cluster.
*/
List *
PreprocessAlterCollationOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_COLLATION);
ObjectAddress collationAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&collationAddress))
{
return NIL;
}
EnsureCoordinator();
QualifyTreeNode((Node *) stmt);
char *sql = DeparseTreeNode((Node *) stmt);
EnsureSequentialMode(OBJECT_COLLATION);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterCollationOwnerStmt is invoked after the owner has been changed locally.
* Since changing the owner could result in new dependencies being found for this object
* we re-ensure all the dependencies for the collation do exist.
*
* This is solely to propagate the new owner (and all its dependencies) if it was not
* already distributed in the cluster.
*/
List *
PostprocessAlterCollationOwnerStmt(Node *node, const char *queryString)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_COLLATION);
ObjectAddress collationAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&collationAddress))
{
return NIL;
}
EnsureDependenciesExistOnAllNodes(&collationAddress);
return NIL;
}
/*
* PreprocessRenameCollationStmt is called when the user is renaming the collation. The invocation happens
* before the statement is applied locally.
*
* As the collation already exists we have access to the ObjectAddress for the collation, this is
* used to check if the collation is distributed. If the collation is distributed the rename is
* executed on all the workers to keep the collation in sync across the cluster.
*/
List *
PreprocessRenameCollationStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
RenameStmt *stmt = castNode(RenameStmt, node);
ObjectAddress collationAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&collationAddress))
{
return NIL;
}
EnsureCoordinator();
/* fully qualify */
QualifyTreeNode((Node *) stmt);
/* deparse sql*/
char *renameStmtSql = DeparseTreeNode((Node *) stmt);
EnsureSequentialMode(OBJECT_COLLATION);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) renameStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterCollationSchemaStmt is executed before the statement is applied to the local
* postgres instance.
*
* In this stage we can prepare the commands that need to be run on all workers.
*/
List *
PreprocessAlterCollationSchemaStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_COLLATION);
ObjectAddress collationAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&collationAddress))
{
return NIL;
}
EnsureCoordinator();
QualifyTreeNode((Node *) stmt);
char *sql = DeparseTreeNode((Node *) stmt);
EnsureSequentialMode(OBJECT_COLLATION);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterCollationSchemaStmt is executed after the change has been applied locally, we
* can now use the new dependencies of the type to ensure all its dependencies exist on
* the workers before we apply the commands remotely.
*/
List *
PostprocessAlterCollationSchemaStmt(Node *node, const char *queryString)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_COLLATION);
ObjectAddress collationAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&collationAddress))
{
return NIL;
}
/* dependencies have changed (schema) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&collationAddress);
return NIL;
}
/*
* RenameCollationStmtObjectAddress returns the ObjectAddress of the type that is the object
* of the RenameStmt. Errors if missing_ok is false.
@ -544,89 +280,3 @@ DefineCollationStmtObjectAddress(Node *node, bool missing_ok)
return address;
}
/*
* PreprocessDefineCollationStmt executed before the collation has been
* created locally to ensure that if the collation create statement will
* be propagated, the node is a coordinator node
*/
List *
PreprocessDefineCollationStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
Assert(castNode(DefineStmt, node)->kind == OBJECT_COLLATION);
if (!ShouldPropagateDefineCollationStmt())
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_COLLATION);
return NIL;
}
/*
* PostprocessDefineCollationStmt executed after the collation has been
* created locally and before we create it on the worker nodes.
* As we now have access to ObjectAddress of the collation that is just
* created, we can mark it as distributed to make sure that its
* dependencies exist on all nodes.
*/
List *
PostprocessDefineCollationStmt(Node *node, const char *queryString)
{
Assert(castNode(DefineStmt, node)->kind == OBJECT_COLLATION);
if (!ShouldPropagateDefineCollationStmt())
{
return NIL;
}
ObjectAddress collationAddress =
DefineCollationStmtObjectAddress(node, false);
DeferredErrorMessage *errMsg = DeferErrorIfHasUnsupportedDependency(
&collationAddress);
if (errMsg != NULL)
{
RaiseDeferredError(errMsg, WARNING);
return NIL;
}
EnsureDependenciesExistOnAllNodes(&collationAddress);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make1(DISABLE_DDL_PROPAGATION);
commands = list_concat(commands, CreateCollationDDLsIdempotent(
collationAddress.objectId));
commands = lappend(commands, ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* ShouldPropagateDefineCollationStmt checks if collation define
* statement should be propagated. Don't propagate if:
* - metadata syncing if off
* - create statement should be propagated according the the ddl propagation policy
*/
static bool
ShouldPropagateDefineCollationStmt()
{
if (!ShouldPropagate())
{
return false;
}
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return false;
}
return true;
}

View File

@ -0,0 +1,274 @@
/*-------------------------------------------------------------------------
*
* common.c
*
* Most of the object propagation code consists of mostly the same
* operations, varying slightly in parameters passed around. This
* file contains most of the reusable logic in object propagation.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "catalog/objectaddress.h"
#include "nodes/parsenodes.h"
#include "tcop/utility.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata/dependency.h"
#include "distributed/metadata/distobject.h"
#include "distributed/multi_executor.h"
#include "distributed/worker_transaction.h"
/*
* PostprocessCreateDistributedObjectFromCatalogStmt is a common function that can be used
* for most objects during their creation phase. After the creation has happened locally
* this function creates idempotent statements to recreate the object addressed by the
* ObjectAddress of resolved from the creation statement.
*
* Since object already need to be able to create idempotent creation sql to support
* scaleout operations we can reuse this logic during the initial creation of the objects
* to reduce the complexity of implementation of new DDL commands.
*/
List *
PostprocessCreateDistributedObjectFromCatalogStmt(Node *stmt, const char *queryString)
{
const DistributeObjectOps *ops = GetDistributeObjectOps(stmt);
Assert(ops != NULL);
if (!ShouldPropagate())
{
return NIL;
}
/* check creation against multi-statement transaction policy */
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
if (ops->featureFlag && *ops->featureFlag == false)
{
/* not propagating when a configured feature flag is turned off by the user */
return NIL;
}
ObjectAddress address = GetObjectAddressFromParseTree(stmt, false);
EnsureCoordinator();
EnsureSequentialMode(ops->objectType);
/* If the object has any unsupported dependency warn, and only create locally */
DeferredErrorMessage *depError = DeferErrorIfHasUnsupportedDependency(&address);
if (depError != NULL)
{
RaiseDeferredError(depError, WARNING);
return NIL;
}
EnsureDependenciesExistOnAllNodes(&address);
List *commands = GetDependencyCreateDDLCommands(&address);
commands = lcons(DISABLE_DDL_PROPAGATION, commands);
commands = lappend(commands, ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterDistributedObjectStmt handles any updates to distributed objects by
* creating the fully qualified sql to apply to all workers after checking all
* predconditions that apply to propagating changes.
*
* Preconditions are (in order):
* - not in a CREATE/ALTER EXTENSION code block
* - citus.enable_metadata_sync is turned on
* - object being altered is distributed
* - any object specific feature flag is turned on when a feature flag is available
*
* Once we conclude to propagate the changes to the workers we make sure that the command
* has been executed on the coordinator and force any ongoing transaction to run in
* sequential mode. If any of these steps fail we raise an error to inform the user.
*
* Lastly we recreate a fully qualified version of the original sql and prepare the tasks
* to send these sql commands to the workers. These tasks include instructions to prevent
* recursion of propagation with Citus' MX functionality.
*/
List *
PreprocessAlterDistributedObjectStmt(Node *stmt, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
const DistributeObjectOps *ops = GetDistributeObjectOps(stmt);
Assert(ops != NULL);
ObjectAddress address = GetObjectAddressFromParseTree(stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
if (ops->featureFlag && *ops->featureFlag == false)
{
/* not propagating when a configured feature flag is turned off by the user */
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(ops->objectType);
QualifyTreeNode(stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterDistributedObjectStmt is the counter part of
* PreprocessAlterDistributedObjectStmt that should be executed after the object has been
* changed locally.
*
* We perform the same precondition checks as before to skip this operation if any of the
* failed during preprocessing. Since we already raised an error on other checks we don't
* have to repeat them here, as they will never fail during postprocessing.
*
* When objects get altered they can start depending on undistributed objects. Now that
* the objects has been changed locally we can find these new dependencies and make sure
* they get created on the workers before we send the command list to the workers.
*/
List *
PostprocessAlterDistributedObjectStmt(Node *stmt, const char *queryString)
{
const DistributeObjectOps *ops = GetDistributeObjectOps(stmt);
Assert(ops != NULL);
ObjectAddress address = GetObjectAddressFromParseTree(stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
if (ops->featureFlag && *ops->featureFlag == false)
{
/* not propagating when a configured feature flag is turned off by the user */
return NIL;
}
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* PreprocessDropDistributedObjectStmt is a general purpose hook that can propagate any
* DROP statement.
*
* DROP statements are one of the few DDL statements that can work on many different
* objects at once. Instead of resolving just one ObjectAddress and check it is
* distributed we will need to lookup many different object addresses. Only if an object
* was _not_ distributed we will need to remove it from the list of objects before we
* recreate the sql statement.
*
* Given that we actually _do_ need to drop them locally we can't simply remove them from
* the object list. Instead we create a new list where we only add distributed objects to.
* Before we recreate the sql statement we put this list on the drop statement, so that
* the SQL created will only contain the objects that are actually distributed in the
* cluster. After we have the SQL we restore the old list so that all objects get deleted
* locally.
*
* The reason we need to go through all this effort is taht we can't resolve the object
* addresses anymore after the objects have been removed locally. Meaning during the
* postprocessing we cannot understand which objects were distributed to begin with.
*/
List *
PreprocessDropDistributedObjectStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
/*
* We swap the list of objects to remove during deparse so we need a reference back to
* the old list to put back
*/
List *originalObjects = stmt->objects;
if (!ShouldPropagate())
{
return NIL;
}
QualifyTreeNode(node);
List *distributedObjects = NIL;
List *distributedObjectAddresses = NIL;
Node *object = NULL;
foreach_ptr(object, stmt->objects)
{
/* TODO understand if the lock should be sth else */
Relation rel = NULL; /* not used, but required to pass to get_object_address */
ObjectAddress address = get_object_address(stmt->removeType, object, &rel,
AccessShareLock, stmt->missing_ok);
if (IsObjectDistributed(&address))
{
ObjectAddress *addressPtr = palloc0(sizeof(ObjectAddress));
*addressPtr = address;
distributedObjects = lappend(distributedObjects, object);
distributedObjectAddresses = lappend(distributedObjectAddresses, addressPtr);
}
}
if (list_length(distributedObjects) <= 0)
{
/* no distributed objects to drop */
return NIL;
}
/*
* managing objects can only be done on the coordinator if ddl propagation is on. when
* it is off we will never get here. MX workers don't have a notion of distributed
* types, so we block the call.
*/
EnsureCoordinator();
/*
* remove the entries for the distributed objects on dropping
*/
ObjectAddress *address = NULL;
foreach_ptr(address, distributedObjectAddresses)
{
UnmarkObjectDistributed(address);
}
/*
* temporary swap the lists of objects to delete with the distributed objects and
* deparse to an executable sql statement for the workers
*/
stmt->objects = distributedObjects;
char *dropStmtSql = DeparseTreeNode((Node *) stmt);
stmt->objects = originalObjects;
EnsureSequentialMode(stmt->removeType);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
dropStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}

View File

@ -59,7 +59,9 @@
#include "distributed/reference_table_utils.h"
#include "distributed/relation_access_tracking.h"
#include "distributed/remote_commands.h"
#include "distributed/resource_lock.h"
#include "distributed/shared_library_init.h"
#include "distributed/shard_rebalancer.h"
#include "distributed/worker_protocol.h"
#include "distributed/worker_shard_visibility.h"
#include "distributed/worker_transaction.h"
@ -116,8 +118,7 @@ static bool ShouldLocalTableBeEmpty(Oid relationId, char distributionMethod, boo
viaDeprecatedAPI);
static void EnsureCitusTableCanBeCreated(Oid relationOid);
static void EnsureDistributedSequencesHaveOneType(Oid relationId,
List *dependentSequenceList,
List *attnumList);
List *seqInfoList);
static List * GetFKeyCreationCommandsRelationInvolvedWithTableType(Oid relationId,
int tableTypeFlag);
static Oid DropFKeysAndUndistributeTable(Oid relationId);
@ -471,9 +472,22 @@ CreateDistributedTable(Oid relationId, char *distributionColumnName,
/*
* Make sure that existing reference tables have been replicated to all the nodes
* such that we can create foreign keys and joins work immediately after creation.
*
* This will take a lock on the nodes to make sure no nodes are added after we have
* verified and ensured the reference tables are copied everywhere.
* Although copying reference tables here for anything but creating a new colocation
* group, it requires significant refactoring which we don't want to perform now.
*/
EnsureReferenceTablesExistOnAllNodes();
/*
* While adding tables to a colocation group we need to make sure no concurrent
* mutations happen on the colocation group with regards to its placements. It is
* important that we have already copied any reference tables before acquiring this
* lock as these are competing operations.
*/
LockColocationId(colocationId, ShareLock);
/* we need to calculate these variables before creating distributed metadata */
bool localTableEmpty = TableEmpty(relationId);
Oid colocatedTableId = ColocatedTableId(colocationId);
@ -588,15 +602,24 @@ EnsureSequenceTypeSupported(Oid seqOid, Oid attributeTypeId, Oid ownerRelationId
Oid citusTableId = InvalidOid;
foreach_oid(citusTableId, citusTableIdList)
{
List *attnumList = NIL;
List *dependentSequenceList = NIL;
GetDependentSequencesWithRelation(citusTableId, &attnumList,
&dependentSequenceList, 0);
AttrNumber currentAttnum = InvalidAttrNumber;
Oid currentSeqOid = InvalidOid;
forboth_int_oid(currentAttnum, attnumList, currentSeqOid,
dependentSequenceList)
List *seqInfoList = NIL;
GetDependentSequencesWithRelation(citusTableId, &seqInfoList, 0);
SequenceInfo *seqInfo = NULL;
foreach_ptr(seqInfo, seqInfoList)
{
AttrNumber currentAttnum = seqInfo->attributeNumber;
Oid currentSeqOid = seqInfo->sequenceOid;
if (!seqInfo->isNextValDefault)
{
/*
* If a sequence is not on the nextval, we don't need any check.
* This is a dependent sequence via ALTER SEQUENCE .. OWNED BY col
*/
continue;
}
/*
* If another distributed table is using the same sequence
* in one of its column defaults, make sure the types of the
@ -655,11 +678,10 @@ AlterSequenceType(Oid seqOid, Oid typeOid)
void
EnsureRelationHasCompatibleSequenceTypes(Oid relationId)
{
List *attnumList = NIL;
List *dependentSequenceList = NIL;
List *seqInfoList = NIL;
GetDependentSequencesWithRelation(relationId, &attnumList, &dependentSequenceList, 0);
EnsureDistributedSequencesHaveOneType(relationId, dependentSequenceList, attnumList);
GetDependentSequencesWithRelation(relationId, &seqInfoList, 0);
EnsureDistributedSequencesHaveOneType(relationId, seqInfoList);
}
@ -669,17 +691,26 @@ EnsureRelationHasCompatibleSequenceTypes(Oid relationId)
* dependentSequenceList, and then alters the sequence type if not the same with the column type.
*/
static void
EnsureDistributedSequencesHaveOneType(Oid relationId, List *dependentSequenceList,
List *attnumList)
EnsureDistributedSequencesHaveOneType(Oid relationId, List *seqInfoList)
{
AttrNumber attnum = InvalidAttrNumber;
Oid sequenceOid = InvalidOid;
forboth_int_oid(attnum, attnumList, sequenceOid, dependentSequenceList)
SequenceInfo *seqInfo = NULL;
foreach_ptr(seqInfo, seqInfoList)
{
if (!seqInfo->isNextValDefault)
{
/*
* If a sequence is not on the nextval, we don't need any check.
* This is a dependent sequence via ALTER SEQUENCE .. OWNED BY col
*/
continue;
}
/*
* We should make sure that the type of the column that uses
* that sequence is supported
*/
Oid sequenceOid = seqInfo->sequenceOid;
AttrNumber attnum = seqInfo->attributeNumber;
Oid attributeTypeId = GetAttributeTypeOid(relationId, attnum);
EnsureSequenceTypeSupported(sequenceOid, attributeTypeId, relationId);
@ -851,6 +882,17 @@ CreateHashDistributedTableShards(Oid relationId, int shardCount,
if (colocatedTableId != InvalidOid)
{
/*
* We currently allow concurrent distribution of colocated tables (which
* we probably should not be allowing because of foreign keys /
* partitioning etc).
*
* We also prevent concurrent shard moves / copy / splits) while creating
* a colocated table.
*/
AcquirePlacementColocationLock(colocatedTableId, ShareLock,
"colocate distributed table");
CreateColocatedShards(relationId, colocatedTableId, useExclusiveConnection);
}
else

View File

@ -33,76 +33,7 @@ static AlterOwnerStmt * RecreateAlterDatabaseOwnerStmt(Oid databaseOid);
static Oid get_database_owner(Oid db_oid);
/* controlled via GUC */
bool EnableAlterDatabaseOwner = false;
/*
* PreprocessAlterDatabaseOwnerStmt is called during the utility hook before the alter
* command is applied locally on the coordinator. This will verify if the command needs to
* be propagated to the workers and if so prepares a list of ddl commands to execute.
*/
List *
PreprocessAlterDatabaseOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_DATABASE);
ObjectAddress typeAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
if (!EnableAlterDatabaseOwner)
{
/* don't propagate if GUC is turned off */
return NIL;
}
EnsureCoordinator();
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
EnsureSequentialMode(OBJECT_DATABASE);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterDatabaseOwnerStmt is called during the utility hook after the alter
* database command has been applied locally.
*
* Its main purpose is to propagate the newly formed dependencies onto the nodes before
* applying the change of owner of the databse. This ensures, for systems that have role
* management, that the roles will be created before applying the alter owner command.
*/
List *
PostprocessAlterDatabaseOwnerStmt(Node *node, const char *queryString)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_DATABASE);
ObjectAddress typeAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
if (!EnableAlterDatabaseOwner)
{
/* don't propagate if GUC is turned off */
return NIL;
}
EnsureDependenciesExistOnAllNodes(&typeAddress);
return NIL;
}
bool EnableAlterDatabaseOwner = true;
/*

View File

@ -34,7 +34,6 @@ typedef bool (*AddressPredicate)(const ObjectAddress *);
static void EnsureDependenciesCanBeDistributed(const ObjectAddress *relationAddress);
static void ErrorIfCircularDependencyExists(const ObjectAddress *objectAddress);
static int ObjectAddressComparator(const void *a, const void *b);
static List * GetDependencyCreateDDLCommands(const ObjectAddress *dependency);
static List * FilterObjectAddressListByPredicate(List *objectAddressList,
AddressPredicate predicate);
@ -166,13 +165,30 @@ EnsureDependenciesCanBeDistributed(const ObjectAddress *objectAddress)
/*
* ErrorIfCircularDependencyExists checks whether given object has circular dependency
* with itself via existing objects of pg_dist_object.
* ErrorIfCircularDependencyExists is a wrapper around
* DeferErrorIfCircularDependencyExists(), and throws error
* if circular dependency exists.
*/
static void
ErrorIfCircularDependencyExists(const ObjectAddress *objectAddress)
{
List *dependencies = GetAllSupportedDependenciesForObject(objectAddress);
DeferredErrorMessage *depError =
DeferErrorIfCircularDependencyExists(objectAddress);
if (depError != NULL)
{
RaiseDeferredError(depError, ERROR);
}
}
/*
* DeferErrorIfCircularDependencyExists checks whether given object has
* circular dependency with itself. If so, returns a deferred error.
*/
DeferredErrorMessage *
DeferErrorIfCircularDependencyExists(const ObjectAddress *objectAddress)
{
List *dependencies = GetAllDependenciesForObject(objectAddress);
ObjectAddress *dependency = NULL;
foreach_ptr(dependency, dependencies)
@ -189,13 +205,18 @@ ErrorIfCircularDependencyExists(const ObjectAddress *objectAddress)
objectDescription = getObjectDescription(objectAddress);
#endif
ereport(ERROR, (errmsg("Citus can not handle circular dependencies "
"between distributed objects"),
errdetail("\"%s\" circularly depends itself, resolve "
"circular dependency first",
objectDescription)));
StringInfo detailInfo = makeStringInfo();
appendStringInfo(detailInfo, "\"%s\" circularly depends itself, resolve "
"circular dependency first", objectDescription);
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
"Citus can not handle circular dependencies "
"between distributed objects", detailInfo->data,
NULL);
}
}
return NULL;
}
@ -289,7 +310,7 @@ GetDistributableDependenciesForObject(const ObjectAddress *target)
* GetDependencyCreateDDLCommands returns a list (potentially empty or NIL) of ddl
* commands to execute on a worker to create the object.
*/
static List *
List *
GetDependencyCreateDDLCommands(const ObjectAddress *dependency)
{
switch (getObjectClass(dependency))
@ -349,6 +370,14 @@ GetDependencyCreateDDLCommands(const ObjectAddress *dependency)
return DDLCommandsForSequence(dependency->objectId, sequenceOwnerName);
}
if (relKind == RELKIND_VIEW)
{
char *createViewCommand = CreateViewDDLCommand(dependency->objectId);
char *alterViewOwnerCommand = AlterViewOwnerCommand(dependency->objectId);
return list_make2(createViewCommand, alterViewOwnerCommand);
}
/* if this relation is not supported, break to the error at the end */
break;
}
@ -358,6 +387,15 @@ GetDependencyCreateDDLCommands(const ObjectAddress *dependency)
return CreateCollationDDLsIdempotent(dependency->objectId);
}
case OCLASS_CONSTRAINT:
{
/*
* Constraints can only be reached by domains, they resolve functions.
* Constraints themself are recreated by the domain recreation.
*/
return NIL;
}
case OCLASS_DATABASE:
{
List *databaseDDLCommands = NIL;
@ -374,7 +412,10 @@ GetDependencyCreateDDLCommands(const ObjectAddress *dependency)
case OCLASS_PROC:
{
return CreateFunctionDDLCommandsIdempotent(dependency);
List *DDLCommands = CreateFunctionDDLCommandsIdempotent(dependency);
List *grantDDLCommands = GrantOnFunctionDDLCommands(dependency->objectId);
DDLCommands = list_concat(DDLCommands, grantDDLCommands);
return DDLCommands;
}
case OCLASS_ROLE:
@ -417,7 +458,13 @@ GetDependencyCreateDDLCommands(const ObjectAddress *dependency)
case OCLASS_FOREIGN_SERVER:
{
return GetForeignServerCreateDDLCommand(dependency->objectId);
Oid serverId = dependency->objectId;
List *DDLCommands = GetForeignServerCreateDDLCommand(serverId);
List *grantDDLCommands = GrantOnForeignServerDDLCommands(serverId);
DDLCommands = list_concat(DDLCommands, grantDDLCommands);
return DDLCommands;
}
default:

View File

@ -16,6 +16,7 @@
#include "distributed/deparser.h"
#include "distributed/pg_version_constants.h"
#include "distributed/version_compat.h"
#include "distributed/commands/utility_hook.h"
static DistributeObjectOps NoDistributeOps = {
.deparse = NULL,
@ -28,31 +29,34 @@ static DistributeObjectOps NoDistributeOps = {
static DistributeObjectOps Aggregate_AlterObjectSchema = {
.deparse = DeparseAlterFunctionSchemaStmt,
.qualify = QualifyAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterFunctionSchemaStmt,
.postprocess = PostprocessAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Aggregate_AlterOwner = {
.deparse = DeparseAlterFunctionOwnerStmt,
.qualify = QualifyAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterFunctionOwnerStmt,
.postprocess = PostprocessAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Aggregate_Define = {
.deparse = NULL,
.qualify = QualifyDefineAggregateStmt,
.preprocess = PreprocessDefineAggregateStmt,
.postprocess = PostprocessDefineAggregateStmt,
.preprocess = NULL,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_AGGREGATE,
.address = DefineAggregateStmtObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps Aggregate_Drop = {
.deparse = DeparseDropFunctionStmt,
.qualify = NULL,
.preprocess = PreprocessDropFunctionStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -60,16 +64,18 @@ static DistributeObjectOps Aggregate_Drop = {
static DistributeObjectOps Aggregate_Rename = {
.deparse = DeparseRenameFunctionStmt,
.qualify = QualifyRenameFunctionStmt,
.preprocess = PreprocessRenameFunctionStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_FUNCTION,
.address = RenameFunctionStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Any_AlterEnum = {
.deparse = DeparseAlterEnumStmt,
.qualify = QualifyAlterEnumStmt,
.preprocess = PreprocessAlterEnumStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TYPE,
.address = AlterEnumStmtObjectAddress,
.markDistributed = false,
};
@ -92,9 +98,10 @@ static DistributeObjectOps Any_AlterExtensionContents = {
static DistributeObjectOps Any_AlterForeignServer = {
.deparse = DeparseAlterForeignServerStmt,
.qualify = NULL,
.preprocess = PreprocessAlterForeignServerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.objectType = OBJECT_FOREIGN_SERVER,
.address = AlterForeignServerStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Any_AlterFunction = {
@ -148,16 +155,29 @@ static DistributeObjectOps Any_Cluster = {
static DistributeObjectOps Any_CompositeType = {
.deparse = DeparseCompositeTypeStmt,
.qualify = QualifyCompositeTypeStmt,
.preprocess = PreprocessCompositeTypeStmt,
.postprocess = PostprocessCompositeTypeStmt,
.preprocess = NULL,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_TYPE,
.featureFlag = &EnableCreateTypePropagation,
.address = CompositeTypeStmtObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps Any_CreateDomain = {
.deparse = DeparseCreateDomainStmt,
.qualify = QualifyCreateDomainStmt,
.preprocess = NULL,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_DOMAIN,
.address = CreateDomainStmtObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps Any_CreateEnum = {
.deparse = DeparseCreateEnumStmt,
.qualify = QualifyCreateEnumStmt,
.preprocess = PreprocessCreateEnumStmt,
.postprocess = PostprocessCreateEnumStmt,
.preprocess = NULL,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_TYPE,
.featureFlag = &EnableCreateTypePropagation,
.address = CreateEnumStmtObjectAddress,
.markDistributed = true,
};
@ -177,10 +197,34 @@ static DistributeObjectOps Any_CreateFunction = {
.address = CreateFunctionStmtObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps Any_View = {
.deparse = NULL,
.qualify = NULL,
.preprocess = PreprocessViewStmt,
.postprocess = PostprocessViewStmt,
.address = ViewStmtObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps Any_CreatePolicy = {
.deparse = NULL,
.qualify = NULL,
.preprocess = PreprocessCreatePolicyStmt,
.preprocess = NULL,
.postprocess = PostprocessCreatePolicyStmt,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Any_CreateRole = {
.deparse = DeparseCreateRoleStmt,
.qualify = NULL,
.preprocess = PreprocessCreateRoleStmt,
.postprocess = NULL,
.address = CreateRoleStmtObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps Any_DropRole = {
.deparse = DeparseDropRoleStmt,
.qualify = NULL,
.preprocess = PreprocessDropRoleStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -188,8 +232,9 @@ static DistributeObjectOps Any_CreatePolicy = {
static DistributeObjectOps Any_CreateForeignServer = {
.deparse = DeparseCreateForeignServerStmt,
.qualify = NULL,
.preprocess = PreprocessCreateForeignServerStmt,
.postprocess = PostprocessCreateForeignServerStmt,
.preprocess = NULL,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_FOREIGN_SERVER,
.address = CreateForeignServerStmtObjectAddress,
.markDistributed = true,
};
@ -225,6 +270,14 @@ static DistributeObjectOps Any_Grant = {
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Any_GrantRole = {
.deparse = DeparseGrantRoleStmt,
.qualify = NULL,
.preprocess = PreprocessGrantRoleStmt,
.postprocess = PostprocessGrantRoleStmt,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Any_Index = {
.deparse = NULL,
.qualify = NULL,
@ -260,31 +313,34 @@ static DistributeObjectOps Attribute_Rename = {
static DistributeObjectOps Collation_AlterObjectSchema = {
.deparse = DeparseAlterCollationSchemaStmt,
.qualify = QualifyAlterCollationSchemaStmt,
.preprocess = PreprocessAlterCollationSchemaStmt,
.postprocess = PostprocessAlterCollationSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_COLLATION,
.address = AlterCollationSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Collation_AlterOwner = {
.deparse = DeparseAlterCollationOwnerStmt,
.qualify = QualifyAlterCollationOwnerStmt,
.preprocess = PreprocessAlterCollationOwnerStmt,
.postprocess = PostprocessAlterCollationOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_COLLATION,
.address = AlterCollationOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Collation_Define = {
.deparse = NULL,
.qualify = NULL,
.preprocess = PreprocessDefineCollationStmt,
.postprocess = PostprocessDefineCollationStmt,
.preprocess = NULL,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_COLLATION,
.address = DefineCollationStmtObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps Collation_Drop = {
.deparse = DeparseDropCollationStmt,
.qualify = QualifyDropCollationStmt,
.preprocess = PreprocessDropCollationStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -292,19 +348,76 @@ static DistributeObjectOps Collation_Drop = {
static DistributeObjectOps Collation_Rename = {
.deparse = DeparseRenameCollationStmt,
.qualify = QualifyRenameCollationStmt,
.preprocess = PreprocessRenameCollationStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_COLLATION,
.address = RenameCollationStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Database_AlterOwner = {
.deparse = DeparseAlterDatabaseOwnerStmt,
.qualify = NULL,
.preprocess = PreprocessAlterDatabaseOwnerStmt,
.postprocess = PostprocessAlterDatabaseOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_DATABASE,
.featureFlag = &EnableAlterDatabaseOwner,
.address = AlterDatabaseOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Domain_Alter = {
.deparse = DeparseAlterDomainStmt,
.qualify = QualifyAlterDomainStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_DOMAIN,
.address = AlterDomainStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Domain_AlterObjectSchema = {
.deparse = DeparseAlterDomainSchemaStmt,
.qualify = QualifyAlterDomainSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_DOMAIN,
.address = AlterTypeSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Domain_AlterOwner = {
.deparse = DeparseAlterDomainOwnerStmt,
.qualify = QualifyAlterDomainOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_DOMAIN,
.address = AlterDomainOwnerStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Domain_Drop = {
.deparse = DeparseDropDomainStmt,
.qualify = QualifyDropDomainStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Domain_Rename = {
.deparse = DeparseRenameDomainStmt,
.qualify = QualifyRenameDomainStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_DOMAIN,
.address = RenameDomainStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Domain_RenameConstraint = {
.deparse = DeparseDomainRenameConstraintStmt,
.qualify = QualifyDomainRenameConstraintStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_DOMAIN,
.address = DomainRenameConstraintStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Extension_AlterObjectSchema = {
.deparse = DeparseAlterExtensionSchemaStmt,
.qualify = NULL,
@ -321,10 +434,26 @@ static DistributeObjectOps Extension_Drop = {
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps FDW_Grant = {
.deparse = DeparseGrantOnFDWStmt,
.qualify = NULL,
.preprocess = PreprocessGrantOnFDWStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps ForeignServer_Drop = {
.deparse = DeparseDropForeignServerStmt,
.qualify = NULL,
.preprocess = PreprocessDropForeignServerStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps ForeignServer_Grant = {
.deparse = DeparseGrantOnForeignServerStmt,
.qualify = NULL,
.preprocess = PreprocessGrantOnForeignServerStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -332,16 +461,18 @@ static DistributeObjectOps ForeignServer_Drop = {
static DistributeObjectOps ForeignServer_Rename = {
.deparse = DeparseAlterForeignServerRenameStmt,
.qualify = NULL,
.preprocess = PreprocessRenameForeignServerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.objectType = OBJECT_FOREIGN_SERVER,
.address = RenameForeignServerStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps ForeignServer_AlterOwner = {
.deparse = DeparseAlterForeignServerOwnerStmt,
.qualify = NULL,
.preprocess = PreprocessAlterForeignServerOwnerStmt,
.postprocess = PostprocessAlterForeignServerOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FOREIGN_SERVER,
.address = AlterForeignServerOwnerStmtObjectAddress,
.markDistributed = false,
};
@ -364,23 +495,41 @@ static DistributeObjectOps Function_AlterObjectDepends = {
static DistributeObjectOps Function_AlterObjectSchema = {
.deparse = DeparseAlterFunctionSchemaStmt,
.qualify = QualifyAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterFunctionSchemaStmt,
.postprocess = PostprocessAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Function_AlterOwner = {
.deparse = DeparseAlterFunctionOwnerStmt,
.qualify = QualifyAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterFunctionOwnerStmt,
.postprocess = PostprocessAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Function_Drop = {
.deparse = DeparseDropFunctionStmt,
.qualify = NULL,
.preprocess = PreprocessDropFunctionStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Function_Grant = {
.deparse = DeparseGrantOnFunctionStmt,
.qualify = NULL,
.preprocess = PreprocessGrantOnFunctionStmt,
.postprocess = PostprocessGrantOnFunctionStmt,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps View_Drop = {
.deparse = DeparseDropViewStmt,
.qualify = QualifyDropViewStmt,
.preprocess = PreprocessDropViewStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -388,8 +537,9 @@ static DistributeObjectOps Function_Drop = {
static DistributeObjectOps Function_Rename = {
.deparse = DeparseRenameFunctionStmt,
.qualify = QualifyRenameFunctionStmt,
.preprocess = PreprocessRenameFunctionStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_FUNCTION,
.address = RenameFunctionStmtObjectAddress,
.markDistributed = false,
};
@ -428,32 +578,43 @@ static DistributeObjectOps Procedure_AlterObjectDepends = {
static DistributeObjectOps Procedure_AlterObjectSchema = {
.deparse = DeparseAlterFunctionSchemaStmt,
.qualify = QualifyAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterFunctionSchemaStmt,
.postprocess = PostprocessAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Procedure_AlterOwner = {
.deparse = DeparseAlterFunctionOwnerStmt,
.qualify = QualifyAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterFunctionOwnerStmt,
.postprocess = PostprocessAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Procedure_Drop = {
.deparse = DeparseDropFunctionStmt,
.qualify = NULL,
.preprocess = PreprocessDropFunctionStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Procedure_Grant = {
.deparse = DeparseGrantOnFunctionStmt,
.qualify = NULL,
.preprocess = PreprocessGrantOnFunctionStmt,
.postprocess = PostprocessGrantOnFunctionStmt,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Procedure_Rename = {
.deparse = DeparseRenameFunctionStmt,
.qualify = QualifyRenameFunctionStmt,
.preprocess = PreprocessRenameFunctionStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_FUNCTION,
.address = RenameFunctionStmtObjectAddress,
.markDistributed = false,
};
@ -491,12 +652,20 @@ static DistributeObjectOps Sequence_AlterOwner = {
};
static DistributeObjectOps Sequence_Drop = {
.deparse = DeparseDropSequenceStmt,
.qualify = NULL,
.qualify = QualifyDropSequenceStmt,
.preprocess = PreprocessDropSequenceStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Sequence_Grant = {
.deparse = DeparseGrantOnSequenceStmt,
.qualify = QualifyGrantOnSequenceStmt,
.preprocess = PreprocessGrantOnSequenceStmt,
.postprocess = PostprocessGrantOnSequenceStmt,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Sequence_Rename = {
.deparse = DeparseRenameSequenceStmt,
.qualify = QualifyRenameSequenceStmt,
@ -508,32 +677,36 @@ static DistributeObjectOps Sequence_Rename = {
static DistributeObjectOps TextSearchConfig_Alter = {
.deparse = DeparseAlterTextSearchConfigurationStmt,
.qualify = QualifyAlterTextSearchConfigurationStmt,
.preprocess = PreprocessAlterTextSearchConfigurationStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TSCONFIGURATION,
.address = AlterTextSearchConfigurationStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps TextSearchConfig_AlterObjectSchema = {
.deparse = DeparseAlterTextSearchConfigurationSchemaStmt,
.qualify = QualifyAlterTextSearchConfigurationSchemaStmt,
.preprocess = PreprocessAlterTextSearchConfigurationSchemaStmt,
.postprocess = PostprocessAlterTextSearchConfigurationSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_TSCONFIGURATION,
.address = AlterTextSearchConfigurationSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps TextSearchConfig_AlterOwner = {
.deparse = DeparseAlterTextSearchConfigurationOwnerStmt,
.qualify = QualifyAlterTextSearchConfigurationOwnerStmt,
.preprocess = PreprocessAlterTextSearchConfigurationOwnerStmt,
.postprocess = PostprocessAlterTextSearchConfigurationOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_TSCONFIGURATION,
.address = AlterTextSearchConfigurationOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps TextSearchConfig_Comment = {
.deparse = DeparseTextSearchConfigurationCommentStmt,
.qualify = QualifyTextSearchConfigurationCommentStmt,
.preprocess = PreprocessTextSearchConfigurationCommentStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TSCONFIGURATION,
.address = TextSearchConfigurationCommentObjectAddress,
.markDistributed = false,
};
@ -541,14 +714,15 @@ static DistributeObjectOps TextSearchConfig_Define = {
.deparse = DeparseCreateTextSearchConfigurationStmt,
.qualify = NULL,
.preprocess = NULL,
.postprocess = PostprocessCreateTextSearchConfigurationStmt,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_TSCONFIGURATION,
.address = CreateTextSearchConfigurationObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps TextSearchConfig_Drop = {
.deparse = DeparseDropTextSearchConfigurationStmt,
.qualify = QualifyDropTextSearchConfigurationStmt,
.preprocess = PreprocessDropTextSearchConfigurationStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -556,40 +730,45 @@ static DistributeObjectOps TextSearchConfig_Drop = {
static DistributeObjectOps TextSearchConfig_Rename = {
.deparse = DeparseRenameTextSearchConfigurationStmt,
.qualify = QualifyRenameTextSearchConfigurationStmt,
.preprocess = PreprocessRenameTextSearchConfigurationStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TSCONFIGURATION,
.address = RenameTextSearchConfigurationStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps TextSearchDict_Alter = {
.deparse = DeparseAlterTextSearchDictionaryStmt,
.qualify = QualifyAlterTextSearchDictionaryStmt,
.preprocess = PreprocessAlterTextSearchDictionaryStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TSDICTIONARY,
.address = AlterTextSearchDictionaryStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps TextSearchDict_AlterObjectSchema = {
.deparse = DeparseAlterTextSearchDictionarySchemaStmt,
.qualify = QualifyAlterTextSearchDictionarySchemaStmt,
.preprocess = PreprocessAlterTextSearchDictionarySchemaStmt,
.postprocess = PostprocessAlterTextSearchDictionarySchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_TSDICTIONARY,
.address = AlterTextSearchDictionarySchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps TextSearchDict_AlterOwner = {
.deparse = DeparseAlterTextSearchDictionaryOwnerStmt,
.qualify = QualifyAlterTextSearchDictionaryOwnerStmt,
.preprocess = PreprocessAlterTextSearchDictionaryOwnerStmt,
.postprocess = PostprocessAlterTextSearchDictionaryOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_TSDICTIONARY,
.address = AlterTextSearchDictOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps TextSearchDict_Comment = {
.deparse = DeparseTextSearchDictionaryCommentStmt,
.qualify = QualifyTextSearchDictionaryCommentStmt,
.preprocess = PreprocessTextSearchDictionaryCommentStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TSDICTIONARY,
.address = TextSearchDictCommentObjectAddress,
.markDistributed = false,
};
@ -597,14 +776,15 @@ static DistributeObjectOps TextSearchDict_Define = {
.deparse = DeparseCreateTextSearchDictionaryStmt,
.qualify = NULL,
.preprocess = NULL,
.postprocess = PostprocessCreateTextSearchDictionaryStmt,
.postprocess = PostprocessCreateDistributedObjectFromCatalogStmt,
.objectType = OBJECT_TSDICTIONARY,
.address = CreateTextSearchDictObjectAddress,
.markDistributed = true,
};
static DistributeObjectOps TextSearchDict_Drop = {
.deparse = DeparseDropTextSearchDictionaryStmt,
.qualify = QualifyDropTextSearchDictionaryStmt,
.preprocess = PreprocessDropTextSearchDictionaryStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -612,8 +792,9 @@ static DistributeObjectOps TextSearchDict_Drop = {
static DistributeObjectOps TextSearchDict_Rename = {
.deparse = DeparseRenameTextSearchDictionaryStmt,
.qualify = QualifyRenameTextSearchDictionaryStmt,
.preprocess = PreprocessRenameTextSearchDictionaryStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TSDICTIONARY,
.address = RenameTextSearchDictionaryStmtObjectAddress,
.markDistributed = false,
};
@ -628,32 +809,43 @@ static DistributeObjectOps Trigger_AlterObjectDepends = {
static DistributeObjectOps Routine_AlterObjectSchema = {
.deparse = DeparseAlterFunctionSchemaStmt,
.qualify = QualifyAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterFunctionSchemaStmt,
.postprocess = PostprocessAlterFunctionSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Routine_AlterOwner = {
.deparse = DeparseAlterFunctionOwnerStmt,
.qualify = QualifyAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterFunctionOwnerStmt,
.postprocess = PostprocessAlterFunctionOwnerStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_FUNCTION,
.address = AlterFunctionOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Routine_Drop = {
.deparse = DeparseDropFunctionStmt,
.qualify = NULL,
.preprocess = PreprocessDropFunctionStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Routine_Grant = {
.deparse = DeparseGrantOnFunctionStmt,
.qualify = NULL,
.preprocess = PreprocessGrantOnFunctionStmt,
.postprocess = PostprocessGrantOnFunctionStmt,
.address = NULL,
.markDistributed = false,
};
static DistributeObjectOps Routine_Rename = {
.deparse = DeparseRenameFunctionStmt,
.qualify = QualifyRenameFunctionStmt,
.preprocess = PreprocessRenameFunctionStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_FUNCTION,
.address = RenameFunctionStmtObjectAddress,
.markDistributed = false,
};
@ -676,8 +868,9 @@ static DistributeObjectOps Schema_Grant = {
static DistributeObjectOps Schema_Rename = {
.deparse = DeparseAlterSchemaRenameStmt,
.qualify = NULL,
.preprocess = PreprocessAlterSchemaRenameStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_SCHEMA,
.address = AlterSchemaRenameStmtObjectAddress,
.markDistributed = false,
};
@ -750,31 +943,66 @@ static DistributeObjectOps Table_Drop = {
static DistributeObjectOps Type_AlterObjectSchema = {
.deparse = DeparseAlterTypeSchemaStmt,
.qualify = QualifyAlterTypeSchemaStmt,
.preprocess = PreprocessAlterTypeSchemaStmt,
.postprocess = PostprocessAlterTypeSchemaStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_TYPE,
.address = AlterTypeSchemaStmtObjectAddress,
.markDistributed = false,
};
/*
* PreprocessAlterViewSchemaStmt and PostprocessAlterViewSchemaStmt functions can be called
* internally by ALTER TABLE view_name SET SCHEMA ... if the ALTER TABLE command targets a
* view. In other words ALTER VIEW view_name SET SCHEMA will use the View_AlterObjectSchema
* but ALTER TABLE view_name SET SCHEMA will use Table_AlterObjectSchema but call process
* functions of View_AlterObjectSchema internally.
*/
static DistributeObjectOps View_AlterObjectSchema = {
.deparse = DeparseAlterViewSchemaStmt,
.qualify = QualifyAlterViewSchemaStmt,
.preprocess = PreprocessAlterViewSchemaStmt,
.postprocess = PostprocessAlterViewSchemaStmt,
.address = AlterViewSchemaStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Type_AlterOwner = {
.deparse = DeparseAlterTypeOwnerStmt,
.qualify = QualifyAlterTypeOwnerStmt,
.preprocess = PreprocessAlterTypeOwnerStmt,
.postprocess = NULL,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = PostprocessAlterDistributedObjectStmt,
.objectType = OBJECT_TYPE,
.address = AlterTypeOwnerObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Type_AlterTable = {
.deparse = DeparseAlterTypeStmt,
.qualify = QualifyAlterTypeStmt,
.preprocess = PreprocessAlterTypeStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TYPE,
.address = AlterTypeStmtObjectAddress,
.markDistributed = false,
};
/*
* PreprocessAlterViewStmt and PostprocessAlterViewStmt functions can be called internally
* by ALTER TABLE view_name SET/RESET ... if the ALTER TABLE command targets a view. In
* other words ALTER VIEW view_name SET/RESET will use the View_AlterView
* but ALTER TABLE view_name SET/RESET will use Table_AlterTable but call process
* functions of View_AlterView internally.
*/
static DistributeObjectOps View_AlterView = {
.deparse = DeparseAlterViewStmt,
.qualify = QualifyAlterViewStmt,
.preprocess = PreprocessAlterViewStmt,
.postprocess = PostprocessAlterViewStmt,
.address = AlterViewStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Type_Drop = {
.deparse = DeparseDropTypeStmt,
.qualify = NULL,
.preprocess = PreprocessDropTypeStmt,
.preprocess = PreprocessDropDistributedObjectStmt,
.postprocess = NULL,
.address = NULL,
.markDistributed = false,
@ -790,11 +1018,27 @@ static DistributeObjectOps Trigger_Drop = {
static DistributeObjectOps Type_Rename = {
.deparse = DeparseRenameTypeStmt,
.qualify = QualifyRenameTypeStmt,
.preprocess = PreprocessRenameTypeStmt,
.preprocess = PreprocessAlterDistributedObjectStmt,
.postprocess = NULL,
.objectType = OBJECT_TYPE,
.address = RenameTypeStmtObjectAddress,
.markDistributed = false,
};
/*
* PreprocessRenameViewStmt function can be called internally by ALTER TABLE view_name
* RENAME ... if the ALTER TABLE command targets a view or a view's column. In other words
* ALTER VIEW view_name RENAME will use the View_Rename but ALTER TABLE view_name RENAME
* will use Any_Rename but call process functions of View_Rename internally.
*/
static DistributeObjectOps View_Rename = {
.deparse = DeparseRenameViewStmt,
.qualify = QualifyRenameViewStmt,
.preprocess = PreprocessRenameViewStmt,
.postprocess = NULL,
.address = RenameViewStmtObjectAddress,
.markDistributed = false,
};
static DistributeObjectOps Trigger_Rename = {
.deparse = NULL,
.qualify = NULL,
@ -815,6 +1059,11 @@ GetDistributeObjectOps(Node *node)
{
switch (nodeTag(node))
{
case T_AlterDomainStmt:
{
return &Domain_Alter;
}
case T_AlterEnumStmt:
{
return &Any_AlterEnum;
@ -887,6 +1136,11 @@ GetDistributeObjectOps(Node *node)
return &Collation_AlterObjectSchema;
}
case OBJECT_DOMAIN:
{
return &Domain_AlterObjectSchema;
}
case OBJECT_EXTENSION:
{
return &Extension_AlterObjectSchema;
@ -938,6 +1192,11 @@ GetDistributeObjectOps(Node *node)
return &Type_AlterObjectSchema;
}
case OBJECT_VIEW:
{
return &View_AlterObjectSchema;
}
default:
{
return &NoDistributeOps;
@ -965,6 +1224,11 @@ GetDistributeObjectOps(Node *node)
return &Database_AlterOwner;
}
case OBJECT_DOMAIN:
{
return &Domain_AlterOwner;
}
case OBJECT_FOREIGN_SERVER:
{
return &ForeignServer_AlterOwner;
@ -1069,6 +1333,11 @@ GetDistributeObjectOps(Node *node)
return &Sequence_AlterOwner;
}
case OBJECT_VIEW:
{
return &View_AlterView;
}
default:
{
return &NoDistributeOps;
@ -1123,6 +1392,11 @@ GetDistributeObjectOps(Node *node)
return &Any_CompositeType;
}
case T_CreateDomainStmt:
{
return &Any_CreateDomain;
}
case T_CreateEnumStmt:
{
return &Any_CreateEnum;
@ -1148,6 +1422,11 @@ GetDistributeObjectOps(Node *node)
return &Any_CreatePolicy;
}
case T_CreateRoleStmt:
{
return &Any_CreateRole;
}
case T_CreateSchemaStmt:
{
return &Any_CreateSchema;
@ -1195,6 +1474,11 @@ GetDistributeObjectOps(Node *node)
}
}
case T_DropRoleStmt:
{
return &Any_DropRole;
}
case T_DropStmt:
{
DropStmt *stmt = castNode(DropStmt, node);
@ -1210,6 +1494,11 @@ GetDistributeObjectOps(Node *node)
return &Collation_Drop;
}
case OBJECT_DOMAIN:
{
return &Domain_Drop;
}
case OBJECT_EXTENSION:
{
return &Extension_Drop;
@ -1285,6 +1574,11 @@ GetDistributeObjectOps(Node *node)
return &Trigger_Drop;
}
case OBJECT_VIEW:
{
return &View_Drop;
}
default:
{
return &NoDistributeOps;
@ -1292,6 +1586,11 @@ GetDistributeObjectOps(Node *node)
}
}
case T_GrantRoleStmt:
{
return &Any_GrantRole;
}
case T_GrantStmt:
{
GrantStmt *stmt = castNode(GrantStmt, node);
@ -1302,6 +1601,36 @@ GetDistributeObjectOps(Node *node)
return &Schema_Grant;
}
case OBJECT_SEQUENCE:
{
return &Sequence_Grant;
}
case OBJECT_FDW:
{
return &FDW_Grant;
}
case OBJECT_FOREIGN_SERVER:
{
return &ForeignServer_Grant;
}
case OBJECT_FUNCTION:
{
return &Function_Grant;
}
case OBJECT_PROCEDURE:
{
return &Procedure_Grant;
}
case OBJECT_ROUTINE:
{
return &Routine_Grant;
}
default:
{
return &Any_Grant;
@ -1314,6 +1643,11 @@ GetDistributeObjectOps(Node *node)
return &Any_Index;
}
case T_ViewStmt:
{
return &Any_View;
}
case T_ReindexStmt:
{
return &Any_Reindex;
@ -1339,6 +1673,16 @@ GetDistributeObjectOps(Node *node)
return &Collation_Rename;
}
case OBJECT_DOMAIN:
{
return &Domain_Rename;
}
case OBJECT_DOMCONSTRAINT:
{
return &Domain_RenameConstraint;
}
case OBJECT_FOREIGN_SERVER:
{
return &ForeignServer_Rename;
@ -1394,6 +1738,27 @@ GetDistributeObjectOps(Node *node)
return &Trigger_Rename;
}
case OBJECT_VIEW:
{
return &View_Rename;
}
case OBJECT_COLUMN:
{
switch (stmt->relationType)
{
case OBJECT_VIEW:
{
return &View_Rename;
}
default:
{
return &Any_Rename;
}
}
}
default:
{
return &Any_Rename;

View File

@ -0,0 +1,328 @@
/*-------------------------------------------------------------------------
*
* domain.c
* Hooks to handle the creation, altering and removal of domains.
* These hooks are responsible for duplicating the changes to the
* workers nodes.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/genam.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_type.h"
#include "nodes/makefuncs.h"
#include "parser/parse_type.h"
#include "tcop/utility.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/regproc.h"
#include "utils/syscache.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "distributed/metadata/distobject.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata_utility.h"
#include "distributed/multi_executor.h"
#include "distributed/worker_create_or_replace.h"
#include "distributed/worker_transaction.h"
static CollateClause * MakeCollateClauseFromOid(Oid collationOid);
static ObjectAddress GetDomainAddressByName(TypeName *domainName, bool missing_ok);
/*
* GetDomainAddressByName returns the ObjectAddress of the domain identified by
* domainName. When missing_ok is true the object id part of the ObjectAddress can be
* InvalidOid. When missing_ok is false this function will raise an error instead when the
* domain can't be found.
*/
static ObjectAddress
GetDomainAddressByName(TypeName *domainName, bool missing_ok)
{
ObjectAddress address = { 0 };
Oid domainOid = LookupTypeNameOid(NULL, domainName, missing_ok);
ObjectAddressSet(address, TypeRelationId, domainOid);
return address;
}
/*
* RecreateDomainStmt returns a CreateDomainStmt pointer where the statement represents
* the creation of the domain to recreate the domain on a different postgres node based on
* the current representation in the local catalog.
*/
CreateDomainStmt *
RecreateDomainStmt(Oid domainOid)
{
CreateDomainStmt *stmt = makeNode(CreateDomainStmt);
stmt->domainname = stringToQualifiedNameList(format_type_be_qualified(domainOid));
HeapTuple tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(domainOid));
if (!HeapTupleIsValid(tup))
{
elog(ERROR, "cache lookup failed for type %u", domainOid);
}
Form_pg_type typTup = (Form_pg_type) GETSTRUCT(tup);
if (typTup->typtype != TYPTYPE_DOMAIN)
{
elog(ERROR, "type is not a domain type");
}
stmt->typeName = makeTypeNameFromOid(typTup->typbasetype, typTup->typtypmod);
if (OidIsValid(typTup->typcollation))
{
stmt->collClause = MakeCollateClauseFromOid(typTup->typcollation);
}
/*
* typdefault and typdefaultbin are potentially null, so don't try to
* access 'em as struct fields. Must do it the hard way with
* SysCacheGetAttr.
*/
bool isNull = false;
Datum typeDefaultDatum = SysCacheGetAttr(TYPEOID,
tup,
Anum_pg_type_typdefaultbin,
&isNull);
if (!isNull)
{
/* when not null there is default value which we should add as a constraint */
Constraint *constraint = makeNode(Constraint);
constraint->contype = CONSTR_DEFAULT;
constraint->cooked_expr = TextDatumGetCString(typeDefaultDatum);
stmt->constraints = lappend(stmt->constraints, constraint);
}
/* NOT NULL constraints are non-named on the actual type */
if (typTup->typnotnull)
{
Constraint *constraint = makeNode(Constraint);
constraint->contype = CONSTR_NOTNULL;
stmt->constraints = lappend(stmt->constraints, constraint);
}
/* lookup and look all constraints to add them to the CreateDomainStmt */
Relation conRel = table_open(ConstraintRelationId, AccessShareLock);
/* Look for CHECK Constraints on this domain */
ScanKeyData key[1];
ScanKeyInit(&key[0],
Anum_pg_constraint_contypid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(domainOid));
SysScanDesc scan = systable_beginscan(conRel, ConstraintTypidIndexId, true, NULL, 1,
key);
HeapTuple conTup = NULL;
while (HeapTupleIsValid(conTup = systable_getnext(scan)))
{
Form_pg_constraint c = (Form_pg_constraint) GETSTRUCT(conTup);
if (c->contype != CONSTRAINT_CHECK)
{
/* Ignore non-CHECK constraints, shouldn't be any */
continue;
}
/*
* We create a constraint, completely ignoring c->convalidated because we can't
* create a domain with an invalidated constraint. Once a constraint is added to
* a domain -even non valid-, all new data is validated. Meaning, creating a
* domain with a non-valid constraint doesn't make any sense.
*
* Given it will be too hard to defer the creation of a constraint till we
* validate the constraint on the coordinator we will simply create the
* non-validated constraint to ad hear to validating all new data.
*
* An edgecase here would be when moving existing data, that hasn't been validated
* before to an other node. This behaviour is consistent with sending it to an
* already existing node (that has the constraint created but not validated) and a
* new node.
*/
Constraint *constraint = makeNode(Constraint);
constraint->conname = pstrdup(NameStr(c->conname));
constraint->contype = CONSTR_CHECK; /* we only come here with check constraints */
/* Not expecting conbin to be NULL, but we'll test for it anyway */
Datum conbin = heap_getattr(conTup, Anum_pg_constraint_conbin, conRel->rd_att,
&isNull);
if (isNull)
{
elog(ERROR, "domain \"%s\" constraint \"%s\" has NULL conbin",
NameStr(typTup->typname), NameStr(c->conname));
}
/*
* The conbin containes the cooked expression from when the constraint was
* inserted into the catalog. We store it here for the deparser to distinguish
* between cooked expressions and raw expressions.
*
* There is no supported way to go from a cooked expression to a raw expression.
*/
constraint->cooked_expr = TextDatumGetCString(conbin);
stmt->constraints = lappend(stmt->constraints, constraint);
}
systable_endscan(scan);
table_close(conRel, NoLock);
ReleaseSysCache(tup);
QualifyTreeNode((Node *) stmt);
return stmt;
}
/*
* MakeCollateClauseFromOid returns a CollateClause describing the COLLATE segment of a
* CREATE DOMAIN statement based on the Oid of the collation used for the domain.
*/
static CollateClause *
MakeCollateClauseFromOid(Oid collationOid)
{
CollateClause *collateClause = makeNode(CollateClause);
ObjectAddress collateAddress = { 0 };
ObjectAddressSet(collateAddress, CollationRelationId, collationOid);
List *objName = NIL;
List *objArgs = NIL;
#if PG_VERSION_NUM >= PG_VERSION_14
getObjectIdentityParts(&collateAddress, &objName, &objArgs, false);
#else
getObjectIdentityParts(&collateAddress, &objName, &objArgs);
#endif
char *name = NULL;
foreach_ptr(name, objName)
{
collateClause->collname = lappend(collateClause->collname, makeString(name));
}
collateClause->location = -1;
return collateClause;
}
/*
* CreateDomainStmtObjectAddress returns the ObjectAddress of the domain that would be
* created by the statement. When missing_ok is false the function will raise an error if
* the domain cannot be found in the local catalog.
*/
ObjectAddress
CreateDomainStmtObjectAddress(Node *node, bool missing_ok)
{
CreateDomainStmt *stmt = castNode(CreateDomainStmt, node);
TypeName *typeName = makeTypeNameFromNameList(stmt->domainname);
Oid typeOid = LookupTypeNameOid(NULL, typeName, missing_ok);
ObjectAddress address = { 0 };
ObjectAddressSet(address, TypeRelationId, typeOid);
return address;
}
/*
* AlterDomainStmtObjectAddress returns the ObjectAddress of the domain being altered.
* When missing_ok is false this function will raise an error when the domain is not
* found.
*/
ObjectAddress
AlterDomainStmtObjectAddress(Node *node, bool missing_ok)
{
AlterDomainStmt *stmt = castNode(AlterDomainStmt, node);
TypeName *domainName = makeTypeNameFromNameList(stmt->typeName);
return GetDomainAddressByName(domainName, missing_ok);
}
/*
* DomainRenameConstraintStmtObjectAddress returns the ObjectAddress of the domain for
* which the constraint is being renamed. When missing_ok this function will raise an
* error if the domain cannot be found.
*/
ObjectAddress
DomainRenameConstraintStmtObjectAddress(Node *node, bool missing_ok)
{
RenameStmt *stmt = castNode(RenameStmt, node);
TypeName *domainName = makeTypeNameFromNameList(castNode(List, stmt->object));
return GetDomainAddressByName(domainName, missing_ok);
}
/*
* AlterDomainOwnerStmtObjectAddress returns the ObjectAddress for which the owner is
* being changed. When missing_ok is false this function will raise an error if the domain
* cannot be found.
*/
ObjectAddress
AlterDomainOwnerStmtObjectAddress(Node *node, bool missing_ok)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_DOMAIN);
TypeName *domainName = makeTypeNameFromNameList(castNode(List, stmt->object));
return GetDomainAddressByName(domainName, missing_ok);
}
/*
* RenameDomainStmtObjectAddress returns the ObjectAddress of the domain being renamed.
* When missing_ok is false this function will raise an error when the domain cannot be
* found.
*/
ObjectAddress
RenameDomainStmtObjectAddress(Node *node, bool missing_ok)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Assert(stmt->renameType == OBJECT_DOMAIN);
TypeName *domainName = makeTypeNameFromNameList(castNode(List, stmt->object));
return GetDomainAddressByName(domainName, missing_ok);
}
/*
* get_constraint_typid returns the contypid of a constraint. This field is only set for
* constraints on domain types. Returns InvalidOid if conoid is an invalid constraint, as
* well as for constraints that are not on domain types.
*/
Oid
get_constraint_typid(Oid conoid)
{
HeapTuple tp = SearchSysCache1(CONSTROID, ObjectIdGetDatum(conoid));
if (HeapTupleIsValid(tp))
{
Form_pg_constraint contup = (Form_pg_constraint) GETSTRUCT(tp);
Oid result = contup->contypid;
ReleaseSysCache(tp);
return result;
}
else
{
return InvalidOid;
}
}

View File

@ -10,8 +10,12 @@
#include "postgres.h"
#include "access/genam.h"
#include "citus_version.h"
#include "catalog/dependency.h"
#include "catalog/pg_depend.h"
#include "catalog/pg_extension_d.h"
#include "catalog/pg_foreign_data_wrapper.h"
#include "commands/defrem.h"
#include "commands/extension.h"
#include "distributed/citus_ruleutils.h"
@ -26,9 +30,12 @@
#include "distributed/multi_executor.h"
#include "distributed/relation_access_tracking.h"
#include "distributed/transaction_management.h"
#include "foreign/foreign.h"
#include "nodes/makefuncs.h"
#include "utils/lsyscache.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/syscache.h"
/* Local functions forward declarations for helper functions */
@ -37,9 +44,11 @@ static void AddSchemaFieldIfMissing(CreateExtensionStmt *stmt);
static List * FilterDistributedExtensions(List *extensionObjectList);
static List * ExtensionNameListToObjectAddressList(List *extensionObjectList);
static void MarkExistingObjectDependenciesDistributedIfSupported(void);
static List * GetAllViews(void);
static bool ShouldPropagateExtensionCommand(Node *parseTree);
static bool IsAlterExtensionSetSchemaCitus(Node *parseTree);
static Node * RecreateExtensionStmt(Oid extensionOid);
static List * GenerateGrantCommandsOnExtesionDependentFDWs(Oid extensionId);
/*
@ -510,26 +519,78 @@ MarkExistingObjectDependenciesDistributedIfSupported()
Oid citusTableId = InvalidOid;
foreach_oid(citusTableId, citusTableIdList)
{
ObjectAddress tableAddress = { 0 };
ObjectAddressSet(tableAddress, RelationRelationId, citusTableId);
if (ShouldSyncTableMetadata(citusTableId))
if (!ShouldMarkRelationDistributed(citusTableId))
{
/* we need to pass pointer allocated in the heap */
ObjectAddress *addressPointer = palloc0(sizeof(ObjectAddress));
*addressPointer = tableAddress;
/* as of Citus 11, tables that should be synced are also considered object */
resultingObjectAddresses = lappend(resultingObjectAddresses, addressPointer);
continue;
}
List *distributableDependencyObjectAddresses =
GetDistributableDependenciesForObject(&tableAddress);
/* refrain reading the metadata cache for all tables */
if (ShouldSyncTableMetadataViaCatalog(citusTableId))
{
ObjectAddress tableAddress = { 0 };
ObjectAddressSet(tableAddress, RelationRelationId, citusTableId);
resultingObjectAddresses = list_concat(resultingObjectAddresses,
distributableDependencyObjectAddresses);
/*
* We mark tables distributed immediately because we also need to mark
* views as distributed. We check whether the views that depend on
* the table has any auto-distirbutable dependencies below. Citus
* currently cannot "auto" distribute tables as dependencies, so we
* mark them distributed immediately.
*/
MarkObjectDistributedLocally(&tableAddress);
/*
* All the distributable dependencies of a table should be marked as
* distributed.
*/
List *distributableDependencyObjectAddresses =
GetDistributableDependenciesForObject(&tableAddress);
resultingObjectAddresses =
list_concat(resultingObjectAddresses,
distributableDependencyObjectAddresses);
}
}
/*
* As of Citus 11, views on Citus tables that do not have any unsupported
* dependency should also be distributed.
*
* In general, we mark views distributed as long as it does not have
* any unsupported dependencies.
*/
List *viewList = GetAllViews();
Oid viewOid = InvalidOid;
foreach_oid(viewOid, viewList)
{
if (!ShouldMarkRelationDistributed(viewOid))
{
continue;
}
ObjectAddress viewAddress = { 0 };
ObjectAddressSet(viewAddress, RelationRelationId, viewOid);
/*
* If a view depends on multiple views, that view will be marked
* as distributed while it is processed for the last view
* table.
*/
MarkObjectDistributedLocally(&viewAddress);
/* we need to pass pointer allocated in the heap */
ObjectAddress *addressPointer = palloc0(sizeof(ObjectAddress));
*addressPointer = viewAddress;
List *distributableDependencyObjectAddresses =
GetDistributableDependenciesForObject(&viewAddress);
resultingObjectAddresses =
list_concat(resultingObjectAddresses,
distributableDependencyObjectAddresses);
}
/* resolve dependencies of the objects in pg_dist_object*/
List *distributedObjectAddressList = GetDistributedObjectAddressList();
@ -565,6 +626,40 @@ MarkExistingObjectDependenciesDistributedIfSupported()
}
/*
* GetAllViews returns list of view oids that exists on this server.
*/
static List *
GetAllViews(void)
{
List *viewOidList = NIL;
Relation pgClass = table_open(RelationRelationId, AccessShareLock);
SysScanDesc scanDescriptor = systable_beginscan(pgClass, InvalidOid, false, NULL,
0, NULL);
HeapTuple heapTuple = systable_getnext(scanDescriptor);
while (HeapTupleIsValid(heapTuple))
{
Form_pg_class relationForm = (Form_pg_class) GETSTRUCT(heapTuple);
/* we're only interested in views */
if (relationForm->relkind == RELKIND_VIEW)
{
viewOidList = lappend_oid(viewOidList, relationForm->oid);
}
heapTuple = systable_getnext(scanDescriptor);
}
systable_endscan(scanDescriptor);
table_close(pgClass, NoLock);
return viewOidList;
}
/*
* PreprocessAlterExtensionContentsStmt issues a notice. It does not propagate.
*/
@ -732,6 +827,12 @@ CreateExtensionDDLCommand(const ObjectAddress *extensionAddress)
List *ddlCommands = list_make1((void *) ddlCommand);
/* any privilege granted on FDWs that belong to the extension should be included */
List *FDWGrants =
GenerateGrantCommandsOnExtesionDependentFDWs(extensionAddress->objectId);
ddlCommands = list_concat(ddlCommands, FDWGrants);
return ddlCommands;
}
@ -790,6 +891,88 @@ RecreateExtensionStmt(Oid extensionOid)
}
/*
* GenerateGrantCommandsOnExtesionDependentFDWs returns a list of commands that GRANTs
* the privileges on FDWs that are depending on the given extension.
*/
static List *
GenerateGrantCommandsOnExtesionDependentFDWs(Oid extensionId)
{
List *commands = NIL;
List *FDWOids = GetDependentFDWsToExtension(extensionId);
Oid FDWOid = InvalidOid;
foreach_oid(FDWOid, FDWOids)
{
Acl *aclEntry = GetPrivilegesForFDW(FDWOid);
if (aclEntry == NULL)
{
continue;
}
AclItem *privileges = ACL_DAT(aclEntry);
int numberOfPrivsGranted = ACL_NUM(aclEntry);
for (int i = 0; i < numberOfPrivsGranted; i++)
{
commands = list_concat(commands,
GenerateGrantOnFDWQueriesFromAclItem(FDWOid,
&privileges[i]));
}
}
return commands;
}
/*
* GetDependentFDWsToExtension gets an extension oid and returns the list of oids of FDWs
* that are depending on the given extension.
*/
List *
GetDependentFDWsToExtension(Oid extensionId)
{
List *extensionFDWs = NIL;
ScanKeyData key[3];
int scanKeyCount = 3;
HeapTuple tup;
Relation pgDepend = table_open(DependRelationId, AccessShareLock);
ScanKeyInit(&key[0],
Anum_pg_depend_refclassid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(ExtensionRelationId));
ScanKeyInit(&key[1],
Anum_pg_depend_refobjid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(extensionId));
ScanKeyInit(&key[2],
Anum_pg_depend_classid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(ForeignDataWrapperRelationId));
SysScanDesc scan = systable_beginscan(pgDepend, InvalidOid, false,
NULL, scanKeyCount, key);
while (HeapTupleIsValid(tup = systable_getnext(scan)))
{
Form_pg_depend pgDependEntry = (Form_pg_depend) GETSTRUCT(tup);
if (pgDependEntry->deptype == DEPENDENCY_EXTENSION)
{
extensionFDWs = lappend_oid(extensionFDWs, pgDependEntry->objid);
}
}
systable_endscan(scan);
table_close(pgDepend, AccessShareLock);
return extensionFDWs;
}
/*
* AlterExtensionSchemaStmtObjectAddress returns the ObjectAddress of the extension that is
* the subject of the AlterObjectSchemaStmt. Errors if missing_ok is false.

View File

@ -23,6 +23,7 @@
#include "catalog/pg_type.h"
#include "distributed/colocation_utils.h"
#include "distributed/commands.h"
#include "distributed/commands/sequence.h"
#include "distributed/coordinator_protocol.h"
#include "distributed/listutils.h"
#include "distributed/coordinator_protocol.h"
@ -30,6 +31,7 @@
#include "distributed/namespace_utils.h"
#include "distributed/reference_table_utils.h"
#include "distributed/version_compat.h"
#include "distributed/worker_protocol.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/inval.h"
@ -56,6 +58,8 @@ typedef bool (*CheckRelationFunc)(Oid);
/* Local functions forward declarations */
static void EnsureReferencingTableNotReplicated(Oid referencingTableId);
static void EnsureSupportedFKeyOnDistKey(Form_pg_constraint constraintForm);
static bool ForeignKeySetsNextValColumnToDefault(HeapTuple pgConstraintTuple);
static List * ForeignKeyGetDefaultingAttrs(HeapTuple pgConstraintTuple);
static void EnsureSupportedFKeyBetweenCitusLocalAndRefTable(Form_pg_constraint
constraintForm,
char
@ -196,6 +200,23 @@ ErrorIfUnsupportedForeignConstraintExists(Relation relation, char referencingDis
referencedReplicationModel = referencingReplicationModel;
}
/*
* Given that we drop DEFAULT nextval('sequence') expressions from
* shard relation columns, allowing ON DELETE/UPDATE SET DEFAULT
* on such columns causes inserting NULL values to referencing relation
* as a result of a delete/update operation on referenced relation.
*
* For this reason, we disallow ON DELETE/UPDATE SET DEFAULT actions
* on columns that default to sequences.
*/
if (ForeignKeySetsNextValColumnToDefault(heapTuple))
{
ereport(ERROR, (errmsg("cannot create foreign key constraint "
"since Citus does not support ON DELETE "
"/ UPDATE SET DEFAULT actions on the "
"columns that default to sequences")));
}
bool referencingIsCitusLocalOrRefTable =
(referencingDistMethod == DISTRIBUTE_BY_NONE);
bool referencedIsCitusLocalOrRefTable =
@ -298,6 +319,104 @@ ErrorIfUnsupportedForeignConstraintExists(Relation relation, char referencingDis
}
/*
* ForeignKeySetsNextValColumnToDefault returns true if at least one of the
* columns specified in ON DELETE / UPDATE SET DEFAULT clauses default to
* nextval().
*/
static bool
ForeignKeySetsNextValColumnToDefault(HeapTuple pgConstraintTuple)
{
Form_pg_constraint pgConstraintForm =
(Form_pg_constraint) GETSTRUCT(pgConstraintTuple);
List *setDefaultAttrs = ForeignKeyGetDefaultingAttrs(pgConstraintTuple);
AttrNumber setDefaultAttr = InvalidAttrNumber;
foreach_int(setDefaultAttr, setDefaultAttrs)
{
if (ColumnDefaultsToNextVal(pgConstraintForm->conrelid, setDefaultAttr))
{
return true;
}
}
return false;
}
/*
* ForeignKeyGetDefaultingAttrs returns a list of AttrNumbers
* might be set to default ON DELETE or ON UPDATE.
*
* For example; if the foreign key has SET DEFAULT clause for
* both actions, then returns a superset of the attributes that
* might be set to DEFAULT on either of those actions.
*/
static List *
ForeignKeyGetDefaultingAttrs(HeapTuple pgConstraintTuple)
{
bool isNull = false;
Datum referencingColumnsDatum = SysCacheGetAttr(CONSTROID, pgConstraintTuple,
Anum_pg_constraint_conkey, &isNull);
if (isNull)
{
ereport(ERROR, (errmsg("got NULL conkey from catalog")));
}
List *referencingColumns =
IntegerArrayTypeToList(DatumGetArrayTypeP(referencingColumnsDatum));
Form_pg_constraint pgConstraintForm =
(Form_pg_constraint) GETSTRUCT(pgConstraintTuple);
if (pgConstraintForm->confupdtype == FKCONSTR_ACTION_SETDEFAULT)
{
/*
* Postgres doesn't allow specifying SET DEFAULT for a subset of
* the referencing columns for ON UPDATE action, so in that case
* we return all referencing columns regardless of what ON DELETE
* action says.
*/
return referencingColumns;
}
if (pgConstraintForm->confdeltype != FKCONSTR_ACTION_SETDEFAULT)
{
return NIL;
}
List *onDeleteSetDefColumnList = NIL;
#if PG_VERSION_NUM >= PG_VERSION_15
Datum onDeleteSetDefColumnsDatum = SysCacheGetAttr(CONSTROID, pgConstraintTuple,
Anum_pg_constraint_confdelsetcols,
&isNull);
/*
* confdelsetcols being NULL means that "ON DELETE SET DEFAULT" doesn't
* specify which subset of columns should be set to DEFAULT, so fetching
* NULL from the catalog is also possible.
*/
if (!isNull)
{
onDeleteSetDefColumnList =
IntegerArrayTypeToList(DatumGetArrayTypeP(onDeleteSetDefColumnsDatum));
}
#endif
if (list_length(onDeleteSetDefColumnList) == 0)
{
/*
* That means that all referencing columns need to be set to
* DEFAULT.
*/
return referencingColumns;
}
else
{
return onDeleteSetDefColumnList;
}
}
/*
* EnsureSupportedFKeyBetweenCitusLocalAndRefTable is a helper function that
* takes a foreign key constraint form for a foreign key between two citus

View File

@ -0,0 +1,144 @@
/*-------------------------------------------------------------------------
*
* foreign_data_wrapper.c
* Commands for FOREIGN DATA WRAPPER statements.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "catalog/pg_foreign_data_wrapper.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/commands.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata/distobject.h"
#include "foreign/foreign.h"
#include "nodes/makefuncs.h"
#include "nodes/parsenodes.h"
#include "utils/syscache.h"
static bool NameListHasFDWOwnedByDistributedExtension(List *FDWNames);
static ObjectAddress GetObjectAddressByFDWName(char *FDWName, bool missing_ok);
/*
* PreprocessGrantOnFDWStmt is executed before the statement is applied to the
* local postgres instance.
*
* In this stage we can prepare the commands that need to be run on all workers to grant
* on foreign data wrappers.
*/
List *
PreprocessGrantOnFDWStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_FDW);
if (!NameListHasFDWOwnedByDistributedExtension(stmt->objects))
{
/*
* We propagate granted privileges on a FDW only if it belongs to a distributed
* extension. For now, we skip for custom FDWs, as most of the users prefer
* extension FDWs.
*/
return NIL;
}
if (list_length(stmt->objects) > 1)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot grant on FDW with other FDWs"),
errhint("Try granting on each object in separate commands")));
}
if (!ShouldPropagate())
{
return NIL;
}
EnsureCoordinator();
Assert(list_length(stmt->objects) == 1);
char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* NameListHasFDWOwnedByDistributedExtension takes a namelist of FDWs and returns true
* if at least one of them depends on a distributed extension. Returns false otherwise.
*/
static bool
NameListHasFDWOwnedByDistributedExtension(List *FDWNames)
{
Value *FDWValue = NULL;
foreach_ptr(FDWValue, FDWNames)
{
/* captures the extension address during lookup */
ObjectAddress extensionAddress = { 0 };
ObjectAddress FDWAddress = GetObjectAddressByFDWName(strVal(FDWValue), false);
if (IsObjectAddressOwnedByExtension(&FDWAddress, &extensionAddress))
{
if (IsObjectDistributed(&extensionAddress))
{
return true;
}
}
}
return false;
}
/*
* GetObjectAddressByFDWName takes a FDW name and returns the object address.
*/
static ObjectAddress
GetObjectAddressByFDWName(char *FDWName, bool missing_ok)
{
ForeignDataWrapper *FDW = GetForeignDataWrapperByName(FDWName, missing_ok);
Oid FDWId = FDW->fdwid;
ObjectAddress address = { 0 };
ObjectAddressSet(address, ForeignDataWrapperRelationId, FDWId);
return address;
}
/*
* GetPrivilegesForFDW takes a FDW object id and returns the privileges granted
* on that FDW as a Acl object. Returns NULL if there is no privilege granted.
*/
Acl *
GetPrivilegesForFDW(Oid FDWOid)
{
HeapTuple fdwtup = SearchSysCache1(FOREIGNDATAWRAPPEROID, ObjectIdGetDatum(FDWOid));
bool isNull = true;
Datum aclDatum = SysCacheGetAttr(FOREIGNDATAWRAPPEROID, fdwtup,
Anum_pg_foreign_data_wrapper_fdwacl, &isNull);
if (isNull)
{
ReleaseSysCache(fdwtup);
return NULL;
}
Acl *aclEntry = DatumGetAclPCopy(aclDatum);
ReleaseSysCache(fdwtup);
return aclEntry;
}

View File

@ -9,6 +9,7 @@
*/
#include "postgres.h"
#include "miscadmin.h"
#include "catalog/pg_foreign_server.h"
#include "distributed/commands/utility_hook.h"
@ -23,240 +24,14 @@
#include "nodes/makefuncs.h"
#include "nodes/parsenodes.h"
#include "nodes/primnodes.h"
#include "utils/builtins.h"
static char * GetForeignServerAlterOwnerCommand(Oid serverId);
static Node * RecreateForeignServerStmt(Oid serverId);
static bool NameListHasDistributedServer(List *serverNames);
static ObjectAddress GetObjectAddressByServerName(char *serverName, bool missing_ok);
/*
* PreprocessCreateForeignServerStmt is called during the planning phase for
* CREATE SERVER.
*/
List *
PreprocessCreateForeignServerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
if (!ShouldPropagate())
{
return NIL;
}
/* check creation against multi-statement transaction policy */
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_FOREIGN_SERVER);
char *sql = DeparseTreeNode(node);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterForeignServerStmt is called during the planning phase for
* ALTER SERVER .. OPTIONS ..
*/
List *
PreprocessAlterForeignServerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterForeignServerStmt *stmt = castNode(AlterForeignServerStmt, node);
ObjectAddress address = GetObjectAddressByServerName(stmt->servername, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
char *sql = DeparseTreeNode(node);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessRenameForeignServerStmt is called during the planning phase for
* ALTER SERVER RENAME.
*/
List *
PreprocessRenameForeignServerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Assert(stmt->renameType == OBJECT_FOREIGN_SERVER);
ObjectAddress address = GetObjectAddressByServerName(strVal(stmt->object), false);
/* filter distributed servers */
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
char *sql = DeparseTreeNode(node);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterForeignServerOwnerStmt is called during the planning phase for
* ALTER SERVER .. OWNER TO.
*/
List *
PreprocessAlterForeignServerOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_FOREIGN_SERVER);
ObjectAddress address = GetObjectAddressByServerName(strVal(stmt->object), false);
/* filter distributed servers */
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
char *sql = DeparseTreeNode(node);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessDropForeignServerStmt is called during the planning phase for
* DROP SERVER.
*/
List *
PreprocessDropForeignServerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
Assert(stmt->removeType == OBJECT_FOREIGN_SERVER);
bool includesDistributedServer = NameListHasDistributedServer(stmt->objects);
if (!includesDistributedServer)
{
return NIL;
}
if (list_length(stmt->objects) > 1)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot drop distributed server with other servers"),
errhint("Try dropping each object in a separate DROP command")));
}
if (!ShouldPropagate())
{
return NIL;
}
EnsureCoordinator();
Assert(list_length(stmt->objects) == 1);
Value *serverValue = linitial(stmt->objects);
ObjectAddress address = GetObjectAddressByServerName(strVal(serverValue), false);
/* unmark distributed server */
UnmarkObjectDistributed(&address);
const char *deparsedStmt = DeparseTreeNode((Node *) stmt);
/*
* To prevent recursive propagation in mx architecture, we disable ddl
* propagation before sending the command to workers.
*/
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) deparsedStmt,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessCreateForeignServerStmt is called after a CREATE SERVER command has
* been executed by standard process utility.
*/
List *
PostprocessCreateForeignServerStmt(Node *node, const char *queryString)
{
if (!ShouldPropagate())
{
return NIL;
}
/* check creation against multi-statement transaction policy */
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
const bool missingOk = false;
ObjectAddress address = GetObjectAddressFromParseTree(node, missingOk);
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* PostprocessAlterForeignServerOwnerStmt is called after a ALTER SERVER OWNER command
* has been executed by standard process utility.
*/
List *
PostprocessAlterForeignServerOwnerStmt(Node *node, const char *queryString)
{
const bool missingOk = false;
ObjectAddress address = GetObjectAddressFromParseTree(node, missingOk);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* CreateForeignServerStmtObjectAddress finds the ObjectAddress for the server
* that is created by given CreateForeignServerStmt. If missingOk is false and if
@ -274,6 +49,88 @@ CreateForeignServerStmtObjectAddress(Node *node, bool missing_ok)
}
/*
* AlterForeignServerStmtObjectAddress finds the ObjectAddress for the server that is
* changed by given AlterForeignServerStmt. If missingOk is false and if
* the server does not exist, then it errors out.
*
* Never returns NULL, but the objid in the address can be invalid if missingOk
* was set to true.
*/
ObjectAddress
AlterForeignServerStmtObjectAddress(Node *node, bool missing_ok)
{
AlterForeignServerStmt *stmt = castNode(AlterForeignServerStmt, node);
return GetObjectAddressByServerName(stmt->servername, missing_ok);
}
/*
* PreprocessGrantOnForeignServerStmt is executed before the statement is applied to the
* local postgres instance.
*
* In this stage we can prepare the commands that need to be run on all workers to grant
* on servers.
*/
List *
PreprocessGrantOnForeignServerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_FOREIGN_SERVER);
bool includesDistributedServer = NameListHasDistributedServer(stmt->objects);
if (!includesDistributedServer)
{
return NIL;
}
if (list_length(stmt->objects) > 1)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot grant on distributed server with other servers"),
errhint("Try granting on each object in separate commands")));
}
if (!ShouldPropagate())
{
return NIL;
}
EnsureCoordinator();
Assert(list_length(stmt->objects) == 1);
char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* RenameForeignServerStmtObjectAddress finds the ObjectAddress for the server that is
* renamed by given RenmaeStmt. If missingOk is false and if the server does not exist,
* then it errors out.
*
* Never returns NULL, but the objid in the address can be invalid if missingOk
* was set to true.
*/
ObjectAddress
RenameForeignServerStmtObjectAddress(Node *node, bool missing_ok)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Assert(stmt->renameType == OBJECT_FOREIGN_SERVER);
return GetObjectAddressByServerName(strVal(stmt->object), missing_ok);
}
/*
* AlterForeignServerOwnerStmtObjectAddress finds the ObjectAddress for the server
* given in AlterOwnerStmt. If missingOk is false and if
@ -303,14 +160,37 @@ GetForeignServerCreateDDLCommand(Oid serverId)
Node *stmt = RecreateForeignServerStmt(serverId);
/* capture ddl command for the create statement */
const char *ddlCommand = DeparseTreeNode(stmt);
const char *createCommand = DeparseTreeNode(stmt);
const char *alterOwnerCommand = GetForeignServerAlterOwnerCommand(serverId);
List *ddlCommands = list_make1((void *) ddlCommand);
List *ddlCommands = list_make2((void *) createCommand,
(void *) alterOwnerCommand);
return ddlCommands;
}
/*
* GetForeignServerAlterOwnerCommand returns "ALTER SERVER .. OWNER TO .." statement
* for the specified foreign server.
*/
static char *
GetForeignServerAlterOwnerCommand(Oid serverId)
{
ForeignServer *server = GetForeignServer(serverId);
Oid ownerId = server->owner;
char *ownerName = GetUserNameFromId(ownerId, false);
StringInfo alterCommand = makeStringInfo();
appendStringInfo(alterCommand, "ALTER SERVER %s OWNER TO %s;",
quote_identifier(server->servername),
quote_identifier(ownerName));
return alterCommand->data;
}
/*
* RecreateForeignServerStmt returns a parsetree for a CREATE SERVER statement
* that would recreate the given server on a new node.

View File

@ -6,7 +6,9 @@
* We currently support replicating function definitions on the
* coordinator in all the worker nodes in the form of
*
* CREATE OR REPLACE FUNCTION ... queries.
* CREATE OR REPLACE FUNCTION ... queries and
* GRANT ... ON FUNCTION queries
*
*
* ALTER or DROP operations are not yet propagated.
*
@ -104,6 +106,7 @@ static void DistributeFunctionColocatedWithDistributedTable(RegProcedure funcOid
functionAddress);
static void DistributeFunctionColocatedWithReferenceTable(const
ObjectAddress *functionAddress);
static List * FilterDistributedFunctions(GrantStmt *grantStmt);
static void EnsureExtensionFunctionCanBeDistributed(const ObjectAddress functionAddress,
const ObjectAddress extensionAddress,
@ -239,8 +242,17 @@ create_distributed_function(PG_FUNCTION_ARGS)
const char *createFunctionSQL = GetFunctionDDLCommand(funcOid, true);
const char *alterFunctionOwnerSQL = GetFunctionAlterOwnerCommand(funcOid);
initStringInfo(&ddlCommand);
appendStringInfo(&ddlCommand, "%s;%s;%s;%s", DISABLE_METADATA_SYNC,
createFunctionSQL, alterFunctionOwnerSQL, ENABLE_METADATA_SYNC);
appendStringInfo(&ddlCommand, "%s;%s;%s", DISABLE_METADATA_SYNC,
createFunctionSQL, alterFunctionOwnerSQL);
List *grantDDLCommands = GrantOnFunctionDDLCommands(funcOid);
char *grantOnFunctionSQL = NULL;
foreach_ptr(grantOnFunctionSQL, grantDDLCommands)
{
appendStringInfo(&ddlCommand, ";%s", grantOnFunctionSQL);
}
appendStringInfo(&ddlCommand, ";%s", ENABLE_METADATA_SYNC);
SendCommandToWorkersAsUser(NON_COORDINATOR_NODES, CurrentUserName(),
ddlCommand.data);
}
@ -1263,12 +1275,7 @@ ShouldPropagateCreateFunction(CreateFunctionStmt *stmt)
return false;
}
/*
* If the create command is a part of a multi-statement transaction that is not in
* sequential mode, don't propagate.
*/
if (IsMultiStatementTransaction() &&
MultiShardConnectionType != SEQUENTIAL_CONNECTION)
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return false;
}
@ -1507,234 +1514,6 @@ PreprocessAlterFunctionStmt(Node *node, const char *queryString,
}
/*
* PreprocessRenameFunctionStmt is called when the user is renaming a function. The invocation
* happens before the statement is applied locally.
*
* As the function already exists we have access to the ObjectAddress, this is used to
* check if it is distributed. If so the rename is executed on all the workers to keep the
* types in sync across the cluster.
*/
List *
PreprocessRenameFunctionStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
RenameStmt *stmt = castNode(RenameStmt, node);
AssertObjectTypeIsFunctional(stmt->renameType);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateAlterFunction(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_FUNCTION);
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterFunctionSchemaStmt is executed before the statement is applied to the local
* postgres instance.
*
* In this stage we can prepare the commands that need to be run on all workers.
*/
List *
PreprocessAlterFunctionSchemaStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
AssertObjectTypeIsFunctional(stmt->objectType);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateAlterFunction(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_FUNCTION);
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterFunctionOwnerStmt is called for change of owner ship of functions before the owner
* ship is changed on the local instance.
*
* If the function for which the owner is changed is distributed we execute the change on
* all the workers to keep the type in sync across the cluster.
*/
List *
PreprocessAlterFunctionOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
AssertObjectTypeIsFunctional(stmt->objectType);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateAlterFunction(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_FUNCTION);
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterFunctionOwnerStmt is invoked after the owner has been changed locally.
* Since changing the owner could result in new dependencies being found for this object
* we re-ensure all the dependencies for the function do exist.
*
* This is solely to propagate the new owner (and all its dependencies) if it was not
* already distributed in the cluster.
*/
List *
PostprocessAlterFunctionOwnerStmt(Node *node, const char *queryString)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
AssertObjectTypeIsFunctional(stmt->objectType);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateAlterFunction(&address))
{
return NIL;
}
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* PreprocessDropFunctionStmt gets called during the planning phase of a DROP FUNCTION statement
* and returns a list of DDLJob's that will drop any distributed functions from the
* workers.
*
* The DropStmt could have multiple objects to drop, the list of objects will be filtered
* to only keep the distributed functions for deletion on the workers. Non-distributed
* functions will still be dropped locally but not on the workers.
*/
List *
PreprocessDropFunctionStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
List *deletingObjectWithArgsList = stmt->objects;
List *distributedObjectWithArgsList = NIL;
List *distributedFunctionAddresses = NIL;
AssertObjectTypeIsFunctional(stmt->removeType);
if (creating_extension)
{
/*
* extensions should be created separately on the workers, types cascading from an
* extension should therefore not be propagated here.
*/
return NIL;
}
if (!EnableMetadataSync)
{
/*
* we are configured to disable object propagation, should not propagate anything
*/
return NIL;
}
/*
* Our statements need to be fully qualified so we can drop them from the right schema
* on the workers
*/
QualifyTreeNode((Node *) stmt);
/*
* iterate over all functions to be dropped and filter to keep only distributed
* functions.
*/
ObjectWithArgs *func = NULL;
foreach_ptr(func, deletingObjectWithArgsList)
{
ObjectAddress address = FunctionToObjectAddress(stmt->removeType, func,
stmt->missing_ok);
if (!IsObjectDistributed(&address))
{
continue;
}
/* collect information for all distributed functions */
ObjectAddress *addressp = palloc(sizeof(ObjectAddress));
*addressp = address;
distributedFunctionAddresses = lappend(distributedFunctionAddresses, addressp);
distributedObjectWithArgsList = lappend(distributedObjectWithArgsList, func);
}
if (list_length(distributedObjectWithArgsList) <= 0)
{
/* no distributed functions to drop */
return NIL;
}
/*
* managing types can only be done on the coordinator if ddl propagation is on. when
* it is off we will never get here. MX workers don't have a notion of distributed
* types, so we block the call.
*/
EnsureCoordinator();
EnsureSequentialMode(OBJECT_FUNCTION);
/* remove the entries for the distributed objects on dropping */
ObjectAddress *address = NULL;
foreach_ptr(address, distributedFunctionAddresses)
{
UnmarkObjectDistributed(address);
}
/*
* Swap the list of objects before deparsing and restore the old list after. This
* ensures we only have distributed functions in the deparsed drop statement.
*/
DropStmt *stmtCopy = copyObject(stmt);
stmtCopy->objects = distributedObjectWithArgsList;
const char *dropStmtSql = DeparseTreeNode((Node *) stmtCopy);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) dropStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterFunctionDependsStmt is called during the planning phase of an
* ALTER FUNCION ... DEPENDS ON EXTENSION ... statement. Since functions depending on
@ -1808,30 +1587,6 @@ AlterFunctionDependsStmtObjectAddress(Node *node, bool missing_ok)
}
/*
* PostprocessAlterFunctionSchemaStmt is executed after the change has been applied locally,
* we can now use the new dependencies of the function to ensure all its dependencies
* exist on the workers before we apply the commands remotely.
*/
List *
PostprocessAlterFunctionSchemaStmt(Node *node, const char *queryString)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
AssertObjectTypeIsFunctional(stmt->objectType);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateAlterFunction(&address))
{
return NIL;
}
/* dependencies have changed (schema) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* AlterFunctionStmtObjectAddress returns the ObjectAddress of the subject in the
* AlterFunctionStmt. If missing_ok is set to false an error will be raised if postgres
@ -2177,3 +1932,162 @@ EnsureExtensionFunctionCanBeDistributed(const ObjectAddress functionAddress,
EnsureDependenciesExistOnAllNodes(&functionAddress);
}
/*
* PreprocessGrantOnFunctionStmt is executed before the statement is applied to the local
* postgres instance.
*
* In this stage we can prepare the commands that need to be run on all workers to grant
* on distributed functions, procedures, routines.
*/
List *
PreprocessGrantOnFunctionStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(isFunction(stmt->objtype));
List *distributedFunctions = FilterDistributedFunctions(stmt);
if (list_length(distributedFunctions) == 0 || !ShouldPropagate())
{
return NIL;
}
EnsureCoordinator();
List *grantFunctionList = NIL;
ObjectAddress *functionAddress = NULL;
foreach_ptr(functionAddress, distributedFunctions)
{
ObjectWithArgs *distFunction = ObjectWithArgsFromOid(
functionAddress->objectId);
grantFunctionList = lappend(grantFunctionList, distFunction);
}
List *originalObjects = stmt->objects;
GrantTargetType originalTargtype = stmt->targtype;
stmt->objects = grantFunctionList;
stmt->targtype = ACL_TARGET_OBJECT;
char *sql = DeparseTreeNode((Node *) stmt);
stmt->objects = originalObjects;
stmt->targtype = originalTargtype;
List *commandList = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commandList);
}
/*
* PostprocessGrantOnFunctionStmt makes sure dependencies of each
* distributed function in the statement exist on all nodes
*/
List *
PostprocessGrantOnFunctionStmt(Node *node, const char *queryString)
{
GrantStmt *stmt = castNode(GrantStmt, node);
List *distributedFunctions = FilterDistributedFunctions(stmt);
if (list_length(distributedFunctions) == 0)
{
return NIL;
}
ObjectAddress *functionAddress = NULL;
foreach_ptr(functionAddress, distributedFunctions)
{
EnsureDependenciesExistOnAllNodes(functionAddress);
}
return NIL;
}
/*
* FilterDistributedFunctions determines and returns a list of distributed functions
* ObjectAddress-es from given grant statement.
*/
static List *
FilterDistributedFunctions(GrantStmt *grantStmt)
{
List *grantFunctionList = NIL;
bool grantOnFunctionCommand = (grantStmt->targtype == ACL_TARGET_OBJECT &&
isFunction(grantStmt->objtype));
bool grantAllFunctionsOnSchemaCommand = (grantStmt->targtype ==
ACL_TARGET_ALL_IN_SCHEMA &&
isFunction(grantStmt->objtype));
/* we are only interested in function/procedure/routine level grants */
if (!grantOnFunctionCommand && !grantAllFunctionsOnSchemaCommand)
{
return NIL;
}
if (grantAllFunctionsOnSchemaCommand)
{
List *distributedFunctionList = DistributedFunctionList();
ObjectAddress *distributedFunction = NULL;
List *namespaceOidList = NIL;
/* iterate over all namespace names provided to get their oid's */
Value *namespaceValue = NULL;
foreach_ptr(namespaceValue, grantStmt->objects)
{
char *nspname = strVal(namespaceValue);
bool missing_ok = false;
Oid namespaceOid = get_namespace_oid(nspname, missing_ok);
namespaceOidList = list_append_unique_oid(namespaceOidList, namespaceOid);
}
/*
* iterate over all distributed functions to filter the ones
* that belong to one of the namespaces from above
*/
foreach_ptr(distributedFunction, distributedFunctionList)
{
Oid namespaceOid = get_func_namespace(distributedFunction->objectId);
/*
* if this distributed function's schema is one of the schemas
* specified in the GRANT .. ALL FUNCTIONS IN SCHEMA ..
* add it to the list
*/
if (list_member_oid(namespaceOidList, namespaceOid))
{
grantFunctionList = lappend(grantFunctionList, distributedFunction);
}
}
}
else
{
bool missingOk = false;
ObjectWithArgs *objectWithArgs = NULL;
foreach_ptr(objectWithArgs, grantStmt->objects)
{
ObjectAddress *functionAddress = palloc0(sizeof(ObjectAddress));
functionAddress->classId = ProcedureRelationId;
functionAddress->objectId = LookupFuncWithArgs(grantStmt->objtype,
objectWithArgs,
missingOk);
functionAddress->objectSubId = 0;
/*
* if this function from GRANT .. ON FUNCTION .. is a distributed
* function, add it to the list
*/
if (IsObjectDistributed(functionAddress))
{
grantFunctionList = lappend(grantFunctionList, functionAddress);
}
}
}
return grantFunctionList;
}

View File

@ -8,13 +8,244 @@
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/metadata/distobject.h"
#include "distributed/metadata_cache.h"
#include "distributed/version_compat.h"
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "utils/lsyscache.h"
/* placeholder for PreprocessGrantStmt */
/* Local functions forward declarations for helper functions */
static List * CollectGrantTableIdList(GrantStmt *grantStmt);
/*
* PreprocessGrantStmt determines whether a given GRANT/REVOKE statement involves
* a distributed table. If so, it creates DDLJobs to encapsulate information
* needed during the worker node portion of DDL execution before returning the
* DDLJobs in a List. If no distributed table is involved, this returns NIL.
*
* NB: So far column level privileges are not supported.
*/
List *
PreprocessGrantStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
return NIL;
GrantStmt *grantStmt = castNode(GrantStmt, node);
StringInfoData privsString;
StringInfoData granteesString;
StringInfoData targetString;
StringInfoData ddlString;
ListCell *granteeCell = NULL;
ListCell *tableListCell = NULL;
bool isFirst = true;
List *ddlJobs = NIL;
initStringInfo(&privsString);
initStringInfo(&granteesString);
initStringInfo(&targetString);
initStringInfo(&ddlString);
/*
* So far only table level grants are supported. Most other types of
* grants aren't interesting anyway.
*/
if (grantStmt->objtype != OBJECT_TABLE)
{
return NIL;
}
List *tableIdList = CollectGrantTableIdList(grantStmt);
/* nothing to do if there is no distributed table in the grant list */
if (tableIdList == NIL)
{
return NIL;
}
/* deparse the privileges */
if (grantStmt->privileges == NIL)
{
appendStringInfo(&privsString, "ALL");
}
else
{
ListCell *privilegeCell = NULL;
isFirst = true;
foreach(privilegeCell, grantStmt->privileges)
{
AccessPriv *priv = lfirst(privilegeCell);
if (!isFirst)
{
appendStringInfoString(&privsString, ", ");
}
isFirst = false;
if (priv->cols != NIL)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("grant/revoke on column list is currently "
"unsupported")));
}
Assert(priv->priv_name != NULL);
appendStringInfo(&privsString, "%s", priv->priv_name);
}
}
/* deparse the grantees */
isFirst = true;
foreach(granteeCell, grantStmt->grantees)
{
RoleSpec *spec = lfirst(granteeCell);
if (!isFirst)
{
appendStringInfoString(&granteesString, ", ");
}
isFirst = false;
appendStringInfoString(&granteesString, RoleSpecString(spec, true));
}
/*
* Deparse the target objects, and issue the deparsed statements to
* workers, if applicable. That's so we easily can replicate statements
* only to distributed relations.
*/
isFirst = true;
foreach(tableListCell, tableIdList)
{
Oid relationId = lfirst_oid(tableListCell);
const char *grantOption = "";
resetStringInfo(&targetString);
appendStringInfo(&targetString, "%s", generate_relation_name(relationId, NIL));
if (grantStmt->is_grant)
{
if (grantStmt->grant_option)
{
grantOption = " WITH GRANT OPTION";
}
appendStringInfo(&ddlString, "GRANT %s ON %s TO %s%s",
privsString.data, targetString.data, granteesString.data,
grantOption);
}
else
{
if (grantStmt->grant_option)
{
grantOption = "GRANT OPTION FOR ";
}
appendStringInfo(&ddlString, "REVOKE %s%s ON %s FROM %s",
grantOption, privsString.data, targetString.data,
granteesString.data);
}
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->metadataSyncCommand = pstrdup(ddlString.data);
ddlJob->taskList = NIL;
if (IsCitusTable(relationId))
{
ddlJob->taskList = DDLTaskList(relationId, ddlString.data);
}
ddlJobs = lappend(ddlJobs, ddlJob);
resetStringInfo(&ddlString);
}
return ddlJobs;
}
/*
* CollectGrantTableIdList determines and returns a list of distributed table
* Oids from grant statement.
* Grant statement may appear in two forms
* 1 - grant on table:
* each distributed table oid in grant object list is added to returned list.
* 2 - grant all tables in schema:
* Collect namespace oid list from grant statement
* Add each distributed table oid in the target namespace list to the returned list.
*/
static List *
CollectGrantTableIdList(GrantStmt *grantStmt)
{
List *grantTableList = NIL;
bool grantOnTableCommand = (grantStmt->targtype == ACL_TARGET_OBJECT &&
grantStmt->objtype == OBJECT_TABLE);
bool grantAllTablesOnSchemaCommand = (grantStmt->targtype ==
ACL_TARGET_ALL_IN_SCHEMA &&
grantStmt->objtype == OBJECT_TABLE);
/* we are only interested in table level grants */
if (!grantOnTableCommand && !grantAllTablesOnSchemaCommand)
{
return NIL;
}
if (grantAllTablesOnSchemaCommand)
{
List *citusTableIdList = CitusTableTypeIdList(ANY_CITUS_TABLE_TYPE);
ListCell *citusTableIdCell = NULL;
List *namespaceOidList = NIL;
ListCell *objectCell = NULL;
foreach(objectCell, grantStmt->objects)
{
char *nspname = strVal(lfirst(objectCell));
bool missing_ok = false;
Oid namespaceOid = get_namespace_oid(nspname, missing_ok);
Assert(namespaceOid != InvalidOid);
namespaceOidList = list_append_unique_oid(namespaceOidList, namespaceOid);
}
foreach(citusTableIdCell, citusTableIdList)
{
Oid relationId = lfirst_oid(citusTableIdCell);
Oid namespaceOid = get_rel_namespace(relationId);
if (list_member_oid(namespaceOidList, namespaceOid))
{
grantTableList = lappend_oid(grantTableList, relationId);
}
}
}
else
{
ListCell *objectCell = NULL;
foreach(objectCell, grantStmt->objects)
{
RangeVar *relvar = (RangeVar *) lfirst(objectCell);
Oid relationId = RangeVarGetRelid(relvar, NoLock, false);
if (IsCitusTable(relationId))
{
grantTableList = lappend_oid(grantTableList, relationId);
continue;
}
/* check for distributed sequences included in GRANT ON TABLE statement */
ObjectAddress sequenceAddress = { 0 };
ObjectAddressSet(sequenceAddress, RelationRelationId, relationId);
if (IsObjectDistributed(&sequenceAddress))
{
grantTableList = lappend_oid(grantTableList, relationId);
}
}
}
return grantTableList;
}

View File

@ -42,6 +42,7 @@
#include "lib/stringinfo.h"
#include "miscadmin.h"
#include "nodes/parsenodes.h"
#include "parser/parse_utilcmd.h"
#include "storage/lmgr.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
@ -184,9 +185,18 @@ PreprocessIndexStmt(Node *node, const char *createIndexCommand,
*/
ErrorIfCreateIndexHasTooManyColumns(createIndexStatement);
/*
* If there are expressions on the index, we should first transform
* the statement as the default index name depends on that. We do
* it on a copy not to interfere with standard process utility.
*/
IndexStmt *copyCreateIndexStatement =
transformIndexStmt(relation->rd_id, copyObject(createIndexStatement),
createIndexCommand);
/* ensure we copy string into proper context */
MemoryContext relationContext = GetMemoryChunkContext(relationRangeVar);
char *defaultIndexName = GenerateDefaultIndexName(createIndexStatement);
char *defaultIndexName = GenerateDefaultIndexName(copyCreateIndexStatement);
createIndexStatement->idxname = MemoryContextStrdup(relationContext,
defaultIndexName);
}
@ -464,7 +474,8 @@ GenerateCreateIndexDDLJob(IndexStmt *createIndexStatement, const char *createInd
{
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = CreateIndexStmtGetRelationId(createIndexStatement);
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId,
CreateIndexStmtGetRelationId(createIndexStatement));
ddlJob->startNewTransaction = createIndexStatement->concurrent;
ddlJob->metadataSyncCommand = createIndexCommand;
ddlJob->taskList = CreateIndexTaskList(createIndexStatement);
@ -598,7 +609,7 @@ PreprocessReindexStmt(Node *node, const char *reindexCommand,
}
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->startNewTransaction = IsReindexWithParam_compat(reindexStatement,
"concurrently");
ddlJob->metadataSyncCommand = reindexCommand;
@ -695,7 +706,8 @@ PreprocessDropIndexStmt(Node *node, const char *dropIndexCommand,
MarkInvalidateForeignKeyGraph();
}
ddlJob->targetRelationId = distributedRelationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId,
distributedRelationId);
/*
* We do not want DROP INDEX CONCURRENTLY to commit locally before
@ -1137,6 +1149,15 @@ ErrorIfUnsupportedIndexStmt(IndexStmt *createIndexStatement)
"is currently unsupported")));
}
if (AllowUnsafeConstraints)
{
/*
* The user explicitly wants to allow the constraint without
* distribution column.
*/
return;
}
Var *partitionKey = DistPartitionKeyOrError(relationId);
List *indexParameterList = createIndexStatement->indexParams;
IndexElem *indexElement = NULL;

View File

@ -73,10 +73,12 @@
#include "distributed/commands/multi_copy.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/intermediate_results.h"
#include "distributed/listutils.h"
#include "distributed/local_executor.h"
#include "distributed/log_utils.h"
#include "distributed/coordinator_protocol.h"
#include "distributed/metadata_cache.h"
#include "distributed/multi_executor.h"
#include "distributed/multi_partitioning_utils.h"
#include "distributed/multi_physical_planner.h"
#include "distributed/multi_router_planner.h"
@ -102,6 +104,7 @@
#include "libpq/pqformat.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "parser/parse_func.h"
#include "parser/parse_type.h"
#if PG_VERSION_NUM >= PG_VERSION_13
#include "tcop/cmdtag.h"
@ -117,6 +120,9 @@
/* constant used in binary protocol */
static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0";
/* if true, skip validation of JSONB columns during COPY */
bool SkipJsonbValidationInCopy = true;
/* custom Citus option for appending to a shard */
#define APPEND_TO_SHARD_OPTION "append_to_shard"
@ -242,6 +248,9 @@ typedef enum LocalCopyStatus
/* Local functions forward declarations */
static void CopyToExistingShards(CopyStmt *copyStatement,
QueryCompletionCompat *completionTag);
static bool IsCopyInBinaryFormat(CopyStmt *copyStatement);
static List * FindJsonbInputColumns(TupleDesc tupleDescriptor,
List *inputColumnNameList);
static List * RemoveOptionFromList(List *optionList, char *optionName);
static bool BinaryOutputFunctionDefined(Oid typeId);
static bool BinaryInputFunctionDefined(Oid typeId);
@ -452,6 +461,7 @@ CopyToExistingShards(CopyStmt *copyStatement, QueryCompletionCompat *completionT
List *columnNameList = NIL;
int partitionColumnIndex = INVALID_PARTITION_COLUMN_INDEX;
bool isInputFormatBinary = IsCopyInBinaryFormat(copyStatement);
uint64 processedRowCount = 0;
ErrorContextCallback errorCallback;
@ -543,6 +553,72 @@ CopyToExistingShards(CopyStmt *copyStatement, QueryCompletionCompat *completionT
copiedDistributedRelationTuple->relkind = RELKIND_RELATION;
}
/*
* We make an optimisation to skip JSON parsing for JSONB columns, because many
* Citus users have large objects in this column and parsing it on the coordinator
* causes significant CPU overhead. We do this by forcing BeginCopyFrom and
* NextCopyFrom to parse the column as text and then encoding it as JSON again
* by using citus_text_send_as_jsonb as the binary output function.
*
* The main downside of enabling this optimisation is that it defers validation
* until the object is parsed by the worker, which is unable to give an accurate
* line number.
*/
if (SkipJsonbValidationInCopy && !isInputFormatBinary)
{
CopyOutState copyOutState = copyDest->copyOutState;
ListCell *jsonbColumnIndexCell = NULL;
/* get the column indices for all JSONB columns that appear in the input */
List *jsonbColumnIndexList = FindJsonbInputColumns(
copiedDistributedRelation->rd_att,
copyStatement->attlist);
foreach(jsonbColumnIndexCell, jsonbColumnIndexList)
{
int jsonbColumnIndex = lfirst_int(jsonbColumnIndexCell);
Form_pg_attribute currentColumn =
TupleDescAttr(copiedDistributedRelation->rd_att, jsonbColumnIndex);
if (jsonbColumnIndex == partitionColumnIndex)
{
/*
* In the curious case of using a JSONB column as partition column,
* we leave it as is because we want to make sure the hashing works
* correctly.
*/
continue;
}
ereport(DEBUG1, (errmsg("parsing JSONB column %s as text",
NameStr(currentColumn->attname))));
/* parse the column as text instead of JSONB */
currentColumn->atttypid = TEXTOID;
if (copyOutState->binary)
{
Oid textSendAsJsonbFunctionId = CitusTextSendAsJsonbFunctionId();
/*
* If we're using binary encoding between coordinator and workers
* then we should honour the format expected by jsonb_recv, which
* is a version number followed by text. We therefore use an output
* function which sends the text as if it were jsonb, namely by
* prepending a version number.
*/
fmgr_info(textSendAsJsonbFunctionId,
&copyDest->columnOutputFunctions[jsonbColumnIndex]);
}
else
{
Oid textoutFunctionId = TextOutFunctionId();
fmgr_info(textoutFunctionId,
&copyDest->columnOutputFunctions[jsonbColumnIndex]);
}
}
}
/* initialize copy state to read from COPY data source */
CopyFromState copyState = BeginCopyFrom_compat(NULL,
copiedDistributedRelation,
@ -610,6 +686,82 @@ CopyToExistingShards(CopyStmt *copyStatement, QueryCompletionCompat *completionT
}
/*
* IsCopyInBinaryFormat determines whether the given COPY statement has the
* WITH (format binary) option.
*/
static bool
IsCopyInBinaryFormat(CopyStmt *copyStatement)
{
ListCell *optionCell = NULL;
foreach(optionCell, copyStatement->options)
{
DefElem *defel = lfirst_node(DefElem, optionCell);
if (strcmp(defel->defname, "format") == 0 &&
strcmp(defGetString(defel), "binary") == 0)
{
return true;
}
}
return false;
}
/*
* FindJsonbInputColumns finds columns in the tuple descriptor that have
* the JSONB type and appear in inputColumnNameList. If the list is empty then
* all JSONB columns are returned.
*/
static List *
FindJsonbInputColumns(TupleDesc tupleDescriptor, List *inputColumnNameList)
{
List *jsonbColumnIndexList = NIL;
int columnCount = tupleDescriptor->natts;
for (int columnIndex = 0; columnIndex < columnCount; columnIndex++)
{
Form_pg_attribute currentColumn = TupleDescAttr(tupleDescriptor, columnIndex);
if (currentColumn->attisdropped)
{
continue;
}
if (currentColumn->atttypid != JSONBOID)
{
continue;
}
if (inputColumnNameList != NIL)
{
ListCell *inputColumnCell = NULL;
bool isInputColumn = false;
foreach(inputColumnCell, inputColumnNameList)
{
char *inputColumnName = strVal(lfirst(inputColumnCell));
if (namestrcmp(&currentColumn->attname, inputColumnName) == 0)
{
isInputColumn = true;
break;
}
}
if (!isInputColumn)
{
continue;
}
}
jsonbColumnIndexList = lappend_int(jsonbColumnIndexList, columnIndex);
}
return jsonbColumnIndexList;
}
static void
CompleteCopyQueryTagCompat(QueryCompletionCompat *completionTag, uint64 processedRowCount)
{
@ -3430,10 +3582,7 @@ InitializeCopyShardState(CopyShardState *shardState,
ereport(ERROR, (errmsg("could not connect to any active placements")));
}
if (hasRemoteCopy)
{
EnsureRemoteTaskExecutionAllowed();
}
EnsureTaskExecutionAllowed(hasRemoteCopy);
/*
* We just error out and code execution should never reach to this

View File

@ -12,112 +12,650 @@
#include "catalog/namespace.h"
#include "commands/policy.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/coordinator_protocol.h"
#include "distributed/listutils.h"
#include "distributed/metadata_cache.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
#include "parser/parse_clause.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteManip.h"
#include "rewrite/rowsecurity.h"
#include "utils/builtins.h"
#include "utils/ruleutils.h"
/* placeholder for CreatePolicyCommands */
static const char * unparse_policy_command(const char aclchar);
static void AddRangeTableEntryToQueryCompat(ParseState *parseState, Relation relation);
static RowSecurityPolicy * GetPolicyByName(Oid relationId, const char *policyName);
static List * GetPolicyListForRelation(Oid relationId);
static char * CreatePolicyCommandForPolicy(Oid relationId, RowSecurityPolicy *policy);
/*
* CreatePolicyCommands takes in a relationId, and returns the list of create policy
* commands needed to reconstruct the policies of that table.
*/
List *
CreatePolicyCommands(Oid relationId)
{
/* placeholder for future implementation */
return NIL;
}
List *commands = NIL;
List *policyList = GetPolicyListForRelation(relationId);
/* placeholder for PreprocessCreatePolicyStmt */
List *
PreprocessCreatePolicyStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
CreatePolicyStmt *stmt = castNode(CreatePolicyStmt, node);
Oid relationId = RangeVarGetRelid(stmt->table,
AccessExclusiveLock,
false);
if (IsCitusTable(relationId))
RowSecurityPolicy *policy;
foreach_ptr(policy, policyList)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("policies on distributed tables are only supported in "
"Citus Enterprise")));
char *createPolicyCommand = CreatePolicyCommandForPolicy(relationId, policy);
commands = lappend(commands, makeTableDDLCommandString(createPolicyCommand));
}
/* placeholder for future implementation */
return NIL;
return commands;
}
/* placeholder for PreprocessAlterPolicyStmt */
/*
* GetPolicyListForRelation returns a list of RowSecurityPolicy objects identifying
* the policies on the relation with relationId. Note that this function acquires
* AccessShareLock on relation and does not release it in the end to make sure that
* caller will process valid policies through the transaction.
*/
static List *
GetPolicyListForRelation(Oid relationId)
{
Relation relation = table_open(relationId, AccessShareLock);
if (!relation_has_policies(relation))
{
table_close(relation, NoLock);
return NIL;
}
if (relation->rd_rsdesc == NULL)
{
/*
* there are policies, but since RLS is not enabled they are not loaded into
* cache, we will do so here for us to access
*/
RelationBuildRowSecurity(relation);
}
List *policyList = NIL;
RowSecurityPolicy *policy;
foreach_ptr(policy, relation->rd_rsdesc->policies)
{
policyList = lappend(policyList, policy);
}
table_close(relation, NoLock);
return policyList;
}
/*
* CreatePolicyCommandForPolicy takes a relationId and a policy, returns
* the CREATE POLICY command needed to reconstruct the policy identified
* by the "policy" object on the relation with relationId.
*/
static char *
CreatePolicyCommandForPolicy(Oid relationId, RowSecurityPolicy *policy)
{
char *relationName = generate_qualified_relation_name(relationId);
List *relationContext = deparse_context_for(relationName, relationId);
StringInfo createPolicyCommand = makeStringInfo();
appendStringInfo(createPolicyCommand, "CREATE POLICY %s ON %s FOR %s",
quote_identifier(policy->policy_name),
relationName,
unparse_policy_command(policy->polcmd));
appendStringInfoString(createPolicyCommand, " TO ");
/*
* iterate over all roles and append them to the ddl command with commas
* separating the role names
*/
Oid *roles = (Oid *) ARR_DATA_PTR(policy->roles);
for (int roleIndex = 0; roleIndex < ARR_DIMS(policy->roles)[0]; roleIndex++)
{
const char *roleName;
if (roleIndex > 0)
{
appendStringInfoString(createPolicyCommand, ", ");
}
if (roles[roleIndex] == ACL_ID_PUBLIC)
{
roleName = "PUBLIC";
}
else
{
roleName = quote_identifier(GetUserNameFromId(roles[roleIndex], false));
}
appendStringInfoString(createPolicyCommand, roleName);
}
if (policy->qual)
{
char *qualString = deparse_expression((Node *) (policy->qual),
relationContext, false, false);
appendStringInfo(createPolicyCommand, " USING (%s)", qualString);
}
if (policy->with_check_qual)
{
char *withCheckQualString = deparse_expression(
(Node *) (policy->with_check_qual), relationContext, false, false);
appendStringInfo(createPolicyCommand, " WITH CHECK (%s)",
withCheckQualString);
}
return createPolicyCommand->data;
}
/*
* unparse_policy_command takes the type of a policy command and converts it to its full
* command string. This function is the exact inverse of parse_policy_command that is in
* postgres.
*/
static const char *
unparse_policy_command(const char aclchar)
{
switch (aclchar)
{
case '*':
{
return "ALL";
}
case ACL_SELECT_CHR:
{
return "SELECT";
}
case ACL_INSERT_CHR:
{
return "INSERT";
}
case ACL_UPDATE_CHR:
{
return "UPDATE";
}
case ACL_DELETE_CHR:
{
return "DELETE";
}
default:
{
elog(ERROR, "unrecognized aclchar: %d", aclchar);
return NULL;
}
}
}
/*
* PostprocessCreatePolicyStmt determines when a CREATE POLICY statement involves
* a distributed table. If so, it creates DDLJobs to encapsulate information
* needed during the worker node portion of DDL execution before returning the
* DDLJobs in a List. If no distributed table is involved, this returns NIL.
*/
List *
PostprocessCreatePolicyStmt(Node *node, const char *queryString)
{
CreatePolicyStmt *stmt = castNode(CreatePolicyStmt, node);
/* load relation information */
RangeVar *relvar = stmt->table;
Oid relationId = RangeVarGetRelid(relvar, NoLock, false);
if (!IsCitusTable(relationId))
{
return NIL;
}
Relation relation = table_open(relationId, AccessShareLock);
ParseState *qual_pstate = make_parsestate(NULL);
AddRangeTableEntryToQueryCompat(qual_pstate, relation);
Node *qual = transformWhereClause(qual_pstate,
copyObject(stmt->qual),
EXPR_KIND_POLICY,
"POLICY");
if (qual)
{
ErrorIfUnsupportedPolicyExpr(qual);
}
ParseState *with_check_pstate = make_parsestate(NULL);
AddRangeTableEntryToQueryCompat(with_check_pstate, relation);
Node *with_check_qual = transformWhereClause(with_check_pstate,
copyObject(stmt->with_check),
EXPR_KIND_POLICY,
"POLICY");
if (with_check_qual)
{
ErrorIfUnsupportedPolicyExpr(with_check_qual);
}
RowSecurityPolicy *policy = GetPolicyByName(relationId, stmt->policy_name);
if (policy == NULL)
{
/*
* As this function is executed after standard process utility created the
* policy, we should be able to find & deparse the policy with policy_name.
* But to be more safe, error out here.
*/
ereport(ERROR, (errmsg("cannot create policy, policy does not exist.")));
}
EnsureCoordinator();
char *ddlCommand = CreatePolicyCommandForPolicy(relationId, policy);
/*
* create the DDLJob that needs to be executed both on the local relation and all its
* placements.
*/
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->metadataSyncCommand = pstrdup(ddlCommand);
ddlJob->taskList = DDLTaskList(relationId, ddlCommand);
relation_close(relation, NoLock);
return list_make1(ddlJob);
}
/*
* AddRangeTableEntryToQueryCompat adds the given relation to query.
* This method is a compatibility wrapper.
*/
static void
AddRangeTableEntryToQueryCompat(ParseState *parseState, Relation relation)
{
#if PG_VERSION_NUM >= PG_VERSION_13
ParseNamespaceItem *rte = NULL;
#else
RangeTblEntry *rte = NULL;
#endif
rte = addRangeTableEntryForRelation(parseState, relation,
#if PG_VERSION_NUM >= PG_VERSION_12
AccessShareLock,
#endif
NULL, false, false);
#if PG_VERSION_NUM >= PG_VERSION_13
addNSItemToQuery(parseState, rte, false, true, true);
#else
addRTEtoQuery(parseState, rte, false, true, true);
#endif
}
/*
* GetPolicyByName takes a relationId and a policyName, returns RowSecurityPolicy
* object which identifies the policy with name "policyName" on the relation
* with relationId. If there does not exist such a policy, then this function
* returns NULL.
*/
static RowSecurityPolicy *
GetPolicyByName(Oid relationId, const char *policyName)
{
List *policyList = GetPolicyListForRelation(relationId);
RowSecurityPolicy *policy = NULL;
foreach_ptr(policy, policyList)
{
if (strncmp(policy->policy_name, policyName, NAMEDATALEN) == 0)
{
return policy;
}
}
return NULL;
}
/*
* PreprocessAlterPolicyStmt determines whether a given ALTER POLICY statement involves a
* distributed table. If so, it creates DDLJobs to encapsulate information needed during
* the worker node portion of DDL execution before returning the DDLJobs in a list. If no
* distributed table is involved this returns NIL.
*/
List *
PreprocessAlterPolicyStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
/* placeholder for future implementation */
return NIL;
AlterPolicyStmt *stmt = castNode(AlterPolicyStmt, node);
StringInfoData ddlString;
ListCell *roleCell = NULL;
/* load relation information */
RangeVar *relvar = stmt->table;
Oid relOid = RangeVarGetRelid(relvar, NoLock, false);
if (!IsCitusTable(relOid))
{
return NIL;
}
initStringInfo(&ddlString);
Relation relation = relation_open(relOid, AccessShareLock);
char *relationName = generate_relation_name(relOid, NIL);
appendStringInfo(&ddlString, "ALTER POLICY %s ON %s",
quote_identifier(stmt->policy_name),
relationName
);
if (stmt->roles)
{
appendStringInfoString(&ddlString, " TO ");
foreach(roleCell, stmt->roles)
{
RoleSpec *roleSpec = (RoleSpec *) lfirst(roleCell);
appendStringInfoString(&ddlString, RoleSpecString(roleSpec, true));
if (lnext_compat(stmt->roles, roleCell) != NULL)
{
appendStringInfoString(&ddlString, ", ");
}
}
}
List *relationContext = deparse_context_for(relationName, relOid);
ParseState *qual_pstate = make_parsestate(NULL);
AddRangeTableEntryToQueryCompat(qual_pstate, relation);
Node *qual = transformWhereClause(qual_pstate,
copyObject(stmt->qual),
EXPR_KIND_POLICY,
"POLICY");
if (qual)
{
ErrorIfUnsupportedPolicyExpr(qual);
char *qualString = deparse_expression(qual, relationContext, false, false);
appendStringInfo(&ddlString, " USING (%s)", qualString);
}
ParseState *with_check_pstate = make_parsestate(NULL);
AddRangeTableEntryToQueryCompat(with_check_pstate, relation);
Node *with_check_qual = transformWhereClause(with_check_pstate,
copyObject(stmt->with_check),
EXPR_KIND_POLICY,
"POLICY");
if (with_check_qual)
{
ErrorIfUnsupportedPolicyExpr(with_check_qual);
char *withCheckString = deparse_expression(with_check_qual, relationContext,
false,
false);
appendStringInfo(&ddlString, " WITH CHECK (%s)", withCheckString);
}
/*
* create the DDLJob that needs to be executed both on the local relation and all its
* placements.
*/
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relOid);
ddlJob->metadataSyncCommand = pstrdup(ddlString.data);
ddlJob->taskList = DDLTaskList(relOid, ddlString.data);
relation_close(relation, NoLock);
return list_make1(ddlJob);
}
/* placeholder for ErrorIfUnsupportedPolicy */
/*
* ErrorIfUnsupportedPolicy runs checks related to a Relation their Policies and errors
* out if it is not possible to create one of the policies in a distributed environment.
*
* To support policies we require that:
* - Policy expressions do not contain subqueries.
*/
void
ErrorIfUnsupportedPolicy(Relation relation)
{
if (relation_has_policies(relation))
ListCell *policyCell = NULL;
if (!relation_has_policies(relation))
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("policies on distributed tables are only supported in "
"Citus Enterprise"),
errhint("Remove any policies on a table before distributing")));
return;
}
/*
* even if a relation has policies they might not be loaded on the Relation yet. This
* happens if policies are on a Relation without Row Level Security enabled. We need
* to make sure the policies installed are valid for distribution if RLS gets enabled
* after the table has been distributed. Therefore we force a build of the policies on
* the cached Relation
*/
if (relation->rd_rsdesc == NULL)
{
RelationBuildRowSecurity(relation);
}
foreach(policyCell, relation->rd_rsdesc->policies)
{
RowSecurityPolicy *policy = (RowSecurityPolicy *) lfirst(policyCell);
ErrorIfUnsupportedPolicyExpr((Node *) policy->qual);
ErrorIfUnsupportedPolicyExpr((Node *) policy->with_check_qual);
}
}
/* placeholder for PreprocessDropPolicyStmt */
/*
* ErrorIfUnsupportedPolicyExpr tests if the provided expression for a policy is
* supported on a distributed table.
*/
void
ErrorIfUnsupportedPolicyExpr(Node *expr)
{
/*
* We do not allow any sublink to prevent expressions with subqueries to be used as an
* expression in policies on distributed tables.
*/
if (checkExprHasSubLink(expr))
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot create policy"),
errdetail("Subqueries are not supported in policies on distributed "
"tables")));
}
}
/*
* PreprocessDropPolicyStmt determines whether a given DROP POLICY statement involves a
* distributed table. If so it creates DDLJobs to encapsulate information needed during
* the worker node portion of DDL execution before returning the DDLJobs in a List. If no
* distributed table is involved this returns NIL.
*/
List *
PreprocessDropPolicyStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
/* placeholder for future implementation */
return NIL;
DropStmt *stmt = castNode(DropStmt, node);
List *ddlJobs = NIL;
ListCell *dropObjectCell = NULL;
Assert(stmt->removeType == OBJECT_POLICY);
foreach(dropObjectCell, stmt->objects)
{
List *names = (List *) lfirst(dropObjectCell);
/*
* the last element in the list of names is the name of the policy. The ones
* before are describing the relation. By removing the last item from the list we
* can use makeRangeVarFromNameList to get to the relation. As list_truncate
* changes the list in place we make a copy before.
*/
names = list_copy(names);
names = list_truncate(names, list_length(names) - 1);
RangeVar *relation = makeRangeVarFromNameList(names);
Oid relOid = RangeVarGetRelid(relation, NoLock, false);
if (!IsCitusTable(relOid))
{
continue;
}
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relOid);
ddlJob->metadataSyncCommand = queryString;
ddlJob->taskList = DDLTaskList(relOid, queryString);
ddlJobs = lappend(ddlJobs, ddlJob);
}
return ddlJobs;
}
/* placeholder for IsPolicyRenameStmt */
/*
* IsPolicyRenameStmt returns wherher the passed-in RenameStmt is one of the following
* forms:
*
* - ALTER POLICY ... ON ... RENAME TO ...
*/
bool
IsPolicyRenameStmt(RenameStmt *stmt)
{
/* placeholder for future implementation */
return false;
return stmt->renameType == OBJECT_POLICY;
}
/* placeholder for CreatePolicyEventExtendNames */
/*
* CreatePolicyEventExtendNames extends relation names in the given CreatePolicyStmt tree.
* This function has side effects on the tree as the names are replaced inplace.
*/
void
CreatePolicyEventExtendNames(CreatePolicyStmt *stmt, const char *schemaName, uint64
shardId)
{
/* placeholder for future implementation */
RangeVar *relation = stmt->table;
char **relationName = &(relation->relname);
char **relationSchemaName = &(relation->schemaname);
/* prefix with schema name if it is not added already */
SetSchemaNameIfNotExist(relationSchemaName, schemaName);
AppendShardIdToName(relationName, shardId);
}
/* placeholder for AlterPolicyEventExtendNames */
/*
* AlterPolicyEventExtendNames extends relation names in the given AlterPolicyStatement
* tree. This function has side effects on the tree as the names are replaced inplace.
*/
void
AlterPolicyEventExtendNames(AlterPolicyStmt *stmt, const char *schemaName, uint64 shardId)
{
/* placeholder for future implementation */
RangeVar *relation = stmt->table;
char **relationName = &(relation->relname);
char **relationSchemaName = &(relation->schemaname);
/* prefix with schema name if it is not added already */
SetSchemaNameIfNotExist(relationSchemaName, schemaName);
AppendShardIdToName(relationName, shardId);
}
/* placeholder for RenamePolicyEventExtendNames */
/*
* RenamePolicyEventExtendNames extends relation names in the given RenameStmt tree. This
* function has side effects on the tree as the names are replaced inline.
*/
void
RenamePolicyEventExtendNames(RenameStmt *stmt, const char *schemaName, uint64 shardId)
{
/* placeholder for future implementation */
char **relationName = &(stmt->relation->relname);
char **objectSchemaName = &(stmt->relation->schemaname);
/* prefix with schema name if it is not added already */
SetSchemaNameIfNotExist(objectSchemaName, schemaName);
AppendShardIdToName(relationName, shardId);
}
/* placeholder for DropPolicyEventExtendNames */
/*
* DropPolicyEventExtendNames extends relation names in the given DropStmt tree specific
* to policies. This function has side effects on the tree as the names are replaced
* inplace.
*/
void
DropPolicyEventExtendNames(DropStmt *dropStmt, const char *schemaName, uint64 shardId)
{
/* placeholder for future implementation */
Value *relationSchemaNameValue = NULL;
Value *relationNameValue = NULL;
uint32 dropCount = list_length(dropStmt->objects);
if (dropCount > 1)
{
ereport(ERROR, (errmsg("cannot extend name for multiple drop objects")));
}
List *relationNameList = (List *) linitial(dropStmt->objects);
int relationNameListLength = list_length(relationNameList);
switch (relationNameListLength)
{
case 2:
{
relationNameValue = linitial(relationNameList);
break;
}
case 3:
{
relationSchemaNameValue = linitial(relationNameList);
relationNameValue = lsecond(relationNameList);
break;
}
default:
{
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("improper policy name: \"%s\"",
NameListToString(relationNameList))));
break;
}
}
/* prefix with schema name if it is not added already */
if (relationSchemaNameValue == NULL)
{
Value *schemaNameValue = makeString(pstrdup(schemaName));
relationNameList = lcons(schemaNameValue, relationNameList);
}
char **relationName = &(relationNameValue->val.str);
AppendShardIdToName(relationName, shardId);
}

View File

@ -36,11 +36,12 @@ PreprocessRenameStmt(Node *node, const char *renameCommand,
/*
* We only support some of the PostgreSQL supported RENAME statements, and
* our list include only renaming table and index (related) objects.
* our list include only renaming table, index, policy and view (related) objects.
*/
if (!IsAlterTableRenameStmt(renameStmt) &&
!IsIndexRenameStmt(renameStmt) &&
!IsPolicyRenameStmt(renameStmt))
!IsPolicyRenameStmt(renameStmt) &&
!IsViewRenameStmt(renameStmt))
{
return NIL;
}
@ -48,7 +49,7 @@ PreprocessRenameStmt(Node *node, const char *renameCommand,
/*
* The lock levels here should be same as the ones taken in
* RenameRelation(), renameatt() and RenameConstraint(). However, since all
* three statements have identical lock levels, we just use a single statement.
* four statements have identical lock levels, we just use a single statement.
*/
objectRelationId = RangeVarGetRelid(renameStmt->relation,
AccessExclusiveLock,
@ -63,14 +64,31 @@ PreprocessRenameStmt(Node *node, const char *renameCommand,
return NIL;
}
/* check whether we are dealing with a sequence here */
if (get_rel_relkind(objectRelationId) == RELKIND_SEQUENCE)
/*
* Check whether we are dealing with a sequence or view here and route queries
* accordingly to the right processor function. We need to check both objects here
* since PG supports targeting sequences and views with ALTER TABLE commands.
*/
char relKind = get_rel_relkind(objectRelationId);
if (relKind == RELKIND_SEQUENCE)
{
RenameStmt *stmtCopy = copyObject(renameStmt);
stmtCopy->renameType = OBJECT_SEQUENCE;
return PreprocessRenameSequenceStmt((Node *) stmtCopy, renameCommand,
processUtilityContext);
}
else if (relKind == RELKIND_VIEW)
{
RenameStmt *stmtCopy = copyObject(renameStmt);
stmtCopy->relationType = OBJECT_VIEW;
if (stmtCopy->renameType == OBJECT_TABLE)
{
stmtCopy->renameType = OBJECT_VIEW;
}
return PreprocessRenameViewStmt((Node *) stmtCopy, renameCommand,
processUtilityContext);
}
/* we have no planning to do unless the table is distributed */
switch (renameStmt->renameType)
@ -127,7 +145,7 @@ PreprocessRenameStmt(Node *node, const char *renameCommand,
}
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = tableRelationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, tableRelationId);
ddlJob->metadataSyncCommand = renameCommand;
ddlJob->taskList = DDLTaskList(tableRelationId, renameCommand);

View File

@ -14,7 +14,9 @@
#include "access/heapam.h"
#include "access/htup_details.h"
#include "access/genam.h"
#include "access/table.h"
#include "access/xact.h"
#include "catalog/catalog.h"
#include "catalog/pg_auth_members.h"
#include "catalog/pg_authid.h"
@ -31,6 +33,9 @@
#include "distributed/coordinator_protocol.h"
#include "distributed/metadata/distobject.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata/distobject.h"
#include "distributed/multi_executor.h"
#include "distributed/relation_access_tracking.h"
#include "distributed/version_compat.h"
#include "distributed/worker_transaction.h"
#include "miscadmin.h"
@ -40,6 +45,7 @@
#include "parser/scansup.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/guc_tables.h"
#include "utils/guc.h"
#include "utils/rel.h"
@ -54,6 +60,9 @@ static char * CreateCreateOrAlterRoleCommand(const char *roleName,
AlterRoleStmt *alterRoleStmt);
static DefElem * makeDefElemInt(char *name, int value);
static List * GenerateRoleOptionsList(HeapTuple tuple);
static List * GenerateGrantRoleStmtsFromOptions(RoleSpec *roleSpec, List *options);
static List * GenerateGrantRoleStmtsOfRole(Oid roleid);
static void EnsureSequentialModeForRoleDDL(void);
static char * GetRoleNameFromDbRoleSetting(HeapTuple tuple,
TupleDesc DbRoleSettingDescription);
@ -68,6 +77,7 @@ static int ConfigGenericNameCompare(const void *lhs, const void *rhs);
static ObjectAddress RoleSpecToObjectAddress(RoleSpec *role, bool missing_ok);
/* controlled via GUC */
bool EnableCreateRolePropagation = true;
bool EnableAlterRolePropagation = true;
bool EnableAlterRoleSetPropagation = true;
@ -133,11 +143,13 @@ PostprocessAlterRoleStmt(Node *node, const char *queryString)
return NIL;
}
if (!EnableAlterRolePropagation || !IsCoordinator())
if (!EnableAlterRolePropagation)
{
return NIL;
}
EnsureCoordinator();
AlterRoleStmt *stmt = castNode(AlterRoleStmt, node);
DefElem *option = NULL;
@ -161,7 +173,9 @@ PostprocessAlterRoleStmt(Node *node, const char *queryString)
break;
}
}
List *commands = list_make1((void *) CreateAlterRoleIfExistsCommand(stmt));
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) CreateAlterRoleIfExistsCommand(stmt),
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
@ -206,14 +220,7 @@ PreprocessAlterRoleSetStmt(Node *node, const char *queryString,
return NIL;
}
/*
* Since roles need to be handled manually on community, we need to support such queries
* by handling them locally on worker nodes
*/
if (!IsCoordinator())
{
return NIL;
}
EnsureCoordinator();
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
@ -338,15 +345,22 @@ ExtractEncryptedPassword(Oid roleOid)
Datum passwordDatum = heap_getattr(tuple, Anum_pg_authid_rolpassword,
pgAuthIdDescription, &isNull);
/*
* In PG, an empty password is treated the same as NULL.
* So we propagate NULL password to the other nodes, even if
* the user supplied an empty password
*/
char *passwordCstring = NULL;
if (!isNull)
{
passwordCstring = pstrdup(TextDatumGetCString(passwordDatum));
}
table_close(pgAuthId, AccessShareLock);
ReleaseSysCache(tuple);
if (isNull)
{
return NULL;
}
return pstrdup(TextDatumGetCString(passwordDatum));
return passwordCstring;
}
@ -492,6 +506,14 @@ GenerateCreateOrAlterRoleCommand(Oid roleOid)
Form_pg_authid role = ((Form_pg_authid) GETSTRUCT(roleTuple));
CreateRoleStmt *createRoleStmt = NULL;
if (EnableCreateRolePropagation)
{
createRoleStmt = makeNode(CreateRoleStmt);
createRoleStmt->stmt_type = ROLESTMT_ROLE;
createRoleStmt->role = pstrdup(NameStr(role->rolname));
createRoleStmt->options = GenerateRoleOptionsList(roleTuple);
}
AlterRoleStmt *alterRoleStmt = NULL;
if (EnableAlterRolePropagation)
{
@ -525,6 +547,16 @@ GenerateCreateOrAlterRoleCommand(Oid roleOid)
completeRoleList = list_concat(completeRoleList, alterRoleSetCommands);
}
if (EnableCreateRolePropagation)
{
List *grantRoleStmts = GenerateGrantRoleStmtsOfRole(roleOid);
Node *stmt = NULL;
foreach_ptr(stmt, grantRoleStmts)
{
completeRoleList = lappend(completeRoleList, DeparseTreeNode(stmt));
}
}
return completeRoleList;
}
@ -731,6 +763,157 @@ MakeSetStatementArguments(char *configurationName, char *configurationValue)
}
/*
* GenerateGrantRoleStmtsFromOptions gets a RoleSpec of a role that is being
* created and a list of options of CreateRoleStmt to generate GrantRoleStmts
* for the role's memberships.
*/
static List *
GenerateGrantRoleStmtsFromOptions(RoleSpec *roleSpec, List *options)
{
List *stmts = NIL;
DefElem *option = NULL;
foreach_ptr(option, options)
{
if (strcmp(option->defname, "adminmembers") != 0 &&
strcmp(option->defname, "rolemembers") != 0 &&
strcmp(option->defname, "addroleto") != 0)
{
continue;
}
GrantRoleStmt *grantRoleStmt = makeNode(GrantRoleStmt);
grantRoleStmt->is_grant = true;
if (strcmp(option->defname, "adminmembers") == 0 || strcmp(option->defname,
"rolemembers") == 0)
{
grantRoleStmt->granted_roles = list_make1(roleSpec);
grantRoleStmt->grantee_roles = (List *) option->arg;
}
else
{
grantRoleStmt->granted_roles = (List *) option->arg;
grantRoleStmt->grantee_roles = list_make1(roleSpec);
}
if (strcmp(option->defname, "adminmembers") == 0)
{
grantRoleStmt->admin_opt = true;
}
stmts = lappend(stmts, grantRoleStmt);
}
return stmts;
}
/*
* GenerateGrantRoleStmtsOfRole generates the GrantRoleStmts for the memberships
* of the role whose oid is roleid.
*/
static List *
GenerateGrantRoleStmtsOfRole(Oid roleid)
{
Relation pgAuthMembers = table_open(AuthMemRelationId, AccessShareLock);
HeapTuple tuple = NULL;
List *stmts = NIL;
ScanKeyData skey[1];
ScanKeyInit(&skey[0], Anum_pg_auth_members_member, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(roleid));
SysScanDesc scan = systable_beginscan(pgAuthMembers, AuthMemMemRoleIndexId, true,
NULL, 1, &skey[0]);
while (HeapTupleIsValid(tuple = systable_getnext(scan)))
{
Form_pg_auth_members membership = (Form_pg_auth_members) GETSTRUCT(tuple);
GrantRoleStmt *grantRoleStmt = makeNode(GrantRoleStmt);
grantRoleStmt->is_grant = true;
RoleSpec *grantedRole = makeNode(RoleSpec);
grantedRole->roletype = ROLESPEC_CSTRING;
grantedRole->location = -1;
grantedRole->rolename = GetUserNameFromId(membership->roleid, true);
grantRoleStmt->granted_roles = list_make1(grantedRole);
RoleSpec *granteeRole = makeNode(RoleSpec);
granteeRole->roletype = ROLESPEC_CSTRING;
granteeRole->location = -1;
granteeRole->rolename = GetUserNameFromId(membership->member, true);
grantRoleStmt->grantee_roles = list_make1(granteeRole);
grantRoleStmt->grantor = NULL;
grantRoleStmt->admin_opt = membership->admin_option;
stmts = lappend(stmts, grantRoleStmt);
}
systable_endscan(scan);
table_close(pgAuthMembers, AccessShareLock);
return stmts;
}
/*
* PreprocessCreateRoleStmt creates a worker_create_or_alter_role query for the
* role that is being created. With that query we can create the role in the
* workers or if they exist we alter them to the way they are being created
* right now.
*/
List *
PreprocessCreateRoleStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
if (!EnableCreateRolePropagation || !ShouldPropagate())
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialModeForRoleDDL();
LockRelationOid(DistNodeRelationId(), RowShareLock);
CreateRoleStmt *createRoleStmt = castNode(CreateRoleStmt, node);
AlterRoleStmt *alterRoleStmt = makeNode(AlterRoleStmt);
alterRoleStmt->role = makeNode(RoleSpec);
alterRoleStmt->role->roletype = ROLESPEC_CSTRING;
alterRoleStmt->role->location = -1;
alterRoleStmt->role->rolename = pstrdup(createRoleStmt->role);
alterRoleStmt->action = 1;
alterRoleStmt->options = createRoleStmt->options;
List *grantRoleStmts = GenerateGrantRoleStmtsFromOptions(alterRoleStmt->role,
createRoleStmt->options);
char *createOrAlterRoleQuery = CreateCreateOrAlterRoleCommand(createRoleStmt->role,
createRoleStmt,
alterRoleStmt);
List *commands = NIL;
commands = lappend(commands, DISABLE_DDL_PROPAGATION);
commands = lappend(commands, createOrAlterRoleQuery);
/* deparse all grant statements and add them to the to commands list */
Node *stmt = NULL;
foreach_ptr(stmt, grantRoleStmts)
{
commands = lappend(commands, DeparseTreeNode(stmt));
}
commands = lappend(commands, ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* makeStringConst creates a Const Node that stores a given string
*
@ -785,6 +968,178 @@ makeFloatConst(char *str, int location)
}
/*
* PreprocessDropRoleStmt finds the distributed role out of the ones
* being dropped and unmarks them distributed and creates the drop statements
* for the workers.
*/
List *
PreprocessDropRoleStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropRoleStmt *stmt = castNode(DropRoleStmt, node);
List *allDropRoles = stmt->roles;
List *distributedDropRoles = FilterDistributedRoles(allDropRoles);
if (list_length(distributedDropRoles) <= 0)
{
return NIL;
}
if (!EnableCreateRolePropagation || !ShouldPropagate())
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialModeForRoleDDL();
stmt->roles = distributedDropRoles;
char *sql = DeparseTreeNode((Node *) stmt);
stmt->roles = allDropRoles;
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* UnmarkRolesDistributed unmarks the roles in the RoleSpec list distributed.
*/
void
UnmarkRolesDistributed(List *roles)
{
Node *roleNode = NULL;
foreach_ptr(roleNode, roles)
{
RoleSpec *role = castNode(RoleSpec, roleNode);
ObjectAddress roleAddress = { 0 };
Oid roleOid = get_rolespec_oid(role, true);
if (roleOid == InvalidOid)
{
/*
* If the role is dropped (concurrently), we might get an inactive oid for the
* role. If it is invalid oid, skip.
*/
continue;
}
ObjectAddressSet(roleAddress, AuthIdRelationId, roleOid);
UnmarkObjectDistributed(&roleAddress);
}
}
/*
* FilterDistributedRoles filters the list of RoleSpecs and returns the ones
* that are distributed.
*/
List *
FilterDistributedRoles(List *roles)
{
List *distributedRoles = NIL;
Node *roleNode = NULL;
foreach_ptr(roleNode, roles)
{
RoleSpec *role = castNode(RoleSpec, roleNode);
ObjectAddress roleAddress = { 0 };
Oid roleOid = get_rolespec_oid(role, true);
if (roleOid == InvalidOid)
{
/*
* Non-existing roles are ignored silently here. Postgres will
* handle to give an error or not for these roles.
*/
continue;
}
ObjectAddressSet(roleAddress, AuthIdRelationId, roleOid);
if (IsObjectDistributed(&roleAddress))
{
distributedRoles = lappend(distributedRoles, role);
}
}
return distributedRoles;
}
/*
* PreprocessGrantRoleStmt finds the distributed grantee roles and creates the
* query to run on the workers.
*/
List *
PreprocessGrantRoleStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
if (!EnableCreateRolePropagation || !ShouldPropagate())
{
return NIL;
}
EnsureCoordinator();
GrantRoleStmt *stmt = castNode(GrantRoleStmt, node);
List *allGranteeRoles = stmt->grantee_roles;
RoleSpec *grantor = stmt->grantor;
List *distributedGranteeRoles = FilterDistributedRoles(allGranteeRoles);
if (list_length(distributedGranteeRoles) <= 0)
{
return NIL;
}
/*
* Postgres don't seem to use the grantor. Even dropping the grantor doesn't
* seem to affect the membership. If this changes, we might need to add grantors
* to the dependency resolution too. For now we just don't propagate it.
*/
stmt->grantor = NULL;
stmt->grantee_roles = distributedGranteeRoles;
char *sql = DeparseTreeNode((Node *) stmt);
stmt->grantee_roles = allGranteeRoles;
stmt->grantor = grantor;
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessGrantRoleStmt actually creates the plan we need to execute for grant
* role statement.
*/
List *
PostprocessGrantRoleStmt(Node *node, const char *queryString)
{
if (!EnableCreateRolePropagation || !IsCoordinator() || !ShouldPropagate())
{
return NIL;
}
GrantRoleStmt *stmt = castNode(GrantRoleStmt, node);
RoleSpec *role = NULL;
foreach_ptr(role, stmt->grantee_roles)
{
ObjectAddress roleAddress = { 0 };
Oid roleOid = get_rolespec_oid(role, false);
ObjectAddressSet(roleAddress, AuthIdRelationId, roleOid);
if (IsObjectDistributed(&roleAddress))
{
EnsureDependenciesExistOnAllNodes(&roleAddress);
}
}
return NIL;
}
/*
* ConfigGenericNameCompare compares two config_generic structs based on their
* name fields. If the name fields contain the same strings two structs are
@ -805,3 +1160,64 @@ ConfigGenericNameCompare(const void *a, const void *b)
*/
return pg_strcasecmp(confa->name, confb->name);
}
/*
* CreateRoleStmtObjectAddress finds the ObjectAddress for the role described
* by the CreateRoleStmt. If missing_ok is false this function throws an error if the
* role does not exist.
*
* Never returns NULL, but the objid in the address could be invalid if missing_ok was set
* to true.
*/
ObjectAddress
CreateRoleStmtObjectAddress(Node *node, bool missing_ok)
{
CreateRoleStmt *stmt = castNode(CreateRoleStmt, node);
Oid roleOid = get_role_oid(stmt->role, missing_ok);
ObjectAddress roleAddress = { 0 };
ObjectAddressSet(roleAddress, AuthIdRelationId, roleOid);
return roleAddress;
}
/*
* EnsureSequentialModeForRoleDDL makes sure that the current transaction is already in
* sequential mode, or can still safely be put in sequential mode, it errors if that is
* not possible. The error contains information for the user to retry the transaction with
* sequential mode set from the begining.
*
* As roles are node scoped objects there exists only 1 instance of the role used by
* potentially multiple shards. To make sure all shards in the transaction can interact
* with the role the role needs to be visible on all connections used by the transaction,
* meaning we can only use 1 connection per node.
*/
static void
EnsureSequentialModeForRoleDDL(void)
{
if (!IsTransactionBlock())
{
/* we do not need to switch to sequential mode if we are not in a transaction */
return;
}
if (ParallelQueryExecutedInTransaction())
{
ereport(ERROR, (errmsg("cannot create or modify role because there was a "
"parallel operation on a distributed table in the "
"transaction"),
errdetail("When creating or altering a role, Citus needs to "
"perform all operations over a single connection per "
"node to ensure consistency."),
errhint("Try re-running the transaction with "
"\"SET LOCAL citus.multi_shard_modify_mode TO "
"\'sequential\';\"")));
}
ereport(DEBUG1, (errmsg("switching to sequential query execution mode"),
errdetail("Role is created or altered. To make sure subsequent "
"commands see the role correctly we need to make sure to "
"use only one connection for all future commands")));
SetLocalMultiShardModifyModeToSequential();
}

View File

@ -161,14 +161,7 @@ PreprocessGrantOnSchemaStmt(Node *node, const char *queryString,
return NIL;
}
/*
* Since access control needs to be handled manually on community, we need to support
* such queries by handling them locally on worker nodes.
*/
if (!IsCoordinator())
{
return NIL;
}
EnsureCoordinator();
List *originalObjects = stmt->objects;
@ -178,41 +171,8 @@ PreprocessGrantOnSchemaStmt(Node *node, const char *queryString,
stmt->objects = originalObjects;
return NodeDDLTaskList(NON_COORDINATOR_NODES, list_make1(sql));
}
/*
* PreprocessAlterSchemaRenameStmt is called when the user is renaming a schema.
* The invocation happens before the statement is applied locally.
*
* As the schema already exists we have access to the ObjectAddress for the schema, this
* is used to check if the schmea is distributed. If the schema is distributed the rename
* is executed on all the workers to keep the schemas in sync across the cluster.
*/
List *
PreprocessAlterSchemaRenameStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
ObjectAddress schemaAddress = GetObjectAddressFromParseTree(node, false);
if (!ShouldPropagateObject(&schemaAddress))
{
return NIL;
}
EnsureCoordinator();
/* fully qualify */
QualifyTreeNode(node);
/* deparse sql*/
const char *renameStmtSql = DeparseTreeNode(node);
EnsureSequentialMode(OBJECT_SCHEMA);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) renameStmtSql,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);

View File

@ -24,14 +24,17 @@
#include "distributed/metadata/distobject.h"
#include "distributed/metadata_cache.h"
#include "distributed/metadata_sync.h"
#include "nodes/makefuncs.h"
#include "distributed/worker_create_or_replace.h"
#include "nodes/parsenodes.h"
#include "rewrite/rewriteHandler.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
/* Local functions forward declarations for helper functions */
static bool OptionsSpecifyOwnedBy(List *optionList, Oid *ownedByTableId);
static Oid SequenceUsedInDistributedTable(const ObjectAddress *sequenceAddress);
static List * FilterDistributedSequences(GrantStmt *stmt);
/*
@ -170,48 +173,70 @@ ExtractDefaultColumnsAndOwnedSequences(Oid relationId, List **columnNameList,
attributeIndex++)
{
Form_pg_attribute attributeForm = TupleDescAttr(tupleDescriptor, attributeIndex);
if (attributeForm->attisdropped || !attributeForm->atthasdef)
{
/*
* If this column has already been dropped or it has no DEFAULT
* definition, skip it.
*/
continue;
}
if (attributeForm->attgenerated == ATTRIBUTE_GENERATED_STORED)
if (attributeForm->attisdropped ||
attributeForm->attgenerated == ATTRIBUTE_GENERATED_STORED)
{
/* skip columns with GENERATED AS ALWAYS expressions */
/* skip dropped columns and columns with GENERATED AS ALWAYS expressions */
continue;
}
char *columnName = NameStr(attributeForm->attname);
*columnNameList = lappend(*columnNameList, columnName);
List *columnOwnedSequences =
GetSequencesOwnedByColumn(relationId, attributeIndex + 1);
Oid ownedSequenceId = InvalidOid;
if (list_length(columnOwnedSequences) != 0)
if (attributeForm->atthasdef && list_length(columnOwnedSequences) == 0)
{
/*
* A column might only own one sequence. We intentionally use
* GetSequencesOwnedByColumn macro and pick initial oid from the
* list instead of using getOwnedSequence. This is both because
* getOwnedSequence is removed in pg13 and is also because it
* errors out if column does not have any sequences.
* Even if there are no owned sequences, the code path still
* expects the columnName to be filled such that it can DROP
* DEFAULT for the existing nextval('seq') columns.
*/
Assert(list_length(columnOwnedSequences) == 1);
ownedSequenceId = linitial_oid(columnOwnedSequences);
*ownedSequenceIdList = lappend_oid(*ownedSequenceIdList, InvalidOid);
*columnNameList = lappend(*columnNameList, columnName);
continue;
}
*ownedSequenceIdList = lappend_oid(*ownedSequenceIdList, ownedSequenceId);
Oid ownedSequenceId = InvalidOid;
foreach_oid(ownedSequenceId, columnOwnedSequences)
{
/*
* A column might have multiple sequences one via OWNED BY one another
* via bigserial/default nextval.
*/
*ownedSequenceIdList = lappend_oid(*ownedSequenceIdList, ownedSequenceId);
*columnNameList = lappend(*columnNameList, columnName);
}
}
relation_close(relation, NoLock);
}
/*
* ColumnDefaultsToNextVal returns true if the column with attrNumber
* has a default expression that contains nextval().
*/
bool
ColumnDefaultsToNextVal(Oid relationId, AttrNumber attrNumber)
{
AssertArg(AttributeNumberIsValid(attrNumber));
Relation relation = RelationIdGetRelation(relationId);
Node *defExpr = build_column_default(relation, attrNumber);
RelationClose(relation);
if (defExpr == NULL)
{
/* column doesn't have a DEFAULT expression */
return false;
}
return contain_nextval_expression_walker(defExpr, NULL);
}
/*
* PreprocessDropSequenceStmt gets called during the planning phase of a DROP SEQUENCE statement
* and returns a list of DDLJob's that will drop any distributed sequences from the
@ -226,7 +251,6 @@ PreprocessDropSequenceStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
List *deletingSequencesList = stmt->objects;
List *distributedSequencesList = NIL;
List *distributedSequenceAddresses = NIL;
@ -259,6 +283,7 @@ PreprocessDropSequenceStmt(Node *node, const char *queryString,
* iterate over all sequences to be dropped and filter to keep only distributed
* sequences.
*/
List *deletingSequencesList = stmt->objects;
List *objectNameList = NULL;
foreach_ptr(objectNameList, deletingSequencesList)
{
@ -445,17 +470,15 @@ SequenceUsedInDistributedTable(const ObjectAddress *sequenceAddress)
Oid citusTableId = InvalidOid;
foreach_oid(citusTableId, citusTableIdList)
{
List *attnumList = NIL;
List *dependentSequenceList = NIL;
GetDependentSequencesWithRelation(citusTableId, &attnumList,
&dependentSequenceList, 0);
Oid currentSeqOid = InvalidOid;
foreach_oid(currentSeqOid, dependentSequenceList)
List *seqInfoList = NIL;
GetDependentSequencesWithRelation(citusTableId, &seqInfoList, 0);
SequenceInfo *seqInfo = NULL;
foreach_ptr(seqInfo, seqInfoList)
{
/*
* This sequence is used in a distributed table
*/
if (currentSeqOid == sequenceAddress->objectId)
if (seqInfo->sequenceOid == sequenceAddress->objectId)
{
return citusTableId;
}
@ -660,6 +683,97 @@ PostprocessAlterSequenceOwnerStmt(Node *node, const char *queryString)
}
/*
* PreprocessGrantOnSequenceStmt is executed before the statement is applied to the local
* postgres instance.
*
* In this stage we can prepare the commands that need to be run on all workers to grant
* on distributed sequences.
*/
List *
PreprocessGrantOnSequenceStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_SEQUENCE);
if (creating_extension)
{
/*
* extensions should be created separately on the workers, sequences cascading
* from an extension should therefore not be propagated here.
*/
return NIL;
}
if (!EnableMetadataSync)
{
/*
* we are configured to disable object propagation, should not propagate anything
*/
return NIL;
}
List *distributedSequences = FilterDistributedSequences(stmt);
if (list_length(distributedSequences) == 0)
{
return NIL;
}
EnsureCoordinator();
GrantStmt *stmtCopy = copyObject(stmt);
stmtCopy->objects = distributedSequences;
/*
* if the original command was targeting schemas, we have expanded to the distributed
* sequences in these schemas through FilterDistributedSequences.
*/
stmtCopy->targtype = ACL_TARGET_OBJECT;
QualifyTreeNode((Node *) stmtCopy);
char *sql = DeparseTreeNode((Node *) stmtCopy);
List *commands = list_make3(DISABLE_DDL_PROPAGATION, (void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_METADATA_NODES, commands);
}
/*
* PostprocessGrantOnSequenceStmt makes sure dependencies of each
* distributed sequence in the statement exist on all nodes
*/
List *
PostprocessGrantOnSequenceStmt(Node *node, const char *queryString)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_SEQUENCE);
List *distributedSequences = FilterDistributedSequences(stmt);
if (list_length(distributedSequences) == 0)
{
return NIL;
}
EnsureCoordinator();
RangeVar *sequence = NULL;
foreach_ptr(sequence, distributedSequences)
{
ObjectAddress sequenceAddress = { 0 };
Oid sequenceOid = RangeVarGetRelid(sequence, NoLock, false);
ObjectAddressSet(sequenceAddress, RelationRelationId, sequenceOid);
EnsureDependenciesExistOnAllNodes(&sequenceAddress);
}
return NIL;
}
/*
* GenerateBackupNameForSequenceCollision generates a new sequence name for an existing
* sequence. The name is generated in such a way that the new name doesn't overlap with
@ -702,6 +816,96 @@ GenerateBackupNameForSequenceCollision(const ObjectAddress *address)
}
/*
* FilterDistributedSequences determines and returns a list of distributed sequences
* RangeVar-s from given grant statement.
* - If the stmt's targtype is ACL_TARGET_OBJECT, i.e. of the form GRANT ON SEQUENCE ...
* it returns the distributed sequences in the list of sequences in the statement
* - If targtype is ACL_TARGET_ALL_IN_SCHEMA, i.e. GRANT ON ALL SEQUENCES IN SCHEMA ...
* it expands the ALL IN SCHEMA to the actual sequences, and returns the distributed
* sequences from those.
*/
static List *
FilterDistributedSequences(GrantStmt *stmt)
{
bool grantOnSequenceCommand = (stmt->targtype == ACL_TARGET_OBJECT &&
stmt->objtype == OBJECT_SEQUENCE);
bool grantOnAllSequencesInSchemaCommand = (stmt->targtype ==
ACL_TARGET_ALL_IN_SCHEMA &&
stmt->objtype == OBJECT_SEQUENCE);
/* we are only interested in sequence level grants */
if (!grantOnSequenceCommand && !grantOnAllSequencesInSchemaCommand)
{
return NIL;
}
List *grantSequenceList = NIL;
if (grantOnAllSequencesInSchemaCommand)
{
/* iterate over all namespace names provided to get their oid's */
List *namespaceOidList = NIL;
Value *namespaceValue = NULL;
foreach_ptr(namespaceValue, stmt->objects)
{
char *nspname = strVal(namespaceValue);
bool missing_ok = false;
Oid namespaceOid = get_namespace_oid(nspname, missing_ok);
namespaceOidList = list_append_unique_oid(namespaceOidList, namespaceOid);
}
/*
* iterate over all distributed sequences to filter the ones
* that belong to one of the namespaces from above
*/
List *distributedSequenceList = DistributedSequenceList();
ObjectAddress *sequenceAddress = NULL;
foreach_ptr(sequenceAddress, distributedSequenceList)
{
Oid namespaceOid = get_rel_namespace(sequenceAddress->objectId);
/*
* if this distributed sequence's schema is one of the schemas
* specified in the GRANT .. ALL SEQUENCES IN SCHEMA ..
* add it to the list
*/
if (list_member_oid(namespaceOidList, namespaceOid))
{
RangeVar *distributedSequence = makeRangeVar(get_namespace_name(
namespaceOid),
get_rel_name(
sequenceAddress->objectId),
-1);
grantSequenceList = lappend(grantSequenceList, distributedSequence);
}
}
}
else
{
bool missing_ok = false;
RangeVar *sequenceRangeVar = NULL;
foreach_ptr(sequenceRangeVar, stmt->objects)
{
ObjectAddress sequenceAddress = { 0 };
Oid sequenceOid = RangeVarGetRelid(sequenceRangeVar, NoLock, missing_ok);
ObjectAddressSet(sequenceAddress, RelationRelationId, sequenceOid);
/*
* if this sequence from GRANT .. ON SEQUENCE .. is a distributed
* sequence, add it to the list
*/
if (IsObjectDistributed(&sequenceAddress))
{
grantSequenceList = lappend(grantSequenceList, sequenceRangeVar);
}
}
}
return grantSequenceList;
}
/*
* RenameExistingSequenceWithDifferentTypeIfExists renames the sequence's type if
* that sequence exists and the desired sequence type is different than it's type.

View File

@ -92,7 +92,7 @@ PreprocessCreateStatisticsStmt(Node *node, const char *queryString,
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->startNewTransaction = false;
ddlJob->metadataSyncCommand = ddlCommand;
ddlJob->taskList = DDLTaskList(relationId, ddlCommand);
@ -197,7 +197,7 @@ PreprocessDropStatisticsStmt(Node *node, const char *queryString,
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->startNewTransaction = false;
ddlJob->metadataSyncCommand = ddlCommand;
ddlJob->taskList = DDLTaskList(relationId, ddlCommand);
@ -236,7 +236,7 @@ PreprocessAlterStatisticsRenameStmt(Node *node, const char *queryString,
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->startNewTransaction = false;
ddlJob->metadataSyncCommand = ddlCommand;
ddlJob->taskList = DDLTaskList(relationId, ddlCommand);
@ -274,7 +274,7 @@ PreprocessAlterStatisticsSchemaStmt(Node *node, const char *queryString,
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->startNewTransaction = false;
ddlJob->metadataSyncCommand = ddlCommand;
ddlJob->taskList = DDLTaskList(relationId, ddlCommand);
@ -376,7 +376,7 @@ PreprocessAlterStatisticsStmt(Node *node, const char *queryString,
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->startNewTransaction = false;
ddlJob->metadataSyncCommand = ddlCommand;
ddlJob->taskList = DDLTaskList(relationId, ddlCommand);
@ -416,7 +416,7 @@ PreprocessAlterStatisticsOwnerStmt(Node *node, const char *queryString,
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->startNewTransaction = false;
ddlJob->metadataSyncCommand = ddlCommand;
ddlJob->taskList = DDLTaskList(relationId, ddlCommand);

View File

@ -10,13 +10,129 @@
#include "postgres.h"
#include "safe_lib.h"
#include <string.h>
#include "commands/defrem.h"
#include "distributed/commands.h"
#include "distributed/connection_management.h"
#include "distributed/pg_version_constants.h"
#include "distributed/version_compat.h"
#include "libpq-fe.h"
#include "nodes/parsenodes.h"
#include "utils/builtins.h"
/* placeholder for ProcessCreateSubscriptionStmt */
static char * GenerateConninfoWithAuth(char *conninfo);
/*
* ProcessCreateSubscriptionStmt looks for a special citus_use_authinfo option.
* If it is set to true, then we'll expand the node's authinfo into the create
* statement (see GenerateConninfoWithAuth).
*/
Node *
ProcessCreateSubscriptionStmt(CreateSubscriptionStmt *createSubStmt)
{
ListCell *currCell = NULL;
#if PG_VERSION_NUM < PG_VERSION_13
ListCell *prevCell = NULL;
#endif
bool useAuthinfo = false;
foreach(currCell, createSubStmt->options)
{
DefElem *defElem = (DefElem *) lfirst(currCell);
if (strcmp(defElem->defname, "citus_use_authinfo") == 0)
{
useAuthinfo = defGetBoolean(defElem);
createSubStmt->options = list_delete_cell_compat(createSubStmt->options,
currCell,
prevCell);
break;
}
#if PG_VERSION_NUM < PG_VERSION_13
prevCell = currCell;
#endif
}
if (useAuthinfo)
{
createSubStmt->conninfo = GenerateConninfoWithAuth(createSubStmt->conninfo);
}
return (Node *) createSubStmt;
}
/*
* GenerateConninfoWithAuth extracts the host and port from the provided libpq
* conninfo string, using them to find an appropriate authinfo for the target
* host. If such an authinfo is found, it is added to the (repalloc'd) string,
* which is then returned.
*/
static char *
GenerateConninfoWithAuth(char *conninfo)
{
StringInfo connInfoWithAuth = makeStringInfo();
char *host = NULL, *user = NULL;
int32 port = -1;
PQconninfoOption *option = NULL, *optionArray = NULL;
optionArray = PQconninfoParse(conninfo, NULL);
if (optionArray == NULL)
{
ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR),
errmsg("not a valid libpq connection info string: %s",
conninfo)));
}
for (option = optionArray; option->keyword != NULL; option++)
{
if (option->val == NULL || option->val[0] == '\0')
{
continue;
}
if (strcmp(option->keyword, "host") == 0)
{
host = option->val;
}
else if (strcmp(option->keyword, "port") == 0)
{
port = pg_atoi(option->val, 4, 0);
}
else if (strcmp(option->keyword, "user") == 0)
{
user = option->val;
}
}
/*
* In case of repetition of parameters in connection strings, last value
* wins. So first add the provided connection string, then global
* connection parameters, then node specific ones.
*
* Note that currently lists of parameters in pg_dist_authnode and
* citus.node_conninfo do not overlap.
*
* The only overlapping parameter between these three lists is
* connect_timeout, which is assigned in conninfo (generated
* by CreateShardMoveSubscription) and is also allowed in
* citus.node_conninfo. Prioritizing the value in citus.node_conninfo
* over conninfo gives user the power to control this value.
*/
appendStringInfo(connInfoWithAuth, "%s %s", conninfo, NodeConninfo);
if (host != NULL && port > 0 && user != NULL)
{
char *nodeAuthInfo = GetAuthinfo(host, port, user);
appendStringInfo(connInfoWithAuth, " %s", nodeAuthInfo);
}
PQconninfoFree(optionArray);
return connInfoWithAuth->data;
}

View File

@ -40,6 +40,7 @@
#include "distributed/resource_lock.h"
#include "distributed/version_compat.h"
#include "distributed/worker_shard_visibility.h"
#include "foreign/foreign.h"
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "parser/parse_expr.h"
@ -54,6 +55,12 @@
/* controlled via GUC, should be accessed via GetEnableLocalReferenceForeignKeys() */
bool EnableLocalReferenceForeignKeys = true;
/*
* GUC that controls whether to allow unique/exclude constraints without
* distribution column.
*/
bool AllowUnsafeConstraints = false;
/* Local functions forward declarations for unsupported command checks */
static void PostprocessCreateTableStmtForeignKeys(CreateStmt *createStatement);
static void PostprocessCreateTableStmtPartitionOf(CreateStmt *createStatement,
@ -111,6 +118,8 @@ static char * GetAlterColumnWithNextvalDefaultCmd(Oid sequenceOid, Oid relationI
char *colname);
static char * GetAddColumnWithNextvalDefaultCmd(Oid sequenceOid, Oid relationId,
char *colname, TypeName *typeName);
static void ErrorIfAlterTableDropTableNameFromPostgresFdw(List *optionList, Oid
relationId);
/*
@ -153,11 +162,14 @@ PreprocessDropTableStmt(Node *node, const char *queryString,
continue;
}
if (IsCitusTableType(relationId, REFERENCE_TABLE))
/*
* While changing the tables that are part of a colocation group we need to
* prevent concurrent mutations to the placements of the shard groups.
*/
CitusTableCacheEntry *cacheEntry = GetCitusTableCacheEntry(relationId);
if (cacheEntry->colocationId != INVALID_COLOCATION_ID)
{
/* prevent concurrent EnsureReferenceTablesExistOnAllNodes */
int colocationId = CreateReferenceTableColocationId();
LockColocationId(colocationId, ExclusiveLock);
LockColocationId(cacheEntry->colocationId, ShareLock);
}
/* invalidate foreign key cache if the table involved in any foreign key */
@ -651,12 +663,21 @@ PostprocessAlterTableSchemaStmt(Node *node, const char *queryString)
*/
ObjectAddress tableAddress = GetObjectAddressFromParseTree((Node *) stmt, true);
/* check whether we are dealing with a sequence here */
if (get_rel_relkind(tableAddress.objectId) == RELKIND_SEQUENCE)
/*
* Check whether we are dealing with a sequence or view here and route queries
* accordingly to the right processor function.
*/
char relKind = get_rel_relkind(tableAddress.objectId);
if (relKind == RELKIND_SEQUENCE)
{
stmt->objectType = OBJECT_SEQUENCE;
return PostprocessAlterSequenceSchemaStmt((Node *) stmt, queryString);
}
else if (relKind == RELKIND_VIEW)
{
stmt->objectType = OBJECT_VIEW;
return PostprocessAlterViewSchemaStmt((Node *) stmt, queryString);
}
if (!ShouldPropagate() || !IsCitusTable(tableAddress.objectId))
{
@ -699,18 +720,26 @@ PreprocessAlterTableStmt(Node *node, const char *alterTableCommand,
}
/*
* check whether we are dealing with a sequence here
* check whether we are dealing with a sequence or view here
* if yes, it must be ALTER TABLE .. OWNER TO .. command
* since this is the only ALTER command of a sequence that
* since this is the only ALTER command of a sequence or view that
* passes through an AlterTableStmt
*/
if (get_rel_relkind(leftRelationId) == RELKIND_SEQUENCE)
char relKind = get_rel_relkind(leftRelationId);
if (relKind == RELKIND_SEQUENCE)
{
AlterTableStmt *stmtCopy = copyObject(alterTableStatement);
AlterTableStmtObjType_compat(stmtCopy) = OBJECT_SEQUENCE;
return PreprocessAlterSequenceOwnerStmt((Node *) stmtCopy, alterTableCommand,
processUtilityContext);
}
else if (relKind == RELKIND_VIEW)
{
AlterTableStmt *stmtCopy = copyObject(alterTableStatement);
AlterTableStmtObjType_compat(stmtCopy) = OBJECT_VIEW;
return PreprocessAlterViewStmt((Node *) stmtCopy, alterTableCommand,
processUtilityContext);
}
/*
* AlterTableStmt applies also to INDEX relations, and we have support for
@ -1102,7 +1131,7 @@ PreprocessAlterTableStmt(Node *node, const char *alterTableCommand,
/* fill them here as it is possible to use them in some conditional blocks below */
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = leftRelationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, leftRelationId);
const char *sqlForTaskList = alterTableCommand;
if (deparseAT)
@ -1758,18 +1787,31 @@ PreprocessAlterTableSchemaStmt(Node *node, const char *queryString,
{
return NIL;
}
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt,
stmt->missing_ok);
Oid relationId = address.objectId;
/* check whether we are dealing with a sequence here */
if (get_rel_relkind(relationId) == RELKIND_SEQUENCE)
/*
* Check whether we are dealing with a sequence or view here and route queries
* accordingly to the right processor function. We need to check both objects here
* since PG supports targeting sequences and views with ALTER TABLE commands.
*/
char relKind = get_rel_relkind(relationId);
if (relKind == RELKIND_SEQUENCE)
{
AlterObjectSchemaStmt *stmtCopy = copyObject(stmt);
stmtCopy->objectType = OBJECT_SEQUENCE;
return PreprocessAlterSequenceSchemaStmt((Node *) stmtCopy, queryString,
processUtilityContext);
}
else if (relKind == RELKIND_VIEW)
{
AlterObjectSchemaStmt *stmtCopy = copyObject(stmt);
stmtCopy->objectType = OBJECT_VIEW;
return PreprocessAlterViewSchemaStmt((Node *) stmtCopy, queryString,
processUtilityContext);
}
/* first check whether a distributed relation is affected */
if (!OidIsValid(relationId) || !IsCitusTable(relationId))
@ -1779,7 +1821,7 @@ PreprocessAlterTableSchemaStmt(Node *node, const char *queryString,
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
QualifyTreeNode((Node *) stmt);
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->metadataSyncCommand = DeparseTreeNode((Node *) stmt);
ddlJob->taskList = DDLTaskList(relationId, ddlJob->metadataSyncCommand);
return list_make1(ddlJob);
@ -1939,12 +1981,19 @@ PostprocessAlterTableStmt(AlterTableStmt *alterTableStatement)
* since this is the only ALTER command of a sequence that
* passes through an AlterTableStmt
*/
if (get_rel_relkind(relationId) == RELKIND_SEQUENCE)
char relKind = get_rel_relkind(relationId);
if (relKind == RELKIND_SEQUENCE)
{
AlterTableStmtObjType_compat(alterTableStatement) = OBJECT_SEQUENCE;
PostprocessAlterSequenceOwnerStmt((Node *) alterTableStatement, NULL);
return;
}
else if (relKind == RELKIND_VIEW)
{
AlterTableStmtObjType_compat(alterTableStatement) = OBJECT_VIEW;
PostprocessAlterViewStmt((Node *) alterTableStatement, NULL);
return;
}
/*
* Before ensuring each dependency exist, update dependent sequences
@ -2509,6 +2558,16 @@ ErrorIfUnsupportedConstraint(Relation relation, char distributionMethod,
errhint("Consider using hash partitioning.")));
}
if (AllowUnsafeConstraints)
{
/*
* The user explicitly wants to allow the constraint without
* distribution column.
*/
index_close(indexDesc, NoLock);
continue;
}
int attributeCount = indexInfo->ii_NumIndexAttrs;
AttrNumber *attributeNumberArray = indexInfo->ii_IndexAttrNumbers;
@ -2551,6 +2610,42 @@ ErrorIfUnsupportedConstraint(Relation relation, char distributionMethod,
}
/*
* ErrorIfAlterTableDropTableNameFromPostgresFdw errors if given alter foreign table
* option list drops 'table_name' from a postgresfdw foreign table which is
* inside metadata.
*/
static void
ErrorIfAlterTableDropTableNameFromPostgresFdw(List *optionList, Oid relationId)
{
char relationKind PG_USED_FOR_ASSERTS_ONLY =
get_rel_relkind(relationId);
Assert(relationKind == RELKIND_FOREIGN_TABLE);
ForeignTable *foreignTable = GetForeignTable(relationId);
Oid serverId = foreignTable->serverid;
if (!ServerUsesPostgresFdw(serverId))
{
return;
}
if (IsCitusTableType(relationId, CITUS_LOCAL_TABLE) &&
ForeignTableDropsTableNameOption(optionList))
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg(
"alter foreign table alter options (drop table_name) command "
"is not allowed for Citus tables"),
errdetail(
"Table_name option can not be dropped from a foreign table "
"which is inside metadata."),
errhint(
"Try to undistribute foreign table before dropping table_name option.")));
}
}
/*
* ErrorIfUnsupportedAlterTableStmt checks if the corresponding alter table
* statement is supported for distributed tables and errors out if it is not.
@ -2563,6 +2658,7 @@ ErrorIfUnsupportedConstraint(Relation relation, char distributionMethod,
* ALTER TABLE ADD|DROP CONSTRAINT
* ALTER TABLE REPLICA IDENTITY
* ALTER TABLE SET ()
* ALTER TABLE ENABLE|DISABLE|NO FORCE|FORCE ROW LEVEL SECURITY
* ALTER TABLE RESET ()
* ALTER TABLE ENABLE/DISABLE TRIGGER (if enable_unsafe_triggers is not set, we only support triggers for citus local tables)
*/
@ -2751,11 +2847,9 @@ ErrorIfUnsupportedAlterTableStmt(AlterTableStmt *alterTableStatement)
* changing the type of the column should not be allowed for now
*/
AttrNumber attnum = get_attnum(relationId, command->name);
List *attnumList = NIL;
List *dependentSequenceList = NIL;
GetDependentSequencesWithRelation(relationId, &attnumList,
&dependentSequenceList, attnum);
if (dependentSequenceList != NIL)
List *seqInfoList = NIL;
GetDependentSequencesWithRelation(relationId, &seqInfoList, attnum);
if (seqInfoList != NIL)
{
ereport(ERROR, (errmsg("cannot execute ALTER COLUMN TYPE .. command "
"because the column involves a default coming "
@ -2906,6 +3000,10 @@ ErrorIfUnsupportedAlterTableStmt(AlterTableStmt *alterTableStatement)
case AT_SetNotNull:
case AT_ReplicaIdentity:
case AT_ChangeOwner:
case AT_EnableRowSecurity:
case AT_DisableRowSecurity:
case AT_ForceRowSecurity:
case AT_NoForceRowSecurity:
case AT_ValidateConstraint:
case AT_DropConstraint: /* we do the check for invalidation in AlterTableDropsForeignKey */
#if PG_VERSION_NUM >= PG_VERSION_14
@ -2936,6 +3034,8 @@ ErrorIfUnsupportedAlterTableStmt(AlterTableStmt *alterTableStatement)
{
if (IsForeignTable(relationId))
{
List *optionList = (List *) command->def;
ErrorIfAlterTableDropTableNameFromPostgresFdw(optionList, relationId);
break;
}
}
@ -2950,6 +3050,7 @@ ErrorIfUnsupportedAlterTableStmt(AlterTableStmt *alterTableStatement)
errdetail("Only ADD|DROP COLUMN, SET|DROP NOT NULL, "
"SET|DROP DEFAULT, ADD|DROP|VALIDATE CONSTRAINT, "
"SET (), RESET (), "
"ENABLE|DISABLE|NO FORCE|FORCE ROW LEVEL SECURITY, "
"ATTACH|DETACH PARTITION and TYPE subcommands "
"are supported.")));
}

View File

@ -42,8 +42,6 @@
#include "distributed/worker_create_or_replace.h"
static List * GetDistributedTextSearchConfigurationNames(DropStmt *stmt);
static List * GetDistributedTextSearchDictionaryNames(DropStmt *stmt);
static DefineStmt * GetTextSearchConfigDefineStmt(Oid tsconfigOid);
static DefineStmt * GetTextSearchDictionaryDefineStmt(Oid tsdictOid);
static List * GetTextSearchDictionaryInitOptions(HeapTuple tup, Form_pg_ts_dict dict);
@ -59,113 +57,6 @@ static List * get_ts_template_namelist(Oid tstemplateOid);
static Oid get_ts_config_parser_oid(Oid tsconfigOid);
static char * get_ts_parser_tokentype_name(Oid parserOid, int32 tokentype);
/*
* PostprocessCreateTextSearchConfigurationStmt is called after the TEXT SEARCH
* CONFIGURATION has been created locally.
*
* Contrary to many other objects a text search configuration is often created as a copy
* of an existing configuration. After the copy there is no relation to the configuration
* that has been copied. This prevents our normal approach of ensuring dependencies to
* exist before forwarding a close ressemblance of the statement the user executed.
*
* Instead we recreate the object based on what we find in our own catalog, hence the
* amount of work we perform in the postprocess function, contrary to other objects.
*/
List *
PostprocessCreateTextSearchConfigurationStmt(Node *node, const char *queryString)
{
DefineStmt *stmt = castNode(DefineStmt, node);
Assert(stmt->kind == OBJECT_TSCONFIGURATION);
if (!ShouldPropagate())
{
return NIL;
}
/* check creation against multi-statement transaction policy */
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSCONFIGURATION);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
DeferredErrorMessage *errMsg = DeferErrorIfHasUnsupportedDependency(&address);
if (errMsg != NULL)
{
RaiseDeferredError(errMsg, WARNING);
return NIL;
}
EnsureDependenciesExistOnAllNodes(&address);
/*
* TEXT SEARCH CONFIGURATION objects are more complex with their mappings and the
* possibility of copying from existing templates that we will require the idempotent
* recreation commands to be run for successful propagation
*/
List *commands = CreateTextSearchConfigDDLCommandsIdempotent(&address);
commands = lcons(DISABLE_DDL_PROPAGATION, commands);
commands = lappend(commands, ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessCreateTextSearchDictionaryStmt is called after the TEXT SEARCH DICTIONARY has been
* created locally.
*/
List *
PostprocessCreateTextSearchDictionaryStmt(Node *node, const char *queryString)
{
DefineStmt *stmt = castNode(DefineStmt, node);
Assert(stmt->kind == OBJECT_TSDICTIONARY);
if (!ShouldPropagate())
{
return NIL;
}
/* check creation against multi-statement transaction policy */
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSDICTIONARY);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
DeferredErrorMessage *errMsg = DeferErrorIfHasUnsupportedDependency(&address);
if (errMsg != NULL)
{
RaiseDeferredError(errMsg, WARNING);
return NIL;
}
EnsureDependenciesExistOnAllNodes(&address);
QualifyTreeNode(node);
const char *createTSDictionaryStmtSql = DeparseTreeNode(node);
/*
* To prevent recursive propagation in mx architecture, we disable ddl
* propagation before sending the command to workers.
*/
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) createTSDictionaryStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
List *
GetCreateTextSearchConfigStatements(const ObjectAddress *address)
{
@ -234,602 +125,6 @@ CreateTextSearchDictDDLCommandsIdempotent(const ObjectAddress *address)
}
/*
* PreprocessDropTextSearchConfigurationStmt prepares the statements we need to send to
* the workers. After we have dropped the configurations locally they also got removed from
* pg_dist_object so it is important to do all distribution checks before the change is
* made locally.
*/
List *
PreprocessDropTextSearchConfigurationStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
Assert(stmt->removeType == OBJECT_TSCONFIGURATION);
if (!ShouldPropagate())
{
return NIL;
}
List *distributedObjects = GetDistributedTextSearchConfigurationNames(stmt);
if (list_length(distributedObjects) == 0)
{
/* no distributed objects to remove */
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSCONFIGURATION);
/*
* Temporarily replace the list of objects being dropped with only the list
* containing the distributed objects. After we have created the sql statement we
* restore the original list of objects to execute on locally.
*
* Because searchpaths on coordinator and workers might not be in sync we fully
* qualify the list before deparsing. This is safe because qualification doesn't
* change the original names in place, but insteads creates new ones.
*/
List *originalObjects = stmt->objects;
stmt->objects = distributedObjects;
QualifyTreeNode((Node *) stmt);
const char *dropStmtSql = DeparseTreeNode((Node *) stmt);
stmt->objects = originalObjects;
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) dropStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessDropTextSearchDictionaryStmt prepares the statements we need to send to
* the workers. After we have dropped the dictionaries locally they also got removed from
* pg_dist_object so it is important to do all distribution checks before the change is
* made locally.
*/
List *
PreprocessDropTextSearchDictionaryStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
Assert(stmt->removeType == OBJECT_TSDICTIONARY);
if (!ShouldPropagate())
{
return NIL;
}
List *distributedObjects = GetDistributedTextSearchDictionaryNames(stmt);
if (list_length(distributedObjects) == 0)
{
/* no distributed objects to remove */
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSDICTIONARY);
/*
* Temporarily replace the list of objects being dropped with only the list
* containing the distributed objects. After we have created the sql statement we
* restore the original list of objects to execute on locally.
*
* Because searchpaths on coordinator and workers might not be in sync we fully
* qualify the list before deparsing. This is safe because qualification doesn't
* change the original names in place, but insteads creates new ones.
*/
List *originalObjects = stmt->objects;
stmt->objects = distributedObjects;
QualifyTreeNode((Node *) stmt);
const char *dropStmtSql = DeparseTreeNode((Node *) stmt);
stmt->objects = originalObjects;
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) dropStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* GetDistributedTextSearchConfigurationNames iterates over all text search configurations
* dropped, and create a list containing all configurations that are distributed.
*/
static List *
GetDistributedTextSearchConfigurationNames(DropStmt *stmt)
{
List *objName = NULL;
List *distributedObjects = NIL;
foreach_ptr(objName, stmt->objects)
{
Oid tsconfigOid = get_ts_config_oid(objName, stmt->missing_ok);
if (!OidIsValid(tsconfigOid))
{
/* skip missing configuration names, they can't be distributed */
continue;
}
ObjectAddress address = { 0 };
ObjectAddressSet(address, TSConfigRelationId, tsconfigOid);
if (!IsObjectDistributed(&address))
{
continue;
}
distributedObjects = lappend(distributedObjects, objName);
}
return distributedObjects;
}
/*
* GetDistributedTextSearchDictionaryNames iterates over all text search dictionaries
* dropped, and create a list containing all dictionaries that are distributed.
*/
static List *
GetDistributedTextSearchDictionaryNames(DropStmt *stmt)
{
List *objName = NULL;
List *distributedObjects = NIL;
foreach_ptr(objName, stmt->objects)
{
Oid tsdictOid = get_ts_dict_oid(objName, stmt->missing_ok);
if (!OidIsValid(tsdictOid))
{
/* skip missing dictionary names, they can't be distributed */
continue;
}
ObjectAddress address = { 0 };
ObjectAddressSet(address, TSDictionaryRelationId, tsdictOid);
if (!IsObjectDistributed(&address))
{
continue;
}
distributedObjects = lappend(distributedObjects, objName);
}
return distributedObjects;
}
/*
* PreprocessAlterTextSearchConfigurationStmt verifies if the configuration being altered
* is distributed in the cluster. If that is the case it will prepare the list of commands
* to send to the worker to apply the same changes remote.
*/
List *
PreprocessAlterTextSearchConfigurationStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterTSConfigurationStmt *stmt = castNode(AlterTSConfigurationStmt, node);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSCONFIGURATION);
QualifyTreeNode((Node *) stmt);
const char *alterStmtSql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) alterStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterTextSearchDictionaryStmt verifies if the dictionary being altered is
* distributed in the cluster. If that is the case it will prepare the list of commands to
* send to the worker to apply the same changes remote.
*/
List *
PreprocessAlterTextSearchDictionaryStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterTSDictionaryStmt *stmt = castNode(AlterTSDictionaryStmt, node);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSDICTIONARY);
QualifyTreeNode((Node *) stmt);
const char *alterStmtSql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) alterStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessRenameTextSearchConfigurationStmt verifies if the configuration being altered
* is distributed in the cluster. If that is the case it will prepare the list of commands
* to send to the worker to apply the same changes remote.
*/
List *
PreprocessRenameTextSearchConfigurationStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Assert(stmt->renameType == OBJECT_TSCONFIGURATION);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSCONFIGURATION);
QualifyTreeNode((Node *) stmt);
char *ddlCommand = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) ddlCommand,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessRenameTextSearchDictionaryStmt verifies if the dictionary being altered
* is distributed in the cluster. If that is the case it will prepare the list of commands
* to send to the worker to apply the same changes remote.
*/
List *
PreprocessRenameTextSearchDictionaryStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Assert(stmt->renameType == OBJECT_TSDICTIONARY);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSDICTIONARY);
QualifyTreeNode((Node *) stmt);
char *ddlCommand = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) ddlCommand,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterTextSearchConfigurationSchemaStmt verifies if the configuration being
* altered is distributed in the cluster. If that is the case it will prepare the list of
* commands to send to the worker to apply the same changes remote.
*/
List *
PreprocessAlterTextSearchConfigurationSchemaStmt(Node *node, const char *queryString,
ProcessUtilityContext
processUtilityContext)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_TSCONFIGURATION);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt,
stmt->missing_ok);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSCONFIGURATION);
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterTextSearchDictionarySchemaStmt verifies if the dictionary being
* altered is distributed in the cluster. If that is the case it will prepare the list of
* commands to send to the worker to apply the same changes remote.
*/
List *
PreprocessAlterTextSearchDictionarySchemaStmt(Node *node, const char *queryString,
ProcessUtilityContext
processUtilityContext)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_TSDICTIONARY);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt,
stmt->missing_ok);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSDICTIONARY);
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterTextSearchConfigurationSchemaStmt is invoked after the schema has been
* changed locally. Since changing the schema could result in new dependencies being found
* for this object we re-ensure all the dependencies for the configuration do exist. This
* is solely to propagate the new schema (and all its dependencies) if it was not already
* distributed in the cluster.
*/
List *
PostprocessAlterTextSearchConfigurationSchemaStmt(Node *node, const char *queryString)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_TSCONFIGURATION);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt,
stmt->missing_ok);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
/* dependencies have changed (schema) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* PostprocessAlterTextSearchDictionarySchemaStmt is invoked after the schema has been
* changed locally. Since changing the schema could result in new dependencies being found
* for this object we re-ensure all the dependencies for the dictionary do exist. This
* is solely to propagate the new schema (and all its dependencies) if it was not already
* distributed in the cluster.
*/
List *
PostprocessAlterTextSearchDictionarySchemaStmt(Node *node, const char *queryString)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_TSDICTIONARY);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt,
stmt->missing_ok);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
/* dependencies have changed (schema) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* PreprocessTextSearchConfigurationCommentStmt propagates any comment on a distributed
* configuration to the workers. Since comments for configurations are promenently shown
* when listing all text search configurations this is purely a cosmetic thing when
* running in MX.
*/
List *
PreprocessTextSearchConfigurationCommentStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
CommentStmt *stmt = castNode(CommentStmt, node);
Assert(stmt->objtype == OBJECT_TSCONFIGURATION);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSCONFIGURATION);
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessTextSearchDictionaryCommentStmt propagates any comment on a distributed
* dictionary to the workers. Since comments for dictionaries are promenently shown
* when listing all text search dictionaries this is purely a cosmetic thing when
* running in MX.
*/
List *
PreprocessTextSearchDictionaryCommentStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
CommentStmt *stmt = castNode(CommentStmt, node);
Assert(stmt->objtype == OBJECT_TSDICTIONARY);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSDICTIONARY);
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterTextSearchConfigurationOwnerStmt verifies if the configuration being
* altered is distributed in the cluster. If that is the case it will prepare the list of
* commands to send to the worker to apply the same changes remote.
*/
List *
PreprocessAlterTextSearchConfigurationOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext
processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_TSCONFIGURATION);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSCONFIGURATION);
QualifyTreeNode((Node *) stmt);
char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterTextSearchDictionaryOwnerStmt verifies if the dictionary being
* altered is distributed in the cluster. If that is the case it will prepare the list of
* commands to send to the worker to apply the same changes remote.
*/
List *
PreprocessAlterTextSearchDictionaryOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext
processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_TSDICTIONARY);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_TSDICTIONARY);
QualifyTreeNode((Node *) stmt);
char *sql = DeparseTreeNode((Node *) stmt);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterTextSearchConfigurationOwnerStmt is invoked after the owner has been
* changed locally. Since changing the owner could result in new dependencies being found
* for this object we re-ensure all the dependencies for the configuration do exist. This
* is solely to propagate the new owner (and all its dependencies) if it was not already
* distributed in the cluster.
*/
List *
PostprocessAlterTextSearchConfigurationOwnerStmt(Node *node, const char *queryString)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_TSCONFIGURATION);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
/* dependencies have changed (owner) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* PostprocessAlterTextSearchDictionaryOwnerStmt is invoked after the owner has been
* changed locally. Since changing the owner could result in new dependencies being found
* for this object we re-ensure all the dependencies for the dictionary do exist. This
* is solely to propagate the new owner (and all its dependencies) if it was not already
* distributed in the cluster.
*/
List *
PostprocessAlterTextSearchDictionaryOwnerStmt(Node *node, const char *queryString)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_TSDICTIONARY);
ObjectAddress address = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&address))
{
return NIL;
}
/* dependencies have changed (owner) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&address);
return NIL;
}
/*
* GetTextSearchConfigDefineStmt returns the DefineStmt for a TEXT SEARCH CONFIGURATION
* based on the configuration as defined in the catalog identified by tsconfigOid.

View File

@ -43,6 +43,7 @@
/* local function forward declarations */
static char * GetAlterTriggerStateCommand(Oid triggerId);
static bool IsCreateCitusTruncateTriggerStmt(CreateTrigStmt *createTriggerStmt);
static Value * GetAlterTriggerDependsTriggerNameValue(AlterObjectDependsStmt *
alterTriggerDependsStmt);
@ -96,6 +97,18 @@ GetExplicitTriggerCommandList(Oid relationId)
createTriggerCommandList = lappend(
createTriggerCommandList,
makeTableDDLCommandString(createTriggerCommand));
/*
* Appends the commands for the trigger settings that are not covered
* by CREATE TRIGGER command, such as ALTER TABLE ENABLE/DISABLE <trigger>.
*/
char *alterTriggerStateCommand =
GetAlterTriggerStateCommand(triggerId);
createTriggerCommandList = lappend(
createTriggerCommandList,
makeTableDDLCommandString(alterTriggerStateCommand));
}
/* revert back to original search_path */
@ -105,6 +118,72 @@ GetExplicitTriggerCommandList(Oid relationId)
}
/*
* GetAlterTriggerStateCommand returns the DDL command to set enable/disable
* state for given trigger. Throws an error if no such trigger exists.
*/
static char *
GetAlterTriggerStateCommand(Oid triggerId)
{
StringInfo alterTriggerStateCommand = makeStringInfo();
bool missingOk = false;
HeapTuple triggerTuple = GetTriggerTupleById(triggerId, missingOk);
Form_pg_trigger triggerForm = (Form_pg_trigger) GETSTRUCT(triggerTuple);
char *qualifiedRelName = generate_qualified_relation_name(triggerForm->tgrelid);
const char *quotedTrigName = quote_identifier(NameStr(triggerForm->tgname));
char enableDisableState = triggerForm->tgenabled;
const char *alterTriggerStateStr = NULL;
switch (enableDisableState)
{
case TRIGGER_FIRES_ON_ORIGIN:
{
/* default mode */
alterTriggerStateStr = "ENABLE";
break;
}
case TRIGGER_FIRES_ALWAYS:
{
alterTriggerStateStr = "ENABLE ALWAYS";
break;
}
case TRIGGER_FIRES_ON_REPLICA:
{
alterTriggerStateStr = "ENABLE REPLICA";
break;
}
case TRIGGER_DISABLED:
{
alterTriggerStateStr = "DISABLE";
break;
}
default:
{
elog(ERROR, "unexpected trigger state");
}
}
appendStringInfo(alterTriggerStateCommand, "ALTER TABLE %s %s TRIGGER %s;",
qualifiedRelName, alterTriggerStateStr, quotedTrigName);
/*
* Free triggerTuple at the end since quote_identifier() might not return
* a palloc'd string if given identifier doesn't need to be quoted, and in
* that case quotedTrigName would still be bound to triggerTuple.
*/
heap_freetuple(triggerTuple);
return alterTriggerStateCommand->data;
}
/*
* GetTriggerTupleById returns copy of the heap tuple from pg_trigger for
* the trigger with triggerId. If no such trigger exists, this function returns
@ -712,7 +791,7 @@ CitusCreateTriggerCommandDDLJob(Oid relationId, char *triggerName,
const char *queryString)
{
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->metadataSyncCommand = queryString;
if (!triggerName)

View File

@ -40,15 +40,10 @@
#include "utils/rel.h"
#define LOCK_RELATION_IF_EXISTS "SELECT lock_relation_if_exists(%s, '%s');"
/* Local functions forward declarations for unsupported command checks */
static void ErrorIfUnsupportedTruncateStmt(TruncateStmt *truncateStatement);
static void ExecuteTruncateStmtSequentialIfNecessary(TruncateStmt *command);
static void EnsurePartitionTableNotReplicatedForTruncate(TruncateStmt *truncateStatement);
static void LockTruncatedRelationMetadataInWorkers(TruncateStmt *truncateStatement);
static void AcquireDistributedLockOnRelations(List *relationIdList, LOCKMODE lockMode);
static List * TruncateTaskList(Oid relationId);
@ -248,7 +243,13 @@ PreprocessTruncateStatement(TruncateStmt *truncateStatement)
ErrorIfUnsupportedTruncateStmt(truncateStatement);
EnsurePartitionTableNotReplicatedForTruncate(truncateStatement);
ExecuteTruncateStmtSequentialIfNecessary(truncateStatement);
LockTruncatedRelationMetadataInWorkers(truncateStatement);
uint32 lockAcquiringMode = truncateStatement->behavior == DROP_CASCADE ?
DIST_LOCK_REFERENCING_TABLES :
DIST_LOCK_DEFAULT;
AcquireDistributedLockOnRelations(truncateStatement->relations, AccessExclusiveLock,
lockAcquiringMode);
}
@ -345,131 +346,3 @@ ExecuteTruncateStmtSequentialIfNecessary(TruncateStmt *command)
}
}
}
/*
* LockTruncatedRelationMetadataInWorkers determines if distributed
* lock is necessary for truncated relations, and acquire locks.
*
* LockTruncatedRelationMetadataInWorkers handles distributed locking
* of truncated tables before standard utility takes over.
*
* Actual distributed truncation occurs inside truncate trigger.
*
* This is only for distributed serialization of truncate commands.
* The function assumes that there is no foreign key relation between
* non-distributed and distributed relations.
*/
static void
LockTruncatedRelationMetadataInWorkers(TruncateStmt *truncateStatement)
{
List *distributedRelationList = NIL;
/* nothing to do if there is no metadata at worker nodes */
if (!ClusterHasKnownMetadataWorkers())
{
return;
}
RangeVar *rangeVar = NULL;
foreach_ptr(rangeVar, truncateStatement->relations)
{
Oid relationId = RangeVarGetRelid(rangeVar, NoLock, false);
Oid referencingRelationId = InvalidOid;
if (!IsCitusTable(relationId))
{
continue;
}
if (list_member_oid(distributedRelationList, relationId))
{
continue;
}
distributedRelationList = lappend_oid(distributedRelationList, relationId);
CitusTableCacheEntry *cacheEntry = GetCitusTableCacheEntry(relationId);
Assert(cacheEntry != NULL);
List *referencingTableList = cacheEntry->referencingRelationsViaForeignKey;
foreach_oid(referencingRelationId, referencingTableList)
{
distributedRelationList = list_append_unique_oid(distributedRelationList,
referencingRelationId);
}
}
if (distributedRelationList != NIL)
{
AcquireDistributedLockOnRelations(distributedRelationList, AccessExclusiveLock);
}
}
/*
* AcquireDistributedLockOnRelations acquire a distributed lock on worker nodes
* for given list of relations ids. Relation id list and worker node list
* sorted so that the lock is acquired in the same order regardless of which
* node it was run on. Notice that no lock is acquired on coordinator node.
*
* Notice that the locking functions is sent to all workers regardless of if
* it has metadata or not. This is because a worker node only knows itself
* and previous workers that has metadata sync turned on. The node does not
* know about other nodes that have metadata sync turned on afterwards.
*/
static void
AcquireDistributedLockOnRelations(List *relationIdList, LOCKMODE lockMode)
{
Oid relationId = InvalidOid;
List *workerNodeList = ActivePrimaryNodeList(NoLock);
const char *lockModeText = LockModeToLockModeText(lockMode);
/*
* We want to acquire locks in the same order across the nodes.
* Although relation ids may change, their ordering will not.
*/
relationIdList = SortList(relationIdList, CompareOids);
workerNodeList = SortList(workerNodeList, CompareWorkerNodes);
UseCoordinatedTransaction();
int32 localGroupId = GetLocalGroupId();
foreach_oid(relationId, relationIdList)
{
/*
* We only acquire distributed lock on relation if
* the relation is sync'ed between mx nodes.
*
* Even if users disable metadata sync, we cannot
* allow them not to acquire the remote locks.
* Hence, we have !IsCoordinator() check.
*/
if (ShouldSyncTableMetadata(relationId) || !IsCoordinator())
{
char *qualifiedRelationName = generate_qualified_relation_name(relationId);
StringInfo lockRelationCommand = makeStringInfo();
appendStringInfo(lockRelationCommand, LOCK_RELATION_IF_EXISTS,
quote_literal_cstr(qualifiedRelationName),
lockModeText);
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, workerNodeList)
{
const char *nodeName = workerNode->workerName;
int nodePort = workerNode->workerPort;
/* if local node is one of the targets, acquire the lock locally */
if (workerNode->groupId == localGroupId)
{
LockRelationOid(relationId, lockMode);
continue;
}
SendCommandToWorker(nodeName, nodePort, lockRelationCommand->data);
}
}
}
}

View File

@ -90,8 +90,6 @@
bool EnableCreateTypePropagation = true;
/* forward declaration for helper functions*/
static List * FilterNameListForDistributedTypes(List *objects, bool missing_ok);
static List * TypeNameListToObjectAddresses(List *objects);
static TypeName * MakeTypeNameFromRangeVar(const RangeVar *relation);
static Oid GetTypeOwner(Oid typeOid);
static Oid LookupNonAssociatedArrayTypeNameOid(ParseState *pstate,
@ -104,365 +102,6 @@ static List * CompositeTypeColumnDefList(Oid typeOid);
static CreateEnumStmt * RecreateEnumStmt(Oid typeOid);
static List * EnumValsList(Oid typeOid);
static bool ShouldPropagateTypeCreate(void);
/*
* PreprocessCompositeTypeStmt is called during the creation of a composite type. It is executed
* before the statement is applied locally.
*
* We decide if the compisite type needs to be replicated to the worker, and if that is
* the case return a list of DDLJob's that describe how and where the type needs to be
* created.
*
* Since the planning happens before the statement has been applied locally we do not have
* access to the ObjectAddress of the new type.
*/
List *
PreprocessCompositeTypeStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
if (!ShouldPropagateTypeCreate())
{
return NIL;
}
/*
* managing types can only be done on the coordinator if ddl propagation is on. when
* it is off we will never get here
*/
EnsureCoordinator();
/* fully qualify before lookup and later deparsing */
QualifyTreeNode(node);
return NIL;
}
/*
* PostprocessCompositeTypeStmt is executed after the type has been created locally and before
* we create it on the remote servers. Here we have access to the ObjectAddress of the new
* type which we use to make sure the type's dependencies are on all nodes.
*/
List *
PostprocessCompositeTypeStmt(Node *node, const char *queryString)
{
/* same check we perform during planning of the statement */
if (!ShouldPropagateTypeCreate())
{
return NIL;
}
/*
* find object address of the just created object, because the type has been created
* locally it can't be missing
*/
ObjectAddress typeAddress = GetObjectAddressFromParseTree(node, false);
/* If the type has any unsupported dependency, create it locally */
DeferredErrorMessage *errMsg = DeferErrorIfHasUnsupportedDependency(&typeAddress);
if (errMsg != NULL)
{
RaiseDeferredError(errMsg, WARNING);
return NIL;
}
/*
* when we allow propagation within a transaction block we should make sure to only
* allow this in sequential mode
*/
EnsureSequentialMode(OBJECT_TYPE);
EnsureDependenciesExistOnAllNodes(&typeAddress);
/*
* reconstruct creation statement in a portable fashion. The create_or_replace helper
* function will be used to create the type in an idempotent manner on the workers.
*
* Types could exist on the worker prior to being created on the coordinator when the
* type previously has been attempted to be created in a transaction which did not
* commit on the coordinator.
*/
const char *compositeTypeStmtSql = DeparseCompositeTypeStmt(node);
compositeTypeStmtSql = WrapCreateOrReplace(compositeTypeStmtSql);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) compositeTypeStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterTypeStmt is invoked for alter type statements for composite types.
*
* Normally we would have a process step as well to re-ensure dependencies exists, however
* this is already implemented by the post processing for adding columns to tables.
*/
List *
PreprocessAlterTypeStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterTableStmt *stmt = castNode(AlterTableStmt, node);
Assert(AlterTableStmtObjType_compat(stmt) == OBJECT_TYPE);
ObjectAddress typeAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
EnsureCoordinator();
/* reconstruct alter statement in a portable fashion */
QualifyTreeNode((Node *) stmt);
const char *alterTypeStmtSql = DeparseTreeNode((Node *) stmt);
/*
* all types that are distributed will need their alter statements propagated
* regardless if in a transaction or not. If we would not propagate the alter
* statement the types would be different on worker and coordinator.
*/
EnsureSequentialMode(OBJECT_TYPE);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) alterTypeStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessCreateEnumStmt is called before the statement gets applied locally.
*
* It decides if the create statement will be applied to the workers and if that is the
* case returns a list of DDLJobs that will be executed _after_ the statement has been
* applied locally.
*
* Since planning is done before we have created the object locally we do not have an
* ObjectAddress for the new type just yet.
*/
List *
PreprocessCreateEnumStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
if (!ShouldPropagateTypeCreate())
{
return NIL;
}
/* managing types can only be done on the coordinator */
EnsureCoordinator();
/* enforce fully qualified typeName for correct deparsing and lookup */
QualifyTreeNode(node);
return NIL;
}
/*
* PostprocessCreateEnumStmt is called after the statement has been applied locally, but
* before the plan on how to create the types on the workers has been executed.
*
* We apply the same checks to verify if the type should be distributed, if that is the
* case we resolve the ObjectAddress for the just created object, distribute its
* dependencies to all the nodes, and mark the object as distributed.
*/
List *
PostprocessCreateEnumStmt(Node *node, const char *queryString)
{
if (!ShouldPropagateTypeCreate())
{
return NIL;
}
/* lookup type address of just created type */
ObjectAddress typeAddress = GetObjectAddressFromParseTree(node, false);
DeferredErrorMessage *errMsg = DeferErrorIfHasUnsupportedDependency(&typeAddress);
if (errMsg != NULL)
{
RaiseDeferredError(errMsg, WARNING);
return NIL;
}
/*
* when we allow propagation within a transaction block we should make sure to only
* allow this in sequential mode
*/
EnsureSequentialMode(OBJECT_TYPE);
EnsureDependenciesExistOnAllNodes(&typeAddress);
/* reconstruct creation statement in a portable fashion */
const char *createEnumStmtSql = DeparseCreateEnumStmt(node);
createEnumStmtSql = WrapCreateOrReplace(createEnumStmtSql);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) createEnumStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessAlterEnumStmt handles ALTER TYPE ... ADD VALUE for enum based types. Planning
* happens before the statement has been applied locally.
*
* Since it is an alter of an existing type we actually have the ObjectAddress. This is
* used to check if the type is distributed, if so the alter will be executed on the
* workers directly to keep the types in sync across the cluster.
*/
List *
PreprocessAlterEnumStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
ObjectAddress typeAddress = GetObjectAddressFromParseTree(node, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
/*
* alter enum will run for all distributed enums, regardless if in a transaction or
* not since the enum will be different on the coordinator and workers if we didn't.
* (adding values to an enum can not run in a transaction anyway and would error by
* postgres already).
*/
EnsureSequentialMode(OBJECT_TYPE);
/*
* managing types can only be done on the coordinator if ddl propagation is on. when
* it is off we will never get here
*/
EnsureCoordinator();
QualifyTreeNode(node);
const char *alterEnumStmtSql = DeparseTreeNode(node);
/*
* Before pg12 ALTER ENUM ... ADD VALUE could not be within a xact block. Instead of
* creating a DDLTaksList we won't return anything here. During the processing phase
* we directly connect to workers and execute the commands remotely.
*/
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) alterEnumStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessDropTypeStmt is called for all DROP TYPE statements. For all types in the list that
* citus has distributed to the workers it will drop the type on the workers as well. If
* no types in the drop list are distributed no calls will be made to the workers.
*/
List *
PreprocessDropTypeStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
/*
* We swap the list of objects to remove during deparse so we need a reference back to
* the old list to put back
*/
List *oldTypes = stmt->objects;
if (!ShouldPropagate())
{
return NIL;
}
List *distributedTypes = FilterNameListForDistributedTypes(oldTypes,
stmt->missing_ok);
if (list_length(distributedTypes) <= 0)
{
/* no distributed types to drop */
return NIL;
}
/*
* managing types can only be done on the coordinator if ddl propagation is on. when
* it is off we will never get here. MX workers don't have a notion of distributed
* types, so we block the call.
*/
EnsureCoordinator();
/*
* remove the entries for the distributed objects on dropping
*/
List *distributedTypeAddresses = TypeNameListToObjectAddresses(distributedTypes);
ObjectAddress *address = NULL;
foreach_ptr(address, distributedTypeAddresses)
{
UnmarkObjectDistributed(address);
}
/*
* temporary swap the lists of objects to delete with the distributed objects and
* deparse to an executable sql statement for the workers
*/
stmt->objects = distributedTypes;
char *dropStmtSql = DeparseTreeNode((Node *) stmt);
stmt->objects = oldTypes;
EnsureSequentialMode(OBJECT_TYPE);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
dropStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessRenameTypeStmt is called when the user is renaming the type. The invocation happens
* before the statement is applied locally.
*
* As the type already exists we have access to the ObjectAddress for the type, this is
* used to check if the type is distributed. If the type is distributed the rename is
* executed on all the workers to keep the types in sync across the cluster.
*/
List *
PreprocessRenameTypeStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
ObjectAddress typeAddress = GetObjectAddressFromParseTree(node, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
EnsureCoordinator();
/* fully qualify */
QualifyTreeNode(node);
/* deparse sql*/
const char *renameStmtSql = DeparseTreeNode(node);
EnsureSequentialMode(OBJECT_TYPE);
/* to prevent recursion with mx we disable ddl propagation */
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) renameStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PreprocessRenameTypeAttributeStmt is called for changes of attribute names for composite
* types. Planning is called before the statement is applied locally.
@ -499,98 +138,6 @@ PreprocessRenameTypeAttributeStmt(Node *node, const char *queryString,
}
/*
* PreprocessAlterTypeSchemaStmt is executed before the statement is applied to the local
* postgres instance.
*
* In this stage we can prepare the commands that need to be run on all workers.
*/
List *
PreprocessAlterTypeSchemaStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_TYPE);
ObjectAddress typeAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
EnsureCoordinator();
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
EnsureSequentialMode(OBJECT_TYPE);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* PostprocessAlterTypeSchemaStmt is executed after the change has been applied locally, we
* can now use the new dependencies of the type to ensure all its dependencies exist on
* the workers before we apply the commands remotely.
*/
List *
PostprocessAlterTypeSchemaStmt(Node *node, const char *queryString)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_TYPE);
ObjectAddress typeAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
/* dependencies have changed (schema) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&typeAddress);
return NIL;
}
/*
* PreprocessAlterTypeOwnerStmt is called for change of ownership of types before the
* ownership is changed on the local instance.
*
* If the type for which the owner is changed is distributed we execute the change on all
* the workers to keep the type in sync across the cluster.
*/
List *
PreprocessAlterTypeOwnerStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_TYPE);
ObjectAddress typeAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (!ShouldPropagateObject(&typeAddress))
{
return NIL;
}
EnsureCoordinator();
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
EnsureSequentialMode(OBJECT_TYPE);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) sql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* CreateTypeStmtByObjectAddress returns a parsetree for the CREATE TYPE statement to
* recreate the type by its object address.
@ -612,6 +159,11 @@ CreateTypeStmtByObjectAddress(const ObjectAddress *address)
return (Node *) RecreateCompositeTypeStmt(address->objectId);
}
case TYPTYPE_DOMAIN:
{
return (Node *) RecreateDomainStmt(address->objectId);
}
default:
{
ereport(ERROR, (errmsg("unsupported type to generate create statement for"),
@ -854,7 +406,7 @@ ObjectAddress
AlterTypeSchemaStmtObjectAddress(Node *node, bool missing_ok)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_TYPE);
Assert(stmt->objectType == OBJECT_TYPE || stmt->objectType == OBJECT_DOMAIN);
List *names = (List *) stmt->object;
@ -1046,60 +598,6 @@ GenerateBackupNameForTypeCollision(const ObjectAddress *address)
}
/*
* FilterNameListForDistributedTypes takes a list of objects to delete, for Types this
* will be a list of TypeName. This list is filtered against the types that are
* distributed.
*
* The original list will not be touched, a new list will be created with only the objects
* in there.
*/
static List *
FilterNameListForDistributedTypes(List *objects, bool missing_ok)
{
List *result = NIL;
TypeName *typeName = NULL;
foreach_ptr(typeName, objects)
{
Oid typeOid = LookupTypeNameOid(NULL, typeName, missing_ok);
ObjectAddress typeAddress = { 0 };
if (!OidIsValid(typeOid))
{
continue;
}
ObjectAddressSet(typeAddress, TypeRelationId, typeOid);
if (IsObjectDistributed(&typeAddress))
{
result = lappend(result, typeName);
}
}
return result;
}
/*
* TypeNameListToObjectAddresses transforms a List * of TypeName *'s into a List * of
* ObjectAddress *'s. For this to succeed all Types identified by the TypeName *'s should
* exist on this postgres, an error will be thrown otherwise.
*/
static List *
TypeNameListToObjectAddresses(List *objects)
{
List *result = NIL;
TypeName *typeName = NULL;
foreach_ptr(typeName, objects)
{
Oid typeOid = LookupTypeNameOid(NULL, typeName, false);
ObjectAddress *typeAddress = palloc0(sizeof(ObjectAddress));
ObjectAddressSet(*typeAddress, TypeRelationId, typeOid);
result = lappend(result, typeAddress);
}
return result;
}
/*
* GetTypeOwner
*
@ -1140,47 +638,6 @@ MakeTypeNameFromRangeVar(const RangeVar *relation)
}
/*
* ShouldPropagateTypeCreate returns if we should propagate the creation of a type.
*
* There are two moments we decide to not directly propagate the creation of a type.
* - During the creation of an Extension; we assume the type will be created by creating
* the extension on the worker
* - During a transaction block; if types are used in a distributed table in the same
* block we can only provide parallelism on the table if we do not change to sequential
* mode. Types will be propagated outside of this transaction to the workers so that
* the transaction can use 1 connection per shard and fully utilize citus' parallelism
*/
static bool
ShouldPropagateTypeCreate()
{
if (!ShouldPropagate())
{
return false;
}
if (!EnableCreateTypePropagation)
{
/*
* Administrator has turned of type creation propagation
*/
return false;
}
/*
* by not propagating in a transaction block we allow for parallelism to be used when
* this type will be used as a column in a table that will be created and distributed
* in this same transaction.
*/
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return false;
}
return true;
}
/*
* LookupNonAssociatedArrayTypeNameOid returns the oid of the type with the given type name
* that is not an array type that is associated to another user defined type.

View File

@ -42,6 +42,7 @@
#include "commands/defrem.h"
#include "commands/tablecmds.h"
#include "distributed/adaptive_executor.h"
#include "distributed/backend_data.h"
#include "distributed/colocation_utils.h"
#include "distributed/commands.h"
#include "distributed/commands/multi_copy.h"
@ -53,6 +54,7 @@
#include "distributed/listutils.h"
#include "distributed/local_executor.h"
#include "distributed/maintenanced.h"
#include "distributed/multi_logical_replication.h"
#include "distributed/multi_partitioning_utils.h"
#if PG_VERSION_NUM < 140000
#include "distributed/metadata_cache.h"
@ -64,6 +66,7 @@
#include "distributed/multi_physical_planner.h"
#include "distributed/reference_table_utils.h"
#include "distributed/resource_lock.h"
#include "distributed/string_utils.h"
#include "distributed/transmit.h"
#include "distributed/version_compat.h"
#include "distributed/worker_shard_visibility.h"
@ -77,6 +80,7 @@
#include "utils/lsyscache.h"
#include "utils/syscache.h"
bool EnableDDLPropagation = true; /* ddl propagation is enabled */
int CreateObjectPropagationMode = CREATE_OBJECT_PROPAGATION_IMMEDIATE;
PropSetCmdBehavior PropagateSetCommands = PROPSETCMD_NONE; /* SET prop off */
@ -108,9 +112,6 @@ static void DecrementUtilityHookCountersIfNecessary(Node *parsetree);
static bool IsDropSchemaOrDB(Node *parsetree);
static bool ShouldCheckUndistributeCitusLocalTables(void);
static bool ShouldAddNewTableToMetadata(Node *parsetree);
static bool ServerUsesPostgresFDW(char *serverName);
static void ErrorIfOptionListHasNoTableName(List *optionList);
/*
* ProcessUtilityParseTree is a convenience method to create a PlannedStmt out of
@ -164,7 +165,6 @@ multi_ProcessUtility(PlannedStmt *pstmt,
parsetree = pstmt->utilityStmt;
if (IsA(parsetree, TransactionStmt) ||
IsA(parsetree, LockStmt) ||
IsA(parsetree, ListenStmt) ||
IsA(parsetree, NotifyStmt) ||
IsA(parsetree, ExecuteStmt) ||
@ -409,6 +409,31 @@ ProcessUtilityInternal(PlannedStmt *pstmt,
parsetree = ProcessCreateSubscriptionStmt(createSubStmt);
}
if (IsA(parsetree, AlterSubscriptionStmt))
{
AlterSubscriptionStmt *alterSubStmt = (AlterSubscriptionStmt *) parsetree;
if (!superuser() &&
StringStartsWith(alterSubStmt->subname,
SHARD_MOVE_SUBSCRIPTION_PREFIX))
{
ereport(ERROR, (
errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("Only superusers can alter shard move subscriptions")));
}
}
if (IsA(parsetree, DropSubscriptionStmt))
{
DropSubscriptionStmt *dropSubStmt = (DropSubscriptionStmt *) parsetree;
if (!superuser() &&
StringStartsWith(dropSubStmt->subname, SHARD_MOVE_SUBSCRIPTION_PREFIX))
{
ereport(ERROR, (
errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("Only superusers can drop shard move subscriptions")));
}
}
/*
* Process SET LOCAL and SET TRANSACTION statements in multi-statement
* transactions.
@ -505,6 +530,18 @@ ProcessUtilityInternal(PlannedStmt *pstmt,
PreprocessTruncateStatement((TruncateStmt *) parsetree);
}
if (IsA(parsetree, LockStmt))
{
/*
* PreprocessLockStatement might lock the relations locally if the
* node executing the command is in pg_dist_node. Even though the process
* utility will re-acquire the locks across the same relations if the node
* is in the metadata (in the pg_dist_node table) that should not be a problem,
* plus it ensures consistent locking order between the nodes.
*/
PreprocessLockStatement((LockStmt *) parsetree, context);
}
/*
* We only process ALTER TABLE ... ATTACH PARTITION commands in the function below
* and distribute the partition if necessary.
@ -525,6 +562,20 @@ ProcessUtilityInternal(PlannedStmt *pstmt,
parsetree = pstmt->utilityStmt;
ops = GetDistributeObjectOps(parsetree);
/*
* For some statements Citus defines a Qualify function. The goal of this function
* is to take any ambiguity from the statement that is contextual on either the
* search_path or current settings.
* Instead of relying on the search_path and settings we replace any deduced bits
* and fill them out how postgres would resolve them. This makes subsequent
* deserialize calls for the statement portable to other postgres servers, the
* workers in our case.
*/
if (ops && ops->qualify)
{
ops->qualify(parsetree);
}
if (ops && ops->preprocess)
{
ddlJobs = ops->preprocess(parsetree, queryString, context);
@ -575,7 +626,7 @@ ProcessUtilityInternal(PlannedStmt *pstmt,
errhint("You can manually create a database and its "
"extensions on workers.")));
}
else if (IsA(parsetree, CreateRoleStmt))
else if (IsA(parsetree, CreateRoleStmt) && !EnableCreateRolePropagation)
{
ereport(NOTICE, (errmsg("not propagating CREATE ROLE/USER commands to worker"
" nodes"),
@ -605,6 +656,24 @@ ProcessUtilityInternal(PlannedStmt *pstmt,
StopMaintenanceDaemon(MyDatabaseId);
}
/*
* Make sure that dropping the role deletes the pg_dist_object entries. There is a
* separate logic for roles, since roles are not included as dropped objects in the
* drop event trigger. To handle it both on worker and coordinator nodes, it is not
* implemented as a part of process functions but here.
*/
if (IsA(parsetree, DropRoleStmt))
{
DropRoleStmt *stmt = castNode(DropRoleStmt, parsetree);
List *allDropRoles = stmt->roles;
List *distributedDropRoles = FilterDistributedRoles(allDropRoles);
if (list_length(distributedDropRoles) > 0)
{
UnmarkRolesDistributed(distributedDropRoles);
}
}
pstmt->utilityStmt = parsetree;
PG_TRY();
@ -691,18 +760,6 @@ ProcessUtilityInternal(PlannedStmt *pstmt,
CreateStmt *createTableStmt = (CreateStmt *) (&createForeignTableStmt->base);
/*
* Error out with a hint if the foreign table is using postgres_fdw and
* the option table_name is not provided.
* Citus relays all the Citus local foreign table logic to the placement of the
* Citus local table. If table_name is NOT provided, Citus would try to talk to
* the foreign postgres table over the shard's table name, which would not exist
* on the remote server.
*/
if (ServerUsesPostgresFDW(createForeignTableStmt->servername))
{
ErrorIfOptionListHasNoTableName(createForeignTableStmt->options);
}
PostprocessCreateTableStmt(createTableStmt, queryString);
}
@ -714,6 +771,21 @@ ProcessUtilityInternal(PlannedStmt *pstmt,
{
PostprocessAlterTableStmt(castNode(AlterTableStmt, parsetree));
}
if (IsA(parsetree, GrantStmt))
{
GrantStmt *grantStmt = (GrantStmt *) parsetree;
if (grantStmt->targtype == ACL_TARGET_ALL_IN_SCHEMA)
{
/*
* Grant .. IN SCHEMA causes a deadlock if we don't use local execution
* because standard process utility processes the shard placements as well
* and the row-level locks in pg_class will not be released until the current
* transaction commits. We could skip the local shard placements after standard
* process utility, but for simplicity we just prefer using local execution.
*/
SetLocalExecutionStatus(LOCAL_EXECUTION_REQUIRED);
}
}
DDLJob *ddlJob = NULL;
foreach_ptr(ddlJob, ddlJobs)
@ -979,50 +1051,6 @@ ShouldAddNewTableToMetadata(Node *parsetree)
}
/*
* ServerUsesPostgresFDW gets a foreign server name and returns true if the FDW that
* the server depends on is postgres_fdw. Returns false otherwise.
*/
static bool
ServerUsesPostgresFDW(char *serverName)
{
ForeignServer *server = GetForeignServerByName(serverName, false);
ForeignDataWrapper *fdw = GetForeignDataWrapper(server->fdwid);
if (strcmp(fdw->fdwname, "postgres_fdw") == 0)
{
return true;
}
return false;
}
/*
* ErrorIfOptionListHasNoTableName gets an option list (DefElem) and errors out
* if the list does not contain a table_name element.
*/
static void
ErrorIfOptionListHasNoTableName(List *optionList)
{
char *table_nameString = "table_name";
DefElem *option = NULL;
foreach_ptr(option, optionList)
{
char *optionName = option->defname;
if (strcmp(optionName, table_nameString) == 0)
{
return;
}
}
ereport(ERROR, (errmsg(
"table_name option must be provided when using postgres_fdw with Citus"),
errhint("Provide the option \"table_name\" with value target table's"
" name")));
}
/*
* NotifyUtilityHookConstraintDropped sets ConstraintDropped to true to tell us
* last command dropped a table constraint.
@ -1081,16 +1109,20 @@ ExecuteDistributedDDLJob(DDLJob *ddlJob)
EnsureCoordinator();
Oid targetRelationId = ddlJob->targetRelationId;
ObjectAddress targetObjectAddress = ddlJob->targetObjectAddress;
if (OidIsValid(targetRelationId))
if (OidIsValid(targetObjectAddress.classId))
{
/*
* Only for ddlJobs that are targetting a relation (table) we want to sync
* its metadata and verify some properties around the table.
* Only for ddlJobs that are targetting an object we want to sync
* its metadata.
*/
shouldSyncMetadata = ShouldSyncTableMetadata(targetRelationId);
EnsurePartitionTableNotReplicated(targetRelationId);
shouldSyncMetadata = ShouldSyncUserCommandForObject(targetObjectAddress);
if (targetObjectAddress.classId == RelationRelationId)
{
EnsurePartitionTableNotReplicated(targetObjectAddress.objectId);
}
}
bool localExecutionSupported = true;
@ -1341,7 +1373,7 @@ CreateCustomDDLTaskList(Oid relationId, TableDDLCommand *command)
}
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = relationId;
ObjectAddressSet(ddlJob->targetObjectAddress, RelationRelationId, relationId);
ddlJob->metadataSyncCommand = GetTableDDLCommand(command);
ddlJob->taskList = taskList;
@ -1592,10 +1624,9 @@ NodeDDLTaskList(TargetWorkerSet targets, List *commands)
}
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetRelationId = InvalidOid;
ddlJob->targetObjectAddress = InvalidObjectAddress;
ddlJob->metadataSyncCommand = NULL;
ddlJob->taskList = list_make1(task);
return list_make1(ddlJob);
}

View File

@ -0,0 +1,706 @@
/*-------------------------------------------------------------------------
*
* view.c
* Commands for distributing CREATE OR REPLACE VIEW statements.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "fmgr.h"
#include "access/genam.h"
#include "catalog/objectaddress.h"
#include "commands/extension.h"
#include "distributed/commands.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/deparser.h"
#include "distributed/errormessage.h"
#include "distributed/listutils.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata/dependency.h"
#include "distributed/metadata/distobject.h"
#include "distributed/multi_executor.h"
#include "distributed/namespace_utils.h"
#include "distributed/worker_transaction.h"
#include "executor/spi.h"
#include "nodes/nodes.h"
#include "nodes/pg_list.h"
#include "tcop/utility.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
static List * FilterNameListForDistributedViews(List *viewNamesList, bool missing_ok);
static void AppendQualifiedViewNameToCreateViewCommand(StringInfo buf, Oid viewOid);
static void AppendViewDefinitionToCreateViewCommand(StringInfo buf, Oid viewOid);
static void AppendAliasesToCreateViewCommand(StringInfo createViewCommand, Oid viewOid);
static void AppendOptionsToCreateViewCommand(StringInfo createViewCommand, Oid viewOid);
/*
* PreprocessViewStmt is called during the planning phase for CREATE OR REPLACE VIEW
* before it is created on the local node internally.
*/
List *
PreprocessViewStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
if (!ShouldPropagate())
{
return NIL;
}
/* check creation against multi-statement transaction policy */
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
EnsureCoordinator();
return NIL;
}
/*
* PostprocessViewStmt actually creates the commmands we need to run on workers to
* propagate views.
*
* If view depends on any undistributable object, Citus can not distribute it. In order to
* not to prevent users from creating local views on the coordinator WARNING message will
* be sent to the customer about the case instead of erroring out. If no worker nodes exist
* at all, view will be created locally without any WARNING message.
*
* Besides creating the plan we also make sure all (new) dependencies of the view are
* created on all nodes.
*/
List *
PostprocessViewStmt(Node *node, const char *queryString)
{
ViewStmt *stmt = castNode(ViewStmt, node);
if (!ShouldPropagate())
{
return NIL;
}
/* check creation against multi-statement transaction policy */
if (!ShouldPropagateCreateInCoordinatedTransction())
{
return NIL;
}
ObjectAddress viewAddress = GetObjectAddressFromParseTree((Node *) stmt, false);
if (IsObjectAddressOwnedByExtension(&viewAddress, NULL))
{
return NIL;
}
/* If the view has any unsupported dependency, create it locally */
if (ErrorOrWarnIfObjectHasUnsupportedDependency(&viewAddress))
{
return NIL;
}
EnsureDependenciesExistOnAllNodes(&viewAddress);
char *command = CreateViewDDLCommand(viewAddress.objectId);
/*
* We'd typically use NodeDDLTaskList() for generating node-level DDL commands,
* such as when creating a type. However, views are different in a sense that
* views do not depend on citus tables. Instead, they are `depending` on citus tables.
*
* When NodeDDLTaskList() used, it should be accompanied with sequential execution.
* Here, we do something equivalent to NodeDDLTaskList(), but using metadataSyncCommand
* field. This hack allows us to use the metadata connection
* (see `REQUIRE_METADATA_CONNECTION` flag). Meaning that, view creation is treated as
* a metadata operation.
*
* We do this mostly for performance reasons, because we cannot afford to switch to
* sequential execution, for instance when we are altering or creating distributed
* tables -- which may require significant resources.
*
* The downside of using this hack is that if a view is re-used in the same transaction
* that creates the view on the workers, we might get errors such as the below which
* we consider a decent trade-off currently:
*
* BEGIN;
* CREATE VIEW dist_view ..
* CRETAE TABLE t2(id int, val dist_view);
*
* -- shard creation fails on one of the connections
* SELECT create_distributed_table('t2', 'id');
* ERROR: type "public.dist_view" does not exist
*
*/
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetObjectAddress = viewAddress;
ddlJob->metadataSyncCommand = command;
ddlJob->taskList = NIL;
return list_make1(ddlJob);
}
/*
* ViewStmtObjectAddress returns the ObjectAddress for the subject of the
* CREATE [OR REPLACE] VIEW statement.
*/
ObjectAddress
ViewStmtObjectAddress(Node *node, bool missing_ok)
{
ViewStmt *stmt = castNode(ViewStmt, node);
Oid viewOid = RangeVarGetRelid(stmt->view, NoLock, missing_ok);
ObjectAddress viewAddress = { 0 };
ObjectAddressSet(viewAddress, RelationRelationId, viewOid);
return viewAddress;
}
/*
* PreprocessDropViewStmt gets called during the planning phase of a DROP VIEW statement
* and returns a list of DDLJob's that will drop any distributed view from the
* workers.
*
* The DropStmt could have multiple objects to drop, the list of objects will be filtered
* to only keep the distributed views for deletion on the workers. Non-distributed
* views will still be dropped locally but not on the workers.
*/
List *
PreprocessDropViewStmt(Node *node, const char *queryString, ProcessUtilityContext
processUtilityContext)
{
DropStmt *stmt = castNode(DropStmt, node);
if (!ShouldPropagate())
{
return NIL;
}
List *distributedViewNames = FilterNameListForDistributedViews(stmt->objects,
stmt->missing_ok);
if (list_length(distributedViewNames) < 1)
{
/* no distributed view to drop */
return NIL;
}
EnsureCoordinator();
EnsureSequentialMode(OBJECT_VIEW);
/*
* Swap the list of objects before deparsing and restore the old list after. This
* ensures we only have distributed views in the deparsed drop statement.
*/
DropStmt *stmtCopy = copyObject(stmt);
stmtCopy->objects = distributedViewNames;
QualifyTreeNode((Node *) stmtCopy);
const char *dropStmtSql = DeparseTreeNode((Node *) stmtCopy);
List *commands = list_make3(DISABLE_DDL_PROPAGATION,
(void *) dropStmtSql,
ENABLE_DDL_PROPAGATION);
return NodeDDLTaskList(NON_COORDINATOR_NODES, commands);
}
/*
* FilterNameListForDistributedViews takes a list of view names and filters against the
* views that are distributed.
*
* The original list will not be touched, a new list will be created with only the objects
* in there.
*/
static List *
FilterNameListForDistributedViews(List *viewNamesList, bool missing_ok)
{
List *distributedViewNames = NIL;
List *possiblyQualifiedViewName = NULL;
foreach_ptr(possiblyQualifiedViewName, viewNamesList)
{
char *viewName = NULL;
char *schemaName = NULL;
DeconstructQualifiedName(possiblyQualifiedViewName, &schemaName, &viewName);
if (schemaName == NULL)
{
char *objName = NULL;
Oid schemaOid = QualifiedNameGetCreationNamespace(possiblyQualifiedViewName,
&objName);
schemaName = get_namespace_name(schemaOid);
}
Oid schemaId = get_namespace_oid(schemaName, missing_ok);
Oid viewOid = get_relname_relid(viewName, schemaId);
if (!OidIsValid(viewOid))
{
continue;
}
if (IsViewDistributed(viewOid))
{
distributedViewNames = lappend(distributedViewNames,
possiblyQualifiedViewName);
}
}
return distributedViewNames;
}
/*
* CreateViewDDLCommand returns the DDL command to create the view addressed by
* the viewAddress.
*/
char *
CreateViewDDLCommand(Oid viewOid)
{
StringInfo createViewCommand = makeStringInfo();
appendStringInfoString(createViewCommand, "CREATE OR REPLACE VIEW ");
AppendQualifiedViewNameToCreateViewCommand(createViewCommand, viewOid);
AppendAliasesToCreateViewCommand(createViewCommand, viewOid);
AppendOptionsToCreateViewCommand(createViewCommand, viewOid);
AppendViewDefinitionToCreateViewCommand(createViewCommand, viewOid);
return createViewCommand->data;
}
/*
* AppendQualifiedViewNameToCreateViewCommand adds the qualified view of the given view
* oid to the given create view command.
*/
static void
AppendQualifiedViewNameToCreateViewCommand(StringInfo buf, Oid viewOid)
{
char *viewName = get_rel_name(viewOid);
char *schemaName = get_namespace_name(get_rel_namespace(viewOid));
char *qualifiedViewName = quote_qualified_identifier(schemaName, viewName);
appendStringInfo(buf, "%s ", qualifiedViewName);
}
/*
* AppendAliasesToCreateViewCommand appends aliases to the create view
* command for the existing view.
*/
static void
AppendAliasesToCreateViewCommand(StringInfo createViewCommand, Oid viewOid)
{
/* Get column name aliases from pg_attribute */
ScanKeyData key[1];
ScanKeyInit(&key[0],
Anum_pg_attribute_attrelid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(viewOid));
Relation maprel = table_open(AttributeRelationId, AccessShareLock);
Relation mapidx = index_open(AttributeRelidNumIndexId, AccessShareLock);
SysScanDesc pgAttributeScan = systable_beginscan_ordered(maprel, mapidx, NULL, 1,
key);
bool isInitialAlias = true;
bool hasAlias = false;
HeapTuple attributeTuple;
while (HeapTupleIsValid(attributeTuple = systable_getnext_ordered(pgAttributeScan,
ForwardScanDirection)))
{
Form_pg_attribute att = (Form_pg_attribute) GETSTRUCT(attributeTuple);
const char *aliasName = quote_identifier(NameStr(att->attname));
if (isInitialAlias)
{
appendStringInfoString(createViewCommand, "(");
}
else
{
appendStringInfoString(createViewCommand, ",");
}
appendStringInfoString(createViewCommand, aliasName);
hasAlias = true;
isInitialAlias = false;
}
if (hasAlias)
{
appendStringInfoString(createViewCommand, ") ");
}
systable_endscan_ordered(pgAttributeScan);
index_close(mapidx, AccessShareLock);
table_close(maprel, AccessShareLock);
}
/*
* AppendOptionsToCreateViewCommand add relation options to create view command
* for an existing view
*/
static void
AppendOptionsToCreateViewCommand(StringInfo createViewCommand, Oid viewOid)
{
/* Add rel options to create view command */
char *relOptions = flatten_reloptions(viewOid);
if (relOptions != NULL)
{
appendStringInfo(createViewCommand, "WITH (%s) ", relOptions);
}
}
/*
* AppendViewDefinitionToCreateViewCommand adds the definition of the given view to the
* given create view command.
*/
static void
AppendViewDefinitionToCreateViewCommand(StringInfo buf, Oid viewOid)
{
/*
* Set search_path to NIL so that all objects outside of pg_catalog will be
* schema-prefixed.
*/
OverrideSearchPath *overridePath = GetOverrideSearchPath(CurrentMemoryContext);
overridePath->schemas = NIL;
overridePath->addCatalog = true;
PushOverrideSearchPath(overridePath);
/*
* Push the transaction snapshot to be able to get vief definition with pg_get_viewdef
*/
PushActiveSnapshot(GetTransactionSnapshot());
Datum viewDefinitionDatum = DirectFunctionCall1(pg_get_viewdef,
ObjectIdGetDatum(viewOid));
char *viewDefinition = TextDatumGetCString(viewDefinitionDatum);
PopActiveSnapshot();
PopOverrideSearchPath();
appendStringInfo(buf, "AS %s ", viewDefinition);
}
/*
* AlterViewOwnerCommand returns the command to alter view owner command for the
* given view or materialized view oid.
*/
char *
AlterViewOwnerCommand(Oid viewOid)
{
/* Add alter owner commmand */
StringInfo alterOwnerCommand = makeStringInfo();
char *viewName = get_rel_name(viewOid);
Oid schemaOid = get_rel_namespace(viewOid);
char *schemaName = get_namespace_name(schemaOid);
char *viewOwnerName = TableOwner(viewOid);
char *qualifiedViewName = NameListToQuotedString(list_make2(makeString(schemaName),
makeString(viewName)));
if (get_rel_relkind(viewOid) == RELKIND_MATVIEW)
{
appendStringInfo(alterOwnerCommand, "ALTER MATERIALIZED VIEW %s ",
qualifiedViewName);
}
else
{
appendStringInfo(alterOwnerCommand, "ALTER VIEW %s ", qualifiedViewName);
}
appendStringInfo(alterOwnerCommand, "OWNER TO %s", quote_identifier(viewOwnerName));
return alterOwnerCommand->data;
}
/*
* IsViewDistributed checks if a view is distributed
*/
bool
IsViewDistributed(Oid viewOid)
{
Assert(get_rel_relkind(viewOid) == RELKIND_VIEW ||
get_rel_relkind(viewOid) == RELKIND_MATVIEW);
ObjectAddress viewAddress = { 0 };
ObjectAddressSet(viewAddress, RelationRelationId, viewOid);
return IsObjectDistributed(&viewAddress);
}
/*
* PreprocessAlterViewStmt is invoked for alter view statements.
*/
List *
PreprocessAlterViewStmt(Node *node, const char *queryString, ProcessUtilityContext
processUtilityContext)
{
AlterTableStmt *stmt = castNode(AlterTableStmt, node);
ObjectAddress viewAddress = GetObjectAddressFromParseTree((Node *) stmt, true);
if (!ShouldPropagateObject(&viewAddress))
{
return NIL;
}
QualifyTreeNode((Node *) stmt);
EnsureCoordinator();
/* reconstruct alter statement in a portable fashion */
const char *alterViewStmtSql = DeparseTreeNode((Node *) stmt);
/*
* To avoid sequential mode, we are using metadata connection. For the
* detailed explanation, please check the comment on PostprocessViewStmt.
*/
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetObjectAddress = viewAddress;
ddlJob->metadataSyncCommand = alterViewStmtSql;
ddlJob->taskList = NIL;
return list_make1(ddlJob);
}
/*
* PostprocessAlterViewStmt is invoked for alter view statements.
*/
List *
PostprocessAlterViewStmt(Node *node, const char *queryString)
{
AlterTableStmt *stmt = castNode(AlterTableStmt, node);
Assert(AlterTableStmtObjType_compat(stmt) == OBJECT_VIEW);
ObjectAddress viewAddress = GetObjectAddressFromParseTree((Node *) stmt, true);
if (!ShouldPropagateObject(&viewAddress))
{
return NIL;
}
if (IsObjectAddressOwnedByExtension(&viewAddress, NULL))
{
return NIL;
}
/* If the view has any unsupported dependency, create it locally */
if (ErrorOrWarnIfObjectHasUnsupportedDependency(&viewAddress))
{
return NIL;
}
EnsureDependenciesExistOnAllNodes(&viewAddress);
return NIL;
}
/*
* AlterViewStmtObjectAddress returns the ObjectAddress for the subject of the
* ALTER VIEW statement.
*/
ObjectAddress
AlterViewStmtObjectAddress(Node *node, bool missing_ok)
{
AlterTableStmt *stmt = castNode(AlterTableStmt, node);
Oid viewOid = RangeVarGetRelid(stmt->relation, NoLock, missing_ok);
ObjectAddress viewAddress = { 0 };
ObjectAddressSet(viewAddress, RelationRelationId, viewOid);
return viewAddress;
}
/*
* PreprocessRenameViewStmt is called when the user is renaming the view or the column of
* the view.
*/
List *
PreprocessRenameViewStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
ObjectAddress viewAddress = GetObjectAddressFromParseTree(node, true);
if (!ShouldPropagateObject(&viewAddress))
{
return NIL;
}
EnsureCoordinator();
/* fully qualify */
QualifyTreeNode(node);
/* deparse sql*/
const char *renameStmtSql = DeparseTreeNode(node);
/*
* To avoid sequential mode, we are using metadata connection. For the
* detailed explanation, please check the comment on PostprocessViewStmt.
*/
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetObjectAddress = viewAddress;
ddlJob->metadataSyncCommand = renameStmtSql;
ddlJob->taskList = NIL;
return list_make1(ddlJob);
}
/*
* RenameViewStmtObjectAddress returns the ObjectAddress of the view that is the object
* of the RenameStmt. Errors if missing_ok is false.
*/
ObjectAddress
RenameViewStmtObjectAddress(Node *node, bool missing_ok)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Oid viewOid = RangeVarGetRelid(stmt->relation, NoLock, missing_ok);
ObjectAddress viewAddress = { 0 };
ObjectAddressSet(viewAddress, RelationRelationId, viewOid);
return viewAddress;
}
/*
* PreprocessAlterViewSchemaStmt is executed before the statement is applied to the local
* postgres instance.
*/
List *
PreprocessAlterViewSchemaStmt(Node *node, const char *queryString,
ProcessUtilityContext processUtilityContext)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
ObjectAddress viewAddress = GetObjectAddressFromParseTree((Node *) stmt, true);
if (!ShouldPropagateObject(&viewAddress))
{
return NIL;
}
EnsureCoordinator();
QualifyTreeNode((Node *) stmt);
const char *sql = DeparseTreeNode((Node *) stmt);
/*
* To avoid sequential mode, we are using metadata connection. For the
* detailed explanation, please check the comment on PostprocessViewStmt.
*/
DDLJob *ddlJob = palloc0(sizeof(DDLJob));
ddlJob->targetObjectAddress = viewAddress;
ddlJob->metadataSyncCommand = sql;
ddlJob->taskList = NIL;
return list_make1(ddlJob);
}
/*
* PostprocessAlterViewSchemaStmt is executed after the change has been applied locally, we
* can now use the new dependencies of the view to ensure all its dependencies exist on
* the workers before we apply the commands remotely.
*/
List *
PostprocessAlterViewSchemaStmt(Node *node, const char *queryString)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
ObjectAddress viewAddress = GetObjectAddressFromParseTree((Node *) stmt, true);
if (!ShouldPropagateObject(&viewAddress))
{
return NIL;
}
/* dependencies have changed (schema) let's ensure they exist */
EnsureDependenciesExistOnAllNodes(&viewAddress);
return NIL;
}
/*
* AlterViewSchemaStmtObjectAddress returns the ObjectAddress of the view that is the object
* of the alter schema statement.
*/
ObjectAddress
AlterViewSchemaStmtObjectAddress(Node *node, bool missing_ok)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Oid viewOid = RangeVarGetRelid(stmt->relation, NoLock, true);
/*
* Since it can be called both before and after executing the standardProcess utility,
* we need to check both old and new schemas
*/
if (viewOid == InvalidOid)
{
Oid schemaId = get_namespace_oid(stmt->newschema, missing_ok);
viewOid = get_relname_relid(stmt->relation->relname, schemaId);
/*
* if the view is still invalid we couldn't find the view, error with the same
* message postgres would error with it missing_ok is false (not ok to miss)
*/
if (!missing_ok && viewOid == InvalidOid)
{
ereport(ERROR, (errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("view \"%s\" does not exist",
stmt->relation->relname)));
}
}
ObjectAddress viewAddress = { 0 };
ObjectAddressSet(viewAddress, RelationRelationId, viewOid);
return viewAddress;
}
/*
* IsViewRenameStmt returns whether the passed-in RenameStmt is the following
* form:
*
* - ALTER VIEW RENAME
* - ALTER VIEW RENAME COLUMN
*/
bool
IsViewRenameStmt(RenameStmt *renameStmt)
{
bool isViewRenameStmt = false;
if (renameStmt->renameType == OBJECT_VIEW ||
(renameStmt->renameType == OBJECT_COLUMN &&
renameStmt->relationType == OBJECT_VIEW))
{
isViewRenameStmt = true;
}
return isViewRenameStmt;
}

View File

@ -10,9 +10,12 @@
#include "postgres.h"
#include "access/transam.h"
#include "access/xact.h"
#include "distributed/backend_data.h"
#include "distributed/citus_safe_lib.h"
#include "distributed/connection_management.h"
#include "distributed/intermediate_result_pruning.h"
#include "distributed/metadata_cache.h"
#include "distributed/worker_manager.h"
@ -40,6 +43,7 @@ typedef struct ConnParamsInfo
static ConnParamsInfo ConnParams;
/* helper functions for processing connection info */
static ConnectionHashKey * GetEffectiveConnKey(ConnectionHashKey *key);
static Size CalculateMaxSize(void);
static int uri_prefix_length(const char *connstr);
@ -232,6 +236,7 @@ GetConnParams(ConnectionHashKey *key, char ***keywords, char ***values,
* already we can add a pointer to the runtimeValues.
*/
char nodePortString[12] = "";
ConnectionHashKey *effectiveKey = GetEffectiveConnKey(key);
StringInfo applicationName = makeStringInfo();
appendStringInfo(applicationName, "%s%ld", CITUS_APPLICATION_NAME_PREFIX,
@ -260,10 +265,10 @@ GetConnParams(ConnectionHashKey *key, char ***keywords, char ***values,
"application_name"
};
const char *runtimeValues[] = {
key->hostname,
effectiveKey->hostname,
nodePortString,
key->database,
key->user,
effectiveKey->database,
effectiveKey->user,
GetDatabaseEncodingName(),
applicationName->data
};
@ -300,7 +305,7 @@ GetConnParams(ConnectionHashKey *key, char ***keywords, char ***values,
errmsg("too many connParams entries")));
}
pg_ltoa(key->port, nodePortString); /* populate node port string with port */
pg_ltoa(effectiveKey->port, nodePortString); /* populate node port string with port */
/* first step: copy global parameters to beginning of array */
for (Size paramIndex = 0; paramIndex < ConnParams.size; paramIndex++)
@ -322,6 +327,58 @@ GetConnParams(ConnectionHashKey *key, char ***keywords, char ***values,
MemoryContextStrdup(context, runtimeValues[runtimeParamIndex]);
}
/* we look up authinfo by original key, not effective one */
char *authinfo = GetAuthinfo(key->hostname, key->port, key->user);
char *pqerr = NULL;
PQconninfoOption *optionArray = PQconninfoParse(authinfo, &pqerr);
if (optionArray == NULL)
{
/* PQconninfoParse failed, it's unsafe to continue as this has caused segfaults in production */
if (pqerr == NULL)
{
/* parse failed without an error message, treat as OOM error */
ereport(ERROR,
(errcode(ERRCODE_OUT_OF_MEMORY),
errmsg("out of memory"),
errdetail("Failed to parse authentication information via libpq")));
}
else
{
/*
* Parse error, should not be possible as the validity is checked upon insert into pg_dist_authinfo,
* however, better safe than sorry
*/
/*
* errmsg is populated by PQconninfoParse which requires us to free the message. Since we want to
* incorporate the parse error into the detail of our message we need to copy the error message before
* freeing it. Not freeing the message will leak memory.
*/
char *pqerrcopy = pstrdup(pqerr);
PQfreemem(pqerr);
ereport(ERROR, (errmsg(
"failed to parse node authentication information for %s@%s:%d",
key->user, key->hostname, key->port),
errdetail("%s", pqerrcopy)));
}
}
for (PQconninfoOption *option = optionArray; option->keyword != NULL; option++)
{
if (option->val == NULL || option->val[0] == '\0')
{
continue;
}
connKeywords[authParamsIdx] = MemoryContextStrdup(context, option->keyword);
connValues[authParamsIdx] = MemoryContextStrdup(context, option->val);
authParamsIdx++;
}
PQconninfoFree(optionArray);
/* final step: add terminal NULL, required by libpq */
connKeywords[authParamsIdx] = connValues[authParamsIdx] = NULL;
}
@ -346,6 +403,116 @@ GetConnParam(const char *keyword)
}
/*
* GetEffectiveConnKey checks whether there is any pooler configuration for the
* provided key (host/port combination). The one case where this logic is not
* applied is for loopback connections originating within the task tracker. If
* a corresponding row is found in the poolinfo table, a modified (effective)
* key is returned with the node, port, and dbname overridden, as applicable,
* otherwise, the original key is returned unmodified.
*/
ConnectionHashKey *
GetEffectiveConnKey(ConnectionHashKey *key)
{
PQconninfoOption *option = NULL, *optionArray = NULL;
if (!IsTransactionState())
{
/* we're in the task tracker, so should only see loopback */
Assert(strncmp(LOCAL_HOST_NAME, key->hostname, MAX_NODE_LENGTH) == 0 &&
PostPortNumber == key->port);
return key;
}
WorkerNode *worker = FindWorkerNode(key->hostname, key->port);
if (worker == NULL)
{
/* this can be hit when the key references an unknown node */
return key;
}
char *poolinfo = GetPoolinfoViaCatalog(worker->nodeId);
if (poolinfo == NULL)
{
return key;
}
/* copy the key to provide defaults for all fields */
ConnectionHashKey *effectiveKey = palloc(sizeof(ConnectionHashKey));
*effectiveKey = *key;
optionArray = PQconninfoParse(poolinfo, NULL);
for (option = optionArray; option->keyword != NULL; option++)
{
if (option->val == NULL || option->val[0] == '\0')
{
continue;
}
if (strcmp(option->keyword, "host") == 0)
{
strlcpy(effectiveKey->hostname, option->val, MAX_NODE_LENGTH);
}
else if (strcmp(option->keyword, "port") == 0)
{
effectiveKey->port = pg_atoi(option->val, 4, 0);
}
else if (strcmp(option->keyword, "dbname") == 0)
{
/* permit dbname for poolers which can key pools based on dbname */
strlcpy(effectiveKey->database, option->val, NAMEDATALEN);
}
else
{
ereport(FATAL, (errmsg("unrecognized poolinfo keyword")));
}
}
PQconninfoFree(optionArray);
return effectiveKey;
}
/*
* GetAuthinfo simply returns the string representation of authentication info
* for a specified hostname/port/user combination. If the current transaction
* is valid, then we use the catalog, otherwise a shared memory hash is used,
* a mode that is currently only useful for getting authentication information
* to the Task Tracker, which lacks a database connection and transaction.
*/
char *
GetAuthinfo(char *hostname, int32 port, char *user)
{
char *authinfo = NULL;
bool isLoopback = (strncmp(LOCAL_HOST_NAME, hostname, MAX_NODE_LENGTH) == 0 &&
PostPortNumber == port);
if (IsTransactionState())
{
int64 nodeId = WILDCARD_NODE_ID;
/* -1 is a special value for loopback connections (task tracker) */
if (isLoopback)
{
nodeId = LOCALHOST_NODE_ID;
}
else
{
WorkerNode *worker = FindWorkerNode(hostname, port);
if (worker != NULL)
{
nodeId = worker->nodeId;
}
}
authinfo = GetAuthinfoViaCatalog(user, nodeId);
}
return (authinfo != NULL) ? authinfo : "";
}
/*
* CalculateMaxSize simply counts the number of elements returned by
* PQconnDefaults, including the final NULL. This helps us know how space would

View File

@ -1466,28 +1466,6 @@ ShouldShutdownConnection(MultiConnection *connection, const int cachedConnection
}
/*
* IsRebalancerInitiatedBackend returns true if we are in a backend that citus
* rebalancer initiated.
*/
bool
IsRebalancerInternalBackend(void)
{
return application_name && strcmp(application_name, CITUS_REBALANCER_NAME) == 0;
}
/*
* IsCitusInitiatedRemoteBackend returns true if we are in a backend that citus
* initiated via remote connection.
*/
bool
IsCitusInternalBackend(void)
{
return ExtractGlobalPID(application_name) != INVALID_CITUS_INTERNAL_BACKEND_GPID;
}
/*
* ResetConnection preserves the given connection for later usage by
* resetting its states.

View File

@ -971,7 +971,6 @@ ResetPlacementConnectionManagement(void)
hash_delete_all(ConnectionPlacementHash);
hash_delete_all(ConnectionShardHash);
hash_delete_all(ColocatedPlacementsHash);
ResetRelationAccessHash();
/*
* NB: memory for ConnectionReference structs and subordinate data is
@ -1091,9 +1090,6 @@ InitPlacementConnectionManagement(void)
ConnectionShardHash = hash_create("citus connection cache (shardid)",
64, &info, hashFlags);
/* (relationId) = [relationAccessMode] hash */
AllocateRelationAccessHash();
}

View File

@ -18,6 +18,7 @@
#include "distributed/listutils.h"
#include "distributed/log_utils.h"
#include "distributed/remote_commands.h"
#include "distributed/errormessage.h"
#include "distributed/cancel_utils.h"
#include "lib/stringinfo.h"
#include "miscadmin.h"
@ -636,14 +637,14 @@ PutRemoteCopyData(MultiConnection *connection, const char *buffer, int nbytes)
Assert(PQisnonblocking(pgConn));
int copyState = PQputCopyData(pgConn, buffer, nbytes);
if (copyState == -1)
if (copyState <= 0)
{
return false;
}
/*
* PQputCopyData may have queued up part of the data even if it managed
* to send some of it succesfully. We provide back pressure by waiting
* to send some of it successfully. We provide back pressure by waiting
* until the socket is writable to prevent the internal libpq buffers
* from growing excessively.
*
@ -1115,3 +1116,92 @@ SendCancelationRequest(MultiConnection *connection)
return cancelSent;
}
/*
* EvaluateSingleQueryResult gets the query result from connection and returns
* true if the query is executed successfully, false otherwise. A query result
* or an error message is returned in queryResultString. The function requires
* that the query returns a single column/single row result. It returns an
* error otherwise.
*/
bool
EvaluateSingleQueryResult(MultiConnection *connection, PGresult *queryResult,
StringInfo queryResultString)
{
bool success = false;
ExecStatusType resultStatus = PQresultStatus(queryResult);
if (resultStatus == PGRES_COMMAND_OK)
{
char *commandStatus = PQcmdStatus(queryResult);
appendStringInfo(queryResultString, "%s", commandStatus);
success = true;
}
else if (resultStatus == PGRES_TUPLES_OK)
{
int ntuples = PQntuples(queryResult);
int nfields = PQnfields(queryResult);
/* error if query returns more than 1 rows, or more than 1 fields */
if (nfields != 1)
{
appendStringInfo(queryResultString,
"expected a single column in query target");
}
else if (ntuples > 1)
{
appendStringInfo(queryResultString,
"expected a single row in query result");
}
else
{
int row = 0;
int column = 0;
if (!PQgetisnull(queryResult, row, column))
{
char *queryResultValue = PQgetvalue(queryResult, row, column);
appendStringInfo(queryResultString, "%s", queryResultValue);
}
success = true;
}
}
else
{
StoreErrorMessage(connection, queryResultString);
}
return success;
}
/*
* StoreErrorMessage gets the error message from connection and stores it
* in queryResultString. It should be called only when error is present
* otherwise it would return a default error message.
*/
void
StoreErrorMessage(MultiConnection *connection, StringInfo queryResultString)
{
char *errorMessage = PQerrorMessage(connection->pgConn);
if (errorMessage != NULL)
{
/* copy the error message to a writable memory */
errorMessage = pnstrdup(errorMessage, strlen(errorMessage));
char *firstNewlineIndex = strchr(errorMessage, '\n');
/* trim the error message at the line break */
if (firstNewlineIndex != NULL)
{
*firstNewlineIndex = '\0';
}
}
else
{
/* put a default error message if no error message is reported */
errorMessage = "An error occurred while running the query";
}
appendStringInfo(queryResultString, "%s", errorMessage);
}

View File

@ -427,7 +427,7 @@ IncrementSharedConnectionCounter(const char *hostname, int port)
{
SharedConnStatsHashKey connKey;
if (GetMaxSharedPoolSize() == DISABLE_CONNECTION_THROTTLING)
if (MaxSharedPoolSize == DISABLE_CONNECTION_THROTTLING)
{
/* connection throttling disabled */
return;
@ -491,7 +491,11 @@ DecrementSharedConnectionCounter(const char *hostname, int port)
{
SharedConnStatsHashKey connKey;
if (GetMaxSharedPoolSize() == DISABLE_CONNECTION_THROTTLING)
/*
* Do not call GetMaxSharedPoolSize() here, since it may read from
* the catalog and we may be in the process exit handler.
*/
if (MaxSharedPoolSize == DISABLE_CONNECTION_THROTTLING)
{
/* connection throttling disabled */
return;

View File

@ -79,8 +79,8 @@ static void deparse_index_columns(StringInfo buffer, List *indexParameterList,
List *deparseContext);
static void AppendStorageParametersToString(StringInfo stringBuffer,
List *optionList);
static const char * convert_aclright_to_string(int aclright);
static void simple_quote_literal(StringInfo buf, const char *val);
static char * flatten_reloptions(Oid relid);
static void AddVacuumParams(ReindexStmt *reindexStmt, StringInfo buffer);
@ -377,6 +377,14 @@ pg_get_tableschemadef_string(Oid tableRelationId, IncludeSequenceDefaults
atttypmod);
appendStringInfoString(&buffer, attributeTypeName);
#if PG_VERSION_NUM >= PG_VERSION_14
if (CompressionMethodIsValid(attributeForm->attcompression))
{
appendStringInfo(&buffer, " COMPRESSION %s",
GetCompressionMethodName(attributeForm->attcompression));
}
#endif
/* if this column has a default value, append the default value */
if (attributeForm->atthasdef)
{
@ -448,14 +456,6 @@ pg_get_tableschemadef_string(Oid tableRelationId, IncludeSequenceDefaults
appendStringInfoString(&buffer, " NOT NULL");
}
#if PG_VERSION_NUM >= PG_VERSION_14
if (CompressionMethodIsValid(attributeForm->attcompression))
{
appendStringInfo(&buffer, " COMPRESSION %s",
GetCompressionMethodName(attributeForm->attcompression));
}
#endif
if (attributeForm->attcollation != InvalidOid &&
attributeForm->attcollation != DEFAULT_COLLATION_OID)
{
@ -1063,6 +1063,138 @@ pg_get_indexclusterdef_string(Oid indexRelationId)
}
/*
* pg_get_table_grants returns a list of sql statements which recreate the
* permissions for a specific table.
*
* This function is modeled after aclexplode(), don't change too heavily.
*/
List *
pg_get_table_grants(Oid relationId)
{
/* *INDENT-OFF* */
StringInfoData buffer;
List *defs = NIL;
bool isNull = false;
Relation relation = relation_open(relationId, AccessShareLock);
char *relationName = generate_relation_name(relationId, NIL);
initStringInfo(&buffer);
/* lookup all table level grants */
HeapTuple classTuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relationId));
if (!HeapTupleIsValid(classTuple))
{
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_TABLE),
errmsg("relation with OID %u does not exist",
relationId)));
}
Datum aclDatum = SysCacheGetAttr(RELOID, classTuple, Anum_pg_class_relacl,
&isNull);
ReleaseSysCache(classTuple);
if (!isNull)
{
/*
* First revoke all default permissions, so we can start adding the
* exact permissions from the master. Note that we only do so if there
* are any actual grants; an empty grant set signals default
* permissions.
*
* Note: This doesn't work correctly if default permissions have been
* changed with ALTER DEFAULT PRIVILEGES - but that's hard to fix
* properly currently.
*/
appendStringInfo(&buffer, "REVOKE ALL ON %s FROM PUBLIC",
relationName);
defs = lappend(defs, pstrdup(buffer.data));
resetStringInfo(&buffer);
/* iterate through the acl datastructure, emit GRANTs */
Acl *acl = DatumGetAclP(aclDatum);
AclItem *aidat = ACL_DAT(acl);
int offtype = -1;
int i = 0;
while (i < ACL_NUM(acl))
{
AclItem *aidata = NULL;
AclMode priv_bit = 0;
offtype++;
if (offtype == N_ACL_RIGHTS)
{
offtype = 0;
i++;
if (i >= ACL_NUM(acl)) /* done */
{
break;
}
}
aidata = &aidat[i];
priv_bit = 1 << offtype;
if (ACLITEM_GET_PRIVS(*aidata) & priv_bit)
{
const char *roleName = NULL;
const char *withGrant = "";
if (aidata->ai_grantee != 0)
{
HeapTuple htup = SearchSysCache1(AUTHOID, ObjectIdGetDatum(aidata->ai_grantee));
if (HeapTupleIsValid(htup))
{
Form_pg_authid authForm = ((Form_pg_authid) GETSTRUCT(htup));
roleName = quote_identifier(NameStr(authForm->rolname));
ReleaseSysCache(htup);
}
else
{
elog(ERROR, "cache lookup failed for role %u", aidata->ai_grantee);
}
}
else
{
roleName = "PUBLIC";
}
if ((ACLITEM_GET_GOPTIONS(*aidata) & priv_bit) != 0)
{
withGrant = " WITH GRANT OPTION";
}
appendStringInfo(&buffer, "GRANT %s ON %s TO %s%s",
convert_aclright_to_string(priv_bit),
relationName,
roleName,
withGrant);
defs = lappend(defs, pstrdup(buffer.data));
resetStringInfo(&buffer);
}
}
}
resetStringInfo(&buffer);
relation_close(relation, NoLock);
return defs;
/* *INDENT-ON* */
}
/*
* generate_qualified_relation_name computes the schema-qualified name to display for a
* relation specified by OID.
@ -1157,6 +1289,45 @@ AppendStorageParametersToString(StringInfo stringBuffer, List *optionList)
}
/* copy of postgresql's function, which is static as well */
static const char *
convert_aclright_to_string(int aclright)
{
/* *INDENT-OFF* */
switch (aclright)
{
case ACL_INSERT:
return "INSERT";
case ACL_SELECT:
return "SELECT";
case ACL_UPDATE:
return "UPDATE";
case ACL_DELETE:
return "DELETE";
case ACL_TRUNCATE:
return "TRUNCATE";
case ACL_REFERENCES:
return "REFERENCES";
case ACL_TRIGGER:
return "TRIGGER";
case ACL_EXECUTE:
return "EXECUTE";
case ACL_USAGE:
return "USAGE";
case ACL_CREATE:
return "CREATE";
case ACL_CREATE_TEMP:
return "TEMPORARY";
case ACL_CONNECT:
return "CONNECT";
default:
elog(ERROR, "unrecognized aclright: %d", aclright);
return NULL;
}
/* *INDENT-ON* */
}
/*
* contain_nextval_expression_walker walks over expression tree and returns
* true if it contains call to 'nextval' function.
@ -1225,13 +1396,53 @@ pg_get_replica_identity_command(Oid tableRelationId)
}
/*
* pg_get_row_level_security_commands function returns the required ALTER .. TABLE
* commands to define the row level security settings for a relation.
*/
List *
pg_get_row_level_security_commands(Oid relationId)
{
StringInfoData buffer;
List *commands = NIL;
initStringInfo(&buffer);
Relation relation = table_open(relationId, AccessShareLock);
if (relation->rd_rel->relrowsecurity)
{
char *relationName = generate_qualified_relation_name(relationId);
appendStringInfo(&buffer, "ALTER TABLE %s ENABLE ROW LEVEL SECURITY",
relationName);
commands = lappend(commands, pstrdup(buffer.data));
resetStringInfo(&buffer);
}
if (relation->rd_rel->relforcerowsecurity)
{
char *relationName = generate_qualified_relation_name(relationId);
appendStringInfo(&buffer, "ALTER TABLE %s FORCE ROW LEVEL SECURITY",
relationName);
commands = lappend(commands, pstrdup(buffer.data));
resetStringInfo(&buffer);
}
table_close(relation, AccessShareLock);
return commands;
}
/*
* Generate a C string representing a relation's reloptions, or NULL if none.
*
* This function comes from PostgreSQL source code in
* src/backend/utils/adt/ruleutils.c
*/
static char *
char *
flatten_reloptions(Oid relid)
{
char *result = NULL;

View File

@ -0,0 +1,626 @@
/*-------------------------------------------------------------------------
*
* deparse_domain_stmts.c
* Functions to turn all Statement structures related to domains back
* into sql.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/htup_details.h"
#include "catalog/heap.h"
#include "catalog/namespace.h"
#include "catalog/pg_type.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "nodes/parsenodes.h"
#include "parser/parse_coerce.h"
#include "parser/parse_collate.h"
#include "parser/parse_expr.h"
#include "parser/parse_node.h"
#include "parser/parse_type.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/ruleutils.h"
#include "utils/syscache.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "distributed/namespace_utils.h"
static void AppendConstraint(StringInfo buf, Constraint *constraint, List *domainName,
TypeName *typeName);
static Node * replace_domain_constraint_value(ParseState *pstate, ColumnRef *cref);
static Node * TransformDefaultExpr(Node *expr, List *domainName, TypeName *typeName);
static Node * TransformConstraintExpr(Node *expr, TypeName *typeName);
static CoerceToDomainValue * GetCoerceDomainValue(TypeName *typeName);
static char * TypeNameAsIdentifier(TypeName *typeName);
static Oid DomainGetBaseTypeOid(List *names, int32 *baseTypeMod);
static void AppendAlterDomainStmtSetDefault(StringInfo buf, AlterDomainStmt *stmt);
static void AppendAlterDomainStmtAddConstraint(StringInfo buf, AlterDomainStmt *stmt);
static void AppendAlterDomainStmtDropConstraint(StringInfo buf, AlterDomainStmt *stmt);
/*
* DeparseCreateDomainStmt returns the sql representation for the CREATE DOMAIN statement.
*/
char *
DeparseCreateDomainStmt(Node *node)
{
CreateDomainStmt *stmt = castNode(CreateDomainStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
const char *domainIdentifier = NameListToQuotedString(stmt->domainname);
const char *typeIdentifier = TypeNameAsIdentifier(stmt->typeName);
appendStringInfo(&buf, "CREATE DOMAIN %s AS %s", domainIdentifier, typeIdentifier);
if (stmt->collClause)
{
const char *collateIdentifier =
NameListToQuotedString(stmt->collClause->collname);
appendStringInfo(&buf, " COLLATE %s", collateIdentifier);
}
Constraint *constraint = NULL;
foreach_ptr(constraint, stmt->constraints)
{
AppendConstraint(&buf, constraint, stmt->domainname, stmt->typeName);
}
appendStringInfoString(&buf, ";");
return buf.data;
}
/*
* TypeNameAsIdentifier returns the sql identifier of a TypeName. This is more complex
* than concatenating the schema name and typename since certain types contain modifiers
* that need to be correctly represented.
*/
static char *
TypeNameAsIdentifier(TypeName *typeName)
{
int32 typmod = 0;
Oid typeOid = InvalidOid;
bits16 formatFlags = FORMAT_TYPE_TYPEMOD_GIVEN | FORMAT_TYPE_FORCE_QUALIFY;
typenameTypeIdAndMod(NULL, typeName, &typeOid, &typmod);
return format_type_extended(typeOid, typmod, formatFlags);
}
/*
* DeparseDropDomainStmt returns the sql for teh DROP DOMAIN statement.
*/
char *
DeparseDropDomainStmt(Node *node)
{
DropStmt *stmt = castNode(DropStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
appendStringInfoString(&buf, "DROP DOMAIN ");
if (stmt->missing_ok)
{
appendStringInfoString(&buf, "IF EXISTS ");
}
TypeName *domainName = NULL;
bool first = true;
foreach_ptr(domainName, stmt->objects)
{
if (!first)
{
appendStringInfoString(&buf, ", ");
}
first = false;
const char *identifier = NameListToQuotedString(domainName->names);
appendStringInfoString(&buf, identifier);
}
if (stmt->behavior == DROP_CASCADE)
{
appendStringInfoString(&buf, " CASCADE");
}
appendStringInfoString(&buf, ";");
return buf.data;
}
/*
* DeparseAlterDomainStmt returns the sql representation of the DOMAIN specific ALTER
* statements.
*/
char *
DeparseAlterDomainStmt(Node *node)
{
AlterDomainStmt *stmt = castNode(AlterDomainStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
appendStringInfo(&buf, "ALTER DOMAIN %s ", NameListToQuotedString(stmt->typeName));
switch (stmt->subtype)
{
case 'T': /* SET DEFAULT */
{
AppendAlterDomainStmtSetDefault(&buf, stmt);
break;
}
case 'N': /* DROP NOT NULL */
{
appendStringInfoString(&buf, "DROP NOT NULL");
break;
}
case 'O': /* SET NOT NULL */
{
appendStringInfoString(&buf, "SET NOT NULL");
break;
}
case 'C': /* ADD [CONSTRAINT name] */
{
AppendAlterDomainStmtAddConstraint(&buf, stmt);
break;
}
case 'X': /* DROP CONSTRAINT */
{
AppendAlterDomainStmtDropConstraint(&buf, stmt);
break;
}
case 'V': /* VALIDATE CONSTRAINT */
{
appendStringInfo(&buf, "VALIDATE CONSTRAINT %s",
quote_identifier(stmt->name));
break;
}
default:
{
elog(ERROR, "unsupported alter domain statement for distribution");
}
}
appendStringInfoChar(&buf, ';');
return buf.data;
}
/*
* DeparseDomainRenameConstraintStmt returns the sql representation of the domain
* constraint renaming.
*/
char *
DeparseDomainRenameConstraintStmt(Node *node)
{
RenameStmt *stmt = castNode(RenameStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
char *domainIdentifier = NameListToQuotedString(castNode(List, stmt->object));
appendStringInfo(&buf, "ALTER DOMAIN %s RENAME CONSTRAINT %s TO %s;",
domainIdentifier,
quote_identifier(stmt->subname),
quote_identifier(stmt->newname));
return buf.data;
}
/*
* DeparseAlterDomainOwnerStmt returns the sql representation of the ALTER DOMAIN OWNER
* statement.
*/
char *
DeparseAlterDomainOwnerStmt(Node *node)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
List *domainName = castNode(List, stmt->object);
char *domainIdentifier = NameListToQuotedString(domainName);
appendStringInfo(&buf, "ALTER DOMAIN %s OWNER TO %s;",
domainIdentifier,
RoleSpecString(stmt->newowner, true));
return buf.data;
}
/*
* DeparseRenameDomainStmt returns the sql representation of the ALTER DOMAIN RENAME
* statement.
*/
char *
DeparseRenameDomainStmt(Node *node)
{
RenameStmt *stmt = castNode(RenameStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
List *domainName = castNode(List, stmt->object);
char *domainIdentifier = NameListToQuotedString(domainName);
appendStringInfo(&buf, "ALTER DOMAIN %s RENAME TO %s;",
domainIdentifier,
quote_identifier(stmt->newname));
return buf.data;
}
/*
* DeparseAlterDomainSchemaStmt returns the sql representation of the ALTER DOMAIN SET
* SCHEMA statement.
*/
char *
DeparseAlterDomainSchemaStmt(Node *node)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
List *domainName = castNode(List, stmt->object);
char *domainIdentifier = NameListToQuotedString(domainName);
appendStringInfo(&buf, "ALTER DOMAIN %s SET SCHEMA %s;",
domainIdentifier,
quote_identifier(stmt->newschema));
return buf.data;
}
/*
* DomainGetBaseTypeOid returns the type Oid and the type modifiers of the type underlying
* a domain addresses by the namelist provided as the names argument. The type modifier is
* only provided if the baseTypeMod pointer is a valid pointer on where to write the
* modifier (not a NULL pointer).
*
* If the type cannot be found this function will raise a non-userfacing error. Care needs
* to be taken by the caller that the domain is actually existing.
*/
static Oid
DomainGetBaseTypeOid(List *names, int32 *baseTypeMod)
{
TypeName *domainName = makeTypeNameFromNameList(names);
Oid domainoid = typenameTypeId(NULL, domainName);
HeapTuple tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(domainoid));
if (!HeapTupleIsValid(tup))
{
elog(ERROR, "cache lookup failed for type %u", domainoid);
}
Form_pg_type typTup = (Form_pg_type) GETSTRUCT(tup);
Oid baseTypeOid = typTup->typbasetype;
if (baseTypeMod)
{
*baseTypeMod = typTup->typtypmod;
}
ReleaseSysCache(tup);
return baseTypeOid;
}
/*
* AppendAlterDomainStmtSetDefault is a helper function that appends the default value
* portion of an ALTER DOMAIN statement that is changing the default value of the domain.
*/
static void
AppendAlterDomainStmtSetDefault(StringInfo buf, AlterDomainStmt *stmt)
{
if (stmt->def == NULL)
{
/* no default expression is a DROP DEFAULT statment */
appendStringInfoString(buf, "DROP DEFAULT");
return;
}
int32 baseTypMod = 0;
Oid baseOid = DomainGetBaseTypeOid(stmt->typeName, &baseTypMod);
TypeName *baseTypeName = makeTypeNameFromOid(baseOid, baseTypMod);
/* cook the default expression, without cooking we can't deparse */
Node *expr = stmt->def;
expr = TransformDefaultExpr(expr, stmt->typeName, baseTypeName);
/* deparse while the searchpath is cleared to force qualification of identifiers */
PushOverrideEmptySearchPath(CurrentMemoryContext);
char *exprSql = deparse_expression(expr, NIL, true, true);
PopOverrideSearchPath();
appendStringInfo(buf, "SET DEFAULT %s", exprSql);
}
/*
* AppendAlterDomainStmtAddConstraint is a helper function that appends the constraint
* specification for an ALTER DOMAIN statement that adds a constraint to the domain.
*/
static void
AppendAlterDomainStmtAddConstraint(StringInfo buf, AlterDomainStmt *stmt)
{
if (stmt->def == NULL || !IsA(stmt->def, Constraint))
{
ereport(ERROR, (errmsg("unable to deparse ALTER DOMAIN statement due to "
"unexpected contents")));
}
Constraint *constraint = castNode(Constraint, stmt->def);
appendStringInfoString(buf, "ADD");
int32 baseTypMod = 0;
Oid baseOid = DomainGetBaseTypeOid(stmt->typeName, &baseTypMod);
TypeName *baseTypeName = makeTypeNameFromOid(baseOid, baseTypMod);
AppendConstraint(buf, constraint, stmt->typeName, baseTypeName);
if (!constraint->initially_valid)
{
appendStringInfoString(buf, " NOT VALID");
}
}
/*
* AppendAlterDomainStmtDropConstraint is a helper function that appends the DROP
* CONSTRAINT part of an ALTER DOMAIN statement for an alter statement that drops a
* constraint.
*/
static void
AppendAlterDomainStmtDropConstraint(StringInfo buf, AlterDomainStmt *stmt)
{
appendStringInfoString(buf, "DROP CONSTRAINT ");
if (stmt->missing_ok)
{
appendStringInfoString(buf, "IF EXISTS ");
}
appendStringInfoString(buf, quote_identifier(stmt->name));
if (stmt->behavior == DROP_CASCADE)
{
appendStringInfoString(buf, " CASCADE");
}
}
/*
* AppendConstraint is a helper function that appends a constraint specification to a sql
* string that is adding a constraint.
*
* There are multiple places where a constraint specification is appended to sql strings.
*
* Given the complexities of serializing a constraint they all use this routine.
*/
static void
AppendConstraint(StringInfo buf, Constraint *constraint, List *domainName,
TypeName *typeName)
{
if (constraint->conname)
{
appendStringInfo(buf, " CONSTRAINT %s", quote_identifier(constraint->conname));
}
switch (constraint->contype)
{
case CONSTR_CHECK:
{
Node *expr = NULL;
if (constraint->raw_expr)
{
/* the expression was parsed from sql, still needs to transform */
expr = TransformConstraintExpr(constraint->raw_expr, typeName);
}
else if (constraint->cooked_expr)
{
/* expression was read from the catalog, no cooking required just parse */
expr = stringToNode(constraint->cooked_expr);
}
else
{
elog(ERROR, "missing expression for domain constraint");
}
PushOverrideEmptySearchPath(CurrentMemoryContext);
char *exprSql = deparse_expression(expr, NIL, true, true);
PopOverrideSearchPath();
appendStringInfo(buf, " CHECK (%s)", exprSql);
return;
}
case CONSTR_DEFAULT:
{
Node *expr = NULL;
if (constraint->raw_expr)
{
/* the expression was parsed from sql, still needs to transform */
expr = TransformDefaultExpr(constraint->raw_expr, domainName, typeName);
}
else if (constraint->cooked_expr)
{
/* expression was read from the catalog, no cooking required just parse */
expr = stringToNode(constraint->cooked_expr);
}
else
{
elog(ERROR, "missing expression for domain default");
}
PushOverrideEmptySearchPath(CurrentMemoryContext);
char *exprSql = deparse_expression(expr, NIL, true, true);
PopOverrideSearchPath();
appendStringInfo(buf, " DEFAULT %s", exprSql);
return;
}
case CONSTR_NOTNULL:
{
appendStringInfoString(buf, " NOT NULL");
return;
}
case CONSTR_NULL:
{
appendStringInfoString(buf, " NULL");
return;
}
default:
{
ereport(ERROR, (errmsg("unsupported constraint for distributed domain")));
}
}
}
/*
* TransformDefaultExpr transforms a default expression from the expression passed on the
* AST to a cooked version that postgres uses internally.
*
* Only the cooked version can be easily turned back into a sql string, hence its use in
* the deparser. This is only called for default expressions that don't have a cooked
* variant stored.
*/
static Node *
TransformDefaultExpr(Node *expr, List *domainName, TypeName *typeName)
{
const char *domainNameStr = NameListToQuotedString(domainName);
int32 basetypeMod = 0; /* capture typeMod during lookup */
Type tup = typenameType(NULL, typeName, &basetypeMod);
Oid basetypeoid = typeTypeId(tup);
ReleaseSysCache(tup);
ParseState *pstate = make_parsestate(NULL);
Node *defaultExpr = cookDefault(pstate, expr,
basetypeoid,
basetypeMod,
domainNameStr,
0);
return defaultExpr;
}
/*
* TransformConstraintExpr transforms a constraint expression from the expression passed
* on the AST to a cooked version that postgres uses internally.
*
* Only the cooked version can be easily turned back into a sql string, hence its use in
* the deparser. This is only called for default expressions that don't have a cooked
* variant stored.
*/
static Node *
TransformConstraintExpr(Node *expr, TypeName *typeName)
{
/*
* Convert the A_EXPR in raw_expr into an EXPR
*/
ParseState *pstate = make_parsestate(NULL);
/*
* Set up a CoerceToDomainValue to represent the occurrence of VALUE in
* the expression. Note that it will appear to have the type of the base
* type, not the domain. This seems correct since within the check
* expression, we should not assume the input value can be considered a
* member of the domain.
*/
CoerceToDomainValue *domVal = GetCoerceDomainValue(typeName);
pstate->p_pre_columnref_hook = replace_domain_constraint_value;
pstate->p_ref_hook_state = (void *) domVal;
expr = transformExpr(pstate, expr, EXPR_KIND_DOMAIN_CHECK);
/*
* Make sure it yields a boolean result.
*/
expr = coerce_to_boolean(pstate, expr, "CHECK");
/*
* Fix up collation information.
*/
assign_expr_collations(pstate, expr);
return expr;
}
/*
* GetCoerceDomainValue creates a stub CoerceToDomainValue struct representing the type
* referenced by the typeName.
*/
static CoerceToDomainValue *
GetCoerceDomainValue(TypeName *typeName)
{
int32 typMod = 0; /* capture typeMod during lookup */
Type tup = LookupTypeName(NULL, typeName, &typMod, false);
if (tup == NULL)
{
elog(ERROR, "unable to lookup type information for %s",
NameListToQuotedString(typeName->names));
}
CoerceToDomainValue *domVal = makeNode(CoerceToDomainValue);
domVal->typeId = typeTypeId(tup);
domVal->typeMod = typMod;
domVal->collation = typeTypeCollation(tup);
domVal->location = -1;
ReleaseSysCache(tup);
return domVal;
}
/* Parser pre_columnref_hook for domain CHECK constraint parsing */
static Node *
replace_domain_constraint_value(ParseState *pstate, ColumnRef *cref)
{
/*
* Check for a reference to "value", and if that's what it is, replace
* with a CoerceToDomainValue as prepared for us by domainAddConstraint.
* (We handle VALUE as a name, not a keyword, to avoid breaking a lot of
* applications that have used VALUE as a column name in the past.)
*/
if (list_length(cref->fields) == 1)
{
Node *field1 = (Node *) linitial(cref->fields);
Assert(IsA(field1, String));
char *colname = strVal(field1);
if (strcmp(colname, "value") == 0)
{
CoerceToDomainValue *domVal = copyObject(pstate->p_ref_hook_state);
/* Propagate location knowledge, if any */
domVal->location = cref->location;
return (Node *) domVal;
}
}
return NULL;
}

View File

@ -0,0 +1,93 @@
/*-------------------------------------------------------------------------
*
* deparse_foreign_data_wrapper_stmts.c
* All routines to deparse foreign data wrapper statements.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "commands/defrem.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "distributed/relay_utility.h"
#include "lib/stringinfo.h"
#include "nodes/nodes.h"
#include "utils/builtins.h"
static void AppendGrantOnFDWStmt(StringInfo buf, GrantStmt *stmt);
static void AppendGrantOnFDWNames(StringInfo buf, GrantStmt *stmt);
char *
DeparseGrantOnFDWStmt(Node *node)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_FDW);
StringInfoData str = { 0 };
initStringInfo(&str);
AppendGrantOnFDWStmt(&str, stmt);
return str.data;
}
static void
AppendGrantOnFDWStmt(StringInfo buf, GrantStmt *stmt)
{
Assert(stmt->objtype == OBJECT_FDW);
appendStringInfo(buf, "%s ", stmt->is_grant ? "GRANT" : "REVOKE");
if (!stmt->is_grant && stmt->grant_option)
{
appendStringInfo(buf, "GRANT OPTION FOR ");
}
AppendGrantPrivileges(buf, stmt);
AppendGrantOnFDWNames(buf, stmt);
AppendGrantGrantees(buf, stmt);
if (stmt->is_grant && stmt->grant_option)
{
appendStringInfo(buf, " WITH GRANT OPTION");
}
if (!stmt->is_grant)
{
if (stmt->behavior == DROP_RESTRICT)
{
appendStringInfo(buf, " RESTRICT");
}
else if (stmt->behavior == DROP_CASCADE)
{
appendStringInfo(buf, " CASCADE");
}
}
appendStringInfo(buf, ";");
}
static void
AppendGrantOnFDWNames(StringInfo buf, GrantStmt *stmt)
{
ListCell *cell = NULL;
appendStringInfo(buf, " ON FOREIGN DATA WRAPPER ");
foreach(cell, stmt->objects)
{
char *fdwname = strVal(lfirst(cell));
appendStringInfoString(buf, quote_identifier(fdwname));
if (cell != list_tail(stmt->objects))
{
appendStringInfo(buf, ", ");
}
}
}

View File

@ -27,6 +27,8 @@ static void AppendDropForeignServerStmt(StringInfo buf, DropStmt *stmt);
static void AppendServerNames(StringInfo buf, DropStmt *stmt);
static void AppendBehavior(StringInfo buf, DropStmt *stmt);
static char * GetDefElemActionString(DefElemAction action);
static void AppendGrantOnForeignServerStmt(StringInfo buf, GrantStmt *stmt);
static void AppendGrantOnForeignServerServers(StringInfo buf, GrantStmt *stmt);
char *
DeparseCreateForeignServerStmt(Node *node)
@ -104,6 +106,21 @@ DeparseDropForeignServerStmt(Node *node)
}
char *
DeparseGrantOnForeignServerStmt(Node *node)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_FOREIGN_SERVER);
StringInfoData str = { 0 };
initStringInfo(&str);
AppendGrantOnForeignServerStmt(&str, stmt);
return str.data;
}
static void
AppendCreateForeignServerStmt(StringInfo buf, CreateForeignServerStmt *stmt)
{
@ -275,3 +292,58 @@ GetDefElemActionString(DefElemAction action)
return "";
}
}
static void
AppendGrantOnForeignServerStmt(StringInfo buf, GrantStmt *stmt)
{
Assert(stmt->objtype == OBJECT_FOREIGN_SERVER);
appendStringInfo(buf, "%s ", stmt->is_grant ? "GRANT" : "REVOKE");
if (!stmt->is_grant && stmt->grant_option)
{
appendStringInfo(buf, "GRANT OPTION FOR ");
}
AppendGrantPrivileges(buf, stmt);
AppendGrantOnForeignServerServers(buf, stmt);
AppendGrantGrantees(buf, stmt);
if (stmt->is_grant && stmt->grant_option)
{
appendStringInfo(buf, " WITH GRANT OPTION");
}
if (!stmt->is_grant)
{
if (stmt->behavior == DROP_RESTRICT)
{
appendStringInfo(buf, " RESTRICT");
}
else if (stmt->behavior == DROP_CASCADE)
{
appendStringInfo(buf, " CASCADE");
}
}
appendStringInfo(buf, ";");
}
static void
AppendGrantOnForeignServerServers(StringInfo buf, GrantStmt *stmt)
{
ListCell *cell = NULL;
appendStringInfo(buf, " ON FOREIGN SERVER ");
foreach(cell, stmt->objects)
{
char *servername = strVal(lfirst(cell));
appendStringInfoString(buf, quote_identifier(servername));
if (cell != list_tail(stmt->objects))
{
appendStringInfo(buf, ", ");
}
}
}

View File

@ -67,6 +67,9 @@ static void AppendAlterFunctionSchemaStmt(StringInfo buf, AlterObjectSchemaStmt
static void AppendAlterFunctionOwnerStmt(StringInfo buf, AlterOwnerStmt *stmt);
static void AppendAlterFunctionDependsStmt(StringInfo buf, AlterObjectDependsStmt *stmt);
static void AppendGrantOnFunctionStmt(StringInfo buf, GrantStmt *stmt);
static void AppendGrantOnFunctionFunctions(StringInfo buf, GrantStmt *stmt);
static char * CopyAndConvertToUpperCase(const char *str);
/*
@ -711,3 +714,113 @@ CopyAndConvertToUpperCase(const char *str)
return result;
}
/*
* DeparseGrantOnFunctionStmt builds and returns a string representing the GrantOnFunctionStmt
*/
char *
DeparseGrantOnFunctionStmt(Node *node)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(isFunction(stmt->objtype));
StringInfoData str = { 0 };
initStringInfo(&str);
AppendGrantOnFunctionStmt(&str, stmt);
return str.data;
}
/*
* AppendGrantOnFunctionStmt builds and returns an SQL command representing a
* GRANT .. ON FUNCTION command from given GrantStmt object.
*/
static void
AppendGrantOnFunctionStmt(StringInfo buf, GrantStmt *stmt)
{
Assert(isFunction(stmt->objtype));
if (stmt->targtype == ACL_TARGET_ALL_IN_SCHEMA)
{
elog(ERROR,
"GRANT .. ALL FUNCTIONS/PROCEDURES IN SCHEMA is not supported for formatting.");
}
appendStringInfoString(buf, stmt->is_grant ? "GRANT " : "REVOKE ");
if (!stmt->is_grant && stmt->grant_option)
{
appendStringInfoString(buf, "GRANT OPTION FOR ");
}
AppendGrantPrivileges(buf, stmt);
AppendGrantOnFunctionFunctions(buf, stmt);
AppendGrantGrantees(buf, stmt);
if (stmt->is_grant && stmt->grant_option)
{
appendStringInfoString(buf, " WITH GRANT OPTION");
}
if (!stmt->is_grant)
{
if (stmt->behavior == DROP_RESTRICT)
{
appendStringInfoString(buf, " RESTRICT");
}
else if (stmt->behavior == DROP_CASCADE)
{
appendStringInfoString(buf, " CASCADE");
}
}
appendStringInfoString(buf, ";");
}
/*
* AppendGrantOnFunctionFunctions appends the function names along with their arguments
* to the given StringInfo from the given GrantStmt
*/
static void
AppendGrantOnFunctionFunctions(StringInfo buf, GrantStmt *stmt)
{
ListCell *cell = NULL;
appendStringInfo(buf, " ON %s ", ObjectTypeToKeyword(stmt->objtype));
foreach(cell, stmt->objects)
{
/*
* GrantOnFunction statement keeps its objects (functions) as
* a list of ObjectWithArgs
*/
ObjectWithArgs *function = (ObjectWithArgs *) lfirst(cell);
appendStringInfoString(buf, NameListToString(function->objname));
if (!function->args_unspecified)
{
/* if args are specified, we should append "(arg1, arg2, ...)" to the function name */
const char *args = TypeNameListToString(function->objargs);
appendStringInfo(buf, "(%s)", args);
}
if (cell != list_tail(stmt->objects))
{
appendStringInfoString(buf, ", ");
}
}
}
/*
* isFunction returns true if the given ObjectType is a function, a procedure or a routine
* otherwise returns false
*/
bool
isFunction(ObjectType objectType)
{
return (objectType == OBJECT_FUNCTION || objectType == OBJECT_PROCEDURE ||
objectType == OBJECT_ROUTINE);
}

View File

@ -21,7 +21,11 @@
static void AppendAlterRoleStmt(StringInfo buf, AlterRoleStmt *stmt);
static void AppendAlterRoleSetStmt(StringInfo buf, AlterRoleSetStmt *stmt);
static void AppendCreateRoleStmt(StringInfo buf, CreateRoleStmt *stmt);
static void AppendRoleOption(StringInfo buf, ListCell *optionCell);
static void AppendRoleList(StringInfo buf, List *roleList);
static void AppendDropRoleStmt(StringInfo buf, DropRoleStmt *stmt);
static void AppendGrantRoleStmt(StringInfo buf, GrantRoleStmt *stmt);
/*
@ -173,6 +177,213 @@ AppendRoleOption(StringInfo buf, ListCell *optionCell)
}
/*
* DeparseCreateRoleStmt builds and returns a string representing of the
* CreateRoleStmt for application on a remote server.
*/
char *
DeparseCreateRoleStmt(Node *node)
{
CreateRoleStmt *stmt = castNode(CreateRoleStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
AppendCreateRoleStmt(&buf, stmt);
return buf.data;
}
/*
* AppendCreateRoleStmt generates the string representation of the
* CreateRoleStmt and appends it to the buffer.
*/
static void
AppendCreateRoleStmt(StringInfo buf, CreateRoleStmt *stmt)
{
ListCell *optionCell = NULL;
appendStringInfo(buf, "CREATE ");
switch (stmt->stmt_type)
{
case ROLESTMT_ROLE:
{
appendStringInfo(buf, "ROLE ");
break;
}
case ROLESTMT_USER:
{
appendStringInfo(buf, "USER ");
break;
}
case ROLESTMT_GROUP:
{
appendStringInfo(buf, "GROUP ");
break;
}
}
appendStringInfo(buf, "%s", quote_identifier(stmt->role));
foreach(optionCell, stmt->options)
{
AppendRoleOption(buf, optionCell);
DefElem *option = (DefElem *) lfirst(optionCell);
if (strcmp(option->defname, "sysid") == 0)
{
appendStringInfo(buf, " SYSID %d", intVal(option->arg));
}
else if (strcmp(option->defname, "adminmembers") == 0)
{
appendStringInfo(buf, " ADMIN ");
AppendRoleList(buf, (List *) option->arg);
}
else if (strcmp(option->defname, "rolemembers") == 0)
{
appendStringInfo(buf, " ROLE ");
AppendRoleList(buf, (List *) option->arg);
}
else if (strcmp(option->defname, "addroleto") == 0)
{
appendStringInfo(buf, " IN ROLE ");
AppendRoleList(buf, (List *) option->arg);
}
}
}
/*
* DeparseDropRoleStmt builds and returns a string representing of the
* DropRoleStmt for application on a remote server.
*/
char *
DeparseDropRoleStmt(Node *node)
{
DropRoleStmt *stmt = castNode(DropRoleStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
AppendDropRoleStmt(&buf, stmt);
return buf.data;
}
/*
* AppendDropRoleStmt generates the string representation of the
* DropRoleStmt and appends it to the buffer.
*/
static void
AppendDropRoleStmt(StringInfo buf, DropRoleStmt *stmt)
{
appendStringInfo(buf, "DROP ROLE ");
if (stmt->missing_ok)
{
appendStringInfo(buf, "IF EXISTS ");
}
AppendRoleList(buf, stmt->roles);
}
static void
AppendRoleList(StringInfo buf, List *roleList)
{
ListCell *cell = NULL;
foreach(cell, roleList)
{
Node *roleNode = (Node *) lfirst(cell);
Assert(IsA(roleNode, RoleSpec) || IsA(roleNode, AccessPriv));
char const *rolename = NULL;
if (IsA(roleNode, RoleSpec))
{
rolename = RoleSpecString((RoleSpec *) roleNode, true);
}
if (IsA(roleNode, AccessPriv))
{
rolename = quote_identifier(((AccessPriv *) roleNode)->priv_name);
}
appendStringInfoString(buf, rolename);
if (cell != list_tail(roleList))
{
appendStringInfo(buf, ", ");
}
}
}
/*
* DeparseGrantRoleStmt builds and returns a string representing of the
* GrantRoleStmt for application on a remote server.
*/
char *
DeparseGrantRoleStmt(Node *node)
{
GrantRoleStmt *stmt = castNode(GrantRoleStmt, node);
StringInfoData buf = { 0 };
initStringInfo(&buf);
AppendGrantRoleStmt(&buf, stmt);
return buf.data;
}
/*
* AppendGrantRoleStmt generates the string representation of the
* GrantRoleStmt and appends it to the buffer.
*/
static void
AppendGrantRoleStmt(StringInfo buf, GrantRoleStmt *stmt)
{
appendStringInfo(buf, "%s ", stmt->is_grant ? "GRANT" : "REVOKE");
if (!stmt->is_grant && stmt->admin_opt)
{
appendStringInfo(buf, "ADMIN OPTION FOR ");
}
AppendRoleList(buf, stmt->granted_roles);
appendStringInfo(buf, "%s ", stmt->is_grant ? " TO " : " FROM ");
AppendRoleList(buf, stmt->grantee_roles);
if (stmt->is_grant)
{
if (stmt->admin_opt)
{
appendStringInfo(buf, " WITH ADMIN OPTION");
}
if (stmt->grantor)
{
appendStringInfo(buf, " GRANTED BY %s", RoleSpecString(stmt->grantor, true));
}
}
else
{
if (stmt->behavior == DROP_RESTRICT)
{
appendStringInfo(buf, " RESTRICT");
}
else if (stmt->behavior == DROP_CASCADE)
{
appendStringInfo(buf, " CASCADE");
}
}
}
/*
* AppendAlterRoleSetStmt generates the string representation of the
* AlterRoleSetStmt and appends it to the buffer.

View File

@ -22,9 +22,7 @@
static void AppendCreateSchemaStmt(StringInfo buf, CreateSchemaStmt *stmt);
static void AppendDropSchemaStmt(StringInfo buf, DropStmt *stmt);
static void AppendGrantOnSchemaStmt(StringInfo buf, GrantStmt *stmt);
static void AppendGrantOnSchemaPrivileges(StringInfo buf, GrantStmt *stmt);
static void AppendGrantOnSchemaSchemas(StringInfo buf, GrantStmt *stmt);
static void AppendGrantOnSchemaGrantees(StringInfo buf, GrantStmt *stmt);
static void AppendAlterSchemaRenameStmt(StringInfo buf, RenameStmt *stmt);
char *
@ -161,11 +159,11 @@ AppendGrantOnSchemaStmt(StringInfo buf, GrantStmt *stmt)
appendStringInfo(buf, "GRANT OPTION FOR ");
}
AppendGrantOnSchemaPrivileges(buf, stmt);
AppendGrantPrivileges(buf, stmt);
AppendGrantOnSchemaSchemas(buf, stmt);
AppendGrantOnSchemaGrantees(buf, stmt);
AppendGrantGrantees(buf, stmt);
if (stmt->is_grant && stmt->grant_option)
{
@ -186,8 +184,8 @@ AppendGrantOnSchemaStmt(StringInfo buf, GrantStmt *stmt)
}
static void
AppendGrantOnSchemaPrivileges(StringInfo buf, GrantStmt *stmt)
void
AppendGrantPrivileges(StringInfo buf, GrantStmt *stmt)
{
if (list_length(stmt->privileges) == 0)
{
@ -227,8 +225,8 @@ AppendGrantOnSchemaSchemas(StringInfo buf, GrantStmt *stmt)
}
static void
AppendGrantOnSchemaGrantees(StringInfo buf, GrantStmt *stmt)
void
AppendGrantGrantees(StringInfo buf, GrantStmt *stmt)
{
ListCell *cell = NULL;
appendStringInfo(buf, " %s ", stmt->is_grant ? "TO" : "FROM");

View File

@ -27,6 +27,8 @@ static void AppendSequenceNameList(StringInfo buf, List *objects, ObjectType obj
static void AppendRenameSequenceStmt(StringInfo buf, RenameStmt *stmt);
static void AppendAlterSequenceSchemaStmt(StringInfo buf, AlterObjectSchemaStmt *stmt);
static void AppendAlterSequenceOwnerStmt(StringInfo buf, AlterTableStmt *stmt);
static void AppendGrantOnSequenceStmt(StringInfo buf, GrantStmt *stmt);
static void AppendGrantOnSequenceSequences(StringInfo buf, GrantStmt *stmt);
/*
* DeparseDropSequenceStmt builds and returns a string representing the DropStmt
@ -86,12 +88,6 @@ AppendSequenceNameList(StringInfo buf, List *objects, ObjectType objtype)
RangeVar *seq = makeRangeVarFromNameList((List *) lfirst(objectCell));
if (seq->schemaname == NULL)
{
Oid schemaOid = RangeVarGetCreationNamespace(seq);
seq->schemaname = get_namespace_name(schemaOid);
}
char *qualifiedSequenceName = quote_qualified_identifier(seq->schemaname,
seq->relname);
appendStringInfoString(buf, qualifiedSequenceName);
@ -260,3 +256,107 @@ AppendAlterSequenceOwnerStmt(StringInfo buf, AlterTableStmt *stmt)
}
}
}
/*
* DeparseGrantOnSequenceStmt builds and returns a string representing the GrantOnSequenceStmt
*/
char *
DeparseGrantOnSequenceStmt(Node *node)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_SEQUENCE);
StringInfoData str = { 0 };
initStringInfo(&str);
AppendGrantOnSequenceStmt(&str, stmt);
return str.data;
}
/*
* AppendGrantOnSequenceStmt builds and returns an SQL command representing a
* GRANT .. ON SEQUENCE command from given GrantStmt object.
*/
static void
AppendGrantOnSequenceStmt(StringInfo buf, GrantStmt *stmt)
{
Assert(stmt->objtype == OBJECT_SEQUENCE);
if (stmt->targtype == ACL_TARGET_ALL_IN_SCHEMA)
{
/*
* Normally we shouldn't reach this
* We deparse a GrantStmt with OBJECT_SEQUENCE after setting targtype
* to ACL_TARGET_OBJECT
*/
elog(ERROR,
"GRANT .. ALL SEQUENCES IN SCHEMA is not supported for formatting.");
}
appendStringInfoString(buf, stmt->is_grant ? "GRANT " : "REVOKE ");
if (!stmt->is_grant && stmt->grant_option)
{
appendStringInfoString(buf, "GRANT OPTION FOR ");
}
AppendGrantPrivileges(buf, stmt);
AppendGrantOnSequenceSequences(buf, stmt);
AppendGrantGrantees(buf, stmt);
if (stmt->is_grant && stmt->grant_option)
{
appendStringInfoString(buf, " WITH GRANT OPTION");
}
if (!stmt->is_grant)
{
if (stmt->behavior == DROP_RESTRICT)
{
appendStringInfoString(buf, " RESTRICT");
}
else if (stmt->behavior == DROP_CASCADE)
{
appendStringInfoString(buf, " CASCADE");
}
}
appendStringInfoString(buf, ";");
}
/*
* AppendGrantOnSequenceSequences appends the sequence names along with their arguments
* to the given StringInfo from the given GrantStmt
*/
static void
AppendGrantOnSequenceSequences(StringInfo buf, GrantStmt *stmt)
{
Assert(stmt->objtype == OBJECT_SEQUENCE);
appendStringInfoString(buf, " ON SEQUENCE ");
ListCell *cell = NULL;
foreach(cell, stmt->objects)
{
/*
* GrantOnSequence statement keeps its objects (sequences) as
* a list of RangeVar-s
*/
RangeVar *sequence = (RangeVar *) lfirst(cell);
/*
* We have qualified the statement beforehand
*/
appendStringInfoString(buf, quote_qualified_identifier(sequence->schemaname,
sequence->relname));
if (cell != list_tail(stmt->objects))
{
appendStringInfoString(buf, ", ");
}
}
}

View File

@ -0,0 +1,310 @@
/*-------------------------------------------------------------------------
*
* deparse_view_stmts.c
*
* All routines to deparse view statements.
*
* Copyright (c), Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "catalog/namespace.h"
#include "commands/defrem.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/commands.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
static void AppendDropViewStmt(StringInfo buf, DropStmt *stmt);
static void AppendViewNameList(StringInfo buf, List *objects);
static void AppendAlterViewStmt(StringInfo buf, AlterTableStmt *stmt);
static void AppendAlterViewCmd(StringInfo buf, AlterTableCmd *alterTableCmd);
static void AppendAlterViewOwnerStmt(StringInfo buf, AlterTableCmd *alterTableCmd);
static void AppendAlterViewSetOptionsStmt(StringInfo buf, AlterTableCmd *alterTableCmd);
static void AppendAlterViewResetOptionsStmt(StringInfo buf, AlterTableCmd *alterTableCmd);
static void AppendRenameViewStmt(StringInfo buf, RenameStmt *stmt);
static void AppendAlterViewSchemaStmt(StringInfo buf, AlterObjectSchemaStmt *stmt);
/*
* DeparseDropViewStmt deparses the given DROP VIEW statement.
*/
char *
DeparseDropViewStmt(Node *node)
{
DropStmt *stmt = castNode(DropStmt, node);
StringInfoData str = { 0 };
initStringInfo(&str);
Assert(stmt->removeType == OBJECT_VIEW);
AppendDropViewStmt(&str, stmt);
return str.data;
}
/*
* AppendDropViewStmt appends the deparsed representation of given drop stmt
* to the given string info buffer.
*/
static void
AppendDropViewStmt(StringInfo buf, DropStmt *stmt)
{
/*
* already tested at call site, but for future it might be collapsed in a
* DeparseDropStmt so be safe and check again
*/
Assert(stmt->removeType == OBJECT_VIEW);
appendStringInfo(buf, "DROP VIEW ");
if (stmt->missing_ok)
{
appendStringInfoString(buf, "IF EXISTS ");
}
AppendViewNameList(buf, stmt->objects);
if (stmt->behavior == DROP_CASCADE)
{
appendStringInfoString(buf, " CASCADE");
}
appendStringInfoString(buf, ";");
}
/*
* AppendViewNameList appends the qualified view names by constructing them from the given
* objects list to the given string info buffer. Note that, objects must hold schema
* qualified view names as its' members.
*/
static void
AppendViewNameList(StringInfo buf, List *viewNamesList)
{
bool isFirstView = true;
List *qualifiedViewName = NULL;
foreach_ptr(qualifiedViewName, viewNamesList)
{
char *quotedQualifiedVieName = NameListToQuotedString(qualifiedViewName);
if (!isFirstView)
{
appendStringInfo(buf, ", ");
}
appendStringInfoString(buf, quotedQualifiedVieName);
isFirstView = false;
}
}
/*
* DeparseAlterViewStmt deparses the given ALTER VIEW statement.
*/
char *
DeparseAlterViewStmt(Node *node)
{
AlterTableStmt *stmt = castNode(AlterTableStmt, node);
StringInfoData str = { 0 };
initStringInfo(&str);
AppendAlterViewStmt(&str, stmt);
return str.data;
}
static void
AppendAlterViewStmt(StringInfo buf, AlterTableStmt *stmt)
{
const char *identifier = quote_qualified_identifier(stmt->relation->schemaname,
stmt->relation->relname);
appendStringInfo(buf, "ALTER VIEW %s ", identifier);
AlterTableCmd *alterTableCmd = castNode(AlterTableCmd, lfirst(list_head(stmt->cmds)));
AppendAlterViewCmd(buf, alterTableCmd);
appendStringInfoString(buf, ";");
}
static void
AppendAlterViewCmd(StringInfo buf, AlterTableCmd *alterTableCmd)
{
switch (alterTableCmd->subtype)
{
case AT_ChangeOwner:
{
AppendAlterViewOwnerStmt(buf, alterTableCmd);
break;
}
case AT_SetRelOptions:
{
AppendAlterViewSetOptionsStmt(buf, alterTableCmd);
break;
}
case AT_ResetRelOptions:
{
AppendAlterViewResetOptionsStmt(buf, alterTableCmd);
break;
}
case AT_ColumnDefault:
{
elog(ERROR, "Citus doesn't support setting or resetting default values for a "
"column of view");
break;
}
default:
{
/*
* ALTER VIEW command only supports for the cases checked above but an
* ALTER TABLE commands targeting views may have different cases. To let
* PG throw the right error locally, we don't throw any error here
*/
break;
}
}
}
static void
AppendAlterViewOwnerStmt(StringInfo buf, AlterTableCmd *alterTableCmd)
{
appendStringInfo(buf, "OWNER TO %s", RoleSpecString(alterTableCmd->newowner, true));
}
static void
AppendAlterViewSetOptionsStmt(StringInfo buf, AlterTableCmd *alterTableCmd)
{
ListCell *lc = NULL;
bool initialOption = true;
foreach(lc, (List *) alterTableCmd->def)
{
DefElem *def = (DefElem *) lfirst(lc);
if (initialOption)
{
appendStringInfo(buf, "SET (");
initialOption = false;
}
else
{
appendStringInfo(buf, ",");
}
appendStringInfo(buf, "%s", def->defname);
if (def->arg != NULL)
{
appendStringInfo(buf, "=");
appendStringInfo(buf, "%s", defGetString(def));
}
}
appendStringInfo(buf, ")");
}
static void
AppendAlterViewResetOptionsStmt(StringInfo buf, AlterTableCmd *alterTableCmd)
{
ListCell *lc = NULL;
bool initialOption = true;
foreach(lc, (List *) alterTableCmd->def)
{
DefElem *def = (DefElem *) lfirst(lc);
if (initialOption)
{
appendStringInfo(buf, "RESET (");
initialOption = false;
}
else
{
appendStringInfo(buf, ",");
}
appendStringInfo(buf, "%s", def->defname);
}
appendStringInfo(buf, ")");
}
char *
DeparseRenameViewStmt(Node *node)
{
RenameStmt *stmt = castNode(RenameStmt, node);
StringInfoData str = { 0 };
initStringInfo(&str);
AppendRenameViewStmt(&str, stmt);
return str.data;
}
static void
AppendRenameViewStmt(StringInfo buf, RenameStmt *stmt)
{
switch (stmt->renameType)
{
case OBJECT_COLUMN:
{
const char *identifier =
quote_qualified_identifier(stmt->relation->schemaname,
stmt->relation->relname);
appendStringInfo(buf, "ALTER VIEW %s RENAME COLUMN %s TO %s;", identifier,
quote_identifier(stmt->subname), quote_identifier(
stmt->newname));
break;
}
case OBJECT_VIEW:
{
const char *identifier =
quote_qualified_identifier(stmt->relation->schemaname,
stmt->relation->relname);
appendStringInfo(buf, "ALTER VIEW %s RENAME TO %s;", identifier,
quote_identifier(stmt->newname));
break;
}
default:
{
ereport(ERROR, (errmsg("unsupported subtype for alter view rename command"),
errdetail("sub command type: %d", stmt->renameType)));
}
}
}
char *
DeparseAlterViewSchemaStmt(Node *node)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
StringInfoData str = { 0 };
initStringInfo(&str);
AppendAlterViewSchemaStmt(&str, stmt);
return str.data;
}
static void
AppendAlterViewSchemaStmt(StringInfo buf, AlterObjectSchemaStmt *stmt)
{
const char *identifier = quote_qualified_identifier(stmt->relation->schemaname,
stmt->relation->relname);
appendStringInfo(buf, "ALTER VIEW %s SET SCHEMA %s;", identifier, quote_identifier(
stmt->newschema));
}

View File

@ -0,0 +1,260 @@
/*-------------------------------------------------------------------------
*
* qualify_domain.c
* Functions to fully qualify, make the statements independent of
* search_path settings, for all domain related statements. This
* mostly consists of adding the schema name to all the domain
* names referencing domains.
*
* Copyright (c) Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
#include "nodes/makefuncs.h"
#include "parser/parse_type.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
static void QualifyTypeName(TypeName *typeName, bool missing_ok);
static void QualifyCollate(CollateClause *collClause, bool missing_ok);
/*
* QualifyCreateDomainStmt modifies the CreateDomainStmt passed to become search_path
* independent.
*/
void
QualifyCreateDomainStmt(Node *node)
{
CreateDomainStmt *stmt = castNode(CreateDomainStmt, node);
char *schemaName = NULL;
char *domainName = NULL;
/* fully qualify domain name */
DeconstructQualifiedName(stmt->domainname, &schemaName, &domainName);
if (!schemaName)
{
RangeVar *var = makeRangeVarFromNameList(stmt->domainname);
Oid creationSchema = RangeVarGetCreationNamespace(var);
schemaName = get_namespace_name(creationSchema);
stmt->domainname = list_make2(makeString(schemaName), makeString(domainName));
}
/* referenced types should be fully qualified */
QualifyTypeName(stmt->typeName, false);
QualifyCollate(stmt->collClause, false);
}
/*
* QualifyDropDomainStmt modifies the DropStmt for DOMAIN's to be search_path independent.
*/
void
QualifyDropDomainStmt(Node *node)
{
DropStmt *stmt = castNode(DropStmt, node);
TypeName *domainName = NULL;
foreach_ptr(domainName, stmt->objects)
{
QualifyTypeName(domainName, stmt->missing_ok);
}
}
/*
* QualifyAlterDomainStmt modifies the AlterDomainStmt to be search_path independent.
*/
void
QualifyAlterDomainStmt(Node *node)
{
AlterDomainStmt *stmt = castNode(AlterDomainStmt, node);
if (list_length(stmt->typeName) == 1)
{
TypeName *typeName = makeTypeNameFromNameList(stmt->typeName);
QualifyTypeName(typeName, false);
stmt->typeName = typeName->names;
}
}
/*
* QualifyDomainRenameConstraintStmt modifies the RenameStmt for domain constraints to be
* search_path independent.
*/
void
QualifyDomainRenameConstraintStmt(Node *node)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Assert(stmt->renameType == OBJECT_DOMCONSTRAINT);
List *domainName = castNode(List, stmt->object);
if (list_length(domainName) == 1)
{
TypeName *typeName = makeTypeNameFromNameList(domainName);
QualifyTypeName(typeName, false);
stmt->object = (Node *) typeName->names;
}
}
/*
* QualifyAlterDomainOwnerStmt modifies the AlterOwnerStmt for DOMAIN's to be search_oath
* independent.
*/
void
QualifyAlterDomainOwnerStmt(Node *node)
{
AlterOwnerStmt *stmt = castNode(AlterOwnerStmt, node);
Assert(stmt->objectType == OBJECT_DOMAIN);
List *domainName = castNode(List, stmt->object);
if (list_length(domainName) == 1)
{
TypeName *typeName = makeTypeNameFromNameList(domainName);
QualifyTypeName(typeName, false);
stmt->object = (Node *) typeName->names;
}
}
/*
* QualifyRenameDomainStmt modifies the RenameStmt for the Domain to be search_path
* independent.
*/
void
QualifyRenameDomainStmt(Node *node)
{
RenameStmt *stmt = castNode(RenameStmt, node);
Assert(stmt->renameType == OBJECT_DOMAIN);
List *domainName = castNode(List, stmt->object);
if (list_length(domainName) == 1)
{
TypeName *typeName = makeTypeNameFromNameList(domainName);
QualifyTypeName(typeName, false);
stmt->object = (Node *) typeName->names;
}
}
/*
* QualifyAlterDomainSchemaStmt modifies the AlterObjectSchemaStmt to be search_path
* independent.
*/
void
QualifyAlterDomainSchemaStmt(Node *node)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
Assert(stmt->objectType == OBJECT_DOMAIN);
List *domainName = castNode(List, stmt->object);
if (list_length(domainName) == 1)
{
TypeName *typeName = makeTypeNameFromNameList(domainName);
QualifyTypeName(typeName, false);
stmt->object = (Node *) typeName->names;
}
}
/*
* QualifyTypeName qualifies a TypeName object in place. When missing_ok is false it might
* throw an error if the type can't be found based on its name. When an oid is provided
* missing_ok is ignored and treated as false. Meaning, even if missing_ok is true the
* function might raise an error for non-existing types if the oid can't be found.
*/
static void
QualifyTypeName(TypeName *typeName, bool missing_ok)
{
if (OidIsValid(typeName->typeOid))
{
/*
* When the typeName is provided as oid, fill in the names.
* missing_ok is ignored for oid's
*/
Type typeTup = typeidType(typeName->typeOid);
char *name = typeTypeName(typeTup);
Oid namespaceOid = TypeOidGetNamespaceOid(typeName->typeOid);
char *schemaName = get_namespace_name(namespaceOid);
typeName->names = list_make2(makeString(schemaName), makeString(name));
ReleaseSysCache(typeTup);
}
else
{
char *name = NULL;
char *schemaName = NULL;
DeconstructQualifiedName(typeName->names, &schemaName, &name);
if (!schemaName)
{
Oid typeOid = LookupTypeNameOid(NULL, typeName, missing_ok);
if (OidIsValid(typeOid))
{
Oid namespaceOid = TypeOidGetNamespaceOid(typeOid);
schemaName = get_namespace_name(namespaceOid);
typeName->names = list_make2(makeString(schemaName), makeString(name));
}
}
}
}
/*
* QualifyCollate qualifies any given CollateClause by adding any missing schema name to
* the collation being identified.
*
* If collClause is a NULL pointer this function is a no-nop.
*/
static void
QualifyCollate(CollateClause *collClause, bool missing_ok)
{
if (collClause == NULL)
{
/* no collate clause, nothing to qualify*/
return;
}
if (list_length(collClause->collname) != 1)
{
/* already qualified */
return;
}
Oid collOid = get_collation_oid(collClause->collname, missing_ok);
ObjectAddress collationAddress = { 0 };
ObjectAddressSet(collationAddress, CollationRelationId, collOid);
List *objName = NIL;
List *objArgs = NIL;
#if PG_VERSION_NUM >= PG_VERSION_14
getObjectIdentityParts(&collationAddress, &objName, &objArgs, false);
#else
getObjectIdentityParts(&collationAddress, &objName, &objArgs);
#endif
collClause->collname = NIL;
char *name = NULL;
foreach_ptr(name, objName)
{
collClause->collname = lappend(collClause->collname, makeString(name));
}
}

View File

@ -17,7 +17,9 @@
#include "postgres.h"
#include "distributed/commands.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "distributed/version_compat.h"
#include "parser/parse_func.h"
#include "utils/lsyscache.h"
@ -38,8 +40,13 @@ QualifyAlterSequenceOwnerStmt(Node *node)
if (seq->schemaname == NULL)
{
Oid schemaOid = RangeVarGetCreationNamespace(seq);
seq->schemaname = get_namespace_name(schemaOid);
Oid seqOid = RangeVarGetRelid(seq, NoLock, stmt->missing_ok);
if (OidIsValid(seqOid))
{
Oid schemaOid = get_rel_namespace(seqOid);
seq->schemaname = get_namespace_name(schemaOid);
}
}
}
@ -59,12 +66,53 @@ QualifyAlterSequenceSchemaStmt(Node *node)
if (seq->schemaname == NULL)
{
Oid schemaOid = RangeVarGetCreationNamespace(seq);
seq->schemaname = get_namespace_name(schemaOid);
Oid seqOid = RangeVarGetRelid(seq, NoLock, stmt->missing_ok);
if (OidIsValid(seqOid))
{
Oid schemaOid = get_rel_namespace(seqOid);
seq->schemaname = get_namespace_name(schemaOid);
}
}
}
/*
* QualifyDropSequenceStmt transforms a DROP SEQUENCE
* statement in place and makes the sequence name fully qualified.
*/
void
QualifyDropSequenceStmt(Node *node)
{
DropStmt *stmt = castNode(DropStmt, node);
Assert(stmt->removeType == OBJECT_SEQUENCE);
List *objectNameListWithSchema = NIL;
List *objectNameList = NULL;
foreach_ptr(objectNameList, stmt->objects)
{
RangeVar *seq = makeRangeVarFromNameList(objectNameList);
if (seq->schemaname == NULL)
{
Oid seqOid = RangeVarGetRelid(seq, NoLock, stmt->missing_ok);
if (OidIsValid(seqOid))
{
Oid schemaOid = get_rel_namespace(seqOid);
seq->schemaname = get_namespace_name(schemaOid);
}
}
objectNameListWithSchema = lappend(objectNameListWithSchema,
MakeNameListFromRangeVar(seq));
}
stmt->objects = objectNameListWithSchema;
}
/*
* QualifyRenameSequenceStmt transforms a
* ALTER SEQUENCE .. RENAME TO ..
@ -84,3 +132,41 @@ QualifyRenameSequenceStmt(Node *node)
seq->schemaname = get_namespace_name(schemaOid);
}
}
/*
* QualifyGrantOnSequenceStmt transforms a
* GRANT ON SEQUENCE ...
* statement in place and makes the sequence names fully qualified.
*/
void
QualifyGrantOnSequenceStmt(Node *node)
{
GrantStmt *stmt = castNode(GrantStmt, node);
Assert(stmt->objtype == OBJECT_SEQUENCE);
/*
* The other option would be GRANT ALL SEQUENCES ON SCHEMA ...
* For that we don't need to qualify
*/
if (stmt->targtype != ACL_TARGET_OBJECT)
{
return;
}
List *qualifiedSequenceRangeVars = NIL;
RangeVar *sequenceRangeVar = NULL;
foreach_ptr(sequenceRangeVar, stmt->objects)
{
if (sequenceRangeVar->schemaname == NULL)
{
Oid seqOid = RangeVarGetRelid(sequenceRangeVar, NoLock, false);
Oid schemaOid = get_rel_namespace(seqOid);
sequenceRangeVar->schemaname = get_namespace_name(schemaOid);
}
qualifiedSequenceRangeVars = lappend(qualifiedSequenceRangeVars,
sequenceRangeVar);
}
stmt->objects = qualifiedSequenceRangeVars;
}

View File

@ -15,15 +15,19 @@
#include "postgres.h"
#include "catalog/namespace.h"
#include "catalog/pg_statistic_ext.h"
#include "distributed/commands.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "nodes/parsenodes.h"
#include "nodes/value.h"
#include "utils/syscache.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/relcache.h"
static Oid GetStatsNamespaceOid(Oid statsOid);
void
QualifyCreateStatisticsStmt(Node *node)
{
@ -38,6 +42,12 @@ QualifyCreateStatisticsStmt(Node *node)
relation->schemaname = get_namespace_name(schemaOid);
}
if (list_length(stmt->defnames) < 1)
{
/* no name to qualify */
return;
}
RangeVar *stat = makeRangeVarFromNameList(stmt->defnames);
if (stat->schemaname == NULL)
@ -68,8 +78,14 @@ QualifyDropStatisticsStmt(Node *node)
if (stat->schemaname == NULL)
{
Oid schemaOid = RangeVarGetCreationNamespace(stat);
stat->schemaname = get_namespace_name(schemaOid);
Oid statsOid = get_statistics_object_oid(objectNameList,
dropStatisticsStmt->missing_ok);
if (OidIsValid(statsOid))
{
Oid schemaOid = GetStatsNamespaceOid(statsOid);
stat->schemaname = get_namespace_name(schemaOid);
}
}
objectNameListWithSchema = lappend(objectNameListWithSchema,
@ -94,7 +110,14 @@ QualifyAlterStatisticsRenameStmt(Node *node)
if (list_length(nameList) == 1)
{
RangeVar *stat = makeRangeVarFromNameList(nameList);
Oid schemaOid = RangeVarGetCreationNamespace(stat);
Oid statsOid = get_statistics_object_oid(nameList, renameStmt->missing_ok);
if (!OidIsValid(statsOid))
{
return;
}
Oid schemaOid = GetStatsNamespaceOid(statsOid);
stat->schemaname = get_namespace_name(schemaOid);
renameStmt->object = (Node *) MakeNameListFromRangeVar(stat);
}
@ -115,7 +138,14 @@ QualifyAlterStatisticsSchemaStmt(Node *node)
if (list_length(nameList) == 1)
{
RangeVar *stat = makeRangeVarFromNameList(nameList);
Oid schemaOid = RangeVarGetCreationNamespace(stat);
Oid statsOid = get_statistics_object_oid(nameList, stmt->missing_ok);
if (!OidIsValid(statsOid))
{
return;
}
Oid schemaOid = GetStatsNamespaceOid(statsOid);
stat->schemaname = get_namespace_name(schemaOid);
stmt->object = (Node *) MakeNameListFromRangeVar(stat);
}
@ -136,7 +166,14 @@ QualifyAlterStatisticsStmt(Node *node)
if (list_length(stmt->defnames) == 1)
{
RangeVar *stat = makeRangeVarFromNameList(stmt->defnames);
Oid schemaOid = RangeVarGetCreationNamespace(stat);
Oid statsOid = get_statistics_object_oid(stmt->defnames, stmt->missing_ok);
if (!OidIsValid(statsOid))
{
return;
}
Oid schemaOid = GetStatsNamespaceOid(statsOid);
stat->schemaname = get_namespace_name(schemaOid);
stmt->defnames = MakeNameListFromRangeVar(stat);
}
@ -159,8 +196,40 @@ QualifyAlterStatisticsOwnerStmt(Node *node)
if (list_length(nameList) == 1)
{
RangeVar *stat = makeRangeVarFromNameList(nameList);
Oid schemaOid = RangeVarGetCreationNamespace(stat);
Oid statsOid = get_statistics_object_oid(nameList, /* missing_ok */ true);
if (!OidIsValid(statsOid))
{
return;
}
Oid schemaOid = GetStatsNamespaceOid(statsOid);
stat->schemaname = get_namespace_name(schemaOid);
stmt->object = (Node *) MakeNameListFromRangeVar(stat);
}
}
/*
* GetStatsNamespaceOid takes the id of a Statistics object and returns
* the id of the schema that the statistics object belongs to.
* Errors out if the stats object is not found.
*/
static Oid
GetStatsNamespaceOid(Oid statsOid)
{
HeapTuple heapTuple = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statsOid));
if (!HeapTupleIsValid(heapTuple))
{
ereport(ERROR, (errmsg("cache lookup failed for statistics "
"object with oid %u", statsOid)));
}
FormData_pg_statistic_ext *statisticsForm =
(FormData_pg_statistic_ext *) GETSTRUCT(heapTuple);
Oid result = statisticsForm->stxnamespace;
ReleaseSysCache(heapTuple);
return result;
}

View File

@ -31,13 +31,10 @@
#include "utils/syscache.h"
#include "utils/lsyscache.h"
static char * GetTypeNamespaceNameByNameList(List *names);
static Oid TypeOidGetNamespaceOid(Oid typeOid);
/*
* GetTypeNamespaceNameByNameList resolved the schema name of a type by its namelist.
*/
static char *
char *
GetTypeNamespaceNameByNameList(List *names)
{
TypeName *typeName = makeTypeNameFromNameList(names);
@ -51,7 +48,7 @@ GetTypeNamespaceNameByNameList(List *names)
/*
* TypeOidGetNamespaceOid resolves the namespace oid for a type identified by its type oid
*/
static Oid
Oid
TypeOidGetNamespaceOid(Oid typeOid)
{
HeapTuple typeTuple = SearchSysCache1(TYPEOID, typeOid);

View File

@ -0,0 +1,116 @@
/*-------------------------------------------------------------------------
*
* qualify_view_stmt.c
* Functions specialized in fully qualifying all view statements. These
* functions are dispatched from qualify.c
*
* Copyright (c), Citus Data, Inc.
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "catalog/namespace.h"
#include "distributed/deparser.h"
#include "distributed/listutils.h"
#include "nodes/nodes.h"
#include "utils/guc.h"
#include "utils/lsyscache.h"
static void QualifyViewRangeVar(RangeVar *view);
/*
* QualifyDropViewStmt quailifies the view names of the DROP VIEW statement.
*/
void
QualifyDropViewStmt(Node *node)
{
DropStmt *stmt = castNode(DropStmt, node);
List *qualifiedViewNames = NIL;
List *possiblyQualifiedViewName = NULL;
foreach_ptr(possiblyQualifiedViewName, stmt->objects)
{
char *viewName = NULL;
char *schemaName = NULL;
List *viewNameToAdd = possiblyQualifiedViewName;
DeconstructQualifiedName(possiblyQualifiedViewName, &schemaName, &viewName);
if (schemaName == NULL)
{
RangeVar *viewRangeVar = makeRangeVarFromNameList(possiblyQualifiedViewName);
Oid viewOid = RangeVarGetRelid(viewRangeVar, AccessExclusiveLock,
stmt->missing_ok);
/*
* If DROP VIEW IF EXISTS called and the view doesn't exist, oid can be invalid.
* Do not try to qualify it.
*/
if (OidIsValid(viewOid))
{
Oid schemaOid = get_rel_namespace(viewOid);
schemaName = get_namespace_name(schemaOid);
List *qualifiedViewName = list_make2(makeString(schemaName),
makeString(viewName));
viewNameToAdd = qualifiedViewName;
}
}
qualifiedViewNames = lappend(qualifiedViewNames, viewNameToAdd);
}
stmt->objects = qualifiedViewNames;
}
/*
* QualifyAlterViewStmt quailifies the view name of the ALTER VIEW statement.
*/
void
QualifyAlterViewStmt(Node *node)
{
AlterTableStmt *stmt = castNode(AlterTableStmt, node);
RangeVar *view = stmt->relation;
QualifyViewRangeVar(view);
}
/*
* QualifyRenameViewStmt quailifies the view name of the ALTER VIEW ... RENAME statement.
*/
void
QualifyRenameViewStmt(Node *node)
{
RenameStmt *stmt = castNode(RenameStmt, node);
RangeVar *view = stmt->relation;
QualifyViewRangeVar(view);
}
/*
* QualifyAlterViewSchemaStmt quailifies the view name of the ALTER VIEW ... SET SCHEMA statement.
*/
void
QualifyAlterViewSchemaStmt(Node *node)
{
AlterObjectSchemaStmt *stmt = castNode(AlterObjectSchemaStmt, node);
RangeVar *view = stmt->relation;
QualifyViewRangeVar(view);
}
/*
* QualifyViewRangeVar qualifies the given view RangeVar if it is not qualified.
*/
static void
QualifyViewRangeVar(RangeVar *view)
{
if (view->schemaname == NULL)
{
Oid viewOid = RelnameGetRelid(view->relname);
Oid schemaOid = get_rel_namespace(viewOid);
view->schemaname = get_namespace_name(schemaOid);
}
}

View File

@ -1388,8 +1388,15 @@ set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
/* Assert we processed the right number of columns */
#ifdef USE_ASSERT_CHECKING
while (i < colinfo->num_cols && colinfo->colnames[i] == NULL)
i++;
for (int col_index = 0; col_index < colinfo->num_cols; col_index++)
{
/*
* In the above processing-loops, "i" advances only if
* the column is not new, check if this is a new column.
*/
if (colinfo->is_new_col[col_index])
i++;
}
Assert(i == colinfo->num_cols);
Assert(j == nnewcolumns);
#endif

View File

@ -1405,8 +1405,15 @@ set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
/* Assert we processed the right number of columns */
#ifdef USE_ASSERT_CHECKING
while (i < colinfo->num_cols && colinfo->colnames[i] == NULL)
i++;
for (int col_index = 0; col_index < colinfo->num_cols; col_index++)
{
/*
* In the above processing-loops, "i" advances only if
* the column is not new, check if this is a new column.
*/
if (colinfo->is_new_col[col_index])
i++;
}
Assert(i == colinfo->num_cols);
Assert(j == nnewcolumns);
#endif

View File

@ -482,7 +482,7 @@ get_merged_argument_list(CallStmt *stmt, List **mergedNamedArgList,
Oid functionOid = stmt->funcexpr->funcid;
List *namedArgList = NIL;
List *finalArgumentList = NIL;
Oid finalArgTypes[FUNC_MAX_ARGS];
Oid *finalArgTypes;
Oid *argTypes = NULL;
char *argModes = NULL;
char **argNames = NULL;
@ -519,6 +519,7 @@ get_merged_argument_list(CallStmt *stmt, List **mergedNamedArgList,
/* Remove the duplicate INOUT counting */
numberOfArgs = numberOfArgs - totalInoutArgs;
finalArgTypes = palloc0(sizeof(Oid) * numberOfArgs);
ListCell *inArgCell = list_head(stmt->funcexpr->args);
ListCell *outArgCell = list_head(stmt->outargs);
@ -1527,8 +1528,15 @@ set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
/* Assert we processed the right number of columns */
#ifdef USE_ASSERT_CHECKING
while (i < colinfo->num_cols && colinfo->colnames[i] == NULL)
i++;
for (int col_index = 0; col_index < colinfo->num_cols; col_index++)
{
/*
* In the above processing-loops, "i" advances only if
* the column is not new, check if this is a new column.
*/
if (colinfo->is_new_col[col_index])
i++;
}
Assert(i == colinfo->num_cols);
Assert(j == nnewcolumns);
#endif

View File

@ -511,7 +511,9 @@ typedef enum TaskExecutionState
/*
* PlacementExecutionOrder indicates whether a command should be executed
* on any replica, on all replicas sequentially (in order), or on all
* replicas in parallel.
* replicas in parallel. In other words, EXECUTION_ORDER_ANY is used for
* SELECTs, EXECUTION_ORDER_SEQUENTIAL/EXECUTION_ORDER_PARALLEL is used for
* DML/DDL.
*/
typedef enum PlacementExecutionOrder
{
@ -1321,7 +1323,8 @@ StartDistributedExecution(DistributedExecution *execution)
/* make sure we are not doing remote execution from within a task */
if (execution->remoteTaskList != NIL)
{
EnsureRemoteTaskExecutionAllowed();
bool isRemote = true;
EnsureTaskExecutionAllowed(isRemote);
}
}
@ -4562,6 +4565,7 @@ ReceiveResults(WorkerSession *session, bool storeRows)
TupleDesc tupleDescriptor = tupleDest->tupleDescForQuery(tupleDest, queryIndex);
if (tupleDescriptor == NULL)
{
PQclear(result);
continue;
}
@ -5294,6 +5298,10 @@ TaskExecutionStateMachine(ShardCommandExecution *shardCommandExecution)
{
currentTaskExecutionState = TASK_EXECUTION_FAILED;
}
else if (executionOrder != EXECUTION_ORDER_ANY && failedPlacementCount > 0)
{
currentTaskExecutionState = TASK_EXECUTION_FAILED;
}
else if (executionOrder == EXECUTION_ORDER_ANY && donePlacementCount > 0)
{
currentTaskExecutionState = TASK_EXECUTION_FINISHED;

View File

@ -40,6 +40,7 @@
#include "nodes/makefuncs.h"
#include "optimizer/optimizer.h"
#include "optimizer/clauses.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/datum.h"
@ -674,7 +675,10 @@ CitusEndScan(CustomScanState *node)
partitionKeyConst = workerJob->partitionKeyValue;
}
/* queryId is not set if pg_stat_statements is not installed */
/*
* queryId is not set if pg_stat_statements is not installed,
* it can be set with as of pg14: set compute_query_id to on;
*/
if (queryId != 0)
{
if (partitionKeyConst != NULL && executorType == MULTI_EXECUTOR_ADAPTIVE)
@ -701,19 +705,7 @@ CitusEndScan(CustomScanState *node)
*/
static void
CitusReScan(CustomScanState *node)
{
CitusScanState *scanState = (CitusScanState *) node;
Job *workerJob = scanState->distributedPlan->workerJob;
EState *executorState = ScanStateGetExecutorState(scanState);
ParamListInfo paramListInfo = executorState->es_param_list_info;
if (paramListInfo != NULL && !workerJob->parametersInJobQueryResolved)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("Cursors for queries on distributed tables with "
"parameters are currently unsupported")));
}
}
{ }
/*

View File

@ -55,7 +55,6 @@
bool EnableRepartitionedInsertSelect = true;
static Query * WrapSubquery(Query *subquery);
static List * TwoPhaseInsertSelectTaskList(Oid targetRelationId, Query *insertSelectQuery,
char *resultIdPrefix);
static void ExecutePlanIntoRelation(Oid targetRelationId, List *insertTargetList,
@ -299,100 +298,6 @@ NonPushableInsertSelectExecScan(CustomScanState *node)
}
/*
* BuildSelectForInsertSelect extracts the SELECT part from an INSERT...SELECT query.
* If the INSERT...SELECT has CTEs then these are added to the resulting SELECT instead.
*/
Query *
BuildSelectForInsertSelect(Query *insertSelectQuery)
{
RangeTblEntry *selectRte = ExtractSelectRangeTableEntry(insertSelectQuery);
Query *selectQuery = selectRte->subquery;
/*
* Wrap the SELECT as a subquery if the INSERT...SELECT has CTEs or the SELECT
* has top-level set operations.
*
* We could simply wrap all queries, but that might create a subquery that is
* not supported by the logical planner. Since the logical planner also does
* not support CTEs and top-level set operations, we can wrap queries containing
* those without breaking anything.
*/
if (list_length(insertSelectQuery->cteList) > 0)
{
selectQuery = WrapSubquery(selectRte->subquery);
/* copy CTEs from the INSERT ... SELECT statement into outer SELECT */
selectQuery->cteList = copyObject(insertSelectQuery->cteList);
selectQuery->hasModifyingCTE = insertSelectQuery->hasModifyingCTE;
}
else if (selectQuery->setOperations != NULL)
{
/* top-level set operations confuse the ReorderInsertSelectTargetLists logic */
selectQuery = WrapSubquery(selectRte->subquery);
}
return selectQuery;
}
/*
* WrapSubquery wraps the given query as a subquery in a newly constructed
* "SELECT * FROM (...subquery...) citus_insert_select_subquery" query.
*/
static Query *
WrapSubquery(Query *subquery)
{
ParseState *pstate = make_parsestate(NULL);
List *newTargetList = NIL;
Query *outerQuery = makeNode(Query);
outerQuery->commandType = CMD_SELECT;
/* create range table entries */
Alias *selectAlias = makeAlias("citus_insert_select_subquery", NIL);
RangeTblEntry *newRangeTableEntry = RangeTableEntryFromNSItem(
addRangeTableEntryForSubquery(
pstate, subquery,
selectAlias, false, true));
outerQuery->rtable = list_make1(newRangeTableEntry);
/* set the FROM expression to the subquery */
RangeTblRef *newRangeTableRef = makeNode(RangeTblRef);
newRangeTableRef->rtindex = 1;
outerQuery->jointree = makeFromExpr(list_make1(newRangeTableRef), NULL);
/* create a target list that matches the SELECT */
TargetEntry *selectTargetEntry = NULL;
foreach_ptr(selectTargetEntry, subquery->targetList)
{
/* exactly 1 entry in FROM */
int indexInRangeTable = 1;
if (selectTargetEntry->resjunk)
{
continue;
}
Var *newSelectVar = makeVar(indexInRangeTable, selectTargetEntry->resno,
exprType((Node *) selectTargetEntry->expr),
exprTypmod((Node *) selectTargetEntry->expr),
exprCollation((Node *) selectTargetEntry->expr), 0);
TargetEntry *newSelectTargetEntry = makeTargetEntry((Expr *) newSelectVar,
selectTargetEntry->resno,
selectTargetEntry->resname,
selectTargetEntry->resjunk);
newTargetList = lappend(newTargetList, newSelectTargetEntry);
}
outerQuery->targetList = newTargetList;
return outerQuery;
}
/*
* TwoPhaseInsertSelectTaskList generates a list of tasks for a query that
* inserts into a target relation and selects from a set of co-located

View File

@ -45,7 +45,7 @@
#include "utils/syscache.h"
static bool CreatedResultsDirectory = false;
static List *CreatedResultsDirectories = NIL;
/* CopyDestReceiver can be used to stream results into a distributed table */
@ -594,26 +594,28 @@ CreateIntermediateResultsDirectory(void)
{
char *resultDirectory = IntermediateResultsDirectory();
if (!CreatedResultsDirectory)
int makeOK = mkdir(resultDirectory, S_IRWXU);
if (makeOK != 0)
{
int makeOK = mkdir(resultDirectory, S_IRWXU);
if (makeOK != 0)
if (errno == EEXIST)
{
if (errno == EEXIST)
{
/* someone else beat us to it, that's ok */
return resultDirectory;
}
ereport(ERROR, (errcode_for_file_access(),
errmsg("could not create intermediate results directory "
"\"%s\": %m",
resultDirectory)));
/* someone else beat us to it, that's ok */
return resultDirectory;
}
CreatedResultsDirectory = true;
ereport(ERROR, (errcode_for_file_access(),
errmsg("could not create intermediate results directory "
"\"%s\": %m",
resultDirectory)));
}
MemoryContext oldContext = MemoryContextSwitchTo(TopTransactionContext);
CreatedResultsDirectories =
lappend(CreatedResultsDirectories, pstrdup(resultDirectory));
MemoryContextSwitchTo(oldContext);
return resultDirectory;
}
@ -693,13 +695,14 @@ IntermediateResultsDirectory(void)
/*
* RemoveIntermediateResultsDirectory removes the intermediate result directory
* RemoveIntermediateResultsDirectories removes the intermediate result directory
* for the current distributed transaction, if any was created.
*/
void
RemoveIntermediateResultsDirectory(void)
RemoveIntermediateResultsDirectories(void)
{
if (CreatedResultsDirectory)
char *directoryElement = NULL;
foreach_ptr(directoryElement, CreatedResultsDirectories)
{
/*
* The shared directory is renamed before deleting it. Otherwise it
@ -708,7 +711,7 @@ RemoveIntermediateResultsDirectory(void)
* that's not possible. The current PID is included in the new
* filename, so there can be no collisions with other backends.
*/
char *sharedName = IntermediateResultsDirectory();
char *sharedName = directoryElement;
StringInfo privateName = makeStringInfo();
appendStringInfo(privateName, "%s.removed-by-%d", sharedName, MyProcPid);
if (rename(sharedName, privateName->data))
@ -728,9 +731,12 @@ RemoveIntermediateResultsDirectory(void)
{
PathNameDeleteTemporaryDir(privateName->data);
}
CreatedResultsDirectory = false;
}
/* cleanup */
list_free_deep(CreatedResultsDirectories);
CreatedResultsDirectories = NIL;
}

View File

@ -108,26 +108,26 @@
bool EnableLocalExecution = true;
bool LogLocalCommands = false;
int LocalExecutorLevel = 0;
/* global variable that tracks whether the local execution is on a shard */
uint64 LocalExecutorShardId = INVALID_SHARD_ID;
static LocalExecutionStatus CurrentLocalExecutionStatus = LOCAL_EXECUTION_OPTIONAL;
static uint64 ExecuteLocalTaskListInternal(List *taskList,
ParamListInfo paramListInfo,
DistributedPlan *distributedPlan,
TupleDestination *defaultTupleDest,
bool isUtilityCommand);
static void SplitLocalAndRemotePlacements(List *taskPlacementList,
List **localTaskPlacementList,
List **remoteTaskPlacementList);
static uint64 ExecuteLocalTaskPlan(PlannedStmt *taskPlan, char *queryString,
TupleDestination *tupleDest, Task *task,
ParamListInfo paramListInfo);
static uint64 LocallyExecuteTaskPlan(PlannedStmt *taskPlan, char *queryString,
TupleDestination *tupleDest, Task *task,
ParamListInfo paramListInfo);
static uint64 ExecuteTaskPlan(PlannedStmt *taskPlan, char *queryString,
TupleDestination *tupleDest, Task *task,
ParamListInfo paramListInfo);
static void RecordNonDistTableAccessesForTask(Task *task);
static void LogLocalCommand(Task *task);
static uint64 LocallyPlanAndExecuteMultipleQueries(List *queryStrings,
TupleDestination *tupleDest,
Task *task);
static void LocallyExecuteUtilityTask(Task *task);
static void ExecuteUdfTaskQuery(Query *localUdfCommandQuery);
static void EnsureTransitionPossible(LocalExecutionStatus from,
LocalExecutionStatus to);
@ -204,50 +204,7 @@ ExecuteLocalTaskListExtended(List *taskList,
TupleDestination *defaultTupleDest,
bool isUtilityCommand)
{
uint64 totalRowsProcessed = 0;
ParamListInfo paramListInfo = copyParamList(orig_paramListInfo);
/*
* Even if we are executing local tasks, we still enable
* coordinated transaction. This is because
* (a) we might be in a transaction, and the next commands may
* require coordinated transaction
* (b) we might be executing some tasks locally and the others
* via remote execution
*
* Also, there is no harm enabling coordinated transaction even if
* we only deal with local tasks in the transaction.
*/
UseCoordinatedTransaction();
LocalExecutorLevel++;
PG_TRY();
{
totalRowsProcessed = ExecuteLocalTaskListInternal(taskList, paramListInfo,
distributedPlan,
defaultTupleDest,
isUtilityCommand);
}
PG_CATCH();
{
LocalExecutorLevel--;
PG_RE_THROW();
}
PG_END_TRY();
LocalExecutorLevel--;
return totalRowsProcessed;
}
static uint64
ExecuteLocalTaskListInternal(List *taskList,
ParamListInfo paramListInfo,
DistributedPlan *distributedPlan,
TupleDestination *defaultTupleDest,
bool isUtilityCommand)
{
uint64 totalRowsProcessed = 0;
int numParams = 0;
Oid *parameterTypes = NULL;
@ -263,6 +220,12 @@ ExecuteLocalTaskListInternal(List *taskList,
numParams = paramListInfo->numParams;
}
if (taskList != NIL)
{
bool isRemote = false;
EnsureTaskExecutionAllowed(isRemote);
}
/*
* Use a new memory context that gets reset after every task to free
* the deparsed query string and query plan.
@ -304,7 +267,7 @@ ExecuteLocalTaskListInternal(List *taskList,
if (isUtilityCommand)
{
ExecuteUtilityCommand(TaskQueryString(task));
LocallyExecuteUtilityTask(task);
MemoryContextSwitchTo(oldContext);
MemoryContextReset(loopContext);
@ -391,8 +354,8 @@ ExecuteLocalTaskListInternal(List *taskList,
}
totalRowsProcessed +=
ExecuteLocalTaskPlan(localPlan, shardQueryString,
tupleDest, task, paramListInfo);
LocallyExecuteTaskPlan(localPlan, shardQueryString,
tupleDest, task, paramListInfo);
MemoryContextSwitchTo(oldContext);
MemoryContextReset(loopContext);
@ -421,9 +384,9 @@ LocallyPlanAndExecuteMultipleQueries(List *queryStrings, TupleDestination *tuple
ParamListInfo paramListInfo = NULL;
PlannedStmt *localPlan = planner_compat(shardQuery, cursorOptions,
paramListInfo);
totalProcessedRows += ExecuteLocalTaskPlan(localPlan, queryString,
tupleDest, task,
paramListInfo);
totalProcessedRows += LocallyExecuteTaskPlan(localPlan, queryString,
tupleDest, task,
paramListInfo);
}
return totalProcessedRows;
}
@ -444,6 +407,39 @@ ExtractParametersForLocalExecution(ParamListInfo paramListInfo, Oid **parameterT
}
/*
* LocallyExecuteUtilityTask runs a utility command via local execution.
*/
static void
LocallyExecuteUtilityTask(Task *task)
{
/*
* If we roll back to a savepoint, we may no longer be in a query on
* a shard. Reset the value as we go back up the stack.
*/
uint64 prevLocalExecutorShardId = LocalExecutorShardId;
if (task->anchorShardId != INVALID_SHARD_ID)
{
LocalExecutorShardId = task->anchorShardId;
}
PG_TRY();
{
ExecuteUtilityCommand(TaskQueryString(task));
}
PG_CATCH();
{
LocalExecutorShardId = prevLocalExecutorShardId;
PG_RE_THROW();
}
PG_END_TRY();
LocalExecutorShardId = prevLocalExecutorShardId;
}
/*
* ExecuteUtilityCommand executes the given task query in the current
* session.
@ -569,9 +565,8 @@ ExtractLocalAndRemoteTasks(bool readOnly, List *taskList, List **localTaskList,
* At this point, we're dealing with a task that has placements on both
* local and remote nodes.
*/
task->partiallyLocalOrRemote = true;
Task *localTask = copyObject(task);
localTask->partiallyLocalOrRemote = true;
localTask->taskPlacementList = localTaskPlacementList;
*localTaskList = lappend(*localTaskList, localTask);
@ -585,6 +580,7 @@ ExtractLocalAndRemoteTasks(bool readOnly, List *taskList, List **localTaskList,
/* since shard replication factor > 1, we should have at least 1 remote task */
Assert(remoteTaskPlacementList != NIL);
Task *remoteTask = copyObject(task);
remoteTask->partiallyLocalOrRemote = true;
remoteTask->taskPlacementList = remoteTaskPlacementList;
*remoteTaskList = lappend(*remoteTaskList, remoteTask);
@ -630,9 +626,50 @@ SplitLocalAndRemotePlacements(List *taskPlacementList, List **localTaskPlacement
* case of DML.
*/
static uint64
ExecuteLocalTaskPlan(PlannedStmt *taskPlan, char *queryString,
TupleDestination *tupleDest, Task *task,
ParamListInfo paramListInfo)
LocallyExecuteTaskPlan(PlannedStmt *taskPlan, char *queryString,
TupleDestination *tupleDest, Task *task,
ParamListInfo paramListInfo)
{
volatile uint64 processedRows = 0;
/*
* If we roll back to a savepoint, we may no longer be in a query on
* a shard. Reset the value as we go back up the stack.
*/
uint64 prevLocalExecutorShardId = LocalExecutorShardId;
if (task->anchorShardId != INVALID_SHARD_ID)
{
LocalExecutorShardId = task->anchorShardId;
}
PG_TRY();
{
processedRows = ExecuteTaskPlan(taskPlan, queryString, tupleDest, task,
paramListInfo);
}
PG_CATCH();
{
LocalExecutorShardId = prevLocalExecutorShardId;
PG_RE_THROW();
}
PG_END_TRY();
LocalExecutorShardId = prevLocalExecutorShardId;
return processedRows;
}
/*
* ExecuteTaskPlan executes the given planned statement and writes the results
* to tupleDest.
*/
static uint64
ExecuteTaskPlan(PlannedStmt *taskPlan, char *queryString,
TupleDestination *tupleDest, Task *task,
ParamListInfo paramListInfo)
{
ScanDirection scanDirection = ForwardScanDirection;
QueryEnvironment *queryEnv = create_queryEnv();
@ -642,7 +679,7 @@ ExecuteLocalTaskPlan(PlannedStmt *taskPlan, char *queryString,
RecordNonDistTableAccessesForTask(task);
MemoryContext localContext = AllocSetContextCreate(CurrentMemoryContext,
"ExecuteLocalTaskPlan",
"ExecuteTaskPlan",
ALLOCSET_DEFAULT_SIZES);
MemoryContext oldContext = MemoryContextSwitchTo(localContext);

View File

@ -18,6 +18,7 @@
#include "catalog/dependency.h"
#include "catalog/pg_class.h"
#include "catalog/namespace.h"
#include "distributed/backend_data.h"
#include "distributed/citus_custom_scan.h"
#include "distributed/commands/multi_copy.h"
#include "distributed/commands/utility_hook.h"
@ -50,6 +51,7 @@
#include "tcop/dest.h"
#include "tcop/pquery.h"
#include "tcop/utility.h"
#include "utils/fmgrprotos.h"
#include "utils/snapmgr.h"
#include "utils/memutils.h"
@ -62,6 +64,12 @@ int MultiShardConnectionType = PARALLEL_CONNECTION;
bool WritableStandbyCoordinator = false;
bool AllowModificationsFromWorkersToReplicatedTables = true;
/*
* Setting that controls whether distributed queries should be
* allowed within a task execution.
*/
bool AllowNestedDistributedExecution = false;
/*
* Pointer to bound parameters of the current ongoing call to ExecutorRun.
* If executor is not running, then this value is meaningless.
@ -87,6 +95,11 @@ static bool AlterTableConstraintCheck(QueryDesc *queryDesc);
static List * FindCitusCustomScanStates(PlanState *planState);
static bool CitusCustomScanStateWalker(PlanState *planState,
List **citusCustomScanStates);
static bool IsTaskExecutionAllowed(bool isRemote);
static bool InLocalTaskExecutionOnShard(void);
static bool MaybeInRemoteTaskExecution(void);
static bool InTrigger(void);
/*
* CitusExecutorStart is the ExecutorStart_hook that gets called when
@ -763,6 +776,11 @@ GetObjectTypeString(ObjectType objType)
return "database";
}
case OBJECT_DOMAIN:
{
return "domain";
}
case OBJECT_EXTENSION:
{
return "extension";
@ -798,6 +816,11 @@ GetObjectTypeString(ObjectType objType)
return "type";
}
case OBJECT_VIEW:
{
return "view";
}
default:
{
ereport(DEBUG1, (errmsg("unsupported object type"),
@ -860,43 +883,146 @@ ExecutorBoundParams(void)
/*
* EnsureRemoteTaskExecutionAllowed ensures that we do not perform remote
* EnsureTaskExecutionAllowed ensures that we do not perform remote
* execution from within a task. That could happen when the user calls
* a function in a query that gets pushed down to the worker, and the
* function performs a query on a distributed table.
*/
void
EnsureRemoteTaskExecutionAllowed(void)
EnsureTaskExecutionAllowed(bool isRemote)
{
if (!InTaskExecution())
if (IsTaskExecutionAllowed(isRemote))
{
/* we are not within a task, distributed execution is allowed */
return;
}
ereport(ERROR, (errmsg("cannot execute a distributed query from a query on a "
"shard")));
"shard"),
errdetail("Executing a distributed query in a function call that "
"may be pushed to a remote node can lead to incorrect "
"results."),
errhint("Avoid nesting of distributed queries or use alter user "
"current_user set citus.allow_nested_distributed_execution "
"to on to allow it with possible incorrectness.")));
}
/*
* InTaskExecution determines whether we are currently in a task execution.
* IsTaskExecutionAllowed determines whether task execution is currently allowed.
* In general, nested distributed execution is not allowed, except in a few cases
* (forced function call delegation, triggers).
*
* We distinguish between local and remote tasks because triggers only disallow
* remote task execution.
*/
bool
InTaskExecution(void)
static bool
IsTaskExecutionAllowed(bool isRemote)
{
if (LocalExecutorLevel > 0)
if (AllowNestedDistributedExecution)
{
/* in a local task */
/* user explicitly allows nested execution */
return true;
}
/*
* Normally, any query execution within a citus-initiated backend
* is considered a task execution, but an exception is when we
* are in a delegated function/procedure call.
*/
return IsCitusInternalBackend() &&
!InTopLevelDelegatedFunctionCall &&
!InDelegatedProcedureCall;
if (!isRemote)
{
if (AllowedDistributionColumnValue.isActive)
{
/*
* When we are in a forced delegated function call, we explicitly check
* whether local tasks use the same distribution column value in
* EnsureForceDelegationDistributionKey.
*/
return true;
}
if (InTrigger())
{
/*
* In triggers on shards we only disallow remote tasks. This has a few
* reasons:
*
* - We want to enable access to co-located shards, but do not have additional
* checks yet.
* - Users need to explicitly set enable_unsafe_triggers in order to create
* triggers on distributed tables.
* - Triggers on Citus local tables should be able to access other Citus local
* tables.
*/
return true;
}
}
return !InLocalTaskExecutionOnShard() && !MaybeInRemoteTaskExecution();
}
/*
* InLocalTaskExecutionOnShard returns whether we are currently in the local executor
* and it is working on a shard of a distributed table.
*
* In general, we can allow distributed queries inside of local executor, because
* we can correctly assign tasks to connections. However, we preemptively protect
* against distributed queries inside of queries on shards of a distributed table,
* because those might start failing after a shard move.
*/
static bool
InLocalTaskExecutionOnShard(void)
{
if (LocalExecutorShardId == INVALID_SHARD_ID)
{
/* local executor is not active or is processing a task without shards */
return false;
}
if (!DistributedTableShardId(LocalExecutorShardId))
{
/*
* Local executor is processing a query on a shard, but the shard belongs
* to a reference table or Citus local table. We do not expect those to
* move.
*/
return false;
}
return true;
}
/*
* MaybeInRemoteTaskExecution returns whether we could in a remote task execution.
*
* We consider anything that happens in a Citus-internal backend, except deleged
* function or procedure calls as a potential task execution.
*
* This function will also return true in other scenarios, such as during metadata
* syncing. However, since this function is mainly used for restricting (dangerous)
* nested executions, it is good to be pessimistic.
*/
static bool
MaybeInRemoteTaskExecution(void)
{
if (!IsCitusInternalBackend())
{
/* in a regular, client-initiated backend doing a regular task */
return false;
}
if (InTopLevelDelegatedFunctionCall || InDelegatedProcedureCall)
{
/* in a citus-initiated backend, but also in a delegated a procedure call */
return false;
}
return true;
}
/*
* InTrigger returns whether the execution is currently in a trigger.
*/
static bool
InTrigger(void)
{
return DatumGetInt32(pg_trigger_depth(NULL)) > 0;
}

File diff suppressed because it is too large Load Diff

View File

@ -10,6 +10,7 @@
#include "postgres.h"
#include "distributed/commands.h"
#include "distributed/pg_version_constants.h"
#include "access/genam.h"
@ -20,8 +21,13 @@
#include "catalog/catalog.h"
#include "catalog/dependency.h"
#include "catalog/indexing.h"
#include "catalog/pg_auth_members.h"
#include "catalog/pg_authid_d.h"
#include "catalog/pg_class.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_depend.h"
#include "catalog/pg_extension_d.h"
#include "catalog/pg_foreign_data_wrapper_d.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_proc_d.h"
#include "catalog/pg_rewrite.h"
@ -43,6 +49,7 @@
#include "utils/fmgroids.h"
#include "utils/hsearch.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
/*
* ObjectAddressCollector keeps track of collected ObjectAddresses. This can be used
@ -127,6 +134,7 @@ static List * GetRelationTriggerFunctionDependencyList(Oid relationId);
static List * GetRelationStatsSchemaDependencyList(Oid relationId);
static List * GetRelationIndicesDependencyList(Oid relationId);
static DependencyDefinition * CreateObjectAddressDependencyDef(Oid classId, Oid objectId);
static List * GetTypeConstraintDependencyDefinition(Oid typeId);
static List * CreateObjectAddressDependencyDefList(Oid classId, List *objectIdList);
static ObjectAddress DependencyDefinitionObjectAddress(DependencyDefinition *definition);
@ -162,10 +170,12 @@ static bool FollowAllDependencies(ObjectAddressCollector *collector,
DependencyDefinition *definition);
static void ApplyAddToDependencyList(ObjectAddressCollector *collector,
DependencyDefinition *definition);
static List * GetViewRuleReferenceDependencyList(Oid relationId);
static List * ExpandCitusSupportedTypes(ObjectAddressCollector *collector,
ObjectAddress target);
static List * GetDependentRoleIdsFDW(Oid FDWOid);
static List * ExpandRolesToGroups(Oid roleid);
static ViewDependencyNode * BuildViewDependencyGraph(Oid relationId, HTAB *nodeMap);
static Oid GetDependingView(Form_pg_depend pg_depend);
/*
@ -422,7 +432,7 @@ DependencyDefinitionFromPgDepend(ObjectAddress target)
/*
* DependencyDefinitionFromPgDepend loads all pg_shdepend records describing the
* DependencyDefinitionFromPgShDepend loads all pg_shdepend records describing the
* dependencies of target.
*/
static List *
@ -630,6 +640,15 @@ SupportedDependencyByCitus(const ObjectAddress *address)
return IsObjectAddressOwnedByExtension(address, NULL);
}
case OCLASS_CONSTRAINT:
{
/*
* Constraints are only supported when on domain types. Other constraints have
* their typid set to InvalidOid.
*/
return OidIsValid(get_constraint_typid(address->objectId));
}
case OCLASS_COLLATION:
{
return true;
@ -658,16 +677,13 @@ SupportedDependencyByCitus(const ObjectAddress *address)
case OCLASS_ROLE:
{
/*
* Community only supports the extension owner as a distributed object to
* propagate alter statements for this user
*/
if (address->objectId == CitusExtensionOwner())
/* if it is a reserved role do not propagate */
if (IsReservedName(GetUserNameFromId(address->objectId, false)))
{
return true;
return false;
}
return false;
return true;
}
case OCLASS_EXTENSION:
@ -691,6 +707,7 @@ SupportedDependencyByCitus(const ObjectAddress *address)
{
case TYPTYPE_ENUM:
case TYPTYPE_COMPOSITE:
case TYPTYPE_DOMAIN:
{
return true;
}
@ -734,7 +751,8 @@ SupportedDependencyByCitus(const ObjectAddress *address)
relKind == RELKIND_FOREIGN_TABLE ||
relKind == RELKIND_SEQUENCE ||
relKind == RELKIND_INDEX ||
relKind == RELKIND_PARTITIONED_INDEX)
relKind == RELKIND_PARTITIONED_INDEX ||
relKind == RELKIND_VIEW)
{
return true;
}
@ -751,6 +769,58 @@ SupportedDependencyByCitus(const ObjectAddress *address)
}
/*
* ErrorOrWarnIfObjectHasUnsupportedDependency returns false without throwing any message if
* object doesn't have any unsupported dependency, else throws a message with proper level
* (except the cluster doesn't have any node) and return true.
*/
bool
ErrorOrWarnIfObjectHasUnsupportedDependency(ObjectAddress *objectAddress)
{
DeferredErrorMessage *errMsg = DeferErrorIfHasUnsupportedDependency(objectAddress);
if (errMsg != NULL)
{
/*
* Don't need to give any messages if there is no worker nodes in
* the cluster as user's experience won't be affected on the single node even
* if the object won't be distributed.
*/
if (!HasAnyNodes())
{
return true;
}
/*
* Since Citus drops and recreates some object while converting a table type
* giving a DEBUG1 message is enough if the process in table type conversion
* function call
*/
if (InTableTypeConversionFunctionCall)
{
RaiseDeferredError(errMsg, DEBUG1);
}
/*
* If the view is object distributed, we should provide an error to not have
* different definition of object on coordinator and worker nodes. If the object
* is not distributed yet, we can create it locally to not affect user's local
* usage experience.
*/
else if (IsObjectDistributed(objectAddress))
{
RaiseDeferredError(errMsg, ERROR);
}
else
{
RaiseDeferredError(errMsg, WARNING);
}
return true;
}
return false;
}
/*
* DeferErrorIfHasUnsupportedDependency returns deferred error message if the given
* object has any undistributable dependency.
@ -788,8 +858,11 @@ DeferErrorIfHasUnsupportedDependency(const ObjectAddress *objectAddress)
* Otherwise, callers are expected to throw the error returned from this
* function as a hard one by ignoring the detail part.
*/
appendStringInfo(detailInfo, "\"%s\" will be created only locally",
objectDescription);
if (!IsObjectDistributed(objectAddress))
{
appendStringInfo(detailInfo, "\"%s\" will be created only locally",
objectDescription);
}
if (SupportedDependencyByCitus(undistributableDependency))
{
@ -800,9 +873,19 @@ DeferErrorIfHasUnsupportedDependency(const ObjectAddress *objectAddress)
objectDescription,
dependencyDescription);
appendStringInfo(hintInfo, "Distribute \"%s\" first to distribute \"%s\"",
dependencyDescription,
objectDescription);
if (IsObjectDistributed(objectAddress))
{
appendStringInfo(hintInfo,
"Distribute \"%s\" first to modify \"%s\" on worker nodes",
dependencyDescription,
objectDescription);
}
else
{
appendStringInfo(hintInfo, "Distribute \"%s\" first to distribute \"%s\"",
dependencyDescription,
objectDescription);
}
return DeferredError(ERRCODE_FEATURE_NOT_SUPPORTED,
errorInfo->data, detailInfo->data, hintInfo->data);
@ -880,7 +963,9 @@ GetUndistributableDependency(const ObjectAddress *objectAddress)
{
char relKind = get_rel_relkind(dependency->objectId);
if (relKind == RELKIND_SEQUENCE || relKind == RELKIND_COMPOSITE_TYPE)
if (relKind == RELKIND_SEQUENCE ||
relKind == RELKIND_COMPOSITE_TYPE ||
relKind == RELKIND_VIEW)
{
/* citus knows how to auto-distribute these dependencies */
continue;
@ -1194,19 +1279,74 @@ ExpandCitusSupportedTypes(ObjectAddressCollector *collector, ObjectAddress targe
switch (target.classId)
{
case TypeRelationId:
case AuthIdRelationId:
{
/*
* types depending on other types are not captured in pg_depend, instead they
* are described with their dependencies by the relation that describes the
* composite type.
* Roles are members of other roles. These relations are not recorded directly
* but can be deduced from pg_auth_members
*/
if (get_typtype(target.objectId) == TYPTYPE_COMPOSITE)
return ExpandRolesToGroups(target.objectId);
}
case ExtensionRelationId:
{
/*
* FDWs get propagated along with the extensions they belong to.
* In case there are GRANTed privileges on FDWs to roles, those
* GRANT statements will be propagated to. In order to make sure
* that those GRANT statements work, the privileged roles should
* exist on the worker nodes. Hence, here we find these dependent
* roles and add them as dependencies.
*/
Oid extensionId = target.objectId;
List *FDWOids = GetDependentFDWsToExtension(extensionId);
Oid FDWOid = InvalidOid;
foreach_oid(FDWOid, FDWOids)
{
Oid typeRelationId = get_typ_typrelid(target.objectId);
DependencyDefinition *dependency =
CreateObjectAddressDependencyDef(RelationRelationId, typeRelationId);
result = lappend(result, dependency);
List *dependentRoleIds = GetDependentRoleIdsFDW(FDWOid);
List *dependencies =
CreateObjectAddressDependencyDefList(AuthIdRelationId,
dependentRoleIds);
result = list_concat(result, dependencies);
}
break;
}
case TypeRelationId:
{
switch (get_typtype(target.objectId))
{
/*
* types depending on other types are not captured in pg_depend, instead
* they are described with their dependencies by the relation that
* describes the composite type.
*/
case TYPTYPE_COMPOSITE:
{
Oid typeRelationId = get_typ_typrelid(target.objectId);
DependencyDefinition *dependency =
CreateObjectAddressDependencyDef(RelationRelationId,
typeRelationId);
result = lappend(result, dependency);
break;
}
/*
* Domains can have constraints associated with them. Constraints themself
* can depend on things like functions. To support the propagation of
* these functions we will add the constraints to the list of objects to
* be created.
*/
case TYPTYPE_DOMAIN:
{
List *dependencies =
GetTypeConstraintDependencyDefinition(target.objectId);
result = list_concat(result, dependencies);
break;
}
}
/*
@ -1275,9 +1415,26 @@ ExpandCitusSupportedTypes(ObjectAddressCollector *collector, ObjectAddress targe
* create all objects required by the indices before we create the table
* including indices.
*/
List *indexDependencyList = GetRelationIndicesDependencyList(relationId);
result = list_concat(result, indexDependencyList);
/*
* Get the dependencies of the rule for the given view. PG keeps internal
* dependency between view and rule. As it is stated on the PG doc, if
* there is an internal dependency, dependencies of the dependent object
* behave much like they were dependencies of the referenced object.
*
* We need to expand dependencies by including dependencies of the rule
* internally dependent to the view. PG doesn't keep any dependencies
* from view to any object, but it keeps an internal dependency to the
* rule and that rule has dependencies to other objects.
*/
char relKind = get_rel_relkind(relationId);
if (relKind == RELKIND_VIEW || relKind == RELKIND_MATVIEW)
{
List *ruleRefDepList = GetViewRuleReferenceDependencyList(relationId);
result = list_concat(result, ruleRefDepList);
}
}
default:
@ -1290,6 +1447,131 @@ ExpandCitusSupportedTypes(ObjectAddressCollector *collector, ObjectAddress targe
}
/*
* GetDependentRoleIdsFDW returns a list of role oids that has privileges on the
* FDW with the given object id.
*/
static List *
GetDependentRoleIdsFDW(Oid FDWOid)
{
List *roleIds = NIL;
Acl *aclEntry = GetPrivilegesForFDW(FDWOid);
if (aclEntry == NULL)
{
return NIL;
}
AclItem *privileges = ACL_DAT(aclEntry);
int numberOfPrivsGranted = ACL_NUM(aclEntry);
for (int i = 0; i < numberOfPrivsGranted; i++)
{
roleIds = lappend_oid(roleIds, privileges[i].ai_grantee);
}
return roleIds;
}
/*
* ExpandRolesToGroups returns a list of object addresses pointing to roles that roleid
* depends on.
*/
static List *
ExpandRolesToGroups(Oid roleid)
{
Relation pgAuthMembers = table_open(AuthMemRelationId, AccessShareLock);
HeapTuple tuple = NULL;
ScanKeyData scanKey[1];
const int scanKeyCount = 1;
/* scan pg_auth_members for member = $1 via index pg_auth_members_member_role_index */
ScanKeyInit(&scanKey[0], Anum_pg_auth_members_member, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(roleid));
SysScanDesc scanDescriptor = systable_beginscan(pgAuthMembers, AuthMemMemRoleIndexId,
true, NULL, scanKeyCount, scanKey);
List *roles = NIL;
while ((tuple = systable_getnext(scanDescriptor)) != NULL)
{
Form_pg_auth_members membership = (Form_pg_auth_members) GETSTRUCT(tuple);
DependencyDefinition *definition = palloc0(sizeof(DependencyDefinition));
definition->mode = DependencyObjectAddress;
ObjectAddressSet(definition->data.address, AuthIdRelationId, membership->roleid);
roles = lappend(roles, definition);
}
systable_endscan(scanDescriptor);
table_close(pgAuthMembers, AccessShareLock);
return roles;
}
/*
* GetViewRuleReferenceDependencyList returns the dependencies of the view's
* internal rule dependencies.
*/
static List *
GetViewRuleReferenceDependencyList(Oid viewId)
{
List *dependencyTupleList = GetPgDependTuplesForDependingObjects(RelationRelationId,
viewId);
List *nonInternalDependenciesOfDependingRules = NIL;
HeapTuple depTup = NULL;
foreach_ptr(depTup, dependencyTupleList)
{
Form_pg_depend pg_depend = (Form_pg_depend) GETSTRUCT(depTup);
/*
* Dependencies of the internal rule dependency should be handled as the dependency
* of referenced view object.
*
* PG doesn't keep dependency relation between views and dependent objects directly
* but it keeps an internal dependency relation between the view and the rule, then
* keeps the dependent objects of the view as non-internal dependencies of the
* internally dependent rule object.
*/
if (pg_depend->deptype == DEPENDENCY_INTERNAL && pg_depend->classid ==
RewriteRelationId)
{
ObjectAddress ruleAddress = { 0 };
ObjectAddressSet(ruleAddress, RewriteRelationId, pg_depend->objid);
/* Expand results with the noninternal dependencies of it */
List *ruleDependencies = DependencyDefinitionFromPgDepend(ruleAddress);
DependencyDefinition *dependencyDef = NULL;
foreach_ptr(dependencyDef, ruleDependencies)
{
/*
* Follow all dependencies of the internally dependent rule dependencies
* except it is an internal dependency of view itself.
*/
if (dependencyDef->data.pg_depend.deptype == DEPENDENCY_INTERNAL ||
(dependencyDef->data.pg_depend.refclassid == RelationRelationId &&
dependencyDef->data.pg_depend.refobjid == viewId))
{
continue;
}
nonInternalDependenciesOfDependingRules =
lappend(nonInternalDependenciesOfDependingRules, dependencyDef);
}
}
}
return nonInternalDependenciesOfDependingRules;
}
/*
* GetRelationSequenceDependencyList returns the sequence dependency definition
* list for the given relation.
@ -1297,12 +1579,18 @@ ExpandCitusSupportedTypes(ObjectAddressCollector *collector, ObjectAddress targe
static List *
GetRelationSequenceDependencyList(Oid relationId)
{
List *attnumList = NIL;
List *dependentSequenceList = NIL;
List *seqInfoList = NIL;
GetDependentSequencesWithRelation(relationId, &seqInfoList, 0);
List *seqIdList = NIL;
SequenceInfo *seqInfo = NULL;
foreach_ptr(seqInfo, seqInfoList)
{
seqIdList = lappend_oid(seqIdList, seqInfo->sequenceOid);
}
GetDependentSequencesWithRelation(relationId, &attnumList, &dependentSequenceList, 0);
List *sequenceDependencyDefList =
CreateObjectAddressDependencyDefList(RelationRelationId, dependentSequenceList);
CreateObjectAddressDependencyDefList(RelationRelationId, seqIdList);
return sequenceDependencyDefList;
}
@ -1381,6 +1669,49 @@ GetRelationTriggerFunctionDependencyList(Oid relationId)
}
/*
* GetTypeConstraintDependencyDefinition creates a list of constraint dependency
* definitions for a given type
*/
static List *
GetTypeConstraintDependencyDefinition(Oid typeId)
{
/* lookup and look all constraints to add them to the CreateDomainStmt */
Relation conRel = table_open(ConstraintRelationId, AccessShareLock);
/* Look for CHECK Constraints on this domain */
ScanKeyData key[1];
ScanKeyInit(&key[0],
Anum_pg_constraint_contypid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(typeId));
SysScanDesc scan = systable_beginscan(conRel, ConstraintTypidIndexId, true, NULL, 1,
key);
List *dependencies = NIL;
HeapTuple conTup = NULL;
while (HeapTupleIsValid(conTup = systable_getnext(scan)))
{
Form_pg_constraint c = (Form_pg_constraint) GETSTRUCT(conTup);
if (c->contype != CONSTRAINT_CHECK)
{
/* Ignore non-CHECK constraints, shouldn't be any */
continue;
}
dependencies = lappend(dependencies, CreateObjectAddressDependencyDef(
ConstraintRelationId, c->oid));
}
systable_endscan(scan);
table_close(conRel, NoLock);
return dependencies;
}
/*
* CreateObjectAddressDependencyDef returns DependencyDefinition object that
* stores the ObjectAddress for the database object identified by classId and
@ -1566,6 +1897,23 @@ GetDependingViews(Oid relationId)
ViewDependencyNode *dependingNode = NULL;
foreach_ptr(dependingNode, node->dependingNodes)
{
ObjectAddress relationAddress = { 0 };
ObjectAddressSet(relationAddress, RelationRelationId, dependingNode->id);
/*
* This function does not catch views with circular dependencies,
* because of the remaining dependency count check below.
* Here we check if the view has a circular dependency or not.
* If yes, we error out with a message that tells the user that
* Citus does not handle circular dependencies.
*/
DeferredErrorMessage *depError =
DeferErrorIfCircularDependencyExists(&relationAddress);
if (depError != NULL)
{
RaiseDeferredError(depError, ERROR);
}
dependingNode->remainingDependencyCount--;
if (dependingNode->remainingDependencyCount == 0)
{

View File

@ -28,8 +28,11 @@
#include "catalog/pg_type.h"
#include "citus_version.h"
#include "commands/extension.h"
#include "distributed/listutils.h"
#include "distributed/colocation_utils.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/metadata/dependency.h"
#include "distributed/metadata/distobject.h"
#include "distributed/metadata/pg_dist_object.h"
#include "distributed/metadata_cache.h"
@ -42,11 +45,11 @@
#include "parser/parse_type.h"
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/regproc.h"
#include "utils/rel.h"
static void MarkObjectDistributedLocally(const ObjectAddress *distAddress);
static char * CreatePgDistObjectEntryCommand(const ObjectAddress *objectAddress);
static int ExecuteCommandAsSuperuser(char *query, int paramCount, Oid *paramTypes,
Datum *paramValues);
@ -195,7 +198,7 @@ MarkObjectDistributedViaSuperUser(const ObjectAddress *distAddress)
* This function should never be called alone, MarkObjectDistributed() or
* MarkObjectDistributedViaSuperUser() should be called.
*/
static void
void
MarkObjectDistributedLocally(const ObjectAddress *distAddress)
{
int paramCount = 3;
@ -221,6 +224,52 @@ MarkObjectDistributedLocally(const ObjectAddress *distAddress)
}
/*
* ShouldMarkRelationDistributed is a helper function that
* decides whether the input relation should be marked as distributed.
*/
bool
ShouldMarkRelationDistributed(Oid relationId)
{
if (!EnableMetadataSync)
{
/*
* Just in case anything goes wrong, we should still be able
* to continue to the version upgrade.
*/
return false;
}
ObjectAddress relationAddress = { 0 };
ObjectAddressSet(relationAddress, RelationRelationId, relationId);
bool pgObject = (relationId < FirstNormalObjectId);
bool isObjectSupported = SupportedDependencyByCitus(&relationAddress);
bool ownedByExtension = IsTableOwnedByExtension(relationId);
bool alreadyDistributed = IsObjectDistributed(&relationAddress);
bool hasUnsupportedDependency =
DeferErrorIfHasUnsupportedDependency(&relationAddress) != NULL;
bool hasCircularDependency =
DeferErrorIfCircularDependencyExists(&relationAddress) != NULL;
/*
* pgObject: Citus never marks pg objects as distributed
* isObjectSupported: Citus does not support propagation of some objects
* ownedByExtension: let extensions manage its own objects
* alreadyDistributed: most likely via earlier versions
* hasUnsupportedDependency: Citus doesn't know how to distribute its dependencies
* hasCircularDependency: Citus cannot handle circular dependencies
*/
if (pgObject || !isObjectSupported || ownedByExtension || alreadyDistributed ||
hasUnsupportedDependency || hasCircularDependency)
{
return false;
}
return true;
}
/*
* CreatePgDistObjectEntryCommand creates command to insert pg_dist_object tuple
* for the given object address.
@ -472,3 +521,82 @@ UpdateDistributedObjectColocationId(uint32 oldColocationId,
table_close(pgDistObjectRel, NoLock);
CommandCounterIncrement();
}
/*
* DistributedFunctionList returns the list of ObjectAddress-es of all the
* distributed functions found in pg_dist_object
*/
List *
DistributedFunctionList(void)
{
List *distributedFunctionList = NIL;
ScanKeyData key[1];
Relation pgDistObjectRel = table_open(DistObjectRelationId(), AccessShareLock);
/* scan pg_dist_object for classid = ProcedureRelationId via index */
ScanKeyInit(&key[0], Anum_pg_dist_object_classid, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(ProcedureRelationId));
SysScanDesc pgDistObjectScan = systable_beginscan(pgDistObjectRel,
DistObjectPrimaryKeyIndexId(),
true, NULL, 1, key);
HeapTuple pgDistObjectTup = NULL;
while (HeapTupleIsValid(pgDistObjectTup = systable_getnext(pgDistObjectScan)))
{
Form_pg_dist_object pg_dist_object =
(Form_pg_dist_object) GETSTRUCT(pgDistObjectTup);
ObjectAddress *functionAddress = palloc0(sizeof(ObjectAddress));
functionAddress->classId = ProcedureRelationId;
functionAddress->objectId = pg_dist_object->objid;
functionAddress->objectSubId = pg_dist_object->objsubid;
distributedFunctionList = lappend(distributedFunctionList, functionAddress);
}
systable_endscan(pgDistObjectScan);
relation_close(pgDistObjectRel, AccessShareLock);
return distributedFunctionList;
}
/*
* DistributedSequenceList returns the list of ObjectAddress-es of all the
* distributed sequences found in pg_dist_object
*/
List *
DistributedSequenceList(void)
{
List *distributedSequenceList = NIL;
ScanKeyData key[1];
Relation pgDistObjectRel = table_open(DistObjectRelationId(), AccessShareLock);
/* scan pg_dist_object for classid = RelationRelationId via index */
ScanKeyInit(&key[0], Anum_pg_dist_object_classid, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(RelationRelationId));
SysScanDesc pgDistObjectScan = systable_beginscan(pgDistObjectRel,
DistObjectPrimaryKeyIndexId(),
true, NULL, 1, key);
HeapTuple pgDistObjectTup = NULL;
while (HeapTupleIsValid(pgDistObjectTup = systable_getnext(pgDistObjectScan)))
{
Form_pg_dist_object pg_dist_object =
(Form_pg_dist_object) GETSTRUCT(pgDistObjectTup);
if (get_rel_relkind(pg_dist_object->objid) == RELKIND_SEQUENCE)
{
ObjectAddress *sequenceAddress = palloc0(sizeof(ObjectAddress));
sequenceAddress->classId = RelationRelationId;
sequenceAddress->objectId = pg_dist_object->objid;
sequenceAddress->objectSubId = pg_dist_object->objsubid;
distributedSequenceList = lappend(distributedSequenceList, sequenceAddress);
}
}
systable_endscan(pgDistObjectScan);
relation_close(pgDistObjectRel, AccessShareLock);
return distributedSequenceList;
}

View File

@ -20,6 +20,7 @@
#include "access/nbtree.h"
#include "access/xact.h"
#include "access/sysattr.h"
#include "catalog/index.h"
#include "catalog/indexing.h"
#include "catalog/pg_am.h"
#include "catalog/pg_collation.h"
@ -37,6 +38,7 @@
#include "distributed/citus_ruleutils.h"
#include "distributed/multi_executor.h"
#include "distributed/function_utils.h"
#include "distributed/listutils.h"
#include "distributed/foreign_key_relationship.h"
#include "distributed/listutils.h"
#include "distributed/metadata_utility.h"
@ -62,11 +64,13 @@
#include "parser/parse_func.h"
#include "parser/parse_type.h"
#include "storage/lmgr.h"
#include "utils/array.h"
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/datum.h"
#include "utils/elog.h"
#include "utils/hsearch.h"
#include "utils/jsonb.h"
#if PG_VERSION_NUM >= PG_VERSION_13
#include "common/hashfn.h"
#endif
@ -149,6 +153,7 @@ typedef struct MetadataCacheData
Oid distShardShardidIndexId;
Oid distPlacementShardidIndexId;
Oid distPlacementPlacementidIndexId;
Oid distColocationidIndexId;
Oid distPlacementGroupidIndexId;
Oid distTransactionRelationId;
Oid distTransactionGroupIndexId;
@ -160,6 +165,7 @@ typedef struct MetadataCacheData
Oid workerHashFunctionId;
Oid anyValueFunctionId;
Oid textSendAsJsonbFunctionId;
Oid textoutFunctionId;
Oid extensionOwner;
Oid binaryCopyFormatId;
Oid textCopyFormatId;
@ -167,6 +173,10 @@ typedef struct MetadataCacheData
Oid secondaryNodeRoleId;
Oid pgTableIsVisibleFuncId;
Oid citusTableIsVisibleFuncId;
Oid distAuthinfoRelationId;
Oid distAuthinfoIndexId;
Oid distPoolinfoRelationId;
Oid distPoolinfoIndexId;
Oid relationIsAKnownShardFuncId;
Oid jsonbExtractPathFuncId;
Oid jsonbExtractPathTextFuncId;
@ -229,6 +239,7 @@ static void InitializeWorkerNodeCache(void);
static void RegisterForeignKeyGraphCacheCallbacks(void);
static void RegisterWorkerNodeCacheCallbacks(void);
static void RegisterLocalGroupIdCacheCallbacks(void);
static void RegisterAuthinfoCacheCallbacks(void);
static void RegisterCitusTableCacheEntryReleaseCallbacks(void);
static uint32 WorkerNodeHashCode(const void *key, Size keySize);
static void ResetCitusTableCacheEntry(CitusTableCacheEntry *cacheEntry);
@ -240,6 +251,7 @@ static void InvalidateForeignRelationGraphCacheCallback(Datum argument, Oid rela
static void InvalidateDistRelationCacheCallback(Datum argument, Oid relationId);
static void InvalidateNodeRelationCacheCallback(Datum argument, Oid relationId);
static void InvalidateLocalGroupIdRelationCacheCallback(Datum argument, Oid relationId);
static void InvalidateConnParamsCacheCallback(Datum argument, Oid relationId);
static void CitusTableCacheEntryReleaseCallback(ResourceReleasePhase phase, bool isCommit,
bool isTopLevel, void *arg);
static HeapTuple LookupDistPartitionTuple(Relation pgDistPartition, Oid relationId);
@ -267,6 +279,10 @@ static bool IsCitusTableTypeInternal(char partitionMethod, char replicationModel
CitusTableType tableType);
static bool RefreshTableCacheEntryIfInvalid(ShardIdCacheEntry *shardEntry);
static Oid DistAuthinfoRelationId(void);
static Oid DistAuthinfoIndexId(void);
static Oid DistPoolinfoRelationId(void);
static Oid DistPoolinfoIndexId(void);
/* exports for SQL callable functions */
PG_FUNCTION_INFO_V1(citus_dist_partition_cache_invalidate);
@ -719,6 +735,24 @@ ReferenceTableShardId(uint64 shardId)
}
/*
* DistributedTableShardId returns true if the given shardId belongs to
* a distributed table.
*/
bool
DistributedTableShardId(uint64 shardId)
{
if (shardId == INVALID_SHARD_ID)
{
return false;
}
ShardIdCacheEntry *shardIdEntry = LookupShardIdCacheEntry(shardId);
CitusTableCacheEntry *tableEntry = shardIdEntry->tableEntry;
return IsCitusTableTypeCacheEntry(tableEntry, DISTRIBUTED_TABLE);
}
/*
* LoadGroupShardPlacement returns the cached shard placement metadata
*
@ -2503,6 +2537,17 @@ DistPlacementPlacementidIndexId(void)
}
/* return oid of pg_dist_colocation_pkey */
Oid
DistColocationIndexId(void)
{
CachedRelationLookup("pg_dist_colocation_pkey",
&MetadataCache.distColocationidIndexId);
return MetadataCache.distColocationidIndexId;
}
/* return oid of pg_dist_transaction relation */
Oid
DistTransactionRelationId(void)
@ -2536,6 +2581,50 @@ DistPlacementGroupidIndexId(void)
}
/* return oid of pg_dist_authinfo relation */
static Oid
DistAuthinfoRelationId(void)
{
CachedRelationLookup("pg_dist_authinfo",
&MetadataCache.distAuthinfoRelationId);
return MetadataCache.distAuthinfoRelationId;
}
/* return oid of pg_dist_authinfo identification index */
static Oid
DistAuthinfoIndexId(void)
{
CachedRelationLookup("pg_dist_authinfo_identification_index",
&MetadataCache.distAuthinfoIndexId);
return MetadataCache.distAuthinfoIndexId;
}
/* return oid of pg_dist_poolinfo relation */
static Oid
DistPoolinfoRelationId(void)
{
CachedRelationLookup("pg_dist_poolinfo",
&MetadataCache.distPoolinfoRelationId);
return MetadataCache.distPoolinfoRelationId;
}
/* return oid of pg_dist_poolinfo primary key index */
static Oid
DistPoolinfoIndexId(void)
{
CachedRelationLookup("pg_dist_poolinfo_pkey",
&MetadataCache.distPoolinfoIndexId);
return MetadataCache.distPoolinfoIndexId;
}
/* return oid of the read_intermediate_result(text,citus_copy_format) function */
Oid
CitusReadIntermediateResultFuncId(void)
@ -2655,6 +2744,42 @@ CitusAnyValueFunctionId(void)
}
/* return oid of the citus_text_send_as_jsonb(text) function */
Oid
CitusTextSendAsJsonbFunctionId(void)
{
if (MetadataCache.textSendAsJsonbFunctionId == InvalidOid)
{
List *nameList = list_make2(makeString("pg_catalog"),
makeString("citus_text_send_as_jsonb"));
Oid paramOids[1] = { TEXTOID };
MetadataCache.textSendAsJsonbFunctionId =
LookupFuncName(nameList, 1, paramOids, false);
}
return MetadataCache.textSendAsJsonbFunctionId;
}
/* return oid of the textout(text) function */
Oid
TextOutFunctionId(void)
{
if (MetadataCache.textoutFunctionId == InvalidOid)
{
List *nameList = list_make2(makeString("pg_catalog"),
makeString("textout"));
Oid paramOids[1] = { TEXTOID };
MetadataCache.textoutFunctionId =
LookupFuncName(nameList, 1, paramOids, false);
}
return MetadataCache.textoutFunctionId;
}
/*
* PgTableVisibleFuncId returns oid of the pg_table_is_visible function.
*/
@ -3243,7 +3368,7 @@ citus_conninfo_cache_invalidate(PG_FUNCTION_ARGS)
errmsg("must be called as trigger")));
}
/* no-op in community edition */
CitusInvalidateRelcacheByRelid(DistAuthinfoRelationId());
PG_RETURN_DATUM(PointerGetDatum(NULL));
}
@ -3371,6 +3496,7 @@ InitializeCaches(void)
RegisterForeignKeyGraphCacheCallbacks();
RegisterWorkerNodeCacheCallbacks();
RegisterLocalGroupIdCacheCallbacks();
RegisterAuthinfoCacheCallbacks();
RegisterCitusTableCacheEntryReleaseCallbacks();
}
PG_CATCH();
@ -3776,6 +3902,18 @@ RegisterLocalGroupIdCacheCallbacks(void)
}
/*
* RegisterAuthinfoCacheCallbacks registers the callbacks required to
* maintain cached connection parameters at fresh values.
*/
static void
RegisterAuthinfoCacheCallbacks(void)
{
/* Watch for invalidation events. */
CacheRegisterRelcacheCallback(InvalidateConnParamsCacheCallback, (Datum) 0);
}
/*
* WorkerNodeHashCode computes the hash code for a worker node from the node's
* host name and port number. Nodes that only differ by their rack locations
@ -4274,6 +4412,30 @@ InvalidateLocalGroupIdRelationCacheCallback(Datum argument, Oid relationId)
}
/*
* InvalidateConnParamsCacheCallback sets isValid flag to false for all entries
* in ConnParamsHash, a cache used during connection establishment.
*/
static void
InvalidateConnParamsCacheCallback(Datum argument, Oid relationId)
{
if (relationId == MetadataCache.distAuthinfoRelationId ||
relationId == MetadataCache.distPoolinfoRelationId ||
relationId == InvalidOid)
{
ConnParamsHashEntry *entry = NULL;
HASH_SEQ_STATUS status;
hash_seq_init(&status, ConnParamsHash);
while ((entry = (ConnParamsHashEntry *) hash_seq_search(&status)) != NULL)
{
entry->isValid = false;
}
}
}
/*
* CitusTableCacheFlushInvalidatedEntries frees invalidated cache entries.
* Invalidated entries aren't freed immediately as callers expect their lifetime
@ -4853,6 +5015,12 @@ DistNodeMetadata(void)
"could not find any entries in pg_dist_metadata")));
}
/*
* Copy the jsonb result before closing the table
* since that memory can be freed.
*/
metadata = JsonbPGetDatum(DatumGetJsonbPCopy(metadata));
systable_endscan(scanDescriptor);
table_close(pgDistNodeMetadata, AccessShareLock);
@ -4875,37 +5043,164 @@ role_exists(PG_FUNCTION_ARGS)
/*
* authinfo_valid is a check constraint which errors on all rows, intended for
* use in prohibiting writes to pg_dist_authinfo in Citus Community.
* GetPoolinfoViaCatalog searches the pg_dist_poolinfo table for a row matching
* the provided nodeId and returns the poolinfo field of this row if found.
* Otherwise, this function returns NULL.
*/
Datum
authinfo_valid(PG_FUNCTION_ARGS)
char *
GetPoolinfoViaCatalog(int64 nodeId)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot write to pg_dist_authinfo"),
errdetail(
"Citus Community Edition does not support the use of "
"custom authentication options."),
errhint(
"To learn more about using advanced authentication schemes "
"with Citus, please contact us at "
"https://citusdata.com/about/contact_us")));
ScanKeyData scanKey[1];
const int scanKeyCount = 1;
const AttrNumber nodeIdIdx = 1, poolinfoIdx = 2;
Relation pgDistPoolinfo = table_open(DistPoolinfoRelationId(), AccessShareLock);
bool indexOK = true;
char *poolinfo = NULL;
/* set scan arguments */
ScanKeyInit(&scanKey[0], nodeIdIdx, BTEqualStrategyNumber, F_INT4EQ,
Int32GetDatum(nodeId));
SysScanDesc scanDescriptor = systable_beginscan(pgDistPoolinfo, DistPoolinfoIndexId(),
indexOK,
NULL, scanKeyCount, scanKey);
HeapTuple heapTuple = systable_getnext(scanDescriptor);
if (HeapTupleIsValid(heapTuple))
{
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPoolinfo);
bool isNull = false;
Datum poolinfoDatum = heap_getattr(heapTuple, poolinfoIdx, tupleDescriptor,
&isNull);
Assert(!isNull);
poolinfo = TextDatumGetCString(poolinfoDatum);
}
systable_endscan(scanDescriptor);
table_close(pgDistPoolinfo, AccessShareLock);
return poolinfo;
}
/*
* poolinfo_valid is a check constraint which errors on all rows, intended for
* use in prohibiting writes to pg_dist_poolinfo in Citus Community.
* GetAuthinfoViaCatalog searches pg_dist_authinfo for a row matching a pro-
* vided role and node id. Three types of rules are currently permitted: those
* matching a specific node (non-zero nodeid), those matching all nodes (a
* nodeid of zero), and those denoting a loopback connection (nodeid of -1).
* Rolename must always be specified. If both types of rules exist for a given
* user/host, the more specific (host-specific) rule wins. This means that when
* both a zero and non-zero row exist for a given rolename, the non-zero row
* has precedence.
*
* In short, this function will return a rule matching nodeId, or if that's
* absent the rule for 0, or if that's absent, an empty string. Callers can
* just use the returned authinfo and know the precedence has been honored.
*/
char *
GetAuthinfoViaCatalog(const char *roleName, int64 nodeId)
{
char *authinfo = "";
Datum nodeIdDatumArray[2] = {
Int32GetDatum(nodeId),
Int32GetDatum(WILDCARD_NODE_ID)
};
ArrayType *nodeIdArrayType = DatumArrayToArrayType(nodeIdDatumArray,
lengthof(nodeIdDatumArray),
INT4OID);
ScanKeyData scanKey[2];
const AttrNumber nodeIdIdx = 1, roleIdx = 2, authinfoIdx = 3;
/*
* Our index's definition ensures correct precedence for positive nodeIds,
* but when handling a negative value we need to traverse backwards to keep
* the invariant that the zero rule has lowest precedence.
*/
ScanDirection direction = (nodeId < 0) ? BackwardScanDirection : ForwardScanDirection;
if (ReindexIsProcessingIndex(DistAuthinfoIndexId()))
{
ereport(ERROR, (errmsg("authinfo is being reindexed; try again")));
}
memset(&scanKey, 0, sizeof(scanKey));
/* first column in index is rolename, need exact match there ... */
ScanKeyInit(&scanKey[0], roleIdx, BTEqualStrategyNumber,
F_NAMEEQ, CStringGetDatum(roleName));
/* second column is nodeId, match against array of nodeid and zero (any node) ... */
ScanKeyInit(&scanKey[1], nodeIdIdx, BTEqualStrategyNumber,
F_INT4EQ, PointerGetDatum(nodeIdArrayType));
scanKey[1].sk_flags |= SK_SEARCHARRAY;
/*
* It's important that we traverse the index in order: we need to ensure
* that rules with nodeid 0 are encountered last. We'll use the first tuple
* we find. This ordering defines the precedence order of authinfo rules.
*/
Relation pgDistAuthinfo = table_open(DistAuthinfoRelationId(), AccessShareLock);
Relation pgDistAuthinfoIdx = index_open(DistAuthinfoIndexId(), AccessShareLock);
SysScanDesc scanDescriptor = systable_beginscan_ordered(pgDistAuthinfo,
pgDistAuthinfoIdx,
NULL, lengthof(scanKey),
scanKey);
/* first tuple represents highest-precedence rule for this node */
HeapTuple authinfoTuple = systable_getnext_ordered(scanDescriptor, direction);
if (HeapTupleIsValid(authinfoTuple))
{
TupleDesc tupleDescriptor = RelationGetDescr(pgDistAuthinfo);
bool isNull = false;
Datum authinfoDatum = heap_getattr(authinfoTuple, authinfoIdx,
tupleDescriptor, &isNull);
Assert(!isNull);
authinfo = TextDatumGetCString(authinfoDatum);
}
systable_endscan_ordered(scanDescriptor);
index_close(pgDistAuthinfoIdx, AccessShareLock);
table_close(pgDistAuthinfo, AccessShareLock);
return authinfo;
}
/*
* authinfo_valid is a check constraint to verify that an inserted authinfo row
* uses only permitted libpq parameters.
*/
Datum
authinfo_valid(PG_FUNCTION_ARGS)
{
char *authinfo = TextDatumGetCString(PG_GETARG_DATUM(0));
/* this array _must_ be kept in an order usable by bsearch */
const char *allowList[] = { "password", "sslcert", "sslkey" };
bool authinfoValid = CheckConninfo(authinfo, allowList, lengthof(allowList), NULL);
PG_RETURN_BOOL(authinfoValid);
}
/*
* poolinfo_valid is a check constraint to verify that an inserted poolinfo row
* uses only permitted libpq parameters.
*/
Datum
poolinfo_valid(PG_FUNCTION_ARGS)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot write to pg_dist_poolinfo"),
errdetail(
"Citus Community Edition does not support the use of "
"pooler options."),
errhint("To learn more about using advanced pooling schemes "
"with Citus, please contact us at "
"https://citusdata.com/about/contact_us")));
char *poolinfo = TextDatumGetCString(PG_GETARG_DATUM(0));
/* this array _must_ be kept in an order usable by bsearch */
const char *allowList[] = { "dbname", "host", "port" };
bool poolinfoValid = CheckConninfo(poolinfo, allowList, lengthof(allowList), NULL);
PG_RETURN_BOOL(poolinfoValid);
}

View File

@ -50,6 +50,7 @@
#include "distributed/metadata_cache.h"
#include "distributed/metadata_sync.h"
#include "distributed/metadata_utility.h"
#include "distributed/metadata/dependency.h"
#include "distributed/metadata/distobject.h"
#include "distributed/metadata/pg_dist_object.h"
#include "distributed/multi_executor.h"
@ -70,6 +71,7 @@
#include "executor/spi.h"
#include "foreign/foreign.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
#include "nodes/pg_list.h"
#include "pgstat.h"
#include "postmaster/bgworker.h"
@ -95,6 +97,8 @@ static char * SchemaOwnerName(Oid objectId);
static bool HasMetadataWorkers(void);
static void CreateShellTableOnWorkers(Oid relationId);
static void CreateTableMetadataOnWorkers(Oid relationId);
static void CreateDependingViewsOnWorkers(Oid relationId);
static NodeMetadataSyncResult SyncNodeMetadataToNodesOptional(void);
static bool ShouldSyncTableMetadataInternal(bool hashDistributed,
bool citusTableWithNoDistKey);
static bool SyncNodeMetadataSnapshotToNode(WorkerNode *workerNode, bool raiseOnError);
@ -110,6 +114,11 @@ static List * GetObjectsForGrantStmt(ObjectType objectType, Oid objectId);
static AccessPriv * GetAccessPrivObjectForGrantStmt(char *permission);
static List * GenerateGrantOnSchemaQueriesFromAclItem(Oid schemaOid,
AclItem *aclItem);
static List * GenerateGrantOnFunctionQueriesFromAclItem(Oid schemaOid,
AclItem *aclItem);
static List * GrantOnSequenceDDLCommands(Oid sequenceOid);
static List * GenerateGrantOnSequenceQueriesFromAclItem(Oid sequenceOid,
AclItem *aclItem);
static void SetLocalReplicateReferenceTablesOnActivate(bool state);
static char * GenerateSetRoleQuery(Oid roleOid);
static void MetadataSyncSigTermHandler(SIGNAL_ARGS);
@ -136,6 +145,7 @@ static char * RemoteTypeIdExpression(Oid typeId);
static char * RemoteCollationIdExpression(Oid colocationId);
PG_FUNCTION_INFO_V1(start_metadata_sync_to_all_nodes);
PG_FUNCTION_INFO_V1(start_metadata_sync_to_node);
PG_FUNCTION_INFO_V1(stop_metadata_sync_to_node);
PG_FUNCTION_INFO_V1(worker_record_sequence_dependency);
@ -192,6 +202,33 @@ start_metadata_sync_to_node(PG_FUNCTION_ARGS)
}
/*
* start_metadata_sync_to_all_nodes function sets hasmetadata column of
* all the primary worker nodes to true, and then activate nodes without
* replicating reference tables.
*/
Datum
start_metadata_sync_to_all_nodes(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
EnsureSuperUser();
EnsureCoordinator();
List *workerNodes = ActivePrimaryNonCoordinatorNodeList(RowShareLock);
bool prevReplicateRefTablesOnActivate = ReplicateReferenceTablesOnActivate;
SetLocalReplicateReferenceTablesOnActivate(false);
ActivateNodeList(workerNodes);
TransactionModifiedNodeMetadata = true;
SetLocalReplicateReferenceTablesOnActivate(prevReplicateRefTablesOnActivate);
PG_RETURN_BOOL(true);
}
/*
* SyncNodeMetadataToNode is the internal API for
* start_metadata_sync_to_node().
@ -271,7 +308,8 @@ SyncNodeMetadataToNode(const char *nodeNameString, int32 nodePort)
* SyncCitusTableMetadata syncs citus table metadata to worker nodes with metadata.
* Our definition of metadata includes the shell table and its inter relations with
* other shell tables, corresponding pg_dist_object, pg_dist_partiton, pg_dist_shard
* and pg_dist_shard placement entries.
* and pg_dist_shard placement entries. This function also propagates the views that
* depend on the given relation, to the metadata workers.
*/
void
SyncCitusTableMetadata(Oid relationId)
@ -286,6 +324,51 @@ SyncCitusTableMetadata(Oid relationId)
ObjectAddressSet(relationAddress, RelationRelationId, relationId);
MarkObjectDistributed(&relationAddress);
}
CreateDependingViewsOnWorkers(relationId);
}
/*
* CreateDependingViewsOnWorkers takes a relationId and creates the views that depend on
* that relation on workers with metadata. Propagated views are marked as distributed.
*/
static void
CreateDependingViewsOnWorkers(Oid relationId)
{
List *views = GetDependingViews(relationId);
if (list_length(views) < 1)
{
/* no view to propagate */
return;
}
SendCommandToWorkersWithMetadata(DISABLE_DDL_PROPAGATION);
Oid viewOid = InvalidOid;
foreach_oid(viewOid, views)
{
if (!ShouldMarkRelationDistributed(viewOid))
{
continue;
}
ObjectAddress viewAddress = { 0 };
ObjectAddressSet(viewAddress, RelationRelationId, viewOid);
EnsureDependenciesExistOnAllNodes(&viewAddress);
char *createViewCommand = CreateViewDDLCommand(viewOid);
char *alterViewOwnerCommand = AlterViewOwnerCommand(viewOid);
SendCommandToWorkersWithMetadata(createViewCommand);
SendCommandToWorkersWithMetadata(alterViewOwnerCommand);
MarkObjectDistributed(&viewAddress);
}
SendCommandToWorkersWithMetadata(ENABLE_DDL_PROPAGATION);
}
@ -423,6 +506,25 @@ ClusterHasKnownMetadataWorkers()
}
/*
* ShouldSyncUserCommandForObject checks if the user command should be synced to the
* worker nodes for the given object.
*/
bool
ShouldSyncUserCommandForObject(ObjectAddress objectAddress)
{
if (objectAddress.classId == RelationRelationId)
{
Oid relOid = objectAddress.objectId;
return ShouldSyncTableMetadata(relOid) ||
ShouldSyncSequenceMetadata(relOid) ||
get_rel_relkind(relOid) == RELKIND_VIEW;
}
return false;
}
/*
* ShouldSyncTableMetadata checks if the metadata of a distributed table should be
* propagated to metadata workers, i.e. the table is a hash distributed table or
@ -488,6 +590,26 @@ ShouldSyncTableMetadataInternal(bool hashDistributed, bool citusTableWithNoDistK
/*
* ShouldSyncSequenceMetadata checks if the metadata of a sequence should be
* propagated to metadata workers, i.e. the sequence is marked as distributed
*/
bool
ShouldSyncSequenceMetadata(Oid relationId)
{
if (!OidIsValid(relationId) || !(get_rel_relkind(relationId) == RELKIND_SEQUENCE))
{
return false;
}
ObjectAddress sequenceAddress = { 0 };
ObjectAddressSet(sequenceAddress, RelationRelationId, relationId);
return IsObjectDistributed(&sequenceAddress);
}
/*
* SyncMetadataSnapshotToNode does the following:
* SyncNodeMetadataSnapshotToNode does the following:
* 1. Sets the localGroupId on the worker so the worker knows which tuple in
* pg_dist_node represents itself.
@ -522,10 +644,10 @@ SyncNodeMetadataSnapshotToNode(WorkerNode *workerNode, bool raiseOnError)
*/
if (raiseOnError)
{
SendMetadataCommandListToWorkerInCoordinatedTransaction(workerNode->workerName,
workerNode->workerPort,
currentUser,
recreateMetadataSnapshotCommandList);
SendMetadataCommandListToWorkerListInCoordinatedTransaction(list_make1(
workerNode),
currentUser,
recreateMetadataSnapshotCommandList);
return true;
}
else
@ -1249,6 +1371,23 @@ ShardListInsertCommand(List *shardIntervalList)
}
/*
* ShardListDeleteCommand generates a command list that can be executed to delete
* shard and shard placement metadata for the given shard.
*/
List *
ShardDeleteCommandList(ShardInterval *shardInterval)
{
uint64 shardId = shardInterval->shardId;
StringInfo deleteShardCommand = makeStringInfo();
appendStringInfo(deleteShardCommand,
"SELECT citus_internal_delete_shard_metadata(%ld);", shardId);
return list_make1(deleteShardCommand->data);
}
/*
* NodeDeleteCommand generate a command that can be
* executed to delete the metadata for a worker node.
@ -1383,6 +1522,8 @@ DDLCommandsForSequence(Oid sequenceOid, char *ownerName)
sequenceDDLList = lappend(sequenceDDLList, wrappedSequenceDef->data);
sequenceDDLList = lappend(sequenceDDLList, sequenceGrantStmt->data);
sequenceDDLList = list_concat(sequenceDDLList, GrantOnSequenceDDLCommands(
sequenceOid));
return sequenceDDLList;
}
@ -1438,10 +1579,10 @@ GetAttributeTypeOid(Oid relationId, AttrNumber attnum)
* attribute of the relationId.
*/
void
GetDependentSequencesWithRelation(Oid relationId, List **attnumList,
List **dependentSequenceList, AttrNumber attnum)
GetDependentSequencesWithRelation(Oid relationId, List **seqInfoList,
AttrNumber attnum)
{
Assert(*attnumList == NIL && *dependentSequenceList == NIL);
Assert(*seqInfoList == NIL);
List *attrdefResult = NIL;
List *attrdefAttnumResult = NIL;
@ -1478,9 +1619,26 @@ GetDependentSequencesWithRelation(Oid relationId, List **attnumList,
deprec->refobjsubid != 0 &&
deprec->deptype == DEPENDENCY_AUTO)
{
/*
* We are going to generate corresponding SequenceInfo
* in the following loop.
*/
attrdefResult = lappend_oid(attrdefResult, deprec->objid);
attrdefAttnumResult = lappend_int(attrdefAttnumResult, deprec->refobjsubid);
}
else if (deprec->deptype == DEPENDENCY_AUTO &&
deprec->refobjsubid != 0 &&
deprec->classid == RelationRelationId &&
get_rel_relkind(deprec->objid) == RELKIND_SEQUENCE)
{
SequenceInfo *seqInfo = (SequenceInfo *) palloc(sizeof(SequenceInfo));
seqInfo->sequenceOid = deprec->objid;
seqInfo->attributeNumber = deprec->refobjsubid;
seqInfo->isNextValDefault = false;
*seqInfoList = lappend(*seqInfoList, seqInfo);
}
}
systable_endscan(scan);
@ -1504,9 +1662,13 @@ GetDependentSequencesWithRelation(Oid relationId, List **attnumList,
if (list_length(sequencesFromAttrDef) == 1)
{
*dependentSequenceList = list_concat(*dependentSequenceList,
sequencesFromAttrDef);
*attnumList = lappend_int(*attnumList, attrdefAttnum);
SequenceInfo *seqInfo = (SequenceInfo *) palloc(sizeof(SequenceInfo));
seqInfo->sequenceOid = linitial_oid(sequencesFromAttrDef);
seqInfo->attributeNumber = attrdefAttnum;
seqInfo->isNextValDefault = true;
*seqInfoList = lappend(*seqInfoList, seqInfo);
}
}
}
@ -1842,7 +2004,7 @@ GrantOnSchemaDDLCommands(Oid schemaOid)
/*
* GenerateGrantOnSchemaQueryFromACL generates a query string for replicating a users permissions
* GenerateGrantOnSchemaQueryFromACLItem generates a query string for replicating a users permissions
* on a schema.
*/
List *
@ -1926,6 +2088,34 @@ GetObjectsForGrantStmt(ObjectType objectType, Oid objectId)
return list_make1(makeString(get_namespace_name(objectId)));
}
/* enterprise supported object types */
case OBJECT_FUNCTION:
case OBJECT_PROCEDURE:
{
ObjectWithArgs *owa = ObjectWithArgsFromOid(objectId);
return list_make1(owa);
}
case OBJECT_FDW:
{
ForeignDataWrapper *fdw = GetForeignDataWrapper(objectId);
return list_make1(makeString(fdw->fdwname));
}
case OBJECT_FOREIGN_SERVER:
{
ForeignServer *server = GetForeignServer(objectId);
return list_make1(makeString(server->servername));
}
case OBJECT_SEQUENCE:
{
Oid namespaceOid = get_rel_namespace(objectId);
RangeVar *sequence = makeRangeVar(get_namespace_name(namespaceOid),
get_rel_name(objectId), -1);
return list_make1(sequence);
}
default:
{
elog(ERROR, "unsupported object type for GRANT");
@ -1936,6 +2126,211 @@ GetObjectsForGrantStmt(ObjectType objectType, Oid objectId)
}
/*
* GrantOnFunctionDDLCommands creates a list of ddl command for replicating the permissions
* of roles on distributed functions.
*/
List *
GrantOnFunctionDDLCommands(Oid functionOid)
{
HeapTuple proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(functionOid));
bool isNull = true;
Datum aclDatum = SysCacheGetAttr(PROCOID, proctup, Anum_pg_proc_proacl,
&isNull);
if (isNull)
{
ReleaseSysCache(proctup);
return NIL;
}
Acl *acl = DatumGetAclPCopy(aclDatum);
AclItem *aclDat = ACL_DAT(acl);
int aclNum = ACL_NUM(acl);
List *commands = NIL;
ReleaseSysCache(proctup);
for (int i = 0; i < aclNum; i++)
{
commands = list_concat(commands,
GenerateGrantOnFunctionQueriesFromAclItem(
functionOid,
&aclDat[i]));
}
return commands;
}
/*
* GrantOnForeignServerDDLCommands creates a list of ddl command for replicating the
* permissions of roles on distributed foreign servers.
*/
List *
GrantOnForeignServerDDLCommands(Oid serverId)
{
HeapTuple servertup = SearchSysCache1(FOREIGNSERVEROID, ObjectIdGetDatum(serverId));
bool isNull = true;
Datum aclDatum = SysCacheGetAttr(FOREIGNSERVEROID, servertup,
Anum_pg_foreign_server_srvacl, &isNull);
if (isNull)
{
ReleaseSysCache(servertup);
return NIL;
}
Acl *aclEntry = DatumGetAclPCopy(aclDatum);
AclItem *privileges = ACL_DAT(aclEntry);
int numberOfPrivsGranted = ACL_NUM(aclEntry);
List *commands = NIL;
ReleaseSysCache(servertup);
for (int i = 0; i < numberOfPrivsGranted; i++)
{
commands = list_concat(commands,
GenerateGrantOnForeignServerQueriesFromAclItem(
serverId,
&privileges[i]));
}
return commands;
}
/*
* GenerateGrantOnForeignServerQueriesFromAclItem generates a query string for
* replicating a users permissions on a foreign server.
*/
List *
GenerateGrantOnForeignServerQueriesFromAclItem(Oid serverId, AclItem *aclItem)
{
/* privileges to be granted */
AclMode permissions = ACLITEM_GET_PRIVS(*aclItem) & ACL_ALL_RIGHTS_FOREIGN_SERVER;
/* WITH GRANT OPTION clause */
AclMode grants = ACLITEM_GET_GOPTIONS(*aclItem) & ACL_ALL_RIGHTS_FOREIGN_SERVER;
/*
* seems unlikely but we check if there is a grant option in the list without the actual permission
*/
Assert(!(grants & ACL_USAGE) || (permissions & ACL_USAGE));
Oid granteeOid = aclItem->ai_grantee;
List *queries = NIL;
/* switch to the role which had granted acl */
queries = lappend(queries, GenerateSetRoleQuery(aclItem->ai_grantor));
/* generate the GRANT stmt that will be executed by the grantor role */
if (permissions & ACL_USAGE)
{
char *query = DeparseTreeNode((Node *) GenerateGrantStmtForRights(
OBJECT_FOREIGN_SERVER, granteeOid, serverId,
"USAGE", grants & ACL_USAGE));
queries = lappend(queries, query);
}
/* reset the role back */
queries = lappend(queries, "RESET ROLE");
return queries;
}
/*
* GenerateGrantOnFunctionQueryFromACLItem generates a query string for replicating a users permissions
* on a distributed function.
*/
List *
GenerateGrantOnFunctionQueriesFromAclItem(Oid functionOid, AclItem *aclItem)
{
AclMode permissions = ACLITEM_GET_PRIVS(*aclItem) & ACL_ALL_RIGHTS_FUNCTION;
AclMode grants = ACLITEM_GET_GOPTIONS(*aclItem) & ACL_ALL_RIGHTS_FUNCTION;
/*
* seems unlikely but we check if there is a grant option in the list without the actual permission
*/
Assert(!(grants & ACL_EXECUTE) || (permissions & ACL_EXECUTE));
Oid granteeOid = aclItem->ai_grantee;
List *queries = NIL;
queries = lappend(queries, GenerateSetRoleQuery(aclItem->ai_grantor));
if (permissions & ACL_EXECUTE)
{
char prokind = get_func_prokind(functionOid);
ObjectType objectType;
if (prokind == PROKIND_FUNCTION)
{
objectType = OBJECT_FUNCTION;
}
else if (prokind == PROKIND_PROCEDURE)
{
objectType = OBJECT_PROCEDURE;
}
else
{
ereport(ERROR, (errmsg("unsupported prokind"),
errdetail("GRANT commands on procedures are propagated only "
"for procedures and functions.")));
}
char *query = DeparseTreeNode((Node *) GenerateGrantStmtForRights(
objectType, granteeOid, functionOid, "EXECUTE",
grants & ACL_EXECUTE));
queries = lappend(queries, query);
}
queries = lappend(queries, "RESET ROLE");
return queries;
}
/*
* GenerateGrantOnFDWQueriesFromAclItem generates a query string for
* replicating a users permissions on a foreign data wrapper.
*/
List *
GenerateGrantOnFDWQueriesFromAclItem(Oid FDWId, AclItem *aclItem)
{
/* privileges to be granted */
AclMode permissions = ACLITEM_GET_PRIVS(*aclItem) & ACL_ALL_RIGHTS_FDW;
/* WITH GRANT OPTION clause */
AclMode grants = ACLITEM_GET_GOPTIONS(*aclItem) & ACL_ALL_RIGHTS_FDW;
/*
* seems unlikely but we check if there is a grant option in the list without the actual permission
*/
Assert(!(grants & ACL_USAGE) || (permissions & ACL_USAGE));
Oid granteeOid = aclItem->ai_grantee;
List *queries = NIL;
/* switch to the role which had granted acl */
queries = lappend(queries, GenerateSetRoleQuery(aclItem->ai_grantor));
/* generate the GRANT stmt that will be executed by the grantor role */
if (permissions & ACL_USAGE)
{
char *query = DeparseTreeNode((Node *) GenerateGrantStmtForRights(
OBJECT_FDW, granteeOid, FDWId, "USAGE",
grants & ACL_USAGE));
queries = lappend(queries, query);
}
/* reset the role back */
queries = lappend(queries, "RESET ROLE");
return queries;
}
/*
* GetAccessPrivObjectForGrantStmt creates an AccessPriv object for the given permission.
* It will be used when creating GrantStmt objects.
@ -1951,6 +2346,93 @@ GetAccessPrivObjectForGrantStmt(char *permission)
}
/*
* GrantOnSequenceDDLCommands creates a list of ddl command for replicating the permissions
* of roles on distributed sequences.
*/
static List *
GrantOnSequenceDDLCommands(Oid sequenceOid)
{
HeapTuple seqtup = SearchSysCache1(RELOID, ObjectIdGetDatum(sequenceOid));
bool isNull = false;
Datum aclDatum = SysCacheGetAttr(RELOID, seqtup, Anum_pg_class_relacl,
&isNull);
if (isNull)
{
ReleaseSysCache(seqtup);
return NIL;
}
Acl *acl = DatumGetAclPCopy(aclDatum);
AclItem *aclDat = ACL_DAT(acl);
int aclNum = ACL_NUM(acl);
List *commands = NIL;
ReleaseSysCache(seqtup);
for (int i = 0; i < aclNum; i++)
{
commands = list_concat(commands,
GenerateGrantOnSequenceQueriesFromAclItem(
sequenceOid,
&aclDat[i]));
}
return commands;
}
/*
* GenerateGrantOnSequenceQueriesFromAclItem generates a query string for replicating a users permissions
* on a distributed sequence.
*/
static List *
GenerateGrantOnSequenceQueriesFromAclItem(Oid sequenceOid, AclItem *aclItem)
{
AclMode permissions = ACLITEM_GET_PRIVS(*aclItem) & ACL_ALL_RIGHTS_SEQUENCE;
AclMode grants = ACLITEM_GET_GOPTIONS(*aclItem) & ACL_ALL_RIGHTS_SEQUENCE;
/*
* seems unlikely but we check if there is a grant option in the list without the actual permission
*/
Assert(!(grants & ACL_USAGE) || (permissions & ACL_USAGE));
Assert(!(grants & ACL_SELECT) || (permissions & ACL_SELECT));
Assert(!(grants & ACL_UPDATE) || (permissions & ACL_UPDATE));
Oid granteeOid = aclItem->ai_grantee;
List *queries = NIL;
queries = lappend(queries, GenerateSetRoleQuery(aclItem->ai_grantor));
if (permissions & ACL_USAGE)
{
char *query = DeparseTreeNode((Node *) GenerateGrantStmtForRights(
OBJECT_SEQUENCE, granteeOid, sequenceOid,
"USAGE", grants & ACL_USAGE));
queries = lappend(queries, query);
}
if (permissions & ACL_SELECT)
{
char *query = DeparseTreeNode((Node *) GenerateGrantStmtForRights(
OBJECT_SEQUENCE, granteeOid, sequenceOid,
"SELECT", grants & ACL_SELECT));
queries = lappend(queries, query);
}
if (permissions & ACL_UPDATE)
{
char *query = DeparseTreeNode((Node *) GenerateGrantStmtForRights(
OBJECT_SEQUENCE, granteeOid, sequenceOid,
"UPDATE", grants & ACL_UPDATE));
queries = lappend(queries, query);
}
queries = lappend(queries, "RESET ROLE");
return queries;
}
/*
* SetLocalEnableMetadataSync sets the enable_metadata_sync locally
*/
@ -2042,7 +2524,7 @@ SchemaOwnerName(Oid objectId)
static bool
HasMetadataWorkers(void)
{
List *workerNodeList = ActivePrimaryNonCoordinatorNodeList(NoLock);
List *workerNodeList = ActiveReadableNonCoordinatorNodeList();
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, workerNodeList)
@ -2217,16 +2699,16 @@ DetachPartitionCommandList(void)
/*
* SyncNodeMetadataToNodes tries recreating the metadata snapshot in the
* metadata workers that are out of sync. Returns the result of
* synchronization.
* SyncNodeMetadataToNodesOptional tries recreating the metadata
* snapshot in the metadata workers that are out of sync.
* Returns the result of synchronization.
*
* This function must be called within coordinated transaction
* since updates on the pg_dist_node metadata must be rollbacked if anything
* goes wrong.
*/
static NodeMetadataSyncResult
SyncNodeMetadataToNodes(void)
SyncNodeMetadataToNodesOptional(void)
{
NodeMetadataSyncResult result = NODE_METADATA_SYNC_SUCCESS;
if (!IsCoordinator())
@ -2286,6 +2768,46 @@ SyncNodeMetadataToNodes(void)
}
/*
* SyncNodeMetadataToNodes recreates the node metadata snapshot in all the
* metadata workers.
*
* This function runs within a coordinated transaction since updates on
* the pg_dist_node metadata must be rollbacked if anything
* goes wrong.
*/
void
SyncNodeMetadataToNodes(void)
{
EnsureCoordinator();
/*
* Request a RowExclusiveLock so we don't run concurrently with other
* functions updating pg_dist_node, but allow concurrency with functions
* which are just reading from pg_dist_node.
*/
if (!ConditionalLockRelationOid(DistNodeRelationId(), RowExclusiveLock))
{
ereport(ERROR, (errmsg("cannot sync metadata because a concurrent "
"metadata syncing operation is in progress")));
}
List *workerList = ActivePrimaryNonCoordinatorNodeList(NoLock);
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, workerList)
{
if (workerNode->hasMetadata)
{
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_metadatasynced,
BoolGetDatum(true));
bool raiseOnError = true;
SyncNodeMetadataSnapshotToNode(workerNode, raiseOnError);
}
}
}
/*
* SyncNodeMetadataToNodesMain is the main function for syncing node metadata to
* MX nodes. It retries until success and then exits.
@ -2332,7 +2854,7 @@ SyncNodeMetadataToNodesMain(Datum main_arg)
{
UseCoordinatedTransaction();
NodeMetadataSyncResult result = SyncNodeMetadataToNodes();
NodeMetadataSyncResult result = SyncNodeMetadataToNodesOptional();
syncedAllNodes = (result == NODE_METADATA_SYNC_SUCCESS);
/* we use LISTEN/NOTIFY to wait for metadata syncing in tests */
@ -3391,12 +3913,19 @@ ColocationGroupCreateCommandList(void)
"distributioncolumncollationschema) AS (VALUES ");
Relation pgDistColocation = table_open(DistColocationRelationId(), AccessShareLock);
Relation colocationIdIndexRel = index_open(DistColocationIndexId(), AccessShareLock);
bool indexOK = false;
SysScanDesc scanDescriptor = systable_beginscan(pgDistColocation, InvalidOid, indexOK,
NULL, 0, NULL);
/*
* It is not strictly necessary to read the tuples in order.
* However, it is useful to get consistent behavior, both for regression
* tests and also in production systems.
*/
SysScanDesc scanDescriptor =
systable_beginscan_ordered(pgDistColocation, colocationIdIndexRel,
NULL, 0, NULL);
HeapTuple colocationTuple = systable_getnext(scanDescriptor);
HeapTuple colocationTuple = systable_getnext_ordered(scanDescriptor,
ForwardScanDirection);
while (HeapTupleIsValid(colocationTuple))
{
@ -3454,10 +3983,11 @@ ColocationGroupCreateCommandList(void)
"NULL, NULL)");
}
colocationTuple = systable_getnext(scanDescriptor);
colocationTuple = systable_getnext_ordered(scanDescriptor, ForwardScanDirection);
}
systable_endscan(scanDescriptor);
systable_endscan_ordered(scanDescriptor);
index_close(colocationIdIndexRel, AccessShareLock);
table_close(pgDistColocation, AccessShareLock);
if (!hasColocations)

View File

@ -66,6 +66,9 @@
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/syscache.h"
#if PG_VERSION_NUM < 120000
#include "utils/tqual.h"
#endif
#define DISK_SPACE_FIELDS 2
@ -913,7 +916,7 @@ AppendShardSizeQuery(StringInfo selectQuery, ShardInterval *shardInterval)
appendStringInfo(selectQuery, "SELECT " UINT64_FORMAT " AS shard_id, ", shardId);
appendStringInfo(selectQuery, "%s AS shard_name, ", quotedShardName);
appendStringInfo(selectQuery, PG_RELATION_SIZE_FUNCTION, quotedShardName);
appendStringInfo(selectQuery, PG_TOTAL_RELATION_SIZE_FUNCTION, quotedShardName);
}
@ -2175,11 +2178,8 @@ EnsureSuperUser(void)
}
/*
* Return a table's owner as a string.
*/
char *
TableOwner(Oid relationId)
Oid
TableOwnerOid(Oid relationId)
{
HeapTuple tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relationId));
if (!HeapTupleIsValid(tuple))
@ -2191,8 +2191,17 @@ TableOwner(Oid relationId)
Oid userId = ((Form_pg_class) GETSTRUCT(tuple))->relowner;
ReleaseSysCache(tuple);
return userId;
}
return GetUserNameFromId(userId, false);
/*
* Return a table's owner as a string.
*/
char *
TableOwner(Oid relationId)
{
return GetUserNameFromId(TableOwnerOid(relationId), false);
}

View File

@ -106,17 +106,18 @@ static void InsertPlaceholderCoordinatorRecord(void);
static void InsertNodeRow(int nodeid, char *nodename, int32 nodeport, NodeMetadata
*nodeMetadata);
static void DeleteNodeRow(char *nodename, int32 nodeport);
static void SyncDistributedObjectsToNode(WorkerNode *workerNode);
static void SyncDistributedObjectsToNodeList(List *workerNodeList);
static void UpdateLocalGroupIdOnNode(WorkerNode *workerNode);
static void SyncPgDistTableMetadataToNode(WorkerNode *workerNode);
static void SyncPgDistTableMetadataToNodeList(List *nodeList);
static List * InterTableRelationshipCommandList();
static void BlockDistributedQueriesOnMetadataNodes(void);
static WorkerNode * TupleToWorkerNode(TupleDesc tupleDescriptor, HeapTuple heapTuple);
static List * PropagateNodeWideObjectsCommandList();
static WorkerNode * ModifiableWorkerNode(const char *nodeName, int32 nodePort);
static bool NodeIsLocal(WorkerNode *worker);
static void SetLockTimeoutLocally(int32 lock_cooldown);
static void UpdateNodeLocation(int32 nodeId, char *newNodeName, int32 newNodePort);
static bool UnsetMetadataSyncedForAll(void);
static bool UnsetMetadataSyncedForAllWorkers(void);
static char * GetMetadataSyncCommandToSetNodeColumn(WorkerNode *workerNode,
int columnIndex,
Datum value);
@ -150,6 +151,7 @@ PG_FUNCTION_INFO_V1(get_shard_id_for_distribution_column);
PG_FUNCTION_INFO_V1(citus_nodename_for_nodeid);
PG_FUNCTION_INFO_V1(citus_nodeport_for_nodeid);
PG_FUNCTION_INFO_V1(citus_coordinator_nodeid);
PG_FUNCTION_INFO_V1(citus_is_coordinator);
/*
@ -451,7 +453,7 @@ citus_disable_node(PG_FUNCTION_ARGS)
{
text *nodeNameText = PG_GETARG_TEXT_P(0);
int32 nodePort = PG_GETARG_INT32(1);
bool forceDisableNode = PG_GETARG_BOOL(2);
bool synchronousDisableNode = PG_GETARG_BOOL(2);
char *nodeName = text_to_cstring(nodeNameText);
WorkerNode *workerNode = ModifiableWorkerNode(nodeName, nodePort);
@ -462,8 +464,10 @@ citus_disable_node(PG_FUNCTION_ARGS)
"isactive");
WorkerNode *firstWorkerNode = GetFirstPrimaryWorkerNode();
if (!forceDisableNode && firstWorkerNode &&
firstWorkerNode->nodeId == workerNode->nodeId)
bool disablingFirstNode =
(firstWorkerNode && firstWorkerNode->nodeId == workerNode->nodeId);
if (disablingFirstNode && !synchronousDisableNode)
{
/*
* We sync metadata async and optionally in the background worker,
@ -477,16 +481,21 @@ citus_disable_node(PG_FUNCTION_ARGS)
* possibility of diverged shard placements for the same shard.
*
* To prevent that, we currently do not allow disabling the first
* worker node.
* worker node unless it is explicitly opted synchronous.
*/
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("disabling the first worker node in the "
"metadata is not allowed"),
errhint("You can force disabling node, but this operation "
"might cause replicated shards to diverge: SELECT "
"citus_disable_node('%s', %d, force:=true);",
workerNode->workerName,
nodePort)));
errhint("You can force disabling node, SELECT "
"citus_disable_node('%s', %d, "
"synchronous:=true);", workerNode->workerName,
nodePort),
errdetail("Citus uses the first worker node in the "
"metadata for certain internal operations when "
"replicated tables are modified. Synchronous mode "
"ensures that all nodes have the same view of the "
"first worker node, which is used for certain "
"locking operations.")));
}
/*
@ -505,45 +514,89 @@ citus_disable_node(PG_FUNCTION_ARGS)
* for any given shard.
*/
ErrorIfNodeContainsNonRemovablePlacements(workerNode);
bool onlyConsiderActivePlacements = false;
if (NodeGroupHasShardPlacements(workerNode->groupId,
onlyConsiderActivePlacements))
{
ereport(NOTICE, (errmsg(
"Node %s:%d has active shard placements. Some queries "
"may fail after this operation. Use "
"SELECT citus_activate_node('%s', %d) to activate this "
"node back.",
workerNode->workerName, nodePort,
workerNode->workerName,
nodePort)));
}
}
TransactionModifiedNodeMetadata = true;
/*
* We have not propagated the metadata changes yet, make sure that all the
* active nodes get the metadata updates. We defer this operation to the
* background worker to make it possible disabling nodes when multiple nodes
* are down.
*
* Note that the active placements reside on the active nodes. Hence, when
* Citus finds active placements, it filters out the placements that are on
* the disabled nodes. That's why, we don't have to change/sync placement
* metadata at this point. Instead, we defer that to citus_activate_node()
* where we expect all nodes up and running.
*/
if (UnsetMetadataSyncedForAll())
if (synchronousDisableNode)
{
TriggerMetadataSyncOnCommit();
/*
* The user might pick between sync vs async options.
* - Pros for the sync option:
* (a) the changes become visible on the cluster immediately
* (b) even if the first worker node is disabled, there is no
* risk of divergence of the placements of replicated shards
* - Cons for the sync options:
* (a) Does not work within 2PC transaction (e.g., BEGIN;
* citus_disable_node(); PREPARE TRANSACTION ...);
* (b) If there are multiple node failures (e.g., one another node
* than the current node being disabled), the sync option would
* fail because it'd try to sync the metadata changes to a node
* that is not up and running.
*/
if (firstWorkerNode && firstWorkerNode->nodeId == workerNode->nodeId)
{
/*
* We cannot let any modification query on a replicated table to run
* concurrently with citus_disable_node() on the first worker node. If
* we let that, some worker nodes might calculate FirstWorkerNode()
* different than others. See LockShardListResourcesOnFirstWorker()
* for the details.
*/
BlockDistributedQueriesOnMetadataNodes();
}
SyncNodeMetadataToNodes();
}
else if (UnsetMetadataSyncedForAllWorkers())
{
/*
* We have not propagated the node metadata changes yet, make sure that all the
* active nodes get the metadata updates. We defer this operation to the
* background worker to make it possible disabling nodes when multiple nodes
* are down.
*
* Note that the active placements reside on the active nodes. Hence, when
* Citus finds active placements, it filters out the placements that are on
* the disabled nodes. That's why, we don't have to change/sync placement
* metadata at this point. Instead, we defer that to citus_activate_node()
* where we expect all nodes up and running.
*/
TriggerNodeMetadataSyncOnCommit();
}
PG_RETURN_VOID();
}
/*
* BlockDistributedQueriesOnMetadataNodes blocks all the modification queries on
* all nodes. Hence, should be used with caution.
*/
static void
BlockDistributedQueriesOnMetadataNodes(void)
{
/* first, block on the coordinator */
LockRelationOid(DistNodeRelationId(), ExclusiveLock);
/*
* Note that we might re-design this lock to be more granular than
* pg_dist_node, scoping only for modifications on the replicated
* tables. However, we currently do not have any such mechanism and
* given that citus_disable_node() runs instantly, it seems acceptable
* to block reads (or modifications on non-replicated tables) for
* a while.
*/
/* only superuser can disable node */
Assert(superuser());
SendCommandToWorkersWithMetadata(
"LOCK TABLE pg_catalog.pg_dist_node IN EXCLUSIVE MODE;");
}
/*
* master_disable_node is a wrapper function for old UDF name.
*/
@ -693,8 +746,6 @@ PgDistTableMetadataSyncCommandList(void)
metadataSnapshotCommandList = list_concat(metadataSnapshotCommandList,
colocationGroupSyncCommandList);
/* As the last step, propagate the pg_dist_object entities */
Assert(ShouldPropagate());
List *distributedObjectSyncCommandList = DistributedObjectMetadataSyncCommandList();
metadataSnapshotCommandList = list_concat(metadataSnapshotCommandList,
distributedObjectSyncCommandList);
@ -790,7 +841,7 @@ SyncDistributedObjectsCommandList(WorkerNode *workerNode)
/*
* SyncDistributedObjectsToNode sync the distributed objects to the node. It includes
* SyncDistributedObjectsToNodeList sync the distributed objects to the node. It includes
* - All dependencies (e.g., types, schemas, sequences)
* - All shell distributed table
* - Inter relation between those shell tables
@ -799,17 +850,29 @@ SyncDistributedObjectsCommandList(WorkerNode *workerNode)
* since all the dependencies should be present in the coordinator already.
*/
static void
SyncDistributedObjectsToNode(WorkerNode *workerNode)
SyncDistributedObjectsToNodeList(List *workerNodeList)
{
if (NodeIsCoordinator(workerNode))
List *workerNodesToSync = NIL;
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, workerNodeList)
{
/* coordinator has all the objects */
return;
if (NodeIsCoordinator(workerNode))
{
/* coordinator has all the objects */
continue;
}
if (!NodeIsPrimary(workerNode))
{
/* secondary nodes gets the objects from their primaries via replication */
continue;
}
workerNodesToSync = lappend(workerNodesToSync, workerNode);
}
if (!NodeIsPrimary(workerNode))
if (workerNodesToSync == NIL)
{
/* secondary nodes gets the objects from their primaries via replication */
return;
}
@ -821,9 +884,8 @@ SyncDistributedObjectsToNode(WorkerNode *workerNode)
/* send commands to new workers, the current user should be a superuser */
Assert(superuser());
SendMetadataCommandListToWorkerInCoordinatedTransaction(
workerNode->workerName,
workerNode->workerPort,
SendMetadataCommandListToWorkerListInCoordinatedTransaction(
workerNodesToSync,
CurrentUserName(),
commandList);
}
@ -841,9 +903,8 @@ UpdateLocalGroupIdOnNode(WorkerNode *workerNode)
/* send commands to new workers, the current user should be a superuser */
Assert(superuser());
SendMetadataCommandListToWorkerInCoordinatedTransaction(
workerNode->workerName,
workerNode->workerPort,
SendMetadataCommandListToWorkerListInCoordinatedTransaction(
list_make1(workerNode),
CurrentUserName(),
commandList);
}
@ -851,25 +912,36 @@ UpdateLocalGroupIdOnNode(WorkerNode *workerNode)
/*
* SyncPgDistTableMetadataToNode syncs the pg_dist_partition, pg_dist_shard
* SyncPgDistTableMetadataToNodeList syncs the pg_dist_partition, pg_dist_shard
* pg_dist_placement and pg_dist_object metadata entries.
*
*/
static void
SyncPgDistTableMetadataToNode(WorkerNode *workerNode)
SyncPgDistTableMetadataToNodeList(List *nodeList)
{
if (NodeIsPrimary(workerNode) && !NodeIsCoordinator(workerNode))
{
List *syncPgDistMetadataCommandList = PgDistTableMetadataSyncCommandList();
/* send commands to new workers, the current user should be a superuser */
Assert(superuser());
/* send commands to new workers, the current user should be a superuser */
Assert(superuser());
SendMetadataCommandListToWorkerInCoordinatedTransaction(
workerNode->workerName,
workerNode->workerPort,
CurrentUserName(),
syncPgDistMetadataCommandList);
List *nodesWithMetadata = NIL;
WorkerNode *workerNode = NULL;
foreach_ptr(workerNode, nodeList)
{
if (NodeIsPrimary(workerNode) && !NodeIsCoordinator(workerNode))
{
nodesWithMetadata = lappend(nodesWithMetadata, workerNode);
}
}
if (nodesWithMetadata == NIL)
{
return;
}
List *syncPgDistMetadataCommandList = PgDistTableMetadataSyncCommandList();
SendMetadataCommandListToWorkerListInCoordinatedTransaction(
nodesWithMetadata,
CurrentUserName(),
syncPgDistMetadataCommandList);
}
@ -1065,15 +1137,14 @@ PrimaryNodeForGroup(int32 groupId, bool *groupContainsNodes)
/*
* ActivateNode activates the node with nodeName and nodePort. Currently, activation
* includes only replicating the reference tables and setting isactive column of the
* given node.
* ActivateNodeList iterates over the nodeList and activates the nodes.
* Some part of the node activation is done parallel across the nodes,
* such as syncing the metadata. However, reference table replication is
* done one by one across nodes.
*/
int
ActivateNode(char *nodeName, int nodePort)
void
ActivateNodeList(List *nodeList)
{
bool isActive = true;
/*
* We currently require the object propagation to happen via superuser,
* see #5139. While activating a node, we sync both metadata and object
@ -1090,86 +1161,130 @@ ActivateNode(char *nodeName, int nodePort)
/* take an exclusive lock on pg_dist_node to serialize pg_dist_node changes */
LockRelationOid(DistNodeRelationId(), ExclusiveLock);
/*
* First, locally mark the node is active, if everything goes well,
* we are going to sync this information to all the metadata nodes.
*/
WorkerNode *workerNode = FindWorkerNodeAnyCluster(nodeName, nodePort);
if (workerNode == NULL)
{
ereport(ERROR, (errmsg("node at \"%s:%u\" does not exist", nodeName, nodePort)));
}
/*
* Delete existing reference and replicated table placements on the
* given groupId if the group has been disabled earlier (e.g., isActive
* set to false).
*
* Sync the metadata changes to all existing metadata nodes irrespective
* of the current nodes' metadata sync state. We expect all nodes up
* and running when another node is activated.
*/
if (!workerNode->isActive && NodeIsPrimary(workerNode))
{
bool localOnly = false;
DeleteAllReplicatedTablePlacementsFromNodeGroup(workerNode->groupId,
localOnly);
}
workerNode =
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_isactive,
BoolGetDatum(isActive));
/* TODO: Once all tests will be enabled for MX, we can remove sync by default check */
bool syncMetadata = EnableMetadataSync && NodeIsPrimary(workerNode);
if (syncMetadata)
List *nodeToSyncMetadata = NIL;
WorkerNode *node = NULL;
foreach_ptr(node, nodeList)
{
/*
* We are going to sync the metadata anyway in this transaction, so do
* not fail just because the current metadata is not synced.
* First, locally mark the node is active, if everything goes well,
* we are going to sync this information to all the metadata nodes.
*/
SetWorkerColumn(workerNode, Anum_pg_dist_node_metadatasynced,
BoolGetDatum(true));
/*
* Update local group id first, as object dependency logic requires to have
* updated local group id.
*/
UpdateLocalGroupIdOnNode(workerNode);
/*
* Sync distributed objects first. We must sync distributed objects before
* replicating reference tables to the remote node, as reference tables may
* need such objects.
*/
SyncDistributedObjectsToNode(workerNode);
/*
* We need to replicate reference tables before syncing node metadata, otherwise
* reference table replication logic would try to get lock on the new node before
* having the shard placement on it
*/
if (ReplicateReferenceTablesOnActivate)
WorkerNode *workerNode =
FindWorkerNodeAnyCluster(node->workerName, node->workerPort);
if (workerNode == NULL)
{
ReplicateAllReferenceTablesToNode(workerNode);
ereport(ERROR, (errmsg("node at \"%s:%u\" does not exist", node->workerName,
node->workerPort)));
}
/*
* Sync node metadata. We must sync node metadata before syncing table
* related pg_dist_xxx metadata. Since table related metadata requires
* to have right pg_dist_node entries.
*/
SyncNodeMetadataToNode(nodeName, nodePort);
/* both nodes should be the same */
Assert(workerNode->nodeId == node->nodeId);
/*
* As the last step, sync the table related metadata to the remote node.
* We must handle it as the last step because of limitations shared with
* above comments.
* Delete existing reference and replicated table placements on the
* given groupId if the group has been disabled earlier (e.g., isActive
* set to false).
*
* Sync the metadata changes to all existing metadata nodes irrespective
* of the current nodes' metadata sync state. We expect all nodes up
* and running when another node is activated.
*/
SyncPgDistTableMetadataToNode(workerNode);
if (!workerNode->isActive && NodeIsPrimary(workerNode))
{
bool localOnly = false;
DeleteAllReplicatedTablePlacementsFromNodeGroup(workerNode->groupId,
localOnly);
}
workerNode =
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_isactive,
BoolGetDatum(true));
/* TODO: Once all tests will be enabled for MX, we can remove sync by default check */
bool syncMetadata = EnableMetadataSync && NodeIsPrimary(workerNode);
if (syncMetadata)
{
/*
* We are going to sync the metadata anyway in this transaction, so do
* not fail just because the current metadata is not synced.
*/
SetWorkerColumn(workerNode, Anum_pg_dist_node_metadatasynced,
BoolGetDatum(true));
/*
* Update local group id first, as object dependency logic requires to have
* updated local group id.
*/
UpdateLocalGroupIdOnNode(workerNode);
nodeToSyncMetadata = lappend(nodeToSyncMetadata, workerNode);
}
}
/*
* Sync distributed objects first. We must sync distributed objects before
* replicating reference tables to the remote node, as reference tables may
* need such objects.
*/
SyncDistributedObjectsToNodeList(nodeToSyncMetadata);
if (ReplicateReferenceTablesOnActivate)
{
foreach_ptr(node, nodeList)
{
/*
* We need to replicate reference tables before syncing node metadata, otherwise
* reference table replication logic would try to get lock on the new node before
* having the shard placement on it
*/
if (NodeIsPrimary(node))
{
ReplicateAllReferenceTablesToNode(node);
}
}
}
/*
* Sync node metadata. We must sync node metadata before syncing table
* related pg_dist_xxx metadata. Since table related metadata requires
* to have right pg_dist_node entries.
*/
foreach_ptr(node, nodeToSyncMetadata)
{
SyncNodeMetadataToNode(node->workerName, node->workerPort);
}
/*
* As the last step, sync the table related metadata to the remote node.
* We must handle it as the last step because of limitations shared with
* above comments.
*/
SyncPgDistTableMetadataToNodeList(nodeToSyncMetadata);
foreach_ptr(node, nodeList)
{
bool isActive = true;
/* finally, let all other active metadata nodes to learn about this change */
SetNodeState(node->workerName, node->workerPort, isActive);
}
}
/*
* ActivateNode activates the node with nodeName and nodePort. Currently, activation
* includes only replicating the reference tables and setting isactive column of the
* given node.
*/
int
ActivateNode(char *nodeName, int nodePort)
{
bool isActive = true;
WorkerNode *workerNode = ModifiableWorkerNode(nodeName, nodePort);
ActivateNodeList(list_make1(workerNode));
/* finally, let all other active metadata nodes to learn about this change */
WorkerNode *newWorkerNode = SetNodeState(nodeName, nodePort, isActive);
Assert(newWorkerNode->nodeId == workerNode->nodeId);
@ -1319,9 +1434,9 @@ citus_update_node(PG_FUNCTION_ARGS)
* early, but that's fine, since this will start a retry loop with
* 5 second intervals until sync is complete.
*/
if (UnsetMetadataSyncedForAll())
if (UnsetMetadataSyncedForAllWorkers())
{
TriggerMetadataSyncOnCommit();
TriggerNodeMetadataSyncOnCommit();
}
if (handle != NULL)
@ -1558,6 +1673,29 @@ citus_coordinator_nodeid(PG_FUNCTION_ARGS)
}
/*
* citus_is_coordinator returns whether the current node is a coordinator.
* We consider the node a coordinator if its group ID is 0 and it has
* pg_dist_node entries (only group ID 0 could indicate a worker without
* metadata).
*/
Datum
citus_is_coordinator(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
bool isCoordinator = false;
if (GetLocalGroupId() == COORDINATOR_GROUP_ID &&
ActiveReadableNodeCount() > 0)
{
isCoordinator = true;
}
PG_RETURN_BOOL(isCoordinator);
}
/*
* FindWorkerNode searches over the worker nodes and returns the workerNode
* if it already exists. Else, the function returns NULL.
@ -1761,12 +1899,15 @@ RemoveNodeFromCluster(char *nodeName, int32 nodePort)
RemoveOldShardPlacementForNodeGroup(workerNode->groupId);
char *nodeDeleteCommand = NodeDeleteCommand(workerNode->nodeId);
/* make sure we don't have any lingering session lifespan connections */
CloseNodeConnectionsAfterTransaction(workerNode->workerName, nodePort);
SendCommandToWorkersWithMetadata(nodeDeleteCommand);
if (EnableMetadataSync)
{
char *nodeDeleteCommand = NodeDeleteCommand(workerNode->nodeId);
SendCommandToWorkersWithMetadata(nodeDeleteCommand);
}
}
@ -2017,18 +2158,21 @@ AddNodeMetadata(char *nodeName, int32 nodePort,
workerNode = FindWorkerNodeAnyCluster(nodeName, nodePort);
/* send the delete command to all primary nodes with metadata */
char *nodeDeleteCommand = NodeDeleteCommand(workerNode->nodeId);
SendCommandToWorkersWithMetadata(nodeDeleteCommand);
/* finally prepare the insert command and send it to all primary nodes */
uint32 primariesWithMetadata = CountPrimariesWithMetadata();
if (primariesWithMetadata != 0)
if (EnableMetadataSync)
{
List *workerNodeList = list_make1(workerNode);
char *nodeInsertCommand = NodeListInsertCommand(workerNodeList);
/* send the delete command to all primary nodes with metadata */
char *nodeDeleteCommand = NodeDeleteCommand(workerNode->nodeId);
SendCommandToWorkersWithMetadata(nodeDeleteCommand);
SendCommandToWorkersWithMetadata(nodeInsertCommand);
/* finally prepare the insert command and send it to all primary nodes */
uint32 primariesWithMetadata = CountPrimariesWithMetadata();
if (primariesWithMetadata != 0)
{
List *workerNodeList = list_make1(workerNode);
char *nodeInsertCommand = NodeListInsertCommand(workerNodeList);
SendCommandToWorkersWithMetadata(nodeInsertCommand);
}
}
return workerNode->nodeId;
@ -2047,11 +2191,13 @@ SetWorkerColumn(WorkerNode *workerNode, int columnIndex, Datum value)
{
workerNode = SetWorkerColumnLocalOnly(workerNode, columnIndex, value);
char *metadataSyncCommand = GetMetadataSyncCommandToSetNodeColumn(workerNode,
columnIndex,
value);
if (EnableMetadataSync)
{
char *metadataSyncCommand =
GetMetadataSyncCommandToSetNodeColumn(workerNode, columnIndex, value);
SendCommandToWorkersWithMetadata(metadataSyncCommand);
SendCommandToWorkersWithMetadata(metadataSyncCommand);
}
return workerNode;
}
@ -2513,9 +2659,16 @@ DeleteNodeRow(char *nodeName, int32 nodePort)
/*
* simple_heap_delete() expects that the caller has at least an
* AccessShareLock on replica identity index.
* AccessShareLock on primary key index.
*
* XXX: This does not seem required, do we really need to acquire this lock?
* Postgres doesn't acquire such locks on indexes before deleting catalog tuples.
* Linking here the reasons we added this lock acquirement:
* https://github.com/citusdata/citus/pull/2851#discussion_r306569462
* https://github.com/citusdata/citus/pull/2855#discussion_r313628554
* https://github.com/citusdata/citus/issues/1890
*/
Relation replicaIndex = index_open(RelationGetReplicaIndex(pgDistNode),
Relation replicaIndex = index_open(RelationGetPrimaryKeyIndex(pgDistNode),
AccessShareLock);
ScanKeyInit(&scanKey[0], Anum_pg_dist_node_nodename,
@ -2646,15 +2799,15 @@ DatumToString(Datum datum, Oid dataType)
/*
* UnsetMetadataSyncedForAll sets the metadatasynced column of all metadata
* nodes to false. It returns true if it updated at least a node.
* UnsetMetadataSyncedForAllWorkers sets the metadatasynced column of all metadata
* worker nodes to false. It returns true if it updated at least a node.
*/
static bool
UnsetMetadataSyncedForAll(void)
UnsetMetadataSyncedForAllWorkers(void)
{
bool updatedAtLeastOne = false;
ScanKeyData scanKey[2];
int scanKeyCount = 2;
ScanKeyData scanKey[3];
int scanKeyCount = 3;
bool indexOK = false;
/*
@ -2669,6 +2822,11 @@ UnsetMetadataSyncedForAll(void)
ScanKeyInit(&scanKey[1], Anum_pg_dist_node_metadatasynced,
BTEqualStrategyNumber, F_BOOLEQ, BoolGetDatum(true));
/* coordinator always has the up to date metadata */
ScanKeyInit(&scanKey[2], Anum_pg_dist_node_groupid,
BTGreaterStrategyNumber, F_INT4GT,
Int32GetDatum(COORDINATOR_GROUP_ID));
CatalogIndexState indstate = CatalogOpenIndexes(relation);
SysScanDesc scanDescriptor = systable_beginscan(relation,

Some files were not shown because too many files have changed in this diff Show More