This patch includes the username in the reported error message.
This makes debugging easier when certain commands open connections
as other users than the user that is executing the command.
```
monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017);
ERROR: connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied
Time: 40,198 ms
```
(cherry picked from commit 8b48d6ab02)
Test isolation_update_node fails on some systems with the following error:
```
-s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Name or service not known
+s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Temporary failure in name resolution
```
This slightly modifies an already existing [normalization
rule](739c6d26df/src/test/regress/bin/normalize.sed (L217-L218))
to fix it.
Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 21464adfec)
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".
To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.
The first commit shows a sample output for the test prior to the fix.
Fixes#7477
---------
Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
(cherry picked from commit 8afa2d0386)
And when that is the case, directly use it as "host" parameter for the
connections between nodes and use the "hostname" provided in
pg_dist_node / pg_dist_poolinfo as "hostaddr" to avoid host name lookup.
This is to avoid allowing dns resolution (and / or setting up DNS names
for each host in the cluster). This already works currently when using
IPs in the hostname. The only use of setting host is that you can then
use sslmode=verify-full and it will validate that the hostname matches
the certificate provided by the node you're connecting too.
It would be more flexible to make this a per-node setting, but that
requires SQL changes. And we'd like to backport this change, and
backporting such a sql change would be quite hard while backporting this
change would be very easy. And in many setups, a different hostname for
TLS validation is actually not needed. The reason for that is
query-from-any node: With query-from-any-node all nodes usually have a
certificate that is valid for the same "cluster hostname", either using
a wildcard cert or a Subject Alternative Name (SAN). Because if you load
balance across nodes you don't know which node you're connecting to, but
you still want TLS validation to do it's job. So with this change you
can use this same "cluster hostname" for TLS validation within the
cluster. Obviously this means you don't validate that you're connecting
to a particular node, just that you're connecting to one of the nodes in
the cluster, but that should be fine from a security perspective (in
most cases).
Note to self: This change requires updating
https://docs.citusdata.com/en/latest/develop/api_guc.html#citus-node-conninfo-text.
DESCRIPTION: Allows overwriting host name for all inter-node connections
by supporting "host" parameter in citus.node_conninfo
(cherry picked from commit 3586aab17a)
When using a CASE WHEN expression in the body
of the function that is used in the DO block, a segmentation
fault occured. This fixes that.
Fixes#7381
---------
Co-authored-by: Konstantin Morozov <vzbdryn@yahoo.com>
(cherry picked from commit 12f56438fc)
DESCRIPTION: Fixes a crash caused by some form of ALTER TABLE ADD COLUMN
statements. When adding multiple columns, if one of the ADD COLUMN
statements contains a FOREIGN constraint ommitting the referenced
columns in the statement, a SEGFAULT occurs.
For instance, the following statement results in a crash:
```
ALTER TABLE lt ADD COLUMN new_col1 bool,
ADD COLUMN new_col2 int references rt;
```
Fixes#7520.
(cherry picked from commit fdd658acec)
Fix check-arbitrary-configs tests failure with current REL_16_STABLE.
This is the same problem as described in #7573. I missed pg_regress call
in _run_pg_regress() in that PR.
Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 41e2af8ff5)
In PostgreSQL 16 a new option expecteddir was introduced to pg_regress.
Together with fix in
[196eeb6b](https://github.com/postgres/postgres/commit/196eeb6b) it
causes check-vanilla failure if expecteddir is not specified.
Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 41d99249d9)
DESCRIPTION: Adds ALTER DATABASE WITH ... and REFRESH COLLATION VERSION
support
This PR adds supports for basic ALTER DATABASE statements propagation
support. Below statements are supported:
ALTER DATABASE <database_name> with IS_TEMPLATE <true/false>;
ALTER DATABASE <database_name> with CONNECTION LIMIT <integer_value>;
ALTER DATABASE <database_name> REFRESH COLLATION VERSION;
---------
Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
We currently don't support propagating these options in Citus
Relevant PG commits:
https://github.com/postgres/postgres/commit/e3ce2dehttps://github.com/postgres/postgres/commit/3d14e17
Limitation:
We also need to take care of generated GRANT statements by dependencies
in attempt to distribute something else. Specifically, this part of the
code in `GenerateGrantRoleStmtsOfRole`:
```
grantRoleStmt->admin_opt = membership->admin_option;
```
In PG16, membership also has `inherit_option` and `set_option` which
need to properly be part of the `grantRoleStmt`. We can skip for now
since #7164 will take care of this soon, and also this is not an
expected use-case.
Add citus_schema_move() that can be used to move tenant tables within a distributed
schema to another node. The function has two variations as simple wrappers around
citus_move_shard_placement() and citus_move_shard_placement_with_nodeid() respectively.
They pick a shard that belongs to the given tenant schema and resolve the source node
that contain the shards under given tenant schema. Hence their signatures are quite
similar to underlying functions:
```sql
-- citus_schema_move(), using target node name and node port
CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move(
schema_id regnamespace,
target_node_name text,
target_node_port integer,
shard_transfer_mode citus.shard_transfer_mode default 'auto')
RETURNS void
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_schema_move$$;
-- citus_schema_move(), using target node id
CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move(
schema_id regnamespace,
target_node_id integer,
shard_transfer_mode citus.shard_transfer_mode default 'auto')
RETURNS void
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_schema_move_with_nodeid$$;
```
Since in PG16, truncate triggers are supported on foreign tables, we add
the citus_truncate_trigger to Citus foreign tables as well, such that the TRUNCATE
command is propagated to the table's single local shard as well.
Note that TRUNCATE command was working for foreign tables even before this
commit: see https://github.com/citusdata/citus/pull/7170#issuecomment-1706240593 for details
This commit also adds tests with user-enabled truncate triggers on Citus foreign tables:
both trigger on the shell table and on its single foreign local shard.
Relevant PG commit:
https://github.com/postgres/postgres/commit/3b00a94
**Problem:**
Previously we always used an outside superuser connection to overcome
permission issues for the current user while propagating dependencies.
That has mainly 2 problems:
1. Visibility issues during dependency propagation, (metadata connection
propagates some objects like a schema, and outside transaction does not
see it and tries to create it again)
2. Security issues (it is preferrable to use current user's connection
instead of extension superuser)
**Solution (high level):**
Now, we try to make a smarter decision on whether should we use an
outside superuser connection or current user's metadata connection. We
prefer using current user's connection if any of the objects, which is
already propagated in the current transaction, is a dependency for a
target object. We do that since we assume if current user has
permissions to create the dependency, then it can most probably
propagate the target as well.
Our assumption is expected to hold most of the times but it can still be
wrong. In those cases, transaction would fail and user should set the
GUC `citus.create_object_propagation` to `deferred` to work around it.
**Solution:**
1. We track all objects propagated in the current transaction (we can
handle subtransactions),
2. We propagate dependencies via the current user's metadata connection
if any dependency is created in the current transaction to address
issues listed above. Otherwise, we still use an outside superuser
connection.
DESCRIPTION: Fixes some object propagation errors seen with transaction
blocks.
Fixes https://github.com/citusdata/citus/issues/6614
---------
Co-authored-by: Nils Dijk <nils@citusdata.com>
For a database that does not create the citus extension by running
` CREATE EXTENSION citus;`
`CitusHasBeenLoaded ` function ends up querying the `pg_extension` table
every time it is invoked. This is not an ideal situation for a such a
database.
The idea in this PR is as follows:
### A new field in MetadataCache.
Add a new variable `extensionCreatedState `of the following type:
```
typedef enum ExtensionCreatedState
{
UNKNOWN = 0,
CREATED = 1,
NOTCREATED = 2,
} ExtensionCreatedState;
```
When the MetadataCache is invalidated, `ExtensionCreatedState` will be
set to UNKNOWN.
### Invalidate MetadataCache when CREATE/DROP/ALTER EXTENSION citus
commands are run.
- Register a callback function, named
`InvalidateDistRelationCacheCallback`, for relcache invalidation during
the shared library initialization for `citus.so`. This callback function
is invoked in all the backends whenever the relcache is invalidated in
one of the backends. (This could be caused many DDLs operations).
- In the cache invalidation callback,`
InvalidateDistRelationCacheCallback`, invalidate `MetadataCache` zeroing
it out.
- In `CitusHasBeenLoaded`, perform the costly citus is loaded check only
if the `MetadataCache` is not valid.
### Downsides
Any relcache invalidation (caused by various DDL operations) will case
Citus MetadataCache to get invalidated. Most of the time it will be
unnecessary. But we rely on that DDL operations on relations will not be
too frequent.
1. Adds an `sql_row` function, for when a query returns a single row
with multiple columns.
2. Include a `notice_handler` for easier debugging
3. Retry dropping replication slots when they are "in use", this is
often an ephemeral state and can cause flaky tests
In PG16, REINDEX DATABASE/SYSTEM name is optional.
We already don't propagate these commands automatically.
Testing here with run_command_on_workers.
Relevant PG commit:
https://github.com/postgres/postgres/commit/2cbc3c1
When we create a database, it already needs to be manually created in
the workers as well.
This new icu_rules option should work as the other options as well.
Added a test for that.
Relevant PG commit:
https://github.com/postgres/postgres/commit/30a53b7
DESCRIPTION: Presenting citus_pause_node UDF enabling pausing by
node_id.
citus_pause_node takes a node_id parameter and fetches all the shards in
that node and puts AccessExclusiveLock on all the shards inside that
node. With this lock, insert is disabled, until citus_pause_node
transaction is closed.
---------
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
DESCRIPTION: Adds grant/revoke propagation support for database
privileges
Following the implementation of support for granting and revoking
database privileges, certain tests that issued grants for worker nodes
experienced failures. These ones are fixed in this PR as well.
DESCRIPTION: Adds PG16Beta3 support
This is the final commit that adds
PG16 compatibility with Citus's current features.
You can use Citus community with PG16Beta3. This commit:
- Enables PG16 in the configure script.
- Adds PG16 tests to CI using test images that have 16beta3
- Skips wal2json cdc test since wal2json package is not available for PG16 yet
- Fixes an isolation test
Several PG16 Compatibility commits have been merged before this final one.
All these subtasks are done https://github.com/citusdata/citus/issues/7017
See the list below:
1 - 42d956888d
Resolve compilation issues
2 - 0d503dd5ac
Ruleutils and successful CREATE EXTENSION
3 - 907d72e60d
Some test outputs
4 - 7c6b4ce103
Outer join checks, subscription password, crash fixes
5 - 6056cb2c29
get_relation_info hook to avoid crash from adjusted partitioning
6 - b36c431abb
Rework PlannedStmt and Query's Permission Info
7 - ee3153fe50
More test output fixes
8 - 2c50b5f7ff
varnullingrels additions
9 - b2291374b4
More test output fixes
10- a2315fdc67
New options to vacuum and analyze
11- 9fa72545e2
Fix AM dependency and grant's admin option
12- 2d6cf8e79a
One more outer join check
Stay tuned for PG16 new features in Citus :)
Postgres got minor updates on Aug10, this commit starts using the
images with the latest version for our tests, namely 14.9 and 15.4.
Depends on https://github.com/citusdata/the-process/pull/147
For CI images, we needed to regenerate Pipfile.lock, mainly because of an issue
with pyyaml version: https://github.com/yaml/pyyaml/issues/601
We also needed to remove a failing test in subquery_local_tables.sql.
Relevant PG commit:
b0e390e6d1
b0e390e6d1d68b92e9983840941f8f6d9e083fe0
Issue: https://github.com/citusdata/citus/issues/7119
For joins where consider_join_pushdown is false, we cannot get the
information that we used to get, which prevents doing the distributed planning.
Team already contacted PG committers for this.
Until then, we remove the test from the schedule.
PG16 compatibility - part 10
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff
part 9 b2291374b4
This commit is in the series of PG16 compatibility commits. It:
- Adds buffer_usage_limit to vacuum and analyze
- Adds process_main, skip_database_stats, only_database_stats to vacuum
Important Note: adding these options is actually required for check-vanilla tests to succeed.
However, in concept, this PR belongs to "PG16 new features",
rather than "PG16 regression tests sanity"
Relevant PG commits:
1cbbee0338
1cbbee03385763b066ae3961fc61f2cd01a0d0d7
4211fbd841
4211fbd8413b26e0abedbe4338aa7cda2cd469b4
a46a7011b2
a46a7011b27188af526047a111969f257aaf4db8
More PG16 compatibility commits are coming soon ...
PG16 compatibility - part 9
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff
This commit is in the series of PG16 compatibility commits. It makes some changes
to our tests in order to be compatible with the following in PG16:
- Fix multi_subquery_in_where_reference_clause test
somehow PG got rid of the outer join
(e.g., explain doesn't show outer joins),
hence we can pushdown the subquery.
Changing to users_reference_table
- Fix unqualified column names for views in PG16
Relevant PG commit:
47bb9db759
47bb9db75996232ea71fc1e1888ffb0e70579b54
- Fix global_cancel test
Error wording and detail changed
Relevant PG commit:
2631ebab7b
2631ebab7b18bdc079fd86107c47d6104a6b3c6e
- Fix local_table_join_test with lateral subquery
Possible relevant PG commit:
ae89129aa3
ae89129aa3555c263b8c3ccc4c0f1ef7e46201aa
I removed the where clause and the limit count error was hit again.
With the where clause the query unexpectedly works.
- Fix test outputs
Relevant PG commits:
-- 1349d2790b
-- f4c7c410ee
For multi_explain and multi_complex_count_distinct there were too many places
touched so I just added an alternative test output.
For the other tests I modified the problematic parts.
More PG16 compatibility commits are coming soon ...
PG16 compatibility - part 7
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
This commit is in the series of PG16 compatibility commits. PG16 introduced a new entry
varnnullingrels to Var, which represents our partkey in pg_dist_partition.
This commit does the necessary changes in Citus to support this.
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
More PG16 compatibility commits are coming soon ...
PG16 compatibility - part 7
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
This commit is in the series of PG16 compatibility commits. It makes some changes
to our tests in order to be compatible with the following in PG16:
- PG16 removed logic for converting a table to a view
Relevant PG commit:
b23cd185fd
b23cd185fd5410e5204683933f848d4583e34b35
- Fix changed error message in certificate verification
Relevant PG commit:
8eda731465
8eda7314652703a2ae30d6c4a69c378f6813a7f2
- Fix backend type order in tests
Relevant PG commit:
0c679464a8
0c679464a837079acc75ff1d45eaa83f79e05690
- Reduce log level to omit extra NOTICE in create collation in PG16
Relevant PG commit:
a14e75eb0b
a14e75eb0b6a73821e0d66c0d407372ec8376105
That commit made LOCALE parameter apply regardless of the
provider used, and it printed the following notice:
NOTICE: using standard form "und-u-ks-level2" for ICU locale "@colStrength=secondary"
We omit this notice to omit output change between pg versions.
- Fix columnar_memory test
TopMemoryContext now has more children contexts
Possible relevant PG commit:
9d3ebba729
9d3ebba729ebaf5882a92f0f5f662a3312037605
memusage is now around 8.5 MB, whereas it was less than 8MB before.
To avoid differences between PG versions, I changed the test to compare
to less than 9 MB. It still reflects very well the improvement from
28MB.
- Alternative test output for GRANTOR values in pg_auth_members
grantor changed in PG16
Relevant PG commit:
ce6b672e44
ce6b672e4455820a0348214be0da1a024c3f619f
- Remove redundant grouping columns from our tests
Relevant PG commit:
8d83a5d0a2
8d83a5d0a2673174dc478e707de1f502935391a5
- Fix tests with different order in Filters
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
More PG16 compatibility commits are coming soon ...
PG16 compatibility - Part 4
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
This commit is in the series of PG16 compatibility commits.
It adds some outer join checks to the planner,
the new password_required option to the subscription,
and a crash fix related to PGIOAlignedBlock, see below for more details:
- Fix PGIOAlignedBlock Assert crash in PG16
Relevant PG commit:
faeedbcefd
faeedbcefd40bfdf314e048c425b6d9208896d90
- Pass planner info as argument to make_simple_restrictinfo
Pre PG16 passing plannerInfo to make_simple_restrictinfo
was only needed for placeholder Vars, which is not the case
in this part of the codebase because we are building the
expression from shard intervals which don't have placeholder
vars.
However, PG16 is counting baserels appearing in clause_relids
and is deleting the rels mentioned in plannerinfo->outer_join_rels
Hence directly accessing plannerinfo.
We will crash if we leave it as NULL.
For reference
2489d76c49 (diff-e045c41eda9686451a7993e91518e40056b3739365e39eb1b70ae438dc1f7c76R207)
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
- Add outer join checks, root->simple_rel_array
- fix rebalancer to include passwork_required option
Relevant PG commit:
c3afe8cf5a
c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6
More PG16 compatibility commits are coming soon ...
PG16 compatibility - Part 3
Check out part 1 42d956888d
and part 2 0d503dd5ac
This commit is in the series of PG compatibility. It makes some changes
to our tests in order to be compatible with the following in PG16:
Use debug_parallel_query in PG16+, force_parallel_mode otherwise
Relevant PG commit
5352ca22e0
5352ca22e0012d48055453ca9992a9515d811291
HINT changed to DETAIL in PG16
Relevant PG commit:
56d0ed3b75
56d0ed3b756b2e3799a7bbc0ac89bc7657ca2c33
Fix removed read-only server setting lc_collate
Relevant PG commit:
b0f6c43716
b0f6c437160db640d4ea3e49398ebc3ba39d1982
Fix unsupported join alias expression in sqlancer_failures
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
More PG16 compatibility commits are coming soon ...