PostgreSQL 17 seems to have introduced improvements in how correlated
subqueries are handled during plan generation. Instead of generating a
trivial subplan with WHERE true, it now applies more specific filtering
(WHERE (key = 5)), which makes the execution plan more efficient.
https://github.com/postgres/postgres/commit/b262ad44
```
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/local_table_join.out /__w/citus/citus/src/test/regress/results/local_table_join.out
--- /__w/citus/citus/src/test/regress/expected/local_table_join.out.modified 2024-11-05 09:53:50.423970699 +0000
+++ /__w/citus/citus/src/test/regress/results/local_table_join.out.modified 2024-11-05 09:53:50.463971296 +0000
@@ -1420,32 +1420,32 @@
) as subq_1
) as subq_2;
DEBUG: Wrapping relation "custom_pg_type" to a subquery
DEBUG: generating subplan 204_1 for subquery SELECT typdefault FROM local_table_join.custom_pg_type WHERE true
ERROR: direct joins between distributed and local tables are not supported
HINT: Use CTE's or subqueries to select from local tables and use them in joins
-- correlated sublinks are not yet supported because of #4470, unless we convert not-correlated table
SELECT COUNT(*) FROM distributed_table d1 JOIN postgres_table using(key)
WHERE d1.key IN (SELECT key FROM distributed_table WHERE d1.key = key and key = 5);
DEBUG: Wrapping relation "postgres_table" to a subquery
-DEBUG: generating subplan XXX_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE true
+DEBUG: generating subplan 206_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE (key OPERATOR(pg_catalog.=) 5)
```
Co-authored-by: Naisila Puka <37271756+naisila@users.noreply.github.com>
PG17 added support for identity columns in partitioned tables:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=699586315
A consequence is that a table with an identity column cannot be attached
as a partition. But Citus on Postgres 17 will generate identity column
for the partitions if the parent table has one (or more) identity
columns when propagating distributed table DDL to worker nodes, as
happens in the `generated_identity` regress test in #7768:
```
CREATE TABLE partitioned_table (
a bigint CONSTRAINT myconname GENERATED BY DEFAULT AS IDENTITY (START WITH 10 INCREMENT BY 10),
b bigint GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10),
c int
)
PARTITION BY RANGE (c);
CREATE TABLE partitioned_table_1_50 PARTITION OF partitioned_table FOR VALUES FROM (1) TO (50);
CREATE TABLE partitioned_table_50_500 PARTITION OF partitioned_table FOR VALUES FROM (50) TO (1000);
SELECT create_distributed_table('partitioned_table', 'a');
- create_distributed_table
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR: table "partitioned_table_1_50" being attached contains an identity column "a"
+DETAIL: The new partition may not contain an identity column.
```
It is the Citus-generated ATTACH PARTITION statement that errors out,
because the Citus-generated CREATE TABLE for the partitions included
identity column definitions. The fix is straightforward - when
propagating the CREATE TABLE ddl for a partition of a table with an
identity column, don't include the identity column(s), they will be
inherited on attaching the partition. In Citus on Postgres 16 (or less)
partitions do not inherit identity; the partitions in the example would
not have any identity columns so it was not an issue previously.
This PR addresses regress tests impacted by the introduction of [the
MAINTAIN privilege in
PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ecb0fd337).
The impacted tests include `generated_identity`,
`create_single_shard_table`, `grant_on_sequence_propagation`,
`grant_on_foreign_server_propagation`, `single_node_enterprise`,
`multi_multiuser_master_protocol`,
`multi_alter_table_row_level_security`, `shard_move_constraints` which
show the following error:
```
SELECT start_metadata_sync_to_node('localhost', :worker_2_port);
- start_metadata_sync_to_node
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR: unrecognized aclright: 16384
```
and `multi_multiuser_master_protocol`, where the `pg_class.relacl`
column has 'm' for MAINTAIN if applicable:
```
relname | rolname | relacl
---------------------+-------------+------------------------------------------------------------
trivial_full_access | full_access |
- trivial_postgres | postgres | {postgres=arwdDxt/postgres,full_access=arwdDxt/postgres}
+ trivial_postgres | postgres | {postgres=arwdDxtm/postgres,full_access=arwdDxtm/postgres}
```
The PR updates function `convert_aclright_to_string()` in
citus_ruleutils.c to include a case for `ACL_MAINTAIN`. Per the comment
on `convert_aclright_to_string()` in citus_ruleutils.c, it is a copy of
`convert_aclright_to_string()` in Postgres (where it is in
`src/backend/utils/adt/acl.c`), so requires updating to be consistent
with Postgres. With this change Citus can recognize the MAINTAIN
privilege, and will not emit the `unrecognized aclright` error. The PR
also adds an alternative goldfile for `multi_multiuser_master_protocol`.
Note that `convert_aclright_to_string()` in Postgres includes access
types SET and ALTER SYSTEM on system parameters (aka GUCs), added by
[this PG16
commit](https://github.com/postgres/postgres/commit/a0ffa885e). If Citus
were to have a requirement to support granting SET and ALTER SYSTEM we
would need to update `convert_aclright_to_string()` in citus_ruleutils.c
with SET and ALTER SYSTEM.
Test `tableam` expects that this CREATE TABLE statement: `CREATE TABLE
test_partitioned(id int, p int, val int) PARTITION BY RANGE (p) USING
fake_am;`
will produce this error:
`specifying a table access method is not supported on a partitioned
table`
but as of [this PG
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=374c7a229)
it is possible to specify an access method on a partitioned table. This
fix moves the CREATE TABLE statement to pg17, and adds an additional
test to show parent access method is inherited.
Fix Test Failure in subquery_in_where, set_operations, dml_recursive in
PG17 #7741
The test failures are caused by[ this commit in
PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639),
which enables correlated subqueries to be pulled up to a join. Prior to
this, the correlated subquery was implemented as a subplan. In citus, it
is not possible to pushdown a correlated subplan, but with a different
plan in PG17 the query can be executed, per the test diff from
`subquery_in_where`:
```
37,39c37,41
< DEBUG: generating subplan XXX_1 for CTE event_id: SELECT user_id AS events_user_id, "time" AS events_time, event_type FROM public.events_table
< DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ...
< ERROR: correlated subqueries are not supported when the FROM clause contains a CTE or subquery
---
> count
> ---------------------------------------------------------------------
> 0
> (1 row)
>
```
This is because with pg17 `= ANY subquery` in the queries can be
implemented as a join, instead of as a subplan filter on a table scan.
For example, `SELECT * FROM test a WHERE x IN (SELECT x FROM test b
UNION SELECT y FROM test c WHERE a.x = c.x) ORDER BY 1,2` (from
set_operations) has this plan in pg17; note that the subquery is the
inner side of a nested loop join:
```
┌───────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────┤
│ Sort │
│ Sort Key: a.x, a.y │
│ -> Nested Loop │
│ -> Seq Scan on test a │
│ -> Subquery Scan on "ANY_subquery" │
│ Filter: (a.x = "ANY_subquery".x) │
│ -> HashAggregate │
│ Group Key: b.x │
│ -> Append │
│ -> Seq Scan on test b │
│ -> Seq Scan on test c │
│ Filter: (a.x = x) │
└───────────────────────────────────────────────────┘
```
and this plan in pg16 (and previous pg versions); the subquery is a
correlated subplan filter on a table scan:
```
┌───────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────┤
│ Sort │
│ Sort Key: a.x, a.y │
│ -> Seq Scan on test a │
│ Filter: (SubPlan 1) │
│ SubPlan 1 │
│ -> HashAggregate │
│ Group Key: b.x │
│ -> Append │
│ -> Seq Scan on test b │
│ -> Seq Scan on test c │
│ Filter: (a.x = x) │
└───────────────────────────────────────────────┘
```
The fix Modifies the queries causing the test failures so that an ANY
subquery is not folded to a join, preserving the expected output of the
tests. A similar approach was taken for existing regress tests in the[
postgres
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639).
See the `join `regress test, for example.
We also add pg17 specific tests that leverage this improvement in Postgres
with Citus distributed planning as well.