Commit Graph

13 Commits (8a8b2f9a6d1ce86be9171e74983ed14b18e52024)

Author SHA1 Message Date
Naisila Puka 8a8b2f9a6d
Add tests for inserting with AT LOCAL operator (#7815)
PG17 has added support for AT LOCAL operator
it converts the given time type to
time stamp with the session's TimeZone value as time zone. Here we add
tests that validate that we can use AT LOCAL at INSERT commands

Relevant PG commit:
https://github.com/postgres/postgres/commit/97957fdba

With the tests, we verify that we evaluate AT LOCAL at the coordinator
and then perform the insert remotely.
2024-12-30 12:54:21 +03:00
Mehmet YILMAZ 1a3316281c
Error out for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION (#7814)
PG17 added support for
ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION.
Relevant PG commit: https://github.com/postgres/postgres/commit/5d06e99a3

We currently don't support propagating this command for Citus tables.
It is added to future work.

This PR disallows `ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION` on
all Citus table types (local, distributed, and partitioned distributed)
by adding an error check in `ErrorIfUnsupportedAlterTableStmt`. A new
regression test verifies that each table type fails with a consistent
error message when attempting to set an expression.
2024-12-27 16:02:12 +03:00
Mehmet YILMAZ 62a00b1b34
Error out for ALTER TABLE ... SET ACCESS METHOD DEFAULT (#7803)
PG17 introduced ALTER TABLE ... SET ACCESS METHOD DEFAULT

This PR introduces and enforces an error check preventing ALTER TABLE
... SET ACCESS METHOD DEFAULT on both Citus local tables (added via
citus_add_local_table_to_metadata) and distributed/partitioned
distributed tables. The regression tests now demonstrate that each table
type raises an error advising users to explicitly specify an access
method, rather than relying on DEFAULT. This ensures consistent behavior
across local and distributed environments in Citus.

The reason why we currently don't support this is that we can't simply
propagate the command as it is, because the default table access method
may be different across Citus cluster nodes.

Relevant PG commit:
https://github.com/postgres/postgres/commit/d61a6cad6
2024-12-27 15:07:38 +03:00
Naisila Puka b4cc7219ae
Add tests for FORCE_NULL * and FORCE_NOT_NULL * options for COPY FROM (#7812)
These options already existed in PG17, and we support them and have
tests for them in `multi_copy.sql`.

In PG17, their capability was extended to specify ALL columns at once
using *.
Citus performs the COPY correctly, as is validated by the added tests in
this PR.

Relevant PG commit:
https://github.com/postgres/postgres/commit/f6d4c9cf1

Copy-pasting from Postgres documentation what these options do, such
that the reviewer may better understand the tests added:

`FORCE_NOT_NULL`: Do not match the specified columns' values against the
null string. In the default case where the null string is empty, this
means that empty values will be read as zero-length strings rather than
nulls, even when they are not quoted. If * is specified, the option will
be applied to all columns. This option is allowed only in `COPY FROM`,
and only when using `CSV` format.

`FORCE_NULL`: Match the specified columns' values against the null
string, even if it has been quoted, and if a match is found set the
value to `NULL`. In the default case where the null string is empty,
this converts a quoted empty string into `NULL`. If * is specified, the
option will be applied to all columns. This option is allowed only in
`COPY FROM`, and only when using `CSV` format.

`FORCE_NULL` and `FORCE_NOT_NULL` can be used simultaneously on the same
column. This results in converting quoted null strings to null values
and unquoted null strings to empty strings.

Explain it to me like I'm a 5-year-old, for a text column:
`FORCE_NULL` looks for empty strings and registers them as `NULL`
`FORCE_NOT_NULL` looks for null values and registers them as empty
strings.
2024-12-26 16:52:42 +03:00
Naisila Puka d682bf06f7
Error for COPY FROM ... on_error, log_verbosity with Citus tables (#7811)
PG17 added the new ON_ERROR option for COPY FROM. When this option is
specified, COPY skips soft errors and
continues copying.
Relevant PG commits:
-- https://github.com/postgres/postgres/commit/9e2d87011
-- https://github.com/postgres/postgres/commit/b725b7eec

I tried it locally with Citus tables.
Without further implementation, it doesn't work correctly.
Therefore, we error out for now, and add it to future work.

PG17 also added log_verbosity option, which controls the
 amount of messages emitted during processing. This is
 currently used in COPY FROM when ON_ERROR option is set to
 ignore. Therefore, we error out for this option as well.
Relevant PG17 commit:
https://github.com/postgres/postgres/commit/f5a227895
2024-12-26 15:29:44 +03:00
Naisila Puka af88c37b56
PG17: ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT (#7808)
DESCRIPTION: Propagates ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT

We automatically support this. Adding tests only.

We currently don't support ALTER TABLE ALTER COLUMN SET STATISTICS

Relevant PG commit:
https://github.com/postgres/postgres/commit/4f622503d
2024-12-25 16:58:51 +03:00
Mehmet YILMAZ 351b1ca63c
PG17 compatibility: Fix Test Failure in multi_alter_table_add_const (#7733)
In earlier versions of PostgreSQL, exclusion constraints were not
allowed on partitioned tables. This is why the error in your regression
test (ERROR: exclusion constraints are not supported on partitioned
tables) was raised in PostgreSQL 16. In PostgreSQL 17, exclusion
constraints are now allowed on partitioned tables, which is why the
error no longer appears when you attempt to add an exclusion constraint.

The constraint exclusion mechanism, described in the documentation,
relies on CHECK constraints to decide which partitions or child tables
need to be queried.

[CHECK
constraints](https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION)

```diff
 -- Check "ADD EXCLUDE" errors out for partitioned table since the postgres does not allow it
 ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD EXCLUDE(partition_col WITH =);
-ERROR:  exclusion constraints are not supported on partitioned tables
 -- Check "ADD CHECK"
 SET client_min_messages TO DEBUG1;
 ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD CHECK (dist_col > 0);
 DEBUG:  the constraint name on the shards of the partition is too long, switching to sequential and local execution mode to prevent self deadlocks: longlonglonglonglonglonglonglonglonglonglonglo_537570f5_5_check
 DEBUG:  verifying table "longlonglonglonglonglonglonglonglonglonglonglonglonglonglongabc"
 DEBUG:  verifying table "p1"
 RESET client_min_messages;
 SELECT con.conname
     FROM pg_catalog.pg_constraint con
       INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid
       INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace
           WHERE rel.relname = 'citus_local_partitioned_table';
                      conname                      
 --------------------------------------------------
+ citus_local_partitioned_table_partition_col_excl
  citus_local_partitioned_table_check
-(1 row)
+(2 rows)
```
2024-12-23 15:24:46 +03:00
Naisila Puka f123632f85
Fix pg17 test (#7797)
Broken from this commit
e3db375149

https://github.com/citusdata/citus/actions/runs/12429202397/attempts/1#summary-34702334056
2024-12-20 20:13:48 +03:00
Mehmet YILMAZ e3db375149
PG17 compatibility: Fix Test Failure in local_table_join (#7732)
PostgreSQL 17 seems to have introduced improvements in how correlated
subqueries are handled during plan generation. Instead of generating a
trivial subplan with WHERE true, it now applies more specific filtering
(WHERE (key = 5)), which makes the execution plan more efficient.

https://github.com/postgres/postgres/commit/b262ad44


```
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/local_table_join.out /__w/citus/citus/src/test/regress/results/local_table_join.out
--- /__w/citus/citus/src/test/regress/expected/local_table_join.out.modified	2024-11-05 09:53:50.423970699 +0000
+++ /__w/citus/citus/src/test/regress/results/local_table_join.out.modified	2024-11-05 09:53:50.463971296 +0000
@@ -1420,32 +1420,32 @@
   ) as subq_1
 ) as subq_2;
 DEBUG:  Wrapping relation "custom_pg_type" to a subquery
 DEBUG:  generating subplan 204_1 for subquery SELECT typdefault FROM local_table_join.custom_pg_type WHERE true
 ERROR:  direct joins between distributed and local tables are not supported
 HINT:  Use CTE's or subqueries to select from local tables and use them in joins
 -- correlated sublinks are not yet supported because of #4470, unless we convert not-correlated table
 SELECT COUNT(*) FROM distributed_table d1 JOIN postgres_table using(key)
 WHERE d1.key IN (SELECT key FROM distributed_table WHERE d1.key = key and key = 5);
 DEBUG:  Wrapping relation "postgres_table" to a subquery
-DEBUG:  generating subplan XXX_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE true
+DEBUG:  generating subplan 206_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE (key OPERATOR(pg_catalog.=) 5)
```

Co-authored-by: Naisila Puka <37271756+naisila@users.noreply.github.com>
2024-12-19 22:21:51 +03:00
Colm e91ee245ac
PG17 compatibility: account for identity columns in partitioned tables. (#7785)
PG17 added support for identity columns in partitioned tables:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=699586315
A consequence is that a table with an identity column cannot be attached
as a partition. But Citus on Postgres 17 will generate identity column
for the partitions if the parent table has one (or more) identity
columns when propagating distributed table DDL to worker nodes, as
happens in the `generated_identity` regress test in #7768:
```
 CREATE TABLE partitioned_table (
     a bigint CONSTRAINT myconname GENERATED BY DEFAULT AS IDENTITY (START WITH 10 INCREMENT BY 10),
     b bigint GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10),
     c int
 )
 PARTITION BY RANGE (c);
 CREATE TABLE partitioned_table_1_50 PARTITION OF partitioned_table FOR VALUES FROM (1) TO (50);
 CREATE TABLE partitioned_table_50_500 PARTITION OF partitioned_table FOR VALUES FROM (50) TO (1000);
 SELECT create_distributed_table('partitioned_table', 'a');
- create_distributed_table
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  table "partitioned_table_1_50" being attached contains an identity column "a"
+DETAIL:  The new partition may not contain an identity column.
```
It is the Citus-generated ATTACH PARTITION statement that errors out,
because the Citus-generated CREATE TABLE for the partitions included
identity column definitions. The fix is straightforward - when
propagating the CREATE TABLE ddl for a partition of a table with an
identity column, don't include the identity column(s), they will be
inherited on attaching the partition. In Citus on Postgres 16 (or less)
partitions do not inherit identity; the partitions in the example would
not have any identity columns so it was not an issue previously.
2024-12-18 13:18:53 +00:00
Colm 1c2e78405b
PG17 compatibility: account for MAINTAIN privilege in regress tests (#7774)
This PR addresses regress tests impacted by the introduction of [the
MAINTAIN privilege in
PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ecb0fd337).
The impacted tests include `generated_identity`,
`create_single_shard_table`, `grant_on_sequence_propagation`,
`grant_on_foreign_server_propagation`, `single_node_enterprise`,
`multi_multiuser_master_protocol`,
`multi_alter_table_row_level_security`, `shard_move_constraints` which
show the following error:
```
SELECT start_metadata_sync_to_node('localhost', :worker_2_port);
- start_metadata_sync_to_node
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  unrecognized aclright: 16384
```

and `multi_multiuser_master_protocol`, where the `pg_class.relacl`
column has 'm' for MAINTAIN if applicable:
```
        relname       |   rolname   |                           relacl                           
 ---------------------+-------------+------------------------------------------------------------
  trivial_full_access | full_access | 
- trivial_postgres    | postgres    | {postgres=arwdDxt/postgres,full_access=arwdDxt/postgres}
+ trivial_postgres    | postgres    | {postgres=arwdDxtm/postgres,full_access=arwdDxtm/postgres}
```

The PR updates function `convert_aclright_to_string()` in
citus_ruleutils.c to include a case for `ACL_MAINTAIN`. Per the comment
on `convert_aclright_to_string()` in citus_ruleutils.c, it is a copy of
`convert_aclright_to_string()` in Postgres (where it is in
`src/backend/utils/adt/acl.c`), so requires updating to be consistent
with Postgres. With this change Citus can recognize the MAINTAIN
privilege, and will not emit the `unrecognized aclright` error. The PR
also adds an alternative goldfile for `multi_multiuser_master_protocol`.

Note that `convert_aclright_to_string()` in Postgres includes access
types SET and ALTER SYSTEM on system parameters (aka GUCs), added by
[this PG16
commit](https://github.com/postgres/postgres/commit/a0ffa885e). If Citus
were to have a requirement to support granting SET and ALTER SYSTEM we
would need to update `convert_aclright_to_string()` in citus_ruleutils.c
with SET and ALTER SYSTEM.
2024-12-06 13:03:51 +00:00
Colm 089555c197
PG17 compatibility: fix diff in tableam (#7771)
Test `tableam` expects that this CREATE TABLE statement: `CREATE TABLE
test_partitioned(id int, p int, val int) PARTITION BY RANGE (p) USING
fake_am;`
will produce this error:
`specifying a table access method is not supported on a partitioned
table`

but as of [this PG
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=374c7a229)
it is possible to specify an access method on a partitioned table. This
fix moves the CREATE TABLE statement to pg17, and adds an additional
test to show parent access method is inherited.
2024-12-02 13:47:19 +00:00
Colm 680c23ffcf
PG17 compatibility: add/fix tests with correlated subqueries that can be pulled to a join (#7745)
Fix Test Failure in subquery_in_where, set_operations, dml_recursive in
PG17 #7741

The test failures are caused by[ this commit in
PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639),
which enables correlated subqueries to be pulled up to a join. Prior to
this, the correlated subquery was implemented as a subplan. In citus, it
is not possible to pushdown a correlated subplan, but with a different
plan in PG17 the query can be executed, per the test diff from
`subquery_in_where`:

```
37,39c37,41
< DEBUG:  generating subplan XXX_1 for CTE event_id: SELECT user_id AS events_user_id, "time" AS events_time, event_type FROM public.events_table
< DEBUG:  Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ...
< ERROR:  correlated subqueries are not supported when the FROM clause contains a CTE or subquery
---
>  count
> ---------------------------------------------------------------------
>      0
> (1 row)
> 
```

This is because with pg17 `= ANY subquery` in the queries can be
implemented as a join, instead of as a subplan filter on a table scan.
For example, `SELECT * FROM test a WHERE x IN (SELECT x FROM test b
UNION SELECT y FROM test c WHERE a.x = c.x) ORDER BY 1,2` (from
set_operations) has this plan in pg17; note that the subquery is the
inner side of a nested loop join:
```
┌───────────────────────────────────────────────────┐
│                    QUERY PLAN                     │
├───────────────────────────────────────────────────┤
│ Sort                                              │
│   Sort Key: a.x, a.y                              │
│   ->  Nested Loop                                 │
│         ->  Seq Scan on test a                    │
│         ->  Subquery Scan on "ANY_subquery"       │
│               Filter: (a.x = "ANY_subquery".x)    │
│               ->  HashAggregate                   │
│                     Group Key: b.x                │
│                     ->  Append                    │
│                           ->  Seq Scan on test b  │
│                           ->  Seq Scan on test c  │
│                                 Filter: (a.x = x) │
└───────────────────────────────────────────────────┘
```
and this plan in pg16 (and previous pg versions); the subquery is a
correlated subplan filter on a table scan:
```
┌───────────────────────────────────────────────┐
│                  QUERY PLAN                   │
├───────────────────────────────────────────────┤
│ Sort                                          │
│   Sort Key: a.x, a.y                          │
│   ->  Seq Scan on test a                      │
│         Filter: (SubPlan 1)                   │
│         SubPlan 1                             │
│           ->  HashAggregate                   │
│                 Group Key: b.x                │
│                 ->  Append                    │
│                       ->  Seq Scan on test b  │
│                       ->  Seq Scan on test c  │
│                             Filter: (a.x = x) │
└───────────────────────────────────────────────┘
```

The fix Modifies the queries causing the test failures so that an ANY
subquery is not folded to a join, preserving the expected output of the
tests. A similar approach was taken for existing regress tests in the[
postgres
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639).
See the `join `regress test, for example.

We also add pg17 specific tests that leverage this improvement in Postgres
with Citus distributed planning as well.
2024-11-20 14:51:16 +03:00