Commit Graph

16 Commits (84fc6801bacc167fb05ff472cfe292c1c2d387f6)

Author SHA1 Message Date
Mehmet YILMAZ 84fc6801ba
Add PG18 logical replication tests for generated columns via publication column lists (#8382)
DESCRIPTION: Add PG18 tests verifying generated-column replication via
publication lists

https://github.com/postgres/postgres/commit/745217a05

Adds a new regression-test section that validates **end-to-end logical
replication behavior** for PG18 generated columns when using publication
**column lists**, using worker1 as publisher and worker2 as subscriber.

### Case A: column list includes generated column `b`

* Publisher table has `b GENERATED ALWAYS AS (a * 10) STORED`
* Publication uses `FOR TABLE ... (id, a, b)` with
`publish_generated_columns = none`
* Subscriber defines `b` as a **plain column** (not generated) to ensure
replicated values are applied
* Verifies:

  * Initial sync copies `b` values (20, 70)
* Streaming INSERT/UPDATE replicates `b` values (e.g., `(1,5,50)` and
`(3,9,90)`)

### Case B: column list excludes `b` (precedence)

* Publisher table still has generated `b`
* Publication uses `FOR TABLE ... (id, a)` with
`publish_generated_columns = stored`
* Subscriber defines `b` as a plain column
* Verifies:

* `b` is **not** replicated and remains `NULL` for initial sync and
streaming changes, demonstrating column-list precedence via data.
2025-12-17 14:58:20 +03:00
Colm 6d156690b5
PG18: GUC file_copy_method behaves as expected. (#8379)
Add tests to check that the GUC `file_copy_method` behaves as expected
when database commands are propagated. It is not relevant for CREATE
DATABASE .. WITH STRATEGY because citus only supports wal_log here, but
ALTER DATABASE .. SET TABLESPACE can use it to determine the OS-level
file management.
2025-12-16 10:43:50 +00:00
Colm 0e110ee5a9
PG18: drop constraints ONLY and NOT VALID fks on partitioned tables (#8374)
This commit verifies that PG18 behavior relating to partitioned tables
is propagated and behaves consistently on Citus distributed tables:
- Allow dropping of constraints ONLY on partitioned tables
- Allow NOT VALID foreign key constraints on partitioned tables
2025-12-15 10:12:18 +00:00
Colm b68023f91d
PG18: verify text search and LIKE with nondeterministic collations. (#8350)
No code change required in Citus, just verificatoin that the PG18 tests
give the same results on a Citus table. Relevant PG commits:
   - 329304c90 for text functions with nondeterministic collations
   - 85b7efa1c for LIKE with nondeterministic collations
2025-12-09 18:19:09 +00:00
Mehmet YILMAZ 31911d8297
PG18 – Respect VACUUM/ANALYZE ONLY semantics for Citus tables (#8365)
fixes #8364

PostgreSQL 18 changes VACUUM/ANALYZE to recurse into inheritance
children by default, and introduces `ONLY` to limit processing to the
parent. Upstream change:

[https://github.com/postgres/postgres/commit/62ddf7ee9](https://github.com/postgres/postgres/commit/62ddf7ee9)

For Citus tables, we should treat shard placements as “children” and
avoid propagating `VACUUM/ANALYZE` to shards when the user explicitly
asks for `ONLY`.

This PR adjusts the Citus VACUUM handling to align with PG18 semantics,
and adds regression coverage on both regular distributed tables and
partitioned distributed tables.

---

### Behavior changes

* Introduce a per-relation helper struct:

  ```c
  typedef struct CitusVacuumRelation
  {
      VacuumRelation *vacuumRelation;
      Oid             relationId;
  } CitusVacuumRelation;
  ```

  This lets us keep both:

  * the resolved relation OID (for `IsCitusTable`, task building), and
* the original `VacuumRelation` node (for column list and ONLY/inh
flag).

* Replace the old `VacuumRelationIdList` / `ExtractVacuumTargetRels`
flow with:

  ```c
  static List *VacuumRelationList(VacuumStmt *vacuumStmt,
                                  CitusVacuumParams vacuumParams);
  ```

  `VacuumRelationList` now:

  * Iterates over `vacuumStmt->rels`.
* Resolves `relid` via `RangeVarGetRelidExtended` when `relation` is
present.
* Falls back to locking `VacuumRelation->oid` when only an OID is
available.
* Respects `VACOPT_FULL` for lock mode and `VACOPT_SKIP_LOCKED` for
locking behavior.
  * Builds a `List *` of `CitusVacuumRelation` entries.

* Update:

  ```c
  IsDistributedVacuumStmt(List *vacuumRelationList);
  ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt,
                                   List *vacuumRelationList,
                                   CitusVacuumParams vacuumParams);
  ```

  to operate on `CitusVacuumRelation` instead of bare OIDs.

* Implement `ONLY` semantics in `ExecuteVacuumOnDistributedTables`:

  ```c
  RangeVar *relation = vacuumRelation->relation;
  if (relation != NULL && !relation->inh)
  {
      /* ONLY specified, so don't recurse to shard placements */
      continue;
  }
  ```

  Effect:

* `VACUUM / ANALYZE` (no `ONLY`) on a Citus table: behavior unchanged,
Citus creates tasks and propagates to shard placements.
  * `VACUUM ONLY <citus_table>` / `ANALYZE ONLY <citus_table>`:

    * Core still processes the coordinator relation as usual.
* Citus **skips** building tasks for shard placements, so we do not
recurse into distributed children.
* The code compiles and behaves as before on pre-PG18; the new behavior
becomes observable only when the core planner starts setting `inh =
false` for `ONLY` (PG18).

* Unqualified `VACUUM` / `ANALYZE` (no rels) is unchanged and still
handled via `ExecuteUnqualifiedVacuumTasks`.

* Remove now-redundant helpers:

  * `VacuumColumnList`
  * `ExtractVacuumTargetRels`

Column lists are now taken directly from `vacuumRelation->va_cols` via
`CitusVacuumRelation`.

---

### Testing

Extend `src/test/regress/sql/pg18.sql` and `expected/pg18.out` with two
PG18-only blocks that verify we do not recurse into shard placements
when `ONLY` is used:

1. **Simple distributed table (`pg18_vacuum_part`)**

   * Create and distribute a regular table:

     ```sql
     CREATE SCHEMA pg18_vacuum_part;
     SET search_path TO pg18_vacuum_part;
     CREATE TABLE vac_analyze_only (a int);
     SELECT create_distributed_table('vac_analyze_only', 'a');
     INSERT INTO vac_analyze_only VALUES (1), (2), (3);
     ```

   * On the coordinator:

* Run `ANALYZE vac_analyze_only;` and later `ANALYZE ONLY
vac_analyze_only;`.
* Run `VACUUM vac_analyze_only;` and later `VACUUM ONLY
vac_analyze_only;`.

   * On `worker_1`:

* Capture `coalesce(max(last_analyze), 'epoch')` from
`pg_stat_user_tables` for `vac_analyze_only_%` into
`:analyze_before_only`, then assert:

       ```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS
analyze_only_skipped;
       ```

* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:

       ```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS
vacuum_only_skipped;
       ```

Both checks return `t`, confirming `ONLY` does not change `last_analyze`
/ `last_vacuum` on shard tables.

2. **Partitioned distributed table (`pg18_vacuum_part_dist`)**

   * Create a partitioned table whose parent is distributed:

     ```sql
     CREATE SCHEMA pg18_vacuum_part_dist;
     SET search_path TO pg18_vacuum_part_dist;
     SET citus.shard_count = 2;
     SET citus.shard_replication_factor = 1;

     CREATE TABLE part_dist (id int, v int) PARTITION BY RANGE (id);
CREATE TABLE part_dist_1 PARTITION OF part_dist FOR VALUES FROM (1) TO
(100);
CREATE TABLE part_dist_2 PARTITION OF part_dist FOR VALUES FROM (100) TO
(200);

     SELECT create_distributed_table('part_dist', 'id');
     INSERT INTO part_dist SELECT g, g FROM generate_series(1, 199) g;
     ```

   * On the coordinator:

     * Run `ANALYZE part_dist;` then `ANALYZE ONLY part_dist;`.
* Run `VACUUM part_dist;` then `VACUUM ONLY part_dist;` (PG18 emits the
expected warning: `VACUUM ONLY of partitioned table "part_dist" has no
effect`).

   * On `worker_1`:

* Capture `coalesce(max(last_analyze), 'epoch')` for `part_dist_%` into
`:analyze_before_only`, then assert:

       ```sql
       SELECT max(last_analyze) = :'analyze_before_only'::timestamptz
              AS analyze_only_partitioned_skipped;
       ```

* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:

       ```sql
       SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz
              AS vacuum_only_partitioned_skipped;
       ```

Both checks return `t`, confirming that even for a partitioned
distributed parent, `VACUUM/ANALYZE ONLY` does not recurse into shard
placements, and Citus behavior matches PG18’s “ONLY = parent only”
semantics.
2025-12-05 16:50:42 +03:00
Colm 3399d660f3
PG18: fix incorrectly failing tests in pg18 regress. (#8370)
Some tests showed diffs because of dependencies on citus GUCs and
getting shuffled around in the test order. This commit fixes that.
2025-12-05 12:37:05 +00:00
Colm 002046b87b
PG18: Add support for virtual generated columns. (#8346)
Generated columns can be virtual (not stored) and this is the default.
This PG18 feature requires tweaking citus_ruleutils and deparse table to
support in Citus. Relevant PG commit: 83ea6c540.
2025-12-04 19:51:45 +00:00
Colm 79cabe7eca
PG18: CHECK constraints can be ENFORCED / NOT ENFORCED. (#8349)
DESCRIPTION: Adds propagation of ENFORCED / NOT ENFORCED on CHECK
constraints.
    
Add propagation support to Citus ruleutils and appropriate regress
tests. Relevant PG commit: ca87c41.
2025-12-03 08:01:01 +00:00
Colm e28591df08
PG18: Foreign key constraint can be specified NOT ENFORCED. (#8347)
Test that FOREIGN KEY .. NOT ENFORCED is propagated when applied to
Citus tables. No C code changes required, ruleutils handles it. Relevant
PG commit eec0040c4. Given that Citus does not yet support ALTER TABLE
.. ALTER CONSTRAINT its usefulness is questionable, but we propagate its
definition at least.
2025-12-03 07:10:55 +00:00
Mehmet YILMAZ c600eabd82
PG18 - Handle publish_generated_columns in distributed publications (#8360)
https://github.com/postgres/postgres/commit/7054186c4

fixes #8358 

This PR wires up PostgreSQL 18’s `publish_generated_columns` publication
option in Citus and adds regression coverage to ensure it behaves
correctly for distributed tables, without changing existing DDL output
for publications that rely on the default.

---

### 1. Preserve `publish_generated_columns` when rebuilding publications

In `BuildCreatePublicationStmt`:

* On PG18+ we now read the new `pubgencols` field from `pg_publication`
and map it as follows:

  * `'n'` → default (`none`)
  * `'s'` → `stored`

* For `pubgencols == 's'` we append a `publish_generated_columns`
defelem to the reconstructed statement:

  ```c
  #if PG_VERSION_NUM >= PG_VERSION_18
      if (publicationForm->pubgencols == 's')    /* stored */
      {
          DefElem *pubGenColsOption =
              makeDefElem("publish_generated_columns",
                          (Node *) makeString("stored"),
                          -1);

          createPubStmt->options =
              lappend(createPubStmt->options, pubGenColsOption);
      }
else if (publicationForm->pubgencols != 'n') /* 'n' = none (default) */
      {
          ereport(ERROR,
(errmsg("unexpected pubgencols value '%c' for publication %u",
                          publicationForm->pubgencols, publicationId)));
      }
  #endif
  ```

* For `pubgencols == 'n'` we do **not** emit an option and rely on
PostgreSQL’s default.

* Any value other than `'n'` or `'s'` raises an error rather than
silently producing incorrect DDL.

This ensures:

* Publications that explicitly use `publish_generated_columns = stored`
are reconstructed with that option on workers, so workers get
`pubgencols = 's'`.
* Publications that use the default (`none`) continue to produce the
same `CREATE PUBLICATION ... WITH (...)` text as before (no extra
`publish_generated_columns = 'none'` noise), fixing the unintended diffs
in existing publication tests.

---

### 2. New PG18 regression coverage for distributed publications

In `src/test/regress/sql/pg18.sql`:

* Create a table with a stored generated column and make it distributed
so the publication goes through Citus DDL propagation:

  ```sql
  CREATE TABLE gen_pub_tab (
      id int primary key,
      a  int,
      b  int GENERATED ALWAYS AS (a * 10) STORED
  );

SELECT create_distributed_table('gen_pub_tab', 'id', colocate_with :=
'none');
  ```

* Create two publications that exercise both `pubgencols` values:

  ```sql
  CREATE PUBLICATION pub_gen_cols_stored
      FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = stored);

  CREATE PUBLICATION pub_gen_cols_none
      FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = none);
  ```

* On coordinator and both workers, assert the catalog contents:

  ```sql
  SELECT pubname, pubgencols
  FROM pg_publication
  WHERE pubname IN ('pub_gen_cols_stored', 'pub_gen_cols_none')
  ORDER BY pubname;
  ```

  Expected on all three nodes:

  * `pub_gen_cols_stored | s`
  * `pub_gen_cols_none   | n`

This test verifies that:

* `pubgencols` is correctly set on the coordinator for both `stored` and
`none`.
* Citus propagates the setting unchanged to all workers for a
distributed table.
2025-12-01 09:17:57 +00:00
Colm 4e47293f9f
PG18: syntax & semantics behavior in Citus, part 1. (#8335)
PG18: syntax & semantics behavior in Citus, part 1.
    
Includes PG18 tests for:
- OLD/NEW support in RETURNING clause of DML queries (PG commit
80feb727c)
- WITHOUT OVERLAPS in PRIMARY KEY and UNIQUE constraints (PG commit
fc0438b4e)
  - COLUMNS clause in JSON_TABLE (PG commit bb766cd)
- Foreign tables created with LIKE <table> clause (PG commit 302cf1575)
  - Foreign Key constraint with PERIOD clause (PG commit 89f908a6d)
  - COPY command REJECT_LIMIT option (PG commit 4ac2a9bec)
  - COPY TABLE TO on a materialized view (PG commit 534874fac)

Partially addresses issue #8250
2025-11-17 11:08:30 +00:00
Colm bf959de39e
PG18: Fix diffs in EXPLAINs introduced by PR #8242 in pg18 goldfile (#8262) 2025-10-19 21:20:16 +01:00
Colm 5d71fca3b4
PG18 regress sanity: disable `enable_self_join_elimination` on queries (#8242)
.. involving Citus tables. Interim fix for #8217 to achieve regress
sanity with PG18. A complete fix will follow with PG18 feature
integration.
2025-10-17 10:25:33 +01:00
Naisila Puka c5dde4b115
Fix crash on create statistics with non-RangeVar type pt2 (#8227)
Fixes #8225 
very similar to #8213 
Also the error message changed between pg18rc1 and pg18.0
2025-10-07 11:56:20 +03:00
Naisila Puka bb840e58a7
Fix crash on create statistics with non-RangeVar type (#8213)
This crash has been there for a while but wasn't tested before pg18.

PG18 added this test:
CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo;

which tries to create statistics on a derived-on-the-fly table (which is
not allowed) However Citus assumes we always have a valid table when
intercepting CREATE STATISTICS command to check for Citus tables
Added a check to return early if needed.

pg18 commit: https://github.com/postgres/postgres/commit/3eea4dc2c

Fixes #8212
2025-10-01 00:09:11 +03:00
Mehmet YILMAZ 10d62d50ea
Stabilize table_checks across PG15–PG18: switch to pg_constraint, remove dupes, exclude NOT NULL (#8140)
DESCRIPTION: Stabilize table_checks across PG15–PG18: switch to
pg_constraint, remove dupes, exclude NOT NUL

fixes #8138
fixes #8131 

**Problem**

```diff
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out
--- /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out.modified	2025-08-18 12:26:51.991598284 +0000
+++ /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out.modified	2025-08-18 12:26:52.004598519 +0000
@@ -403,22 +403,30 @@
     relid = 'check_example_partition_col_key_365068'::regclass;
     Column     |  Type   |  Definition   
 ---------------+---------+---------------
  partition_col | integer | partition_col
 (1 row)
 
 SELECT "Constraint", "Definition" FROM table_checks WHERE relid='public.check_example_365068'::regclass;
              Constraint              |            Definition             
 -------------------------------------+-----------------------------------
  check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
  check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
-(2 rows)
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+(10 rows)
```

On PostgreSQL 18, `NOT NULL` is represented as a cataloged constraint
and surfaces through `information_schema.check_constraints`.
14e87ffa5c
Our helper view `table_checks` (built on
`information_schema.check_constraints` + `constraint_column_usage`)
started returning:

* Extra `…_not_null` rows (noise for our tests)
* Duplicate rows for real CHECKs due to the one-to-many join via
`constraint_column_usage`
* Occasional literal formatting differences (e.g., dates) coming from
the information\_schema deparser

### What changed

1. **Rewrite `table_checks` to use system catalogs directly**
We now select only expression-based, table-level constraints—excluding
NOT NULL—by filtering on `contype <> 'n'` and requiring `conbin IS NOT
NULL`. This yields the same effective set as real CHECKs while remaining
future-proof against non-CHECK constraint types.

```sql
CREATE OR REPLACE VIEW table_checks AS
SELECT
  c.conname AS "Constraint",
  'CHECK ' ||
  -- drop a single pair of outer parens if the deparser adds them
  regexp_replace(pg_get_expr(c.conbin, c.conrelid, true), '^\((.*)\)$', '\1')
    AS "Definition",
  c.conrelid AS relid
FROM pg_catalog.pg_constraint AS c
WHERE c.contype <> 'n'         -- drop NOT NULL (PG18)
  AND c.conbin IS NOT NULL     -- only expression-bearing constraints (i.e., CHECKs)
  AND c.conrelid <> 0          -- table-level only (exclude domains)
ORDER BY "Constraint", "Definition";
```

Why this filter?

* `contype <> 'n'` excludes PG18’s NOT NULL rows.
* `conbin IS NOT NULL` restricts to expression-backed constraints
(CHECKs); PK/UNIQUE/FK/EXCLUSION don’t have `conbin`.
* `conrelid <> 0` removes domain constraints.

2. **Add a PG18-specific regression test for `contype = 'n'`**
   New test (`pg18_not_null_constraints`) verifies:

* Coordinator tables have `n` rows for NOT NULL (columns `a`, `c`),
* A worker shard has matching `n` rows,
* Dropping a NOT NULL on the coordinator propagates to shards (count
goes from 2 → 1),
* `table_checks` *never* reports NOT NULL, but does report a real CHECK
added for the test.

---

### Why this works (PG15–PG18)

* **Stable source of truth:** Directly reads `pg_constraint` instead of
`information_schema`.
* **No duplicates:** Eliminates the `constraint_column_usage` join,
removing multiplicity.
* **No NOT NULL noise:** PG18’s `contype = 'n'` is filtered out by
design.
* **Deterministic text:** Uses `pg_get_expr` and strips a single outer
set of parentheses for consistent output.

---

### Impact on tests

* Removes spurious `…_not_null` entries and duplicate `checky_…` rows
(e.g., in `multi_name_lengths` and similar).
* Existing expected files stabilize without adding brittle
normalizations.
* New PG18 test asserts correct catalog behavior and Citus propagation
while remaining a no-op on earlier PG versions.

---
2025-09-22 15:50:32 +03:00