PostgreSQL 18 extends `EXPLAIN (ANALYZE, WAL)` to report how many times
WAL buffers became full (`WAL Buffers Full`).
https://github.com/postgres/postgres/commit/320545bfc
This PR adds regression coverage to ensure Citus preserves the new PG18
field through the distributed EXPLAIN aggregation path:
* Creates a distributed table and forces per-task EXPLAIN output via
`citus.explain_all_tasks`.
* Captures `EXPLAIN (ANALYZE true, WAL true, FORMAT JSON)` into `jsonb`.
* Asserts the plan is distributed by checking for `"Task Count"`.
* Asserts the new PG18 `"WAL Buffers Full"` property exists, and
extracts its value.
#### Why we don’t assert `wal_buffers_full > 0`
Getting `wal_buffers_full > 0` reliably requires either shrinking
`wal_buffers` (postmaster restart) or generating enough WAL to exhaust
the buffer before a flush, which isn’t deterministic in this regression
harness across coordinators/workers. As-is we’re verifying Citus carries
the new field through EXPLAIN; pushing for non-zero would likely be
flaky.
DESCRIPTION: Add PG18 tests verifying generated-column replication via
publication lists
https://github.com/postgres/postgres/commit/745217a05
Adds a new regression-test section that validates **end-to-end logical
replication behavior** for PG18 generated columns when using publication
**column lists**, using worker1 as publisher and worker2 as subscriber.
### Case A: column list includes generated column `b`
* Publisher table has `b GENERATED ALWAYS AS (a * 10) STORED`
* Publication uses `FOR TABLE ... (id, a, b)` with
`publish_generated_columns = none`
* Subscriber defines `b` as a **plain column** (not generated) to ensure
replicated values are applied
* Verifies:
* Initial sync copies `b` values (20, 70)
* Streaming INSERT/UPDATE replicates `b` values (e.g., `(1,5,50)` and
`(3,9,90)`)
### Case B: column list excludes `b` (precedence)
* Publisher table still has generated `b`
* Publication uses `FOR TABLE ... (id, a)` with
`publish_generated_columns = stored`
* Subscriber defines `b` as a plain column
* Verifies:
* `b` is **not** replicated and remains `NULL` for initial sync and
streaming changes, demonstrating column-list precedence via data.
Add tests to check that the GUC `file_copy_method` behaves as expected
when database commands are propagated. It is not relevant for CREATE
DATABASE .. WITH STRATEGY because citus only supports wal_log here, but
ALTER DATABASE .. SET TABLESPACE can use it to determine the OS-level
file management.
This commit verifies that PG18 behavior relating to partitioned tables
is propagated and behaves consistently on Citus distributed tables:
- Allow dropping of constraints ONLY on partitioned tables
- Allow NOT VALID foreign key constraints on partitioned tables
No code change required in Citus, just verificatoin that the PG18 tests
give the same results on a Citus table. Relevant PG commits:
- 329304c90 for text functions with nondeterministic collations
- 85b7efa1c for LIKE with nondeterministic collations
fixes#8364
PostgreSQL 18 changes VACUUM/ANALYZE to recurse into inheritance
children by default, and introduces `ONLY` to limit processing to the
parent. Upstream change:
[https://github.com/postgres/postgres/commit/62ddf7ee9](https://github.com/postgres/postgres/commit/62ddf7ee9)
For Citus tables, we should treat shard placements as “children” and
avoid propagating `VACUUM/ANALYZE` to shards when the user explicitly
asks for `ONLY`.
This PR adjusts the Citus VACUUM handling to align with PG18 semantics,
and adds regression coverage on both regular distributed tables and
partitioned distributed tables.
---
### Behavior changes
* Introduce a per-relation helper struct:
```c
typedef struct CitusVacuumRelation
{
VacuumRelation *vacuumRelation;
Oid relationId;
} CitusVacuumRelation;
```
This lets us keep both:
* the resolved relation OID (for `IsCitusTable`, task building), and
* the original `VacuumRelation` node (for column list and ONLY/inh
flag).
* Replace the old `VacuumRelationIdList` / `ExtractVacuumTargetRels`
flow with:
```c
static List *VacuumRelationList(VacuumStmt *vacuumStmt,
CitusVacuumParams vacuumParams);
```
`VacuumRelationList` now:
* Iterates over `vacuumStmt->rels`.
* Resolves `relid` via `RangeVarGetRelidExtended` when `relation` is
present.
* Falls back to locking `VacuumRelation->oid` when only an OID is
available.
* Respects `VACOPT_FULL` for lock mode and `VACOPT_SKIP_LOCKED` for
locking behavior.
* Builds a `List *` of `CitusVacuumRelation` entries.
* Update:
```c
IsDistributedVacuumStmt(List *vacuumRelationList);
ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt,
List *vacuumRelationList,
CitusVacuumParams vacuumParams);
```
to operate on `CitusVacuumRelation` instead of bare OIDs.
* Implement `ONLY` semantics in `ExecuteVacuumOnDistributedTables`:
```c
RangeVar *relation = vacuumRelation->relation;
if (relation != NULL && !relation->inh)
{
/* ONLY specified, so don't recurse to shard placements */
continue;
}
```
Effect:
* `VACUUM / ANALYZE` (no `ONLY`) on a Citus table: behavior unchanged,
Citus creates tasks and propagates to shard placements.
* `VACUUM ONLY <citus_table>` / `ANALYZE ONLY <citus_table>`:
* Core still processes the coordinator relation as usual.
* Citus **skips** building tasks for shard placements, so we do not
recurse into distributed children.
* The code compiles and behaves as before on pre-PG18; the new behavior
becomes observable only when the core planner starts setting `inh =
false` for `ONLY` (PG18).
* Unqualified `VACUUM` / `ANALYZE` (no rels) is unchanged and still
handled via `ExecuteUnqualifiedVacuumTasks`.
* Remove now-redundant helpers:
* `VacuumColumnList`
* `ExtractVacuumTargetRels`
Column lists are now taken directly from `vacuumRelation->va_cols` via
`CitusVacuumRelation`.
---
### Testing
Extend `src/test/regress/sql/pg18.sql` and `expected/pg18.out` with two
PG18-only blocks that verify we do not recurse into shard placements
when `ONLY` is used:
1. **Simple distributed table (`pg18_vacuum_part`)**
* Create and distribute a regular table:
```sql
CREATE SCHEMA pg18_vacuum_part;
SET search_path TO pg18_vacuum_part;
CREATE TABLE vac_analyze_only (a int);
SELECT create_distributed_table('vac_analyze_only', 'a');
INSERT INTO vac_analyze_only VALUES (1), (2), (3);
```
* On the coordinator:
* Run `ANALYZE vac_analyze_only;` and later `ANALYZE ONLY
vac_analyze_only;`.
* Run `VACUUM vac_analyze_only;` and later `VACUUM ONLY
vac_analyze_only;`.
* On `worker_1`:
* Capture `coalesce(max(last_analyze), 'epoch')` from
`pg_stat_user_tables` for `vac_analyze_only_%` into
`:analyze_before_only`, then assert:
```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS
analyze_only_skipped;
```
* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:
```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS
vacuum_only_skipped;
```
Both checks return `t`, confirming `ONLY` does not change `last_analyze`
/ `last_vacuum` on shard tables.
2. **Partitioned distributed table (`pg18_vacuum_part_dist`)**
* Create a partitioned table whose parent is distributed:
```sql
CREATE SCHEMA pg18_vacuum_part_dist;
SET search_path TO pg18_vacuum_part_dist;
SET citus.shard_count = 2;
SET citus.shard_replication_factor = 1;
CREATE TABLE part_dist (id int, v int) PARTITION BY RANGE (id);
CREATE TABLE part_dist_1 PARTITION OF part_dist FOR VALUES FROM (1) TO
(100);
CREATE TABLE part_dist_2 PARTITION OF part_dist FOR VALUES FROM (100) TO
(200);
SELECT create_distributed_table('part_dist', 'id');
INSERT INTO part_dist SELECT g, g FROM generate_series(1, 199) g;
```
* On the coordinator:
* Run `ANALYZE part_dist;` then `ANALYZE ONLY part_dist;`.
* Run `VACUUM part_dist;` then `VACUUM ONLY part_dist;` (PG18 emits the
expected warning: `VACUUM ONLY of partitioned table "part_dist" has no
effect`).
* On `worker_1`:
* Capture `coalesce(max(last_analyze), 'epoch')` for `part_dist_%` into
`:analyze_before_only`, then assert:
```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz
AS analyze_only_partitioned_skipped;
```
* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:
```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz
AS vacuum_only_partitioned_skipped;
```
Both checks return `t`, confirming that even for a partitioned
distributed parent, `VACUUM/ANALYZE ONLY` does not recurse into shard
placements, and Citus behavior matches PG18’s “ONLY = parent only”
semantics.
Generated columns can be virtual (not stored) and this is the default.
This PG18 feature requires tweaking citus_ruleutils and deparse table to
support in Citus. Relevant PG commit: 83ea6c540.
DESCRIPTION: Adds propagation of ENFORCED / NOT ENFORCED on CHECK
constraints.
Add propagation support to Citus ruleutils and appropriate regress
tests. Relevant PG commit: ca87c41.
Test that FOREIGN KEY .. NOT ENFORCED is propagated when applied to
Citus tables. No C code changes required, ruleutils handles it. Relevant
PG commit eec0040c4. Given that Citus does not yet support ALTER TABLE
.. ALTER CONSTRAINT its usefulness is questionable, but we propagate its
definition at least.
https://github.com/postgres/postgres/commit/7054186c4fixes#8358
This PR wires up PostgreSQL 18’s `publish_generated_columns` publication
option in Citus and adds regression coverage to ensure it behaves
correctly for distributed tables, without changing existing DDL output
for publications that rely on the default.
---
### 1. Preserve `publish_generated_columns` when rebuilding publications
In `BuildCreatePublicationStmt`:
* On PG18+ we now read the new `pubgencols` field from `pg_publication`
and map it as follows:
* `'n'` → default (`none`)
* `'s'` → `stored`
* For `pubgencols == 's'` we append a `publish_generated_columns`
defelem to the reconstructed statement:
```c
#if PG_VERSION_NUM >= PG_VERSION_18
if (publicationForm->pubgencols == 's') /* stored */
{
DefElem *pubGenColsOption =
makeDefElem("publish_generated_columns",
(Node *) makeString("stored"),
-1);
createPubStmt->options =
lappend(createPubStmt->options, pubGenColsOption);
}
else if (publicationForm->pubgencols != 'n') /* 'n' = none (default) */
{
ereport(ERROR,
(errmsg("unexpected pubgencols value '%c' for publication %u",
publicationForm->pubgencols, publicationId)));
}
#endif
```
* For `pubgencols == 'n'` we do **not** emit an option and rely on
PostgreSQL’s default.
* Any value other than `'n'` or `'s'` raises an error rather than
silently producing incorrect DDL.
This ensures:
* Publications that explicitly use `publish_generated_columns = stored`
are reconstructed with that option on workers, so workers get
`pubgencols = 's'`.
* Publications that use the default (`none`) continue to produce the
same `CREATE PUBLICATION ... WITH (...)` text as before (no extra
`publish_generated_columns = 'none'` noise), fixing the unintended diffs
in existing publication tests.
---
### 2. New PG18 regression coverage for distributed publications
In `src/test/regress/sql/pg18.sql`:
* Create a table with a stored generated column and make it distributed
so the publication goes through Citus DDL propagation:
```sql
CREATE TABLE gen_pub_tab (
id int primary key,
a int,
b int GENERATED ALWAYS AS (a * 10) STORED
);
SELECT create_distributed_table('gen_pub_tab', 'id', colocate_with :=
'none');
```
* Create two publications that exercise both `pubgencols` values:
```sql
CREATE PUBLICATION pub_gen_cols_stored
FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = stored);
CREATE PUBLICATION pub_gen_cols_none
FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = none);
```
* On coordinator and both workers, assert the catalog contents:
```sql
SELECT pubname, pubgencols
FROM pg_publication
WHERE pubname IN ('pub_gen_cols_stored', 'pub_gen_cols_none')
ORDER BY pubname;
```
Expected on all three nodes:
* `pub_gen_cols_stored | s`
* `pub_gen_cols_none | n`
This test verifies that:
* `pubgencols` is correctly set on the coordinator for both `stored` and
`none`.
* Citus propagates the setting unchanged to all workers for a
distributed table.
PG18: syntax & semantics behavior in Citus, part 1.
Includes PG18 tests for:
- OLD/NEW support in RETURNING clause of DML queries (PG commit
80feb727c)
- WITHOUT OVERLAPS in PRIMARY KEY and UNIQUE constraints (PG commit
fc0438b4e)
- COLUMNS clause in JSON_TABLE (PG commit bb766cd)
- Foreign tables created with LIKE <table> clause (PG commit 302cf1575)
- Foreign Key constraint with PERIOD clause (PG commit 89f908a6d)
- COPY command REJECT_LIMIT option (PG commit 4ac2a9bec)
- COPY TABLE TO on a materialized view (PG commit 534874fac)
Partially addresses issue #8250
This crash has been there for a while but wasn't tested before pg18.
PG18 added this test:
CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo;
which tries to create statistics on a derived-on-the-fly table (which is
not allowed) However Citus assumes we always have a valid table when
intercepting CREATE STATISTICS command to check for Citus tables
Added a check to return early if needed.
pg18 commit: https://github.com/postgres/postgres/commit/3eea4dc2cFixes#8212
DESCRIPTION: Stabilize table_checks across PG15–PG18: switch to
pg_constraint, remove dupes, exclude NOT NUL
fixes#8138fixes#8131
**Problem**
```diff
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out
--- /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out.modified 2025-08-18 12:26:51.991598284 +0000
+++ /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out.modified 2025-08-18 12:26:52.004598519 +0000
@@ -403,22 +403,30 @@
relid = 'check_example_partition_col_key_365068'::regclass;
Column | Type | Definition
---------------+---------+---------------
partition_col | integer | partition_col
(1 row)
SELECT "Constraint", "Definition" FROM table_checks WHERE relid='public.check_example_365068'::regclass;
Constraint | Definition
-------------------------------------+-----------------------------------
check_example_other_col_check | CHECK other_col >= 100
+ check_example_other_col_check | CHECK other_col >= 100
+ check_example_other_col_check | CHECK other_col >= 100
+ check_example_other_col_check | CHECK other_col >= 100
+ check_example_other_col_check | CHECK other_col >= 100
check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
-(2 rows)
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+(10 rows)
```
On PostgreSQL 18, `NOT NULL` is represented as a cataloged constraint
and surfaces through `information_schema.check_constraints`.
14e87ffa5c
Our helper view `table_checks` (built on
`information_schema.check_constraints` + `constraint_column_usage`)
started returning:
* Extra `…_not_null` rows (noise for our tests)
* Duplicate rows for real CHECKs due to the one-to-many join via
`constraint_column_usage`
* Occasional literal formatting differences (e.g., dates) coming from
the information\_schema deparser
### What changed
1. **Rewrite `table_checks` to use system catalogs directly**
We now select only expression-based, table-level constraints—excluding
NOT NULL—by filtering on `contype <> 'n'` and requiring `conbin IS NOT
NULL`. This yields the same effective set as real CHECKs while remaining
future-proof against non-CHECK constraint types.
```sql
CREATE OR REPLACE VIEW table_checks AS
SELECT
c.conname AS "Constraint",
'CHECK ' ||
-- drop a single pair of outer parens if the deparser adds them
regexp_replace(pg_get_expr(c.conbin, c.conrelid, true), '^\((.*)\)$', '\1')
AS "Definition",
c.conrelid AS relid
FROM pg_catalog.pg_constraint AS c
WHERE c.contype <> 'n' -- drop NOT NULL (PG18)
AND c.conbin IS NOT NULL -- only expression-bearing constraints (i.e., CHECKs)
AND c.conrelid <> 0 -- table-level only (exclude domains)
ORDER BY "Constraint", "Definition";
```
Why this filter?
* `contype <> 'n'` excludes PG18’s NOT NULL rows.
* `conbin IS NOT NULL` restricts to expression-backed constraints
(CHECKs); PK/UNIQUE/FK/EXCLUSION don’t have `conbin`.
* `conrelid <> 0` removes domain constraints.
2. **Add a PG18-specific regression test for `contype = 'n'`**
New test (`pg18_not_null_constraints`) verifies:
* Coordinator tables have `n` rows for NOT NULL (columns `a`, `c`),
* A worker shard has matching `n` rows,
* Dropping a NOT NULL on the coordinator propagates to shards (count
goes from 2 → 1),
* `table_checks` *never* reports NOT NULL, but does report a real CHECK
added for the test.
---
### Why this works (PG15–PG18)
* **Stable source of truth:** Directly reads `pg_constraint` instead of
`information_schema`.
* **No duplicates:** Eliminates the `constraint_column_usage` join,
removing multiplicity.
* **No NOT NULL noise:** PG18’s `contype = 'n'` is filtered out by
design.
* **Deterministic text:** Uses `pg_get_expr` and strips a single outer
set of parentheses for consistent output.
---
### Impact on tests
* Removes spurious `…_not_null` entries and duplicate `checky_…` rows
(e.g., in `multi_name_lengths` and similar).
* Existing expected files stabilize without adding brittle
normalizations.
* New PG18 test asserts correct catalog behavior and Citus propagation
while remaining a no-op on earlier PG versions.
---