citus

Commit Graph

Author	SHA1	Message	Date
Mehmet YILMAZ	2203fe36ea	PG18: Add regression coverage for EXPLAIN (WAL) “WAL Buffers Full” (#8383 ) PostgreSQL 18 extends `EXPLAIN (ANALYZE, WAL)` to report how many times WAL buffers became full (`WAL Buffers Full`). https://github.com/postgres/postgres/commit/320545bfc This PR adds regression coverage to ensure Citus preserves the new PG18 field through the distributed EXPLAIN aggregation path: * Creates a distributed table and forces per-task EXPLAIN output via `citus.explain_all_tasks`. * Captures `EXPLAIN (ANALYZE true, WAL true, FORMAT JSON)` into `jsonb`. * Asserts the plan is distributed by checking for `"Task Count"`. * Asserts the new PG18 `"WAL Buffers Full"` property exists, and extracts its value. #### Why we don’t assert `wal_buffers_full > 0` Getting `wal_buffers_full > 0` reliably requires either shrinking `wal_buffers` (postmaster restart) or generating enough WAL to exhaust the buffer before a flush, which isn’t deterministic in this regression harness across coordinators/workers. As-is we’re verifying Citus carries the new field through EXPLAIN; pushing for non-zero would likely be flaky.	2025-12-18 14:36:38 +03:00
Mehmet YILMAZ	84fc6801ba	Add PG18 logical replication tests for generated columns via publication column lists (#8382 ) DESCRIPTION: Add PG18 tests verifying generated-column replication via publication lists https://github.com/postgres/postgres/commit/745217a05 Adds a new regression-test section that validates end-to-end logical replication behavior for PG18 generated columns when using publication column lists, using worker1 as publisher and worker2 as subscriber. ### Case A: column list includes generated column `b` * Publisher table has `b GENERATED ALWAYS AS (a * 10) STORED` * Publication uses `FOR TABLE ... (id, a, b)` with `publish_generated_columns = none` * Subscriber defines `b` as a plain column (not generated) to ensure replicated values are applied * Verifies: * Initial sync copies `b` values (20, 70) * Streaming INSERT/UPDATE replicates `b` values (e.g., `(1,5,50)` and `(3,9,90)`) ### Case B: column list excludes `b` (precedence) * Publisher table still has generated `b` * Publication uses `FOR TABLE ... (id, a)` with `publish_generated_columns = stored` * Subscriber defines `b` as a plain column * Verifies: * `b` is not replicated and remains `NULL` for initial sync and streaming changes, demonstrating column-list precedence via data.	2025-12-17 14:58:20 +03:00
Colm	6d156690b5	PG18: GUC file_copy_method behaves as expected. (#8379 ) Add tests to check that the GUC `file_copy_method` behaves as expected when database commands are propagated. It is not relevant for CREATE DATABASE .. WITH STRATEGY because citus only supports wal_log here, but ALTER DATABASE .. SET TABLESPACE can use it to determine the OS-level file management.	2025-12-16 10:43:50 +00:00
Colm	0e110ee5a9	PG18: drop constraints ONLY and NOT VALID fks on partitioned tables (#8374 ) This commit verifies that PG18 behavior relating to partitioned tables is propagated and behaves consistently on Citus distributed tables: - Allow dropping of constraints ONLY on partitioned tables - Allow NOT VALID foreign key constraints on partitioned tables	2025-12-15 10:12:18 +00:00
Colm	b68023f91d	PG18: verify text search and LIKE with nondeterministic collations. (#8350 ) No code change required in Citus, just verificatoin that the PG18 tests give the same results on a Citus table. Relevant PG commits: - 329304c90 for text functions with nondeterministic collations - 85b7efa1c for LIKE with nondeterministic collations	2025-12-09 18:19:09 +00:00
Mehmet YILMAZ	31911d8297	PG18 – Respect VACUUM/ANALYZE ONLY semantics for Citus tables (#8365 ) fixes #8364 PostgreSQL 18 changes VACUUM/ANALYZE to recurse into inheritance children by default, and introduces `ONLY` to limit processing to the parent. Upstream change: [https://github.com/postgres/postgres/commit/62ddf7ee9](https://github.com/postgres/postgres/commit/62ddf7ee9) For Citus tables, we should treat shard placements as “children” and avoid propagating `VACUUM/ANALYZE` to shards when the user explicitly asks for `ONLY`. This PR adjusts the Citus VACUUM handling to align with PG18 semantics, and adds regression coverage on both regular distributed tables and partitioned distributed tables. --- ### Behavior changes * Introduce a per-relation helper struct: ```c typedef struct CitusVacuumRelation { VacuumRelation vacuumRelation; Oid relationId; } CitusVacuumRelation; ``` This lets us keep both: the resolved relation OID (for `IsCitusTable`, task building), and * the original `VacuumRelation` node (for column list and ONLY/inh flag). * Replace the old `VacuumRelationIdList` / `ExtractVacuumTargetRels` flow with: ```c static List VacuumRelationList(VacuumStmt vacuumStmt, CitusVacuumParams vacuumParams); ``` `VacuumRelationList` now: * Iterates over `vacuumStmt->rels`. * Resolves `relid` via `RangeVarGetRelidExtended` when `relation` is present. * Falls back to locking `VacuumRelation->oid` when only an OID is available. * Respects `VACOPT_FULL` for lock mode and `VACOPT_SKIP_LOCKED` for locking behavior. * Builds a `List ` of `CitusVacuumRelation` entries. Update: ```c IsDistributedVacuumStmt(List vacuumRelationList); ExecuteVacuumOnDistributedTables(VacuumStmt vacuumStmt, List vacuumRelationList, CitusVacuumParams vacuumParams); ``` to operate on `CitusVacuumRelation` instead of bare OIDs. Implement `ONLY` semantics in `ExecuteVacuumOnDistributedTables`: ```c RangeVar relation = vacuumRelation->relation; if (relation != NULL && !relation->inh) { / ONLY specified, so don't recurse to shard placements / continue; } ``` Effect: `VACUUM / ANALYZE` (no `ONLY`) on a Citus table: behavior unchanged, Citus creates tasks and propagates to shard placements. * `VACUUM ONLY <citus_table>` / `ANALYZE ONLY <citus_table>`: * Core still processes the coordinator relation as usual. * Citus skips building tasks for shard placements, so we do not recurse into distributed children. * The code compiles and behaves as before on pre-PG18; the new behavior becomes observable only when the core planner starts setting `inh = false` for `ONLY` (PG18). * Unqualified `VACUUM` / `ANALYZE` (no rels) is unchanged and still handled via `ExecuteUnqualifiedVacuumTasks`. * Remove now-redundant helpers: * `VacuumColumnList` * `ExtractVacuumTargetRels` Column lists are now taken directly from `vacuumRelation->va_cols` via `CitusVacuumRelation`. --- ### Testing Extend `src/test/regress/sql/pg18.sql` and `expected/pg18.out` with two PG18-only blocks that verify we do not recurse into shard placements when `ONLY` is used: 1. Simple distributed table (`pg18_vacuum_part`) * Create and distribute a regular table: ```sql CREATE SCHEMA pg18_vacuum_part; SET search_path TO pg18_vacuum_part; CREATE TABLE vac_analyze_only (a int); SELECT create_distributed_table('vac_analyze_only', 'a'); INSERT INTO vac_analyze_only VALUES (1), (2), (3); ``` * On the coordinator: * Run `ANALYZE vac_analyze_only;` and later `ANALYZE ONLY vac_analyze_only;`. * Run `VACUUM vac_analyze_only;` and later `VACUUM ONLY vac_analyze_only;`. * On `worker_1`: * Capture `coalesce(max(last_analyze), 'epoch')` from `pg_stat_user_tables` for `vac_analyze_only_%` into `:analyze_before_only`, then assert: ```sql SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS analyze_only_skipped; ``` * Capture `coalesce(max(last_vacuum), 'epoch')` into `:vacuum_before_only`, then assert: ```sql SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS vacuum_only_skipped; ``` Both checks return `t`, confirming `ONLY` does not change `last_analyze` / `last_vacuum` on shard tables. 2. Partitioned distributed table (`pg18_vacuum_part_dist`) * Create a partitioned table whose parent is distributed: ```sql CREATE SCHEMA pg18_vacuum_part_dist; SET search_path TO pg18_vacuum_part_dist; SET citus.shard_count = 2; SET citus.shard_replication_factor = 1; CREATE TABLE part_dist (id int, v int) PARTITION BY RANGE (id); CREATE TABLE part_dist_1 PARTITION OF part_dist FOR VALUES FROM (1) TO (100); CREATE TABLE part_dist_2 PARTITION OF part_dist FOR VALUES FROM (100) TO (200); SELECT create_distributed_table('part_dist', 'id'); INSERT INTO part_dist SELECT g, g FROM generate_series(1, 199) g; ``` * On the coordinator: * Run `ANALYZE part_dist;` then `ANALYZE ONLY part_dist;`. * Run `VACUUM part_dist;` then `VACUUM ONLY part_dist;` (PG18 emits the expected warning: `VACUUM ONLY of partitioned table "part_dist" has no effect`). * On `worker_1`: * Capture `coalesce(max(last_analyze), 'epoch')` for `part_dist_%` into `:analyze_before_only`, then assert: ```sql SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS analyze_only_partitioned_skipped; ``` * Capture `coalesce(max(last_vacuum), 'epoch')` into `:vacuum_before_only`, then assert: ```sql SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS vacuum_only_partitioned_skipped; ``` Both checks return `t`, confirming that even for a partitioned distributed parent, `VACUUM/ANALYZE ONLY` does not recurse into shard placements, and Citus behavior matches PG18’s “ONLY = parent only” semantics.	2025-12-05 16:50:42 +03:00
Colm	3399d660f3	PG18: fix incorrectly failing tests in pg18 regress. (#8370 ) Some tests showed diffs because of dependencies on citus GUCs and getting shuffled around in the test order. This commit fixes that.	2025-12-05 12:37:05 +00:00
Colm	002046b87b	PG18: Add support for virtual generated columns. (#8346 ) Generated columns can be virtual (not stored) and this is the default. This PG18 feature requires tweaking citus_ruleutils and deparse table to support in Citus. Relevant PG commit: 83ea6c540.	2025-12-04 19:51:45 +00:00
Colm	79cabe7eca	PG18: CHECK constraints can be ENFORCED / NOT ENFORCED. (#8349 ) DESCRIPTION: Adds propagation of ENFORCED / NOT ENFORCED on CHECK constraints. Add propagation support to Citus ruleutils and appropriate regress tests. Relevant PG commit: ca87c41.	2025-12-03 08:01:01 +00:00
Colm	e28591df08	PG18: Foreign key constraint can be specified NOT ENFORCED. (#8347 ) Test that FOREIGN KEY .. NOT ENFORCED is propagated when applied to Citus tables. No C code changes required, ruleutils handles it. Relevant PG commit eec0040c4. Given that Citus does not yet support ALTER TABLE .. ALTER CONSTRAINT its usefulness is questionable, but we propagate its definition at least.	2025-12-03 07:10:55 +00:00
Mehmet YILMAZ	a39ce7942f	PG18 - small fix on pg18.out (#8367 )	2025-12-03 09:41:24 +03:00
Mehmet YILMAZ	c600eabd82	PG18 - Handle publish_generated_columns in distributed publications (#8360 ) https://github.com/postgres/postgres/commit/7054186c4 fixes #8358 This PR wires up PostgreSQL 18’s `publish_generated_columns` publication option in Citus and adds regression coverage to ensure it behaves correctly for distributed tables, without changing existing DDL output for publications that rely on the default. --- ### 1. Preserve `publish_generated_columns` when rebuilding publications In `BuildCreatePublicationStmt`: * On PG18+ we now read the new `pubgencols` field from `pg_publication` and map it as follows: * `'n'` → default (`none`) * `'s'` → `stored` * For `pubgencols == 's'` we append a `publish_generated_columns` defelem to the reconstructed statement: ```c #if PG_VERSION_NUM >= PG_VERSION_18 if (publicationForm->pubgencols == 's') /* stored / { DefElem pubGenColsOption = makeDefElem("publish_generated_columns", (Node ) makeString("stored"), -1); createPubStmt->options = lappend(createPubStmt->options, pubGenColsOption); } else if (publicationForm->pubgencols != 'n') / 'n' = none (default) / { ereport(ERROR, (errmsg("unexpected pubgencols value '%c' for publication %u", publicationForm->pubgencols, publicationId))); } #endif ``` For `pubgencols == 'n'` we do not emit an option and rely on PostgreSQL’s default. * Any value other than `'n'` or `'s'` raises an error rather than silently producing incorrect DDL. This ensures: * Publications that explicitly use `publish_generated_columns = stored` are reconstructed with that option on workers, so workers get `pubgencols = 's'`. * Publications that use the default (`none`) continue to produce the same `CREATE PUBLICATION ... WITH (...)` text as before (no extra `publish_generated_columns = 'none'` noise), fixing the unintended diffs in existing publication tests. --- ### 2. New PG18 regression coverage for distributed publications In `src/test/regress/sql/pg18.sql`: * Create a table with a stored generated column and make it distributed so the publication goes through Citus DDL propagation: ```sql CREATE TABLE gen_pub_tab ( id int primary key, a int, b int GENERATED ALWAYS AS (a * 10) STORED ); SELECT create_distributed_table('gen_pub_tab', 'id', colocate_with := 'none'); ``` * Create two publications that exercise both `pubgencols` values: ```sql CREATE PUBLICATION pub_gen_cols_stored FOR TABLE gen_pub_tab WITH (publish = 'insert, update', publish_generated_columns = stored); CREATE PUBLICATION pub_gen_cols_none FOR TABLE gen_pub_tab WITH (publish = 'insert, update', publish_generated_columns = none); ``` * On coordinator and both workers, assert the catalog contents: ```sql SELECT pubname, pubgencols FROM pg_publication WHERE pubname IN ('pub_gen_cols_stored', 'pub_gen_cols_none') ORDER BY pubname; ``` Expected on all three nodes: * `pub_gen_cols_stored \| s` * `pub_gen_cols_none \| n` This test verifies that: * `pubgencols` is correctly set on the coordinator for both `stored` and `none`. * Citus propagates the setting unchanged to all workers for a distributed table.	2025-12-01 09:17:57 +00:00
Colm	4e47293f9f	PG18: syntax & semantics behavior in Citus, part 1. (#8335 ) PG18: syntax & semantics behavior in Citus, part 1. Includes PG18 tests for: - OLD/NEW support in RETURNING clause of DML queries (PG commit 80feb727c) - WITHOUT OVERLAPS in PRIMARY KEY and UNIQUE constraints (PG commit fc0438b4e) - COLUMNS clause in JSON_TABLE (PG commit bb766cd) - Foreign tables created with LIKE <table> clause (PG commit 302cf1575) - Foreign Key constraint with PERIOD clause (PG commit 89f908a6d) - COPY command REJECT_LIMIT option (PG commit 4ac2a9bec) - COPY TABLE TO on a materialized view (PG commit 534874fac) Partially addresses issue #8250	2025-11-17 11:08:30 +00:00
Colm	bf959de39e	PG18: Fix diffs in EXPLAINs introduced by PR #8242 in pg18 goldfile (#8262 )	2025-10-19 21:20:16 +01:00
Colm	5d71fca3b4	PG18 regress sanity: disable `enable_self_join_elimination` on queries (#8242 ) .. involving Citus tables. Interim fix for #8217 to achieve regress sanity with PG18. A complete fix will follow with PG18 feature integration.	2025-10-17 10:25:33 +01:00
Naisila Puka	c5dde4b115	Fix crash on create statistics with non-RangeVar type pt2 (#8227 ) Fixes #8225 very similar to #8213 Also the error message changed between pg18rc1 and pg18.0	2025-10-07 11:56:20 +03:00
Naisila Puka	bb840e58a7	Fix crash on create statistics with non-RangeVar type (#8213 ) This crash has been there for a while but wasn't tested before pg18. PG18 added this test: CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo; which tries to create statistics on a derived-on-the-fly table (which is not allowed) However Citus assumes we always have a valid table when intercepting CREATE STATISTICS command to check for Citus tables Added a check to return early if needed. pg18 commit: https://github.com/postgres/postgres/commit/3eea4dc2c Fixes #8212	2025-10-01 00:09:11 +03:00
Mehmet YILMAZ	10d62d50ea	Stabilize table_checks across PG15–PG18: switch to pg_constraint, remove dupes, exclude NOT NULL (#8140 ) DESCRIPTION: Stabilize table_checks across PG15–PG18: switch to pg_constraint, remove dupes, exclude NOT NUL fixes #8138 fixes #8131 Problem ```diff diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out --- /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out.modified 2025-08-18 12:26:51.991598284 +0000 +++ /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out.modified 2025-08-18 12:26:52.004598519 +0000 @@ -403,22 +403,30 @@ relid = 'check_example_partition_col_key_365068'::regclass; Column \| Type \| Definition ---------------+---------+--------------- partition_col \| integer \| partition_col (1 row) SELECT "Constraint", "Definition" FROM table_checks WHERE relid='public.check_example_365068'::regclass; Constraint \| Definition -------------------------------------+----------------------------------- check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 -(2 rows) + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 +(10 rows) ``` On PostgreSQL 18, `NOT NULL` is represented as a cataloged constraint and surfaces through `information_schema.check_constraints`. `14e87ffa5c` Our helper view `table_checks` (built on `information_schema.check_constraints` + `constraint_column_usage`) started returning: * Extra `…_not_null` rows (noise for our tests) * Duplicate rows for real CHECKs due to the one-to-many join via `constraint_column_usage` * Occasional literal formatting differences (e.g., dates) coming from the information\_schema deparser ### What changed 1. Rewrite `table_checks` to use system catalogs directly We now select only expression-based, table-level constraints—excluding NOT NULL—by filtering on `contype <> 'n'` and requiring `conbin IS NOT NULL`. This yields the same effective set as real CHECKs while remaining future-proof against non-CHECK constraint types. ```sql CREATE OR REPLACE VIEW table_checks AS SELECT c.conname AS "Constraint", 'CHECK ' \|\| -- drop a single pair of outer parens if the deparser adds them regexp_replace(pg_get_expr(c.conbin, c.conrelid, true), '^$(.)$$', '\1') AS "Definition", c.conrelid AS relid FROM pg_catalog.pg_constraint AS c WHERE c.contype <> 'n' -- drop NOT NULL (PG18) AND c.conbin IS NOT NULL -- only expression-bearing constraints (i.e., CHECKs) AND c.conrelid <> 0 -- table-level only (exclude domains) ORDER BY "Constraint", "Definition"; ``` Why this filter? `contype <> 'n'` excludes PG18’s NOT NULL rows. * `conbin IS NOT NULL` restricts to expression-backed constraints (CHECKs); PK/UNIQUE/FK/EXCLUSION don’t have `conbin`. * `conrelid <> 0` removes domain constraints. 2. Add a PG18-specific regression test for `contype = 'n'` New test (`pg18_not_null_constraints`) verifies: * Coordinator tables have `n` rows for NOT NULL (columns `a`, `c`), * A worker shard has matching `n` rows, * Dropping a NOT NULL on the coordinator propagates to shards (count goes from 2 → 1), * `table_checks` never reports NOT NULL, but does report a real CHECK added for the test. --- ### Why this works (PG15–PG18) * Stable source of truth: Directly reads `pg_constraint` instead of `information_schema`. * No duplicates: Eliminates the `constraint_column_usage` join, removing multiplicity. * No NOT NULL noise: PG18’s `contype = 'n'` is filtered out by design. * Deterministic text: Uses `pg_get_expr` and strips a single outer set of parentheses for consistent output. --- ### Impact on tests * Removes spurious `…_not_null` entries and duplicate `checky_…` rows (e.g., in `multi_name_lengths` and similar). * Existing expected files stabilize without adding brittle normalizations. * New PG18 test asserts correct catalog behavior and Citus propagation while remaining a no-op on earlier PG versions. ---	2025-09-22 15:50:32 +03:00

18 Commits (2203fe36eae478c4a97c2c98b3d70320452b8f77)