citus

Commit Graph

Author	SHA1	Message	Date
Onur Tirtir	c7a55c8606	Merge branch 'main' into remove-stats-collector	2025-10-24 15:45:38 +03:00
Mehmet YILMAZ	95477e6d02	PG18 - Add BUFFERS OFF to remaining EXPLAIN calls (#8288 ) fixes #8093 `c2a4078eba` - Enable buffer-usage reporting by default in `EXPLAIN ANALYZE` on PostgreSQL 18 and above. - Introduce the explicit `BUFFERS OFF` option in every existing regression test to maintain pre-PG18 output consistency. - This appends, `BUFFERS OFF` to all `EXPLAIN(...)` calls in src/test/regress/sql and the corresponding .out files.	2025-10-24 15:09:49 +03:00
Colm	bf959de39e	PG18: Fix diffs in EXPLAINs introduced by PR #8242 in pg18 goldfile (#8262 )	2025-10-19 21:20:16 +01:00
Onur Tirtir	90f2ab6648	Actually deprecate mark_tables_colocated()	2025-10-17 11:57:36 +00:00
Colm	5d71fca3b4	PG18 regress sanity: disable `enable_self_join_elimination` on queries (#8242 ) .. involving Citus tables. Interim fix for #8217 to achieve regress sanity with PG18. A complete fix will follow with PG18 feature integration.	2025-10-17 10:25:33 +01:00
eaydingol	aa0ac0af60	Citus upgrade tests (#8237 ) Expand the citus upgrade tests matrix: PG15: v11.1.0 v11.3.0 v12.1.10 PG16: v12.1.10 See https://github.com/citusdata/the-process/pull/174	2025-10-15 15:28:44 +03:00
Naisila Puka	432b69eb9d	PG18 - fix naming diffs of child FK constraints (#8247 ) PG18 changed the names generated for child foreign key constraints. https://github.com/postgres/postgres/commit/3db61db48 The test failures in Citus regression suite are all changing the name of a constraint from `'sensors%'` to `'%to_parent%_1'`: the naming is very nice here because `to_parent` means that we have a foreign key to a parent table. To fix the diff, we exclude those constraints from the output. To verify correctness, we still count the problematic constraints to make sure they are there - we are simply removing them from the first output (we add this count query right after the previous one) Fixes #8126 Co-authored-by: Mehmet YILMAZ <mehmety87@gmail.com>	2025-10-13 13:33:38 +03:00
Naisila Puka	287abea661	PG18 compatibility - varreturningtype additions (#8231 ) This PR solves the following diffs, originating from the addition of `varreturningtype` field to the `Var` struct in PG18: https://github.com/postgres/postgres/commit/80feb727c Previously we didn't account for this new field (as it's new), so this wouldn't allow the parser to correctly reconstruct the `Var` node structure, but rather it would error out with `did not find '}' at end of input node`: ```diff SELECT column_to_column_name(logicalrelid, partkey) FROM pg_dist_partition WHERE partkey IS NOT NULL ORDER BY 1 LIMIT 1; - column_to_column_name ---------------------------------------------------------------------- - a -(1 row) - +ERROR: did not find '}' at end of input node ``` Solution follows precedent https://github.com/citusdata/citus/pull/7107, when varnullingrels field was added to the `Var` struct in PG16. The solution includes: - Taking care of the `partkey` in `pg_dist_partition` table because it's coming from the `Var` struct. This mainly includes fixing the upgrade script to PG18, by saving all the `partkey` infos before upgrading to PG18 (in `citus_prepare_pg_upgrade`), and then re-generating `partkey` columns in `pg_dist_partition` (using `UPDATE`) after upgrading to PG18 (in `citus_finish_pg_upgrade`). - Adding a normalize rule to fix output differences among PG versions. Note that we need two normalize lines: one for PG15 since it doesn't have `varnullingrels`, and one for PG16/PG17. - Small trick on `metadata_sync_helpers` to use different text when generating the `partkey`, based on the PG version. Fixes #8189	2025-10-09 17:35:03 +03:00
Naisila Puka	f0014cf0df	PG18 compatibility: misc output diffs pt2 (#8234 ) 3 minor changes to reduce some noise from the regression diffs. 1 - Reduce verbosity when ALTER EXTENSION fails PG18 has improved reporting of errors in extension script files Relevant PG commit: https://github.com/postgres/postgres/commit/774171c4f There was more context in PG18, so reducing verbosity ``` ALTER EXTENSION citus UPDATE TO '11.0-1'; ERROR: cstore_fdw tables are deprecated as of Citus 11.0 HINT: Install Citus 10.2 and convert your cstore_fdw tables to the columnar access method before upgrading further CONTEXT: PL/pgSQL function inline_code_block line 4 at RAISE +SQL statement "DO LANGUAGE plpgsql +$$ +BEGIN + IF EXISTS (SELECT 1 FROM pg_dist_shard where shardstorage = 'c') THEN + RAISE EXCEPTION 'cstore_fdw tables are deprecated as of Citus 11.0' + USING HINT = 'Install Citus 10.2 and convert your cstore_fdw tables to the columnar access method before upgrading further'; + END IF; +END; +$$" +extension script file "citus--10.2-5--11.0-1.sql", near line 532 ``` 2 - Fix backend type order in tests for PG18 PG18 added another backend type which messed the order in this test Adding a separate IF condition for PG18 Relevant PG commit: https://github.com/postgres/postgres/commit/18d67a8d7d 3 - Ignore "DEBUG: find_in_path" lines in output Relevant PG commit: https://github.com/postgres/postgres/commit/4f7f7b0375 The new GUC extension_control_path specifies a path to look for extension control files.	2025-10-09 16:50:41 +03:00
Naisila Puka	d9652bf5f9	PG18 compatibility: misc output diffs (#8233 ) 6 minor changes to reduce some noise from the regression diffs. 1 - Add ORDER BY to fix subquery_in_where diff 2 - Disable buffers in explain analyze calls Leftover work from https://github.com/citusdata/citus/commit/f1f0b09f7 3 - Reduce verbosity to avoid diffs between PG versions Relevant PG commit: https://github.com/postgres/postgres/commit/0dca5d68d7 diff was: ``` CALL test_procedure_commit(2,5); ERROR: COMMIT is not allowed in an SQL function -CONTEXT: SQL function "test_procedure_commit" during startup +CONTEXT: SQL function "test_procedure_commit" statement 2 ``` 4 - Rename array_sort to array_sort_citus since PG18 added array_sort Relevant PG commit: https://github.com/postgres/postgres/commit/6c12ae09f5a Diff we were seeing in multi_array_agg, because the PG18 test was using PG18's array_sort function instead: ``` -- Check that we return NULL in case there are no input rows to array_agg() SELECT array_sort(array_agg(l_orderkey)) FROM lineitem WHERE l_orderkey < 0; array_sort ------------ - {} + (1 row) ``` 5 - Exclude not-null constraints from output to avoid diffs PG18 has added pg_constraint rows for not-null constraints Relevant PG commit https://github.com/postgres/postgres/commit/14e87ffa5c Remove them by condition contype <> 'n' 6 - Reduce verbosity to avoid md5 pwd deprecation warning in PG18 PG18 has deprecated MD5 passwords Relevant PG commit: https://github.com/postgres/postgres/commit/db6a4a985 Fixes #8154 Fixes #8157	2025-10-08 13:23:55 +03:00
Naisila Puka	c5dde4b115	Fix crash on create statistics with non-RangeVar type pt2 (#8227 ) Fixes #8225 very similar to #8213 Also the error message changed between pg18rc1 and pg18.0	2025-10-07 11:56:20 +03:00
Mehmet YILMAZ	d4dfdd765b	PG18 - Normalize \d+ output in PG18 by filtering “Not-null constraints” blocks (#8183 ) DESCRIPTION: Normalize \d+ output in PG18 by filtering “Not-null constraints” blocks fixes #8095 PR Description Postgres 18 started representing column `NOT NULL` as named constraints in `pg_constraint`, and `psql \d+` now prints them under a `Not-null constraints:` section. This caused extra diffs in our regression tests. `14e87ffa5c` This PR updates the normalization rules to strip those sections during diff filtering by adding two regex rules: * remove the `Not-null constraints:` header * remove any indented constraint lines ending in `_not_null`	2025-10-02 13:48:27 +03:00
Onur Tirtir	b65096b1d7	Merge branch 'main' into remove-stats-collector	2025-10-01 16:11:56 +03:00
Mehmet YILMAZ	cec1848b13	PG18: adapt multi_sql_function expected output to SQL-function plan cache (#8184 ) `0dca5d68d7` fixes #8153 ```diff diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_sql_function.out /__w/citus/citus/src/test/regress/results/multi_sql_function.out --- /__w/citus/citus/src/test/regress/expected/multi_sql_function.out.modified 2025-08-25 12:43:24.373634581 +0000 +++ /__w/citus/citus/src/test/regress/results/multi_sql_function.out.modified 2025-08-25 12:43:24.383634533 +0000 @@ -317,24 +317,25 @@ $$ LANGUAGE SQL STABLE; INSERT INTO test_parameterized_sql VALUES(1, 1); -- all of them should fail SELECT * FROM test_parameterized_sql_function(1); ERROR: cannot perform distributed planning on this query because parameterized queries for SQL functions referencing distributed tables are not supported HINT: Consider using PL/pgSQL functions instead. SELECT (SELECT 1 FROM test_parameterized_sql limit 1) FROM test_parameterized_sql_function(1); ERROR: cannot perform distributed planning on this query because parameterized queries for SQL functions referencing distributed tables are not supported HINT: Consider using PL/pgSQL functions instead. SELECT test_parameterized_sql_function_in_subquery_where(1); -ERROR: could not create distributed plan -DETAIL: Possibly this is caused by the use of parameters in SQL functions, which is not supported in Citus. -HINT: Consider using PL/pgSQL functions instead. -CONTEXT: SQL function "test_parameterized_sql_function_in_subquery_where" statement 1 + test_parameterized_sql_function_in_subquery_where +--------------------------------------------------- + 1 +(1 row) + ``` allows custom vs. generic plans for SQL functions; arguments can be folded to consts, enabling more rewrites/optimizations (and in your case, routable Citus plans) seems that P18 rewrote how LANGUAGE SQL functions are planned/executed: they now go through the plan cache (like PL/pgSQL does) and can produce custom plans with the function arguments substituted as constants. That means your call SELECT test_parameterized_sql_function_in_subquery_where(1); is planned with org_id_val = 1 baked in, so Citus no longer sees an unresolved Param inside the function body and is able to build a distributed plan instead of tripping the old “params in SQL functions” error path. What’s in here - Update `expected/multi_sql_function.out` to reflect PG18 behavior - Add `expected/multi_sql_function_0.out` as an alternate expected file that retains the pre-PG18 error output for the same test	2025-10-01 16:03:21 +03:00
Naisila Puka	bb840e58a7	Fix crash on create statistics with non-RangeVar type (#8213 ) This crash has been there for a while but wasn't tested before pg18. PG18 added this test: CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo; which tries to create statistics on a derived-on-the-fly table (which is not allowed) However Citus assumes we always have a valid table when intercepting CREATE STATISTICS command to check for Citus tables Added a check to return early if needed. pg18 commit: https://github.com/postgres/postgres/commit/3eea4dc2c Fixes #8212	2025-10-01 00:09:11 +03:00
Onur Tirtir	5eb1d93be1	Properly detect no-op shard-key updates via UPDATE / MERGE (#8214 ) DESCRIPTION: Fixes a bug that causes allowing UPDATE / MERGE queries that may change the distribution column value. Fixes: #8087. Probably as of #769, we were not properly checking if UPDATE may change the distribution column. In #769, we had these checks: ```c if (targetEntry->resno != column->varattno) { /* target entry of the form SET some_other_col = <x> / isColumnValueChanged = false; } else if (IsA(setExpr, Var)) { Var newValue = (Var ) setExpr; if (newValue->varattno == column->varattno) { / target entry of the form SET col = table.col / isColumnValueChanged = false; } } ``` However, what we check in "if" and in the "else if" are not so different in the sense they both attempt to verify if SET expr of the target entry points to the attno of given column. So, in #5220, we even removed the first check because it was redundant. Also see this PR comment from #5220: https://github.com/citusdata/citus/pull/5220#discussion_r699230597. In #769, probably we actually wanted to first check whether both SET expr of the target entry and given variable are pointing to the same range var entry, but this wasn't what the "if" was checking, so removed. As a result, in the cases that are mentioned in the linked issue, we were incorrectly concluding that the SET expr of the target entry won't change given column just because it's pointing to the same attno as given variable, regardless of what range var entries the column and the SET expr are pointing to. Then we also started using the same function to check for such cases for update action of MERGE, so we have the same bug there as well. So with this PR, we properly check for such cases by comparing varno as well in TargetEntryChangesValue(). However, then some of the existing tests started failing where the SET expr doesn't directly assign the column to itself but the "where" clause could actually imply that the distribution column won't change. Even before we were not attempting to verify if "where" cluse quals could imply a no-op assignment for the SET expr in such cases but that was not a problem. This is because, for the most cases, we were always qualifying such SET expressions as a no-op update as long as the SET expr's attno is the same as given column's. For this reason, to prevent regressions, this PR also adds some extra logic as well to understand if the "where" clause quals could imply that SET expr for the distribution key is a no-op. Ideally, we should instead use "relation restriction equivalence" mechanism to understand if the "where" clause implies a no-op update. This is because, for instance, right now we're not able to deduce that the update is a no-op when the "where" clause transitively implies a no-op update, as in the case where we're setting "column a" to "column c" and where clause looks like: "column a = column b AND column b = column c". If this means a regression for some users, we can consider doing it that way. Until then, as a workaround, we can suggest adding additional quals to "where" clause that would directly imply equivalence. Also, after fixing TargetEntryChangesValue(), we started successfully deducing that the update action is a no-op for such MERGE queries: ```sql MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.b; ``` However, we then started seeing below error for above query even though now the update is qualified as a no-op update: ``` ERROR: Unexpected column index of the source list ``` This was because of #8180 and #8201 fixed that. In summary, with this PR: We disallow such queries, ```sql -- attno for dist_1.a, dist_1.b: 1, 2 -- attno for dist_different_order_1.a, dist_different_order_1.b: 2, 1 UPDATE dist_1 SET a = dist_different_order_1.b FROM dist_different_order_1 WHERE dist_1.a dist_different_order_1.a; -- attno for dist_1.a, dist_1.b: 1, 2 -- but ON (..) doesn't imply a no-op update for SET expr MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.a; ``` * .. and allow such queries, ```sql MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.b; ```	2025-09-30 10:13:47 +00:00
Onur Tirtir	83b25e1fb1	Fix unexpected column index error for repartitioned merge (#8201 ) DESCRIPTION: Fixes a bug that causes an unexpected error when executing repartitioned merge. Fixes #8180. This was happening because of a bug in SourceResultPartitionColumnIndex(). And to fix it, this PR avoids using DistributionColumnIndex() in SourceResultPartitionColumnIndex(). Instead, invents FindTargetListEntryWithVarExprAttno(), which finds the index of the target entry in the source query's target list that can be used to repartition the source for a repartitioned merge. In short, to find the source target entry that refences the Var used in ON (..) clause and that references the source rte, we should check the varattno of the underlying expr, which presumably is always a Var for repartitioned merge as we always wrap the source rte with a subquery, where all target entries point to the columns of the original source relation. Using DistributionColumnIndex() prior to 13.0 wasn't causing such an issue because prior to 13.0, the varattno of the underlying expr of the source target entries was almost (1) always equal to resno of the target entry as we were including all target entries of the source relation. However, starting with #7659, which is merged to main before 13.0, we started using CreateFilteredTargetListForRelation() instead of CreateAllTargetListForRelation() to compute the target entry list for the source rte to fix another bug. So we cannot revert to using CreateAllTargetListForRelation() because otherwise we would re-introduce bug that it helped fixing, so we instead had to find a way to properly deal with the "filtered target list"s, as in this commit. Plus (1), even before #7659, probably we would still fail when the source relation has dropped attributes or such because that would probably also cause such a mismatch between the varattno of the underlying expr of the target entry and its resno.	2025-09-23 11:17:51 +00:00
Mehmet YILMAZ	10d62d50ea	Stabilize table_checks across PG15–PG18: switch to pg_constraint, remove dupes, exclude NOT NULL (#8140 ) DESCRIPTION: Stabilize table_checks across PG15–PG18: switch to pg_constraint, remove dupes, exclude NOT NUL fixes #8138 fixes #8131 Problem ```diff diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out --- /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out.modified 2025-08-18 12:26:51.991598284 +0000 +++ /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out.modified 2025-08-18 12:26:52.004598519 +0000 @@ -403,22 +403,30 @@ relid = 'check_example_partition_col_key_365068'::regclass; Column \| Type \| Definition ---------------+---------+--------------- partition_col \| integer \| partition_col (1 row) SELECT "Constraint", "Definition" FROM table_checks WHERE relid='public.check_example_365068'::regclass; Constraint \| Definition -------------------------------------+----------------------------------- check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_col_check \| CHECK other_col >= 100 check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 -(2 rows) + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 +(10 rows) ``` On PostgreSQL 18, `NOT NULL` is represented as a cataloged constraint and surfaces through `information_schema.check_constraints`. `14e87ffa5c` Our helper view `table_checks` (built on `information_schema.check_constraints` + `constraint_column_usage`) started returning: * Extra `…_not_null` rows (noise for our tests) * Duplicate rows for real CHECKs due to the one-to-many join via `constraint_column_usage` * Occasional literal formatting differences (e.g., dates) coming from the information\_schema deparser ### What changed 1. Rewrite `table_checks` to use system catalogs directly We now select only expression-based, table-level constraints—excluding NOT NULL—by filtering on `contype <> 'n'` and requiring `conbin IS NOT NULL`. This yields the same effective set as real CHECKs while remaining future-proof against non-CHECK constraint types. ```sql CREATE OR REPLACE VIEW table_checks AS SELECT c.conname AS "Constraint", 'CHECK ' \|\| -- drop a single pair of outer parens if the deparser adds them regexp_replace(pg_get_expr(c.conbin, c.conrelid, true), '^$(.)$$', '\1') AS "Definition", c.conrelid AS relid FROM pg_catalog.pg_constraint AS c WHERE c.contype <> 'n' -- drop NOT NULL (PG18) AND c.conbin IS NOT NULL -- only expression-bearing constraints (i.e., CHECKs) AND c.conrelid <> 0 -- table-level only (exclude domains) ORDER BY "Constraint", "Definition"; ``` Why this filter? `contype <> 'n'` excludes PG18’s NOT NULL rows. * `conbin IS NOT NULL` restricts to expression-backed constraints (CHECKs); PK/UNIQUE/FK/EXCLUSION don’t have `conbin`. * `conrelid <> 0` removes domain constraints. 2. Add a PG18-specific regression test for `contype = 'n'` New test (`pg18_not_null_constraints`) verifies: * Coordinator tables have `n` rows for NOT NULL (columns `a`, `c`), * A worker shard has matching `n` rows, * Dropping a NOT NULL on the coordinator propagates to shards (count goes from 2 → 1), * `table_checks` never reports NOT NULL, but does report a real CHECK added for the test. --- ### Why this works (PG15–PG18) * Stable source of truth: Directly reads `pg_constraint` instead of `information_schema`. * No duplicates: Eliminates the `constraint_column_usage` join, removing multiplicity. * No NOT NULL noise: PG18’s `contype = 'n'` is filtered out by design. * Deterministic text: Uses `pg_get_expr` and strips a single outer set of parentheses for consistent output. --- ### Impact on tests * Removes spurious `…_not_null` entries and duplicate `checky_…` rows (e.g., in `multi_name_lengths` and similar). * Existing expected files stabilize without adding brittle normalizations. * New PG18 test asserts correct catalog behavior and Citus propagation while remaining a no-op on earlier PG versions. ---	2025-09-22 15:50:32 +03:00
Onur Tirtir	f9b6863bf7	make multi_test_helpers re-runable	2025-09-19 17:43:56 +03:00
Onur Tirtir	762465da2c	Merge remote-tracking branch 'origin/main' into remove-stats-collector	2025-09-19 15:02:43 +03:00
Naisila Puka	b4cb1a94e9	Bump citus and citus_columnar to 14.0devel (#8170 )	2025-09-19 12:54:55 +03:00
Naisila Puka	becc02b398	Cleanup from dropping pg14 in merge isolation tests (#8204 ) These alternative test outputs are redundant since we have dropped PG14 support on main.	2025-09-19 12:01:29 +03:00
Mehmet YILMAZ	b58af1c8d5	PG18: stabilize constraint-name tests by filtering pg_constraint on contype (#8185 ) `14e87ffa5c` PostgreSQL 18 now records column `NOT NULL` constraints in `pg_constraint` (`contype = 'n'`). That means queries that previously listed “all constraints” for a relation now return extra rows, causing noisy diffs in Citus regression tests. This PR narrows each catalog probe to the specific constraint type under test (PK/UNIQUE/EXCLUDE/CHECK), keeping results stable across PG15–PG18. ## What changed * Update `src/test/regress/sql/multi_alter_table_add_constraints_without_name.sql` to: * Add `AND con.contype IN ('p'\|'u'\|'x'\|'c')` in each query, matching the constraint just created. * Join namespace via `rel.relnamespace` for robustness. * Refresh `src/test/regress/expected/multi_alter_table_add_constraints_without_name.out` to reflect the filtered results. ## Why * PG18 adds named `NOT NULL` entries to `pg_constraint`, which previously lived only in `pg_attribute`. Tests that select from `pg_constraint` without filtering now see extra rows (e.g., `*_not_null`), breaking expectations. Filtering by `contype` validates exactly what the test intends (PK/UNIQUE/EXCLUDE/CHECK naming/propagation) and ignores unrelated `NOT NULL` rows. ```diff diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_alter_table_add_constraints_without_name.out /__w/citus/citus/src/test/regress/results/multi_alter_table_add_constraints_without_name.out --- /__w/citus/citus/src/test/regress/expected/multi_alter_table_add_constraints_without_name.out.modified 2025-09-11 14:36:52.521254512 +0000 +++ /__w/citus/citus/src/test/regress/results/multi_alter_table_add_constraints_without_name.out.modified 2025-09-11 14:36:52.549254440 +0000 @@ -20,34 +20,36 @@ ALTER TABLE AT_AddConstNoName.products ADD PRIMARY KEY(product_no); SELECT con.conname FROM pg_catalog.pg_constraint con INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace WHERE rel.relname = 'products'; conname ------------------------------ products_pkey -(1 row) + products_product_no_not_null +(2 rows) -- Check that the primary key name created on the coordinator is sent to workers and -- the constraints created for the shard tables conform to the <conname>_shardid naming scheme. \c - - :public_worker_1_host :worker_1_port SELECT con.conname FROM pg_catalog.pg_constraint con INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace WHERE rel.relname = 'products_5410000'; conname -------------------------------------- + products_5410000_product_no_not_null products_pkey_5410000 -(1 row) +(2 rows) ``` after pr: https://github.com/citusdata/citus/actions/runs/17697415668/job/50298622183#step:5:265	2025-09-17 14:12:15 +03:00
Onur Tirtir	f69c62870d	Remove	2025-09-04 14:55:41 +03:00
Naisila Puka	0fd95d71e4	Order same frequency common values, and add test (#8167 ) Added similar test to what @colm-mchugh tested in the original PR https://github.com/citusdata/citus/pull/8026#discussion_r2279021218	2025-08-29 01:41:32 +03:00
Naisila Puka	d5f0ec5cd1	Fix invalid input syntax for type bigint (#8166 ) Fixes #8164	2025-08-29 01:01:18 +03:00
Naisila Puka	544b6c4716	Add GUC for queries with outer joins and pseudoconstant quals (#8163 ) Users can turn on this GUC at their own risk.	2025-08-27 22:31:22 +03:00
Colm	bb6eeb17cc	Fix bug in redundant WHERE clause detection. (#8162 ) Need to also check Postgres plan's rangetables for relations used in Initplans. DESCRIPTION: Fix a bug in redundant WHERE clause detection; we need to additionally check the Postgres plan's range tables for the presence of citus tables, to account for relations that are referenced from scalar subqueries. There is a fundamental flaw in `4139370`, the assumption that, after Postgres planning has completed, all tables used in a query can be obtained by walking the query tree. This is not the case for scalar subqueries, which will be referenced by `PARAM` nodes. The fix adds an additional check of the Postgres plan range tables; if there is at least one citus table in there we do not need to change the needs distributed planning flag. Fixes #8159	2025-08-27 13:32:02 +01:00
Muhammad Usama	62e5fcfe09	Enhance clone node replication status messages (#8152 ) - Downgrade replication lag reporting from NOTICE to DEBUG to reduce noise and improve regression test stability. - Add hints to certain replication status messages for better clarity. - Update expected output files accordingly.	2025-08-26 21:48:07 +03:00
Naisila Puka	aaa31376e0	Make columnar_chunk_filtering pass consecutive runs (#8147 ) Test was not cleaning up after itself therefore failed consecutive runs Test locally with: make check-columnar-minimal \ EXTRA_TESTS='columnar_chunk_filtering columnar_chunk_filtering'	2025-08-25 14:35:37 +03:00
Mehmet YILMAZ	f1f0b09f73	PG18 - Add BUFFERS OFF to EXPLAIN ANALYZE calls (#8101 ) Relevant PG18 commit: `c2a4078eba` - Enable buffer-usage reporting by default in `EXPLAIN ANALYZE` on PostgreSQL 18 and above. Solution: - Introduce the explicit `BUFFERS OFF` option in every existing regression test to maintain pre-PG18 output consistency. - This appends, `BUFFERS OFF` to all `EXPLAIN ANALYZE(...)` calls in src/test/regress/sql and the corresponding .out files. fixes #8093	2025-08-21 13:48:50 +03:00
Naisila Puka	eaa609f510	Add citus_stats UDF (#8026 ) DESCRIPTION: Add `citus_stats` UDF This UDF acts on a Citus table, and provides `null_frac`, `most_common_vals` and `most_common_freqs` for each column in the table, based on the definitions of these columns in the Postgres view `pg_stats`. Aggregated Views: pg\_stats > citus\_stats citus\_stats, is a view intended for use in Citus, a distributed extension of PostgreSQL. It collects and returns column-level statistics for a distributed table—specifically, the most common values, their frequencies, and fraction of null values, like pg\_stats view does for regular Postgres tables. Use Case This view is useful when: - You need column-level insights on a distributed table. - You're performing query optimization, cardinality estimation, or data profiling across shards. What It Returns A table with: \| Column Name \| Data Type \| Description \| \|---------------------\|-----------\|-----------------------------------------------------------------------------\| \| schemaname \| text \| Name of the schema containing the distributed table \| \| tablename \| text \| Name of the distributed table \| \| attname \| text \| Name of the column (attribute) \| \| null_frac \| float4 \| Estimated fraction of NULLs in the column across all shards \| \| most_common_vals \| text[] \| Array of most common values for the column \| \| most_common_freqs \| float4[] \| Array of corresponding frequencies (as fractions) of the most common values\| Caveats - The function assumes that the array of the most common values among different shards will be the same, therefore it just adds everything up.	2025-08-19 23:17:13 +03:00
Muhammad Usama	be6668e440	Snapshot-Based Node Split – Foundation and Core Implementation (#8122 ) DESCRIPTION: This pull request introduces the foundation and core logic for the snapshot-based node split feature in Citus. This feature enables promoting a streaming replica (referred to as a clone in this feature and UI) to a primary node and rebalancing shards between the original and the newly promoted node without requiring a full data copy. This significantly reduces rebalance times for scale-out operations where the new node already contains a full copy of the data via streaming replication. Key Highlights: 1. Replica (Clone) Registration & Management Infrastructure Introduces a new set of UDFs to register and manage clone nodes: - citus_add_clone_node() - citus_add_clone_node_with_nodeid() - citus_remove_clone_node() - citus_remove_clone_node_with_nodeid() These functions allow administrators to register a streaming replica of an existing worker node as a clone, making it eligible for later promotion via snapshot-based split. 2. Snapshot-Based Node Split (Core Implementation) New core UDF: - citus_promote_clone_and_rebalance() This function implements the full workflow to promote a clone and rebalance shards between the old and new primaries. Steps include: 1. Ensuring Exclusivity – Blocks any concurrent placement-changing operations. 2. Blocking Writes – Temporarily blocks writes on the primary to ensure consistency. 3. Replica Catch-up – Waits for the replica to be fully in sync. 4. Promotion – Promotes the replica to a primary using pg_promote. 5. Metadata Update – Updates metadata to reflect the newly promoted primary node. 6. Shard Rebalancing – Redistributes shards between the old and new primary nodes. 3. Split Plan Preview A new helper UDF get_snapshot_based_node_split_plan() provides a preview of the shard distribution post-split, without executing the promotion. Example: ``` reb 63796> select * from pg_catalog.get_snapshot_based_node_split_plan('127.0.0.1',5433,'127.0.0.1',5453); table_name \| shardid \| shard_size \| placement_node --------------+---------+------------+---------------- companies \| 102008 \| 0 \| Primary Node campaigns \| 102010 \| 0 \| Primary Node ads \| 102012 \| 0 \| Primary Node mscompanies \| 102014 \| 0 \| Primary Node mscampaigns \| 102016 \| 0 \| Primary Node msads \| 102018 \| 0 \| Primary Node mscompanies2 \| 102020 \| 0 \| Primary Node mscampaigns2 \| 102022 \| 0 \| Primary Node msads2 \| 102024 \| 0 \| Primary Node companies \| 102009 \| 0 \| Clone Node campaigns \| 102011 \| 0 \| Clone Node ads \| 102013 \| 0 \| Clone Node mscompanies \| 102015 \| 0 \| Clone Node mscampaigns \| 102017 \| 0 \| Clone Node msads \| 102019 \| 0 \| Clone Node mscompanies2 \| 102021 \| 0 \| Clone Node mscampaigns2 \| 102023 \| 0 \| Clone Node msads2 \| 102025 \| 0 \| Clone Node (18 rows) ``` 4 Test Infrastructure Enhancements - Added a new test case scheduler for snapshot-based split scenarios. - Enhanced pg_regress_multi.pl to support creating node backups with slightly modified options to simulate real-world backup-based clone creation. ### 5. Usage Guide The snapshot-based node split can be performed using the following workflow: - Take a Backup of the Worker Node Run pg_basebackup (or an equivalent tool) against the existing worker node to create a physical backup. `pg_basebackup -h <primary_worker_host> -p <port> -D /path/to/replica/data --write-recovery-conf ` - Start the Replica Node Start PostgreSQL on the replica using the backup data directory, ensuring it is configured as a streaming replica of the original worker node. - Register the Backup Node as a Clone Mark the registered replica as a clone of its original worker node: `SELECT * FROM citus_add_clone_node('<clone_host>', <clone_port>, '<primary_host>', <primary_port>); ` - Promote and Rebalance the Clone Promote the clone to a primary and rebalance shards between it and the original worker: `SELECT * FROM citus_promote_clone_and_rebalance('clone_node_id'); ` - Drop Any Replication Slots from the Original Worker After promotion, clean up any unused replication slots from the original worker: `SELECT pg_drop_replication_slot('<slot_name>'); `	2025-08-19 14:13:55 +03:00
Muhammad Usama	f743b35fc2	Parallelize Shard Rebalancing & Unlock Concurrent Logical Shard Moves (#7983 ) DESCRIPTION: Parallelizes shard rebalancing and removes the bottlenecks that previously blocked concurrent logical-replication moves. These improvements reduce rebalance windows—particularly for clusters with large reference tables and enable multiple shard transfers to run in parallel. Motivation: Citus’ shard rebalancer has some key performance bottlenecks: Sequential Movement of Reference Tables: Reference tables are often assumed to be small, but in real-world deployments, they can grow significantly large. Previously, reference table shards were transferred as a single unit, making the process monolithic and time-consuming. No Parallelism Within a Colocation Group: Although Citus distributes data using colocated shards, shard movements within the same colocation group were serialized. In environments with hundreds of distributed tables colocated together, this serialization significantly slowed down rebalance operations. Excessive Locking: Rebalancer used restrictive locks and redundant logical replication guards, further limiting concurrency. The goal of this commit is to eliminate these inefficiencies and enable maximum parallelism during rebalance, without compromising correctness or compatibility. Parallelize shard rebalancing to reduce rebalance time. Feature Summary: 1. Parallel Reference Table Rebalancing Each reference-table shard is now copied in its own background task. Foreign key and other constraints are deferred until all shards are copied. For single shard movement without considering colocation a new internal-only UDF '`citus_internal_copy_single_shard_placement`' is introduced to allow single-shard copy/move operations. Since this function is internal, we do not allow users to call it directly. Temporary Hack to Set Background Task Context Background tasks cannot currently set custom GUCs like application_name before executing internal-only functions. 'citus_rebalancer ...' statement as a prefix in the task command. This is a temporary hack to label internal tasks until proper GUC injection support is added to the background task executor. 2. Changes in Locking Strategy - Drop the leftover replication lock that previously serialized shard moves performed via logical replication. This lock was only needed when we used to drop and recreate the subscriptions/publications before each move. Since Citus now removes those objects later as part of the “unused distributed objects” cleanup, shard moves via logical replication can safely run in parallel without additional locking. - Introduced a per-shard advisory lock to prevent concurrent operations on the same shard while allowing maximum parallelism elsewhere. - Change the lock mode in AcquirePlacementColocationLock from ExclusiveLock to RowExclusiveLock to allow concurrent updates within the same colocation group, while still preventing concurrent DDL operations. 3. citus_rebalance_start() enhancements The citus_rebalance_start() function now accepts two new optional parameters: ``` - parallel_transfer_colocated_shards BOOLEAN DEFAULT false, - parallel_transfer_reference_tables BOOLEAN DEFAULT false ``` This ensures backward compatibility by preserving the existing behavior and avoiding any disruption to user expectations and when both are set to true, the rebalancer operates with full parallelism. Previous Rebalancer Behavior: `SELECT citus_rebalance_start(shard_transfer_mode := 'force_logical');` This would: Start a single background task for replicating all reference tables Then, move all shards serially, one at a time. ``` Task 1: replicate_reference_tables() ↓ Task 2: move_shard_1() ↓ Task 3: move_shard_2() ↓ Task 4: move_shard_3() ``` Slow and sequential. Reference table copy is a bottleneck. Colocated shards must wait for each other. New Parallel Rebalancer: ``` SELECT citus_rebalance_start( shard_transfer_mode := 'force_logical', parallel_transfer_colocated_shards := true, parallel_transfer_reference_tables := true ); ``` This would: - Schedule independent background tasks for each reference-table shard. - Move colocated shards in parallel, while still maintaining dependency order. - Defer constraint application until all reference shards are in place. - ``` Task 1: copy_ref_shard_1() Task 2: copy_ref_shard_2() Task 3: copy_ref_shard_3() → Task 4: apply_constraints() ↓ Task 5: copy_shard_1() Task 6: copy_shard_2() Task 7: copy_shard_3() ↓ Task 8-10: move_shard_1..3() ``` Each operation is scheduled independently and can run as soon as dependencies are satisfied.	2025-08-18 17:44:14 +03:00
eaydingol	8d929d3bf8	Push down recurring outer joins when possible (#7973 ) DESCRIPTION: Adds support for pushing down LEFT/RIGHT outer joins having a reference table in the outer side and a distributed table on the inner side (e.g., <reference table> LEFT JOIN <distributed table>) Partially addresses #6546 1) `<outer:reference>` LEFT JOIN `<inner:distributed>` 2) `<inner:distributed>` RIGHT JOIN `<outer:reference>` Previously, for outer joins of types (1) and (2), the distributed side was computed recursively. This was necessary because, when the inner side of a recurring outer join is a distributed table, it is not possible to directly distribute the join; the preserved (outer and recurring) side may generate rows with join keys that hash to different shards. To implement distributed planning while maintaining consistency with global execution semantics, this PR restricts the outer side only to those partition key values that route to the selected shard during distributed shard query computation. This method is employed )when the following criteria are met: (recursive planning applied otherwise) - The join type is (1) or (2) (lateral joins are not supported). - The outer side is a reference table. - The outer join qualifications include an equality condition between the partition column of a distributed table and the recurring table. - The join is not part of a chained join. - The “enable_recurring_outer_join_pushdown” GUC is enabled (default is on). --------- Co-authored-by: ebruaydingol <ebruaydingol@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2025-08-18 14:03:44 +03:00
Onur Tirtir	87a1b631e8	Not automatically create citus_columnar when creating citus extension (#8081 ) DESCRIPTION: Not automatically create citus_columnar when there are no relations using it. Previously, we were always creating citus_columnar when creating citus with version >= 11.1. And how we were doing was as follows: * Detach SQL objects owned by old columnar, i.e., "drop" them from citus, but not actually drop them from the database * "old columnar" is the one that we had before Citus 11.1 as part of citus, i.e., before splitting the access method ands its catalog to citus_columnar. * Create citus_columnar and attach the SQL objects leftover from old columnar to it so that we can continue supporting the columnar tables that user had before Citus 11.1 with citus_columnar. First part is unchanged, however, now we don't create citus_columnar automatically anymore if the user didn't have any relations using columnar. For this reason, as of Citus 13.2, when these SQL objects are not owned by an extension and there are no relations using columnar access method, we drop these SQL objects when updating Citus to 13.2. The net effect is still the same as if we automatically created citus_columnar and user dropped citus_columnar later, so we should not have any issues with dropping them. (Update: Seems we've made some assumptions in citus, e.g., citus_finish_pg_upgrade() still assumes columnar metadata exists and tries to apply some fixes for it, so this PR fixes them as well. See the last section of this PR description.) Also, ideally I was hoping to just remove some lines of code from extension.c, where we decide automatically creating citus_columnar when creating citus, however, this didn't happen to be the case for two reasons: * We still need to automatically create it for the servers using columnar access method. * We need to clean-up the leftover SQL objects from old columnar when the above is not case otherwise we would have leftover SQL objects from old columnar for no reason, and that would confuse users too. * Old columnar cannot be used to create columnar tables properly, so we should clean them up and let the user decide whether they want to create citus_columnar when they really need it later. --- Also made several changes in the test suite because similarly, we don't always want to have citus_columnar created in citus tests anymore: * Now, columnar specific test targets, which cover 41 test sql files, always install columnar by default, by using "--load-extension=citus_columnar". * "--load-extension=citus_columnar" is not added to citus specific test targets because by default we don't want to have citus_columnar created during citus tests. * Excluding citus_columnar specific tests, we have 601 sql files that we have as citus tests and in 27 of them we manually create citus_columnar at the very beginning of the test because these tests do test some functionalities of citus together with columnar tables. Also, before and after schedules for PG upgrade tests are now duplicated so we have two versions of each: one with columnar tests and one without. To choose between them, check-pg-upgrade now supports a "test-with-columnar" option, which can be set to "true" or anything else to logically indicate "false". In CI, we run the check-pg-upgrade test target with both options. The purpose is to ensure we can test PG upgrades where citus_columnar is not created in the cluster before the upgrade as well. Finally, added more tests to multi_extension.sql to test Citus upgrade scenarios with / without columnar tables / citus_columnar extension. --- Also, seems citus_finish_pg_upgrade was assuming that citus_columnar is always created but actually we should have never made such an assumption. To fix that, moved columnar specific post-PG-upgrade work from citus to a new columnar UDF, which is columnar_finish_pg_upgrade. But to avoid breaking existing customer / managed service scripts, we continue to automatically perform post PG-upgrade work for columnar within citus_finish_pg_upgrade, but only if columnar access method exists this time.	2025-08-18 08:29:27 +01:00
Mehmet YILMAZ	a6161f5a21	Fix CTE traversal for outer Vars in FindReferencedTableColumn (remove assert; correct parentQueryList handling) (#8106 ) fixes #8105 This change lets `FindReferencedTableColumn()` correctly resolve columns through a CTE even when the expression comes from an outer query level (`varlevelsup > 0`, `skipOuterVars = false`). Before, we hit an `Assert(skipOuterVars)` in this path. Problem * Hitting a CTE after walking outer Vars triggered `Assert(skipOuterVars)`. * Cause: we modified `parentQueryList` in place and didn’t rebuild the correct parent chain before recursing into the CTE, so the path was considered unsafe. Fix * Remove the `Assert(skipOuterVars)` in the `RTE_CTE` branch. * Find the CTE’s owning level via `ctelevelsup` and compute `cteParentListIndex`. * Rebuild a private parent list for recursion: `list_copy` → `list_truncate` → `lappend(current query)`. * Add a bounds check before indexing the CTE’s `targetList`. Why it works ```diff -parentQueryList = lappend(parentQueryList, query); -FindReferencedTableColumn(targetEntry->expr, parentQueryList, - cteQuery, column, rteContainingReferencedColumn, - skipOuterVars); + /* hand a private, bounded parent list to the recursion / + List newParent = list_copy(parentQueryList); + newParent = list_truncate(newParent, cteParentListIndex + 1); + newParent = lappend(newParent, query); + + FindReferencedTableColumn(targetEntry->expr, + newParent, + cteQuery, + column, + rteContainingReferencedColumn, + skipOuterVars); +} ``` Before: We changed `parentQueryList` in place (`parentQueryList = lappend(...)`) and didn’t trim it to the CTE’s owner level. After: We copy the list, trim it to the CTE’s owner level, then append the current query. This keeps the parent list accurate for the current recursion and safe when following outer Vars. Example: Nested subquery referencing the CTE (two levels down) ``` WITH c AS MATERIALIZED (SELECT user_id FROM raw_events_first) SELECT 1 FROM raw_events_first t WHERE EXISTS ( SELECT 1 FROM (SELECT user_id FROM c) c2 WHERE c2.user_id = t.user_id ); ``` Levels: Q0 = top SELECT Q1 = EXISTS subquery Q2 = inner (SELECT user_id FROM c) When resolving c2.user_id inside Q2: - parentQueryList is [Q0, Q1, Q2]. - `ctelevelsup`: 2 `cteParentListIndex = length(parentQueryList) - ctelevelsup - 1` - Recurse into the CTE’s query with [Q0, Q2]. Tests (added in `multi_insert_select`) * T1: Correlated subquery that references a CTE (one level down) Verifies that resolving through `RTE_CTE` after following an outer `Var` succeeds, row count matches source table. * T2: Nested subquery that references a CTE (two levels down) Exercises deeper recursion and confirms identical to T1. * T3: Scalar subquery in a target list that reads from the outer CTE Checks expected row count and that no NULLs are inserted. These tests cover the cases that previously hit `Assert(skipOuterVars)` and confirm CTE references while following outer Vars.	2025-08-12 11:49:50 +03:00
Mehmet YILMAZ	6b6d959fac	PG18 - pg17.sql Simplify step 10 verification to use COUNT() instead of SELECT (#8111 ) fixes #8096 PostgreSQL 18 adds a `conenforced` flag allowing `CHECK` constraints to be declared `NOT ENFORCED`. `ca87c415e2` ```diff @@ -1256,26 +1278,26 @@ distributed_partitioned_table_id_partition_col_excl \| x (2 rows) -- Step 9: Drop the exclusion constraints from both tables \c - - :master_host :master_port SET search_path TO pg17; ALTER TABLE distributed_partitioned_table DROP CONSTRAINT dist_exclude_named; ALTER TABLE local_partitioned_table DROP CONSTRAINT local_exclude_named; -- Step 10: Verify the constraints were dropped SELECT * FROM pg_constraint WHERE conname = 'dist_exclude_named' AND contype = 'x'; - oid \| conname \| connamespace \| contype \| condeferrable \| condeferred \| convalidated \| conrelid \| contypid \| conindid \| conparentid \| confrelid \| confupdtype \| confdeltype \| confmatchtype \| conislocal \| coninhcount \| connoinherit \| conkey \| confkey \| conpfeqop \| conppeqop \| conffeqop \| confdelsetcols \| conexclop \| conbin + oid \| conname \| connamespace \| contype \| condeferrable \| condeferred \| conenforced \| convalidated \| conrelid \| contypid \| conindid \| conparentid \| confrelid \| confupdtype \| confdeltype \| confmatchtype \| conislocal \| coninhcount \| connoinherit \| conperiod \| conkey \| confkey \| conpfeqop \| conppeqop \| conffeqop \| confdelsetcols \| conexclop \| conbin -----+---------+--------------+---------+---------------+-------------+-------------+--------------+----------+----------+----------+-------------+-----------+-------------+-------------+---------------+------------+-------------+--------------+-----------+--------+---------+-----------+-----------+-----------+----------------+-----------+-------- (0 rows) SELECT * FROM pg_constraint WHERE conname = 'local_exclude_named' AND contype = 'x'; - oid \| conname \| connamespace \| contype \| condeferrable \| condeferred \| convalidated \| conrelid \| contypid \| conindid \| conparentid \| confrelid \| confupdtype \| confdeltype \| confmatchtype \| conislocal \| coninhcount \| connoinherit \| conkey \| confkey \| conpfeqop \| conppeqop \| conffeqop \| confdelsetcols \| conexclop \| conbin + oid \| conname \| connamespace \| contype \| condeferrable \| condeferred \| conenforced \| convalidated \| conrelid \| contypid \| conindid \| conparentid \| confrelid \| confupdtype \| confdeltype \| confmatchtype \| conislocal \| coninhcount \| connoinherit \| conperiod \| conkey \| confkey \| conpfeqop \| conppeqop \| conffeqop \| confdelsetcols \| conexclop \| conbin -----+---------+--------------+---------+---------------+-------------+-------------+--------------+----------+----------+----------+-------------+-----------+-------------+-------------+---------------+------------+-------------+--------------+-----------+--------+---------+-----------+-----------+-----------+----------------+-----------+-------- (0 rows) ``` The purpose of step 10 is merely to confirm that the exclusion constraints dist_exclude_named and local_exclude_named have been dropped. There’s no need to pull back every column from pg_constraint—we only care about whether any matching row remains. - Reduces noise in the output - Eliminates dependence on the full set of pg_constraint columns (which can drift across Postgres versions) - Resolves the pg18 regression diff without altering test expectations	2025-08-08 13:46:11 +03:00
eaydingol	3d8fd337e5	Check outer table partition column (#8092 ) DESCRIPTION: Introduce a new check to push down a query including union and outer join to fix #8091 . In "SafeToPushdownUnionSubquery", we check if the distribution column of the outer relation is in the target list.	2025-08-06 16:13:14 +03:00
Teja Mupparti	889aa92ac0	EXPLAIN ANALYZE - Prevent execution of the plan during the plan-print (#8017 ) DESCRIPTION: Fixed a bug in EXPLAIN ANALYZE to prevent unintended (duplicate) execution of the (sub)plans during the explain phase. Fixes #4212 ### 🐞 Bug #4212 : Redundant (Subplan) Execution in `EXPLAIN ANALYZE` codepath #### 🔍 Background In the standard PostgreSQL execution path, `ExplainOnePlan()` is responsible for two distinct operations depending on whether `EXPLAIN ANALYZE` is requested: 1. Execute the plan ```c if (es->analyze) ExecutorRun(queryDesc, direction, 0L, true); ``` 2. Print the plan tree ```c ExplainPrintPlan(es, queryDesc); ``` When printing the plan, the executor should not run the plan again. Execution is only expected to happen once—at the top level when `es->analyze = true`. --- #### ⚠️ Issue in Citus In the Citus implementation of `CustomScanMethods.ExplainCustomScan = CitusExplainScan`, which is a custom scan explain callback function used to print explain information of a Citus plan incorrectly performs redundant execution inside the explain path of `ExplainPrintPlan()` ```c ExplainOnePlan() ExplainPrintPlan() ExplainNode() CitusExplainScan() if (distributedPlan->subPlanList != NIL) { ExplainSubPlans(distributedPlan, es); { PlannedStmt plan = subPlan->plan; ExplainOnePlan(plan, ...); // ⚠️ May re-execute subplan if es->analyze is true } } ``` This causes the subplans to be executed again, even though they have already been executed during the top-level plan execution. This behavior violates the expectation in PostgreSQL where `EXPLAIN ANALYZE` should execute each node exactly once* for analysis. --- #### ✅ Fix (proposed) Save the output of Subplans during `ExecuteSubPlans()`, and later use it in `ExplainSubPlans()`	2025-07-30 11:29:50 -07:00
Cédric Villemain	0c1b31cdb5	Fix UPDATE stmts with indirection & array/jsonb subscripting with more than 1 field (#7675 ) DESCRIPTION: Fixes problematic UPDATE statements with indirection and array/jsonb subscripting with more than one field. Fixes #4092, #7674 and #5621. Issues #7674 and #4092 involve an UPDATE with out of order columns and a sublink (SELECT) in the source, e.g. `UPDATE T SET (col3, col1, col4) = (SELECT 3, 1, 4)` where an incorrect value could get written to a column because query deparsing generated an incorrect SQL statement. To address this the fix adds an additional check to `ruleutils` to ensure that the target list of an UPDATE statement is in an order so that deparsing can be done safely. It is needed when the source of the UPDATE has a sublink, because Postgres `rewrite` will have put the target list in attribute order, but for deparsing to produce a correct SQL text the target list needs to be in order of the references (or `paramids`) to the target list of the sublink(s). Issue #5621 involves an UPDATE with array/jsonb subscripting that can behave incorrectly with more than one field, again because Citus query deparsing is receiving a post-`rewrite` query tree. The fix also adds a check to `ruleutils` to enable correct query deparsing of the UPDATE. --------- Co-authored-by: Ibrahim Halatci <ihalatci@gmail.com> Co-authored-by: Colm McHugh <colm.mchugh@gmail.com>	2025-07-22 17:49:26 +01:00
Colm	245a62df3e	Avoid query deparse and planning of shard query in local execution. (#8035 ) DESCRIPTION: Avoid query deparse and planning of shard query in local execution. Adds citus.enable_local_execution_local_plan GUC to allow avoiding unnecessary query deparsing to improve performance of fast-path queries targeting local shards. If a fast path query resolves to a shard that is local to the node planning the query, a shortcut can be taken so that the OID of the shard is plugged into the parse tree, which is then planned by Postgres. In `local_executor.c` the task uses that plan instead of parsing and planning a shard query. How this is done: The fast path planner identifies if the shortcut is possible, and then the distributed planner checks, using `CheckAndBuildDelayedFastPathPlan()`, if a local plan can be generated or if the shard query should be generated. This optimization is controlled by a GUC `citus.enable_local_execution_local_plan` which is on by default. A new regress test `local_execution_local_plan` tests both row-sharding and schema sharding. Negative tests are added to `local_shard_execution_dropped_column` to verify that the optimization is not taken when the shard is local but there is a difference between the shard and distributed table because of a dropped column.	2025-07-22 17:16:53 +01:00
SongYoungUk	743c9bbf87	fix #7715 - add assign hook for CDC library path adjustment (#8025 ) DESCRIPTION: Automatically updates dynamic_library_path when CDC is enabled fix : #7715 According to the documentation and `pg_settings`, the context of the `citus.enable_change_data_capture` parameter is user. However, changing this parameter — even as a superuser — doesn't work as expected: while the initial copy phase works correctly, subsequent change events are not propagated. This appears to be due to the fact that `dynamic_library_path` is only updated to `$libdir/citus_decoders:$libdir` when the server is restarted and the `_PG_init` function is invoked. To address this, I added an `EnableChangeDataCaptureAssignHook` that automatically updates `dynamic_library_path` at runtime when `citus.enable_change_data_capture` is enabled, ensuring that the CDC decoder libraries are properly loaded. Note that `dynamic_library_path` is already a `superuser`-context parameter in base PostgreSQL, so updating it from within the assign hook should be safe and consistent with PostgreSQL’s configuration model. If there’s any reason this approach might be problematic or if there’s a preferred alternative, I’d appreciate any feedback. cc. @jy-min --------- Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com> Co-authored-by: ibrahim halatci <ihalatci@gmail.com>	2025-07-18 11:07:17 +03:00
naisila	4cd8bb1b67	Bump Citus version to 13.2devel	2025-06-24 16:21:48 +02:00
Onur Tirtir	55a0d1f730	Add skip_qualify_public param to shard_name() to allow qualifying for "public" schema (#8014 ) DESCRIPTION: Adds skip_qualify_public param to `shard_name()` UDF to allow qualifying for "public" schema when needed.	2025-06-02 10:15:32 +03:00
Alper Kocatas	088ba75057	Add citus_nodes view (#7968 ) DESCRIPTION: Adds `citus_nodes` view that displays the node name, port, role, and "active" for nodes in the cluster. This PR adds `citus_nodes` view to the `pg_catalog` schema. The `citus_nodes` view is created in the `citus` schema and is used to display the node name, port, role, and active status of each node in the `pg_dist_node` table. The view is granted `SELECT` permission to the `PUBLIC` role and is set to the `pg_catalog` schema. Test cases was added to `multi_cluster_management` tests. structs.py was modified to add white spaces as `citus_indent` required. --------- Co-authored-by: Alper Kocatas <alperkocatas@microsoft.com>	2025-05-14 15:05:12 +03:00
Naisila Puka	a18040869a	Error out for queries with outer joins and pseudoconstant quals in PG<17 (#7937 ) PG15 commit d1ef5631e620f9a5b6480a32bb70124c857af4f1 and PG16 commit 695f5deb7902865901eb2d50a70523af655c3a00 disallow replacing joins with scans in queries with pseudoconstant quals. This commit prevents the set_join_pathlist_hook from being called if any of the join restrictions is a pseudo-constant. So in these cases, citus has no info on the join, never sees that the query has an outer join, and ends up producing an incorrect plan. PG17 fixes this by commit 9e9931d2bf40e2fea447d779c2e133c2c1256ef3 Therefore, we take this extra measure here for PG versions less than 17. hasOuterJoin can never be true when set_join_pathlist_hook is absent.	2025-05-11 21:47:28 +00:00
Mehmet YILMAZ	a4040ba5da	Planner: lift volatile target‑list items in `WrapSubquery` to coordinator (prevents sequence‑leap in distributed `INSERT … SELECT`) (#7976 ) This PR fixes #7784 and refactors the `WrapSubquery(Query subquery)` function to improve clarity and correctness when handling volatile expressions in subqueries during Citus insert-select rewriting. ### Background The `WrapSubquery` function rewrites a query of the form: ```sql INSERT INTO target_table SELECT ... FROM ... ``` ...by wrapping the `SELECT` in a subquery: ```sql SELECT <outer-TL> FROM ( <subquery with volatile expressions replaced with NULL> ) citus_insert_select_subquery ``` This transformation allows: Volatile expressions (e.g., `nextval`, `now`) not used in `GROUP BY` or `ORDER BY` to be evaluated exactly once on the coordinator. * Stable/immutable or sort-relevant expressions to remain in the worker-executed subquery. * Placeholder `NULL`s to maintain column alignment in the inner subquery. ### Fix Details * Restructured the code into labeled logical sections: 1. Build wrapper query (`SELECT … FROM (subquery)`) 2. Rewrite target lists with volatility analysis 3. Assign and return updated query trees * Preserved existing behavior, focusing on clarity and maintainability. ### How the new code handles volatile items stage \| what we look for \| what we do \| why -- \| -- \| -- \| -- scan target list once \| 1. `expr_is_volatile(te->expr)` 2. `te->ressortgroupref != 0` (is the column used in GROUP BY / ORDER BY?) \| decide whether to hoist or keep \| we must not hoist an expression the inner query still needs for sorting/grouping, otherwise its `SortGroupClause` breaks volatile & not used in sort/group \| deep‑copy the expression into the outer target list \| executes once on the coordinator \| \| leave a typed `NULL `placeholder (visible, not `resjunk`) in the inner target list \| keeps column numbering stable for helpers that already ran (reorder, cast); the worker sends a cheap constant \| stable / immutable, or volatile but used in sort/group \| keep the original expression in the inner list; outer list references it via a `Var `\| workers can evaluate it safely and, if needed, the inner ORDER BY still works \| ### Example Given this query: ```sql INSERT INTO t SELECT nextval('s'), 42 FROM generate_series(1, 2); ``` The planner rewrites it as: ```sql SELECT nextval('s'), col2 FROM (SELECT NULL::bigint AS col1, 42 AS col2 FROM generate_series(1, 2)) citus_insert_select_subquery; ``` This ensures `nextval('s')` is evaluated only once per row on the coordinator, not on each worker node, preserving correct sequence semantics. #### Outer‑Var guard (`FindReferencedTableColumn`) Because `WrapSubquery` adds an extra query level, lots of Vars that the old code never expected become “outer” Vars; without teaching `FindReferencedTableColumn` to climb that extra level reliably, Citus would intermittently reject valid foreign keys and even hit asserts. * Re‑implemented the outer‑Var guard so that the function: * Walks deterministically up the query stack when `skipOuterVars = false` (default for FK / UNION checks). A new while‑loop copies — rather than truncates — `parentQueryList` on each hop, eliminating list‑aliasing that made issue 5248 fail intermittently in parallel regressions. * Handles multi‑level `varlevelsup` in a single loop; never mutates the caller’s list in place.	2025-05-06 17:45:49 +03:00
Colm	d4dd44e715	Propagate SECURITY LABEL on tables and columns. (#7956 ) Issue #7709 asks for security labels on columns to be propagated, to support the `anon` extension. Before, Citus supported security labels on roles (#7735) and this PR adds support for propagating security labels on tables and columns. All scenarios that involve propagating metadata for a Citus table now include the security labels on the table and on the columns of the table. These scenarios are: - When a table becomes distributed using `create_distributed_table()` or `create_reference_table()`, its security labels (if any) are propageted. - When a security label is defined on a distributed table, or one of its columns, the label is propagated. - When a node is added to a Citus cluster, all distributed tables have their security labels propagated. - When a column of a distributed table is dropped, any security labels on the column are also dropped. - When a column is added to a distributed table, security labels can be defined on the column and are propagated. - Security labels on a distributed table or its columns are not propagated when `citus.enable_metadata_sync` is enabled. Regress test `seclabel` is extended with tests to cover these scenarios. The implementation is somewhat involved because it impacts DDL propagation of Citus tables, but can be broken down as follows: - distributed_object_ops has `Role_SecLabel`, `Table_SecLabel` and `Column_SecLabel` to take care of security labels on roles, tables and columns. `Any_SecLabel` is used for all other security labels and is essentially a nop. - Deparser support - `DeparseRoleSecLabelStmt()`, `DeparseTableSecLabelStmt()` and `DeparseColumnSecLabelStmt()` take care of deparsing security label statements on roles, tables and columns respectively. - When reconstructing the DDL for a citus table, security labels on the table or its columns are included by having `GetPreLoadTableCreationCommands()` call a new function `CreateSecurityLabelCommands()` to take care of any security labels on the table or its columns. - When changing a distributed table name to a shard name before running a command locally on a worker, function `RelayEventExtendNames()` checks for security labels on a table or its columns.	2025-04-30 18:03:52 +01:00
Onur Tirtir	3d61c4dc71	Add citus_stat_counters view and citus_stat_counters_reset() function to reset it (#7917 ) DESCRIPTION: Adds citus_stat_counters view that can be used to query stat counters that Citus collects while the feature is enabled, which is controlled by citus.enable_stat_counters. citus_stat_counters() can be used to query the stat counters for the provided database oid and citus_stat_counters_reset() can be used to reset them for the provided database oid or for the current database if nothing or 0 is provided. Today we don't persist stat counters on server shutdown. In other words, stat counters are automatically reset in case of a server restart. Details on the underlying design can be found in header comment of stat_counters.c and in the technical readme. ------- Here are the details about what we track as of this PR: For connection management, we have three statistics about the inter-node connections initiated by the node itself: * connection_establishment_succeeded * connection_establishment_failed * connection_reused While the first two are relatively easier to understand, the third one covers the case where a connection is reused. This can happen when a connection was already established to the desired node, Citus decided to cache it for some time (see citus.max_cached_conns_per_worker & citus.max_cached_connection_lifetime), and then reused it for a new remote operation. Here are the other important details about these connection statistics: 1. connection_establishment_failed doesn't care about the connections that we could establish but are lost later in the transaction. Plus, we cannot guarantee that the connections that are counted in connection_establishment_succeeded were not lost later. 2. connection_establishment_failed doesn't care about the optional connections (see OPTIONAL_CONNECTION flag) that we gave up establishing because of the connection throttling rules we follow (see citus.max_shared_pool_size & citus.local_shared_pool_size). The reaason for this is that we didn't even try to establish these connections. 3. For the rest of the cases where a connection failed for some reason, we always increment connection_establishment_failed even if the caller was okay with the failure and know how to recover from it (e.g., the adaptive executor knows how to fall back local execution when the target node is the local node and if it cannot establish a connection to the local node). The reason is that even if it's likely that we can still serve the operation, we still failed to establish the connection and we want to track this. 4. Finally, the connection failures that we count in connection_establishment_failed might be caused by any of the following reasons and for now we prefer to _not_ further distinguish them for simplicity: a. remote node is down or cannot accept any more connections, or overloaded such that citus.node_connection_timeout is not enough to establish a connection b. any internal Citus error that might result in preparing a bad connection string so that libpq fails when parsing the connection string even before actually trying to establish a connection via connect() call c. broken citus.node_conninfo or such Citus configuration that was incorrectly set by the user can also result in similar outcomes as in b d. internal waitevent set / poll errors or OOM in local node We also track two more statistics for query execution: * query_execution_single_shard * query_execution_multi_shard And more importantly, both query_execution_single_shard and query_execution_multi_shard are not only tracked for the top-level queries but also for the subplans etc. The reason is that for some queries, e.g., the ones that go through recursive planning, after Citus performs the heavy work as part of subplans, the work that needs to be done for the top-level query becomes quite straightforward. And for such query types, it would be deceiving if we only incremented the query stat counters for the top-level query. Similarly, for non-pushable INSERT .. SELECT and MERGE queries, we perform separate counter increments for the SELECT / source part of the query besides the final INSERT / MERGE query.	2025-04-28 12:23:52 +00:00

1 2 3 4 5 ...

2798 Commits (c7a55c8606a795ff81347840cb4216cfa73da8eb)