citus

Commit Graph

Author	SHA1	Message	Date
naisila	bcabf9175d	Upgrade uncrustify to from 0.68.1 to 0.82.0	2025-12-07 23:54:28 +03:00
Mehmet YILMAZ	31911d8297	PG18 – Respect VACUUM/ANALYZE ONLY semantics for Citus tables (#8365 ) fixes #8364 PostgreSQL 18 changes VACUUM/ANALYZE to recurse into inheritance children by default, and introduces `ONLY` to limit processing to the parent. Upstream change: [https://github.com/postgres/postgres/commit/62ddf7ee9](https://github.com/postgres/postgres/commit/62ddf7ee9) For Citus tables, we should treat shard placements as “children” and avoid propagating `VACUUM/ANALYZE` to shards when the user explicitly asks for `ONLY`. This PR adjusts the Citus VACUUM handling to align with PG18 semantics, and adds regression coverage on both regular distributed tables and partitioned distributed tables. --- ### Behavior changes * Introduce a per-relation helper struct: ```c typedef struct CitusVacuumRelation { VacuumRelation vacuumRelation; Oid relationId; } CitusVacuumRelation; ``` This lets us keep both: the resolved relation OID (for `IsCitusTable`, task building), and * the original `VacuumRelation` node (for column list and ONLY/inh flag). * Replace the old `VacuumRelationIdList` / `ExtractVacuumTargetRels` flow with: ```c static List VacuumRelationList(VacuumStmt vacuumStmt, CitusVacuumParams vacuumParams); ``` `VacuumRelationList` now: * Iterates over `vacuumStmt->rels`. * Resolves `relid` via `RangeVarGetRelidExtended` when `relation` is present. * Falls back to locking `VacuumRelation->oid` when only an OID is available. * Respects `VACOPT_FULL` for lock mode and `VACOPT_SKIP_LOCKED` for locking behavior. * Builds a `List ` of `CitusVacuumRelation` entries. Update: ```c IsDistributedVacuumStmt(List vacuumRelationList); ExecuteVacuumOnDistributedTables(VacuumStmt vacuumStmt, List vacuumRelationList, CitusVacuumParams vacuumParams); ``` to operate on `CitusVacuumRelation` instead of bare OIDs. Implement `ONLY` semantics in `ExecuteVacuumOnDistributedTables`: ```c RangeVar relation = vacuumRelation->relation; if (relation != NULL && !relation->inh) { / ONLY specified, so don't recurse to shard placements / continue; } ``` Effect: `VACUUM / ANALYZE` (no `ONLY`) on a Citus table: behavior unchanged, Citus creates tasks and propagates to shard placements. * `VACUUM ONLY <citus_table>` / `ANALYZE ONLY <citus_table>`: * Core still processes the coordinator relation as usual. * Citus skips building tasks for shard placements, so we do not recurse into distributed children. * The code compiles and behaves as before on pre-PG18; the new behavior becomes observable only when the core planner starts setting `inh = false` for `ONLY` (PG18). * Unqualified `VACUUM` / `ANALYZE` (no rels) is unchanged and still handled via `ExecuteUnqualifiedVacuumTasks`. * Remove now-redundant helpers: * `VacuumColumnList` * `ExtractVacuumTargetRels` Column lists are now taken directly from `vacuumRelation->va_cols` via `CitusVacuumRelation`. --- ### Testing Extend `src/test/regress/sql/pg18.sql` and `expected/pg18.out` with two PG18-only blocks that verify we do not recurse into shard placements when `ONLY` is used: 1. Simple distributed table (`pg18_vacuum_part`) * Create and distribute a regular table: ```sql CREATE SCHEMA pg18_vacuum_part; SET search_path TO pg18_vacuum_part; CREATE TABLE vac_analyze_only (a int); SELECT create_distributed_table('vac_analyze_only', 'a'); INSERT INTO vac_analyze_only VALUES (1), (2), (3); ``` * On the coordinator: * Run `ANALYZE vac_analyze_only;` and later `ANALYZE ONLY vac_analyze_only;`. * Run `VACUUM vac_analyze_only;` and later `VACUUM ONLY vac_analyze_only;`. * On `worker_1`: * Capture `coalesce(max(last_analyze), 'epoch')` from `pg_stat_user_tables` for `vac_analyze_only_%` into `:analyze_before_only`, then assert: ```sql SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS analyze_only_skipped; ``` * Capture `coalesce(max(last_vacuum), 'epoch')` into `:vacuum_before_only`, then assert: ```sql SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS vacuum_only_skipped; ``` Both checks return `t`, confirming `ONLY` does not change `last_analyze` / `last_vacuum` on shard tables. 2. Partitioned distributed table (`pg18_vacuum_part_dist`) * Create a partitioned table whose parent is distributed: ```sql CREATE SCHEMA pg18_vacuum_part_dist; SET search_path TO pg18_vacuum_part_dist; SET citus.shard_count = 2; SET citus.shard_replication_factor = 1; CREATE TABLE part_dist (id int, v int) PARTITION BY RANGE (id); CREATE TABLE part_dist_1 PARTITION OF part_dist FOR VALUES FROM (1) TO (100); CREATE TABLE part_dist_2 PARTITION OF part_dist FOR VALUES FROM (100) TO (200); SELECT create_distributed_table('part_dist', 'id'); INSERT INTO part_dist SELECT g, g FROM generate_series(1, 199) g; ``` * On the coordinator: * Run `ANALYZE part_dist;` then `ANALYZE ONLY part_dist;`. * Run `VACUUM part_dist;` then `VACUUM ONLY part_dist;` (PG18 emits the expected warning: `VACUUM ONLY of partitioned table "part_dist" has no effect`). * On `worker_1`: * Capture `coalesce(max(last_analyze), 'epoch')` for `part_dist_%` into `:analyze_before_only`, then assert: ```sql SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS analyze_only_partitioned_skipped; ``` * Capture `coalesce(max(last_vacuum), 'epoch')` into `:vacuum_before_only`, then assert: ```sql SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS vacuum_only_partitioned_skipped; ``` Both checks return `t`, confirming that even for a partitioned distributed parent, `VACUUM/ANALYZE ONLY` does not recurse into shard placements, and Citus behavior matches PG18’s “ONLY = parent only” semantics.	2025-12-05 16:50:42 +03:00
Colm	002046b87b	PG18: Add support for virtual generated columns. (#8346 ) Generated columns can be virtual (not stored) and this is the default. This PG18 feature requires tweaking citus_ruleutils and deparse table to support in Citus. Relevant PG commit: 83ea6c540.	2025-12-04 19:51:45 +00:00
Colm	79cabe7eca	PG18: CHECK constraints can be ENFORCED / NOT ENFORCED. (#8349 ) DESCRIPTION: Adds propagation of ENFORCED / NOT ENFORCED on CHECK constraints. Add propagation support to Citus ruleutils and appropriate regress tests. Relevant PG commit: ca87c41.	2025-12-03 08:01:01 +00:00
Mehmet YILMAZ	c600eabd82	PG18 - Handle publish_generated_columns in distributed publications (#8360 ) https://github.com/postgres/postgres/commit/7054186c4 fixes #8358 This PR wires up PostgreSQL 18’s `publish_generated_columns` publication option in Citus and adds regression coverage to ensure it behaves correctly for distributed tables, without changing existing DDL output for publications that rely on the default. --- ### 1. Preserve `publish_generated_columns` when rebuilding publications In `BuildCreatePublicationStmt`: * On PG18+ we now read the new `pubgencols` field from `pg_publication` and map it as follows: * `'n'` → default (`none`) * `'s'` → `stored` * For `pubgencols == 's'` we append a `publish_generated_columns` defelem to the reconstructed statement: ```c #if PG_VERSION_NUM >= PG_VERSION_18 if (publicationForm->pubgencols == 's') /* stored / { DefElem pubGenColsOption = makeDefElem("publish_generated_columns", (Node ) makeString("stored"), -1); createPubStmt->options = lappend(createPubStmt->options, pubGenColsOption); } else if (publicationForm->pubgencols != 'n') / 'n' = none (default) / { ereport(ERROR, (errmsg("unexpected pubgencols value '%c' for publication %u", publicationForm->pubgencols, publicationId))); } #endif ``` For `pubgencols == 'n'` we do not emit an option and rely on PostgreSQL’s default. * Any value other than `'n'` or `'s'` raises an error rather than silently producing incorrect DDL. This ensures: * Publications that explicitly use `publish_generated_columns = stored` are reconstructed with that option on workers, so workers get `pubgencols = 's'`. * Publications that use the default (`none`) continue to produce the same `CREATE PUBLICATION ... WITH (...)` text as before (no extra `publish_generated_columns = 'none'` noise), fixing the unintended diffs in existing publication tests. --- ### 2. New PG18 regression coverage for distributed publications In `src/test/regress/sql/pg18.sql`: * Create a table with a stored generated column and make it distributed so the publication goes through Citus DDL propagation: ```sql CREATE TABLE gen_pub_tab ( id int primary key, a int, b int GENERATED ALWAYS AS (a * 10) STORED ); SELECT create_distributed_table('gen_pub_tab', 'id', colocate_with := 'none'); ``` * Create two publications that exercise both `pubgencols` values: ```sql CREATE PUBLICATION pub_gen_cols_stored FOR TABLE gen_pub_tab WITH (publish = 'insert, update', publish_generated_columns = stored); CREATE PUBLICATION pub_gen_cols_none FOR TABLE gen_pub_tab WITH (publish = 'insert, update', publish_generated_columns = none); ``` * On coordinator and both workers, assert the catalog contents: ```sql SELECT pubname, pubgencols FROM pg_publication WHERE pubname IN ('pub_gen_cols_stored', 'pub_gen_cols_none') ORDER BY pubname; ``` Expected on all three nodes: * `pub_gen_cols_stored \| s` * `pub_gen_cols_none \| n` This test verifies that: * `pubgencols` is correctly set on the coordinator for both `stored` and `none`. * Citus propagates the setting unchanged to all workers for a distributed table.	2025-12-01 09:17:57 +00:00
Colm	bc41e7b94f	PG18: fix query results diff in merge regress test. (#8323 ) The `merge` regress test uses SQL functions which can be cached in PG18+ since commit [0dca5d68d](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0dca5d68d7bebf2c1036fd84875533afef6df992). Distributed plan's copy function did not include the `sourceResultRepartitionColumnIndex` field, which is critical for MERGE queries, and for cached distributed plans this field was always 0 leading to the problem (#8285). Ensuring it is copied fixes it. This was an oversight in Citus, and not specific to PG18.	2025-11-06 12:31:43 +00:00
Mehmet YILMAZ	b63572d72f	PG18 deparser: map Vars through JOIN aliases (fixes whole-row join column names) (#8300 ) fixes #8278 Please check issue: https://github.com/citusdata/citus/issues/8278#issuecomment-3431707484 `f4e7756ef9` ### What PG18 changed SELECT creates diff has a named join: ```sql (...) AS unsupported_join (x,y,z,t,e,f,q) ``` On PG17, `COUNT(unsupported_join.)` stayed as a single whole-row Var that referenced the join alias. On PG18, the parser expands that whole-row Var early* into a `ROW(...)` of base columns: ``` ROW(a.user_id, a.item_id, a.buy_count, b.id, b.it_name, b.k_no, c.id, c.it_name, c.k_no) ``` But since the join is named, inner aliases `a/b/c` are hidden. Referencing them later blows up with “invalid reference to FROM-clause entry for table ‘a’”. ### What this PR changes 1. Retarget at `RowExpr` deparse (not in `get_variable`) * In `get_rule_expr()`’s `T_RowExpr` branch, each element `e` of `ROW(...)` is examined. * If `e` unwraps to a simple, same-level `Var` (`varlevelsup == 0`, `varattno > 0`) and there is a named `RTE_JOIN` with `joinaliasvars`, we do not change `varno/varattno`. * Instead, we build a copy of the Var and set `varnosyn/varattnosyn` to the matching join alias column (from `joinaliasvars`). * Then we deparse that Var via `get_rule_expr_toplevel(...)`, which naturally prints `join_alias.colname`. * Scope is limited to query deparsing (`dpns->plan == NULL`), exactly where PG18 expands whole-row vars into `ROW(...)` of base Vars. 2. Helpers (PG18-only file) * `unwrap_simple_var(Node)`: strips trivial wrappers (`RelabelType`, `CoerceToDomain`, `CollateExpr`) to reveal a `Var`. `var_matches_base(const Var, int varno, AttrNumber attno)`: matches canonical or synonym identity. `dpns_has_named_join(const deparse_namespace)`: fast precheck for any named join with `joinaliasvars`. `map_var_through_join_alias(...)`: scans `joinaliasvars` to locate the JOIN RTE index + attno for a 1:1 alias; the caller uses these to set `varnosyn/varattnosyn`. 3. Safety and non-goals * No effect on plan deparsing (`dpns->plan != NULL`). * No change to semantic identity: we leave `varno/varattno` untouched; only set `varnosyn/varattnosyn`. * Skip whole-row/system columns (`attno <= 0`) and non-simple join columns (computed expressions). * Works with named joins with or without an explicit column list (we rely on `joinaliasvars`, not the alias collist). ### Reproducer ```sql CREATE TABLE distributed_table(user_id int, item_id int, buy_count int); CREATE TABLE reference_table(id int, it_name varchar(25), k_no int); SELECT create_distributed_table('distributed_table', 'user_id'); SELECT COUNT(unsupported_join.) FROM (distributed_table a LEFT JOIN reference_table b ON true RIGHT JOIN reference_table c ON true) AS unsupported_join (x,y,z,t,e,f,q) JOIN (reference_table d JOIN reference_table e ON true) ON true; ``` Before (PG18):* deparser emitted `ROW(a.user_id, …)` → `ERROR: invalid reference to FROM-clause entry for table "a"` After: deparser emits `ROW(unsupported_join.x, ..., unsupported_join.k_no)` → runs successfully. Now maps to `unsupported_join.<auto_col_names>` and runs.	2025-11-05 11:08:58 +00:00
Colm	7a7a0ba9c7	PG18: Fix missing Planning Fast Path Query DEBUG messages (#8320 ) With PG18's GROUP RTE, queries that should have been eligible for fast path planning were skipped because the fast path planner allows exactly one range table only. This fix extends that to account for a GROUP RTE.	2025-11-04 12:33:21 +00:00
Colm	5a71f0d1ca	PG18: Print names in order in tables are not colocated error detail. (#8319 ) Fixes #8275 by printing the names in order so that in every message `DETAIL: x and y are not co-located` x precedes (or is lexicographically less than) y.	2025-11-03 18:56:17 +00:00
Vinod Sridharan	86010de733	Update GUC setting to not crash with ASAN (#8301 ) The GUC configuration for SkipAdvisoryLockPermissionChecks had misconfigured the settings for GUC_SUPERUSER_ONLY for PGC_SUSET - when PostgreSQL running with ASAN, this fails when querying pg_settings due to exceeding the size of the array GucContext_Names. Fix up this GUC declaration to not crash with ASAN.	2025-10-31 10:21:58 +03:00
Colm	458299035b	PG18: fix regress test failures in subquery_in_targetlist. The failing queries all have a GROUP BY, and the fix teaches the Citus recursive planner how to handle a PG18 GROUP range table in the outer query: - In recursive query planning, don't recurse into subquery expressions in a GROUP BY clause - Flatten references to a GROUP rte before creating the worker subquery in pushdown planning - If a PARAM node points to a GROUP rte then tunnel through to the underlying expression Fixes #8296.	2025-10-30 11:49:28 +00:00
Onur Tirtir	90f2ab6648	Actually deprecate mark_tables_colocated()	2025-10-17 11:57:36 +00:00
Colm	3ca66e1fcc	PG18: Fix "Unrecognized range table id" in INSERT .. SELECT planning (#8256 ) The error `Unrecognized range table id` seen in regress test `insert_select_into_local_tables` is a consequence of the INSERT .. SELECT planner getting confused by a SELECT query with a GROUP BY and hence a Group RTE, introduced in PG18 (commit 247dea89f). The solution is to flatten the relevant parts of the SELECT query before preparing the INSERT .. SELECT query tree for use by Citus.	2025-10-17 11:21:25 +01:00
Colm	5d71fca3b4	PG18 regress sanity: disable `enable_self_join_elimination` on queries (#8242 ) .. involving Citus tables. Interim fix for #8217 to achieve regress sanity with PG18. A complete fix will follow with PG18 feature integration.	2025-10-17 10:25:33 +01:00
Naisila Puka	f1dd976a14	Fix vanilla tests with domain creation (#8238 ) Qualify create domain stmt after local execution, to avoid such diffs in PG vanilla tests: ```diff create domain d_fail as anyelement; -ERROR: "anyelement" is not a valid base type for a domain +ERROR: "pg_catalog.anyelement" is not a valid base type for a domain ``` These tests were newly added in PG18, however this is not new PG18 behavior, just some added tests. https://github.com/postgres/postgres/commit/0172b4c94 Fixes #8042	2025-10-10 15:34:32 +03:00
Naisila Puka	351cb2044d	PG18 - define some EXPLAIN funcs and structs only in PG17 (#8239 ) PG18 changed the visibility of various Explain Serialize functions and structs to `extern`. Previously, for PG17 support, these were `static`, so we had to copy paste their definitions from `explain.c` to Citus's `multi_explain.c`. Relevant PG18 commits: https://github.com/postgres/postgres/commit/555960a0 https://github.com/postgres/postgres/commit/77cb08be Now we don't need to define the following anymore in Citus, since they are extern in PG18: - typedef struct SerializeMetrics - void ExplainIndentText(ExplainState es); - SerializeMetrics GetSerializationMetrics(DestReceiver dest); - typedef struct SerializeDestReceiver (this is not extern, however it is only used by GetSerializationMetrics function) This was incorrectly handled in https://github.com/citusdata/citus/commit/9e42f3f2c by wrapping these definitions and usages in PG17 only, causing such diffs in PG18 (not able to see serialization at all): ```diff citus/src/test/regress/expected/pg17.out select public.explain_filter('explain (analyze, serialize binary,buffers,timing) select * from int8_tbl i8'); ... Planning Time: N.N ms - Serialization: time=N.N ms output=NkB format=binary Execution Time: N.N ms Planning Time: N.N ms Serialization: time=N.N ms output=NkB format=binary Execution Time: N.N ms -(14 rows) +(13 rows) ```	2025-10-10 15:05:47 +03:00
Naisila Puka	287abea661	PG18 compatibility - varreturningtype additions (#8231 ) This PR solves the following diffs, originating from the addition of `varreturningtype` field to the `Var` struct in PG18: https://github.com/postgres/postgres/commit/80feb727c Previously we didn't account for this new field (as it's new), so this wouldn't allow the parser to correctly reconstruct the `Var` node structure, but rather it would error out with `did not find '}' at end of input node`: ```diff SELECT column_to_column_name(logicalrelid, partkey) FROM pg_dist_partition WHERE partkey IS NOT NULL ORDER BY 1 LIMIT 1; - column_to_column_name ---------------------------------------------------------------------- - a -(1 row) - +ERROR: did not find '}' at end of input node ``` Solution follows precedent https://github.com/citusdata/citus/pull/7107, when varnullingrels field was added to the `Var` struct in PG16. The solution includes: - Taking care of the `partkey` in `pg_dist_partition` table because it's coming from the `Var` struct. This mainly includes fixing the upgrade script to PG18, by saving all the `partkey` infos before upgrading to PG18 (in `citus_prepare_pg_upgrade`), and then re-generating `partkey` columns in `pg_dist_partition` (using `UPDATE`) after upgrading to PG18 (in `citus_finish_pg_upgrade`). - Adding a normalize rule to fix output differences among PG versions. Note that we need two normalize lines: one for PG15 since it doesn't have `varnullingrels`, and one for PG16/PG17. - Small trick on `metadata_sync_helpers` to use different text when generating the `partkey`, based on the PG version. Fixes #8189	2025-10-09 17:35:03 +03:00
Naisila Puka	c5dde4b115	Fix crash on create statistics with non-RangeVar type pt2 (#8227 ) Fixes #8225 very similar to #8213 Also the error message changed between pg18rc1 and pg18.0	2025-10-07 11:56:20 +03:00
Naisila Puka	bb840e58a7	Fix crash on create statistics with non-RangeVar type (#8213 ) This crash has been there for a while but wasn't tested before pg18. PG18 added this test: CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo; which tries to create statistics on a derived-on-the-fly table (which is not allowed) However Citus assumes we always have a valid table when intercepting CREATE STATISTICS command to check for Citus tables Added a check to return early if needed. pg18 commit: https://github.com/postgres/postgres/commit/3eea4dc2c Fixes #8212	2025-10-01 00:09:11 +03:00
Onur Tirtir	5eb1d93be1	Properly detect no-op shard-key updates via UPDATE / MERGE (#8214 ) DESCRIPTION: Fixes a bug that causes allowing UPDATE / MERGE queries that may change the distribution column value. Fixes: #8087. Probably as of #769, we were not properly checking if UPDATE may change the distribution column. In #769, we had these checks: ```c if (targetEntry->resno != column->varattno) { /* target entry of the form SET some_other_col = <x> / isColumnValueChanged = false; } else if (IsA(setExpr, Var)) { Var newValue = (Var ) setExpr; if (newValue->varattno == column->varattno) { / target entry of the form SET col = table.col / isColumnValueChanged = false; } } ``` However, what we check in "if" and in the "else if" are not so different in the sense they both attempt to verify if SET expr of the target entry points to the attno of given column. So, in #5220, we even removed the first check because it was redundant. Also see this PR comment from #5220: https://github.com/citusdata/citus/pull/5220#discussion_r699230597. In #769, probably we actually wanted to first check whether both SET expr of the target entry and given variable are pointing to the same range var entry, but this wasn't what the "if" was checking, so removed. As a result, in the cases that are mentioned in the linked issue, we were incorrectly concluding that the SET expr of the target entry won't change given column just because it's pointing to the same attno as given variable, regardless of what range var entries the column and the SET expr are pointing to. Then we also started using the same function to check for such cases for update action of MERGE, so we have the same bug there as well. So with this PR, we properly check for such cases by comparing varno as well in TargetEntryChangesValue(). However, then some of the existing tests started failing where the SET expr doesn't directly assign the column to itself but the "where" clause could actually imply that the distribution column won't change. Even before we were not attempting to verify if "where" cluse quals could imply a no-op assignment for the SET expr in such cases but that was not a problem. This is because, for the most cases, we were always qualifying such SET expressions as a no-op update as long as the SET expr's attno is the same as given column's. For this reason, to prevent regressions, this PR also adds some extra logic as well to understand if the "where" clause quals could imply that SET expr for the distribution key is a no-op. Ideally, we should instead use "relation restriction equivalence" mechanism to understand if the "where" clause implies a no-op update. This is because, for instance, right now we're not able to deduce that the update is a no-op when the "where" clause transitively implies a no-op update, as in the case where we're setting "column a" to "column c" and where clause looks like: "column a = column b AND column b = column c". If this means a regression for some users, we can consider doing it that way. Until then, as a workaround, we can suggest adding additional quals to "where" clause that would directly imply equivalence. Also, after fixing TargetEntryChangesValue(), we started successfully deducing that the update action is a no-op for such MERGE queries: ```sql MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.b; ``` However, we then started seeing below error for above query even though now the update is qualified as a no-op update: ``` ERROR: Unexpected column index of the source list ``` This was because of #8180 and #8201 fixed that. In summary, with this PR: We disallow such queries, ```sql -- attno for dist_1.a, dist_1.b: 1, 2 -- attno for dist_different_order_1.a, dist_different_order_1.b: 2, 1 UPDATE dist_1 SET a = dist_different_order_1.b FROM dist_different_order_1 WHERE dist_1.a dist_different_order_1.a; -- attno for dist_1.a, dist_1.b: 1, 2 -- but ON (..) doesn't imply a no-op update for SET expr MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.a; ``` * .. and allow such queries, ```sql MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.b; ```	2025-09-30 10:13:47 +00:00
Naisila Puka	de045402f3	PG18 - register snapshot where needed (#8196 ) Register and push snapshots as needed per the relevant PG18 commits `8076c00592` https://github.com/postgres/postgres/commit/706054b `citus_split_shard_columnar_partitioned`, `multi_partitioning` tests are handled. Fixes #8195	2025-09-26 18:04:34 +03:00
Colm McHugh	81776fe190	Fix crash in Range Table identity check. The range table entry array created by the Postgres planner for each SELECT in a query may have NULL entries as of PG18. Add a NULL check to skip over these when looking for matches in rte identities.	2025-09-26 14:53:30 +01:00
Colm	80945212ae	PG18 regress sanity: update pg18 ruleutils with fix #7675 (#8216 ) Fix deparsing of UPDATE statements with indirection (#7675) involved changing ruleutils of our supported Postgres versions. It means that when integrating a new Postgres version we need to update its ruleutils with the relevant parts of #7675; basically PG ruleutils needs to call the `citus_ruleutils.c` functions added by #7675.	2025-09-26 13:19:47 +01:00
Onur Tirtir	83b25e1fb1	Fix unexpected column index error for repartitioned merge (#8201 ) DESCRIPTION: Fixes a bug that causes an unexpected error when executing repartitioned merge. Fixes #8180. This was happening because of a bug in SourceResultPartitionColumnIndex(). And to fix it, this PR avoids using DistributionColumnIndex() in SourceResultPartitionColumnIndex(). Instead, invents FindTargetListEntryWithVarExprAttno(), which finds the index of the target entry in the source query's target list that can be used to repartition the source for a repartitioned merge. In short, to find the source target entry that refences the Var used in ON (..) clause and that references the source rte, we should check the varattno of the underlying expr, which presumably is always a Var for repartitioned merge as we always wrap the source rte with a subquery, where all target entries point to the columns of the original source relation. Using DistributionColumnIndex() prior to 13.0 wasn't causing such an issue because prior to 13.0, the varattno of the underlying expr of the source target entries was almost (1) always equal to resno of the target entry as we were including all target entries of the source relation. However, starting with #7659, which is merged to main before 13.0, we started using CreateFilteredTargetListForRelation() instead of CreateAllTargetListForRelation() to compute the target entry list for the source rte to fix another bug. So we cannot revert to using CreateAllTargetListForRelation() because otherwise we would re-introduce bug that it helped fixing, so we instead had to find a way to properly deal with the "filtered target list"s, as in this commit. Plus (1), even before #7659, probably we would still fail when the source relation has dropped attributes or such because that would probably also cause such a mismatch between the varattno of the underlying expr of the target entry and its resno.	2025-09-23 11:17:51 +00:00
Colm	b5e70f56ab	Postgres 18: Fix regress tests caused by GROUP RTE. (#8206 ) The change in `merge_planner.c` fixes _unrecognized range table entry_ diffs in merge regress tests (category 2 diffs in #7992), the change in `multi_router_planner.c` fixes _column reference ... is ambiguous_ diffs in `multi_insert_select` and `multi_insert_select_window` (category 3 diffs in #7992). Edit to `common.py` enables standalone regress tests with pg18 (e..g `citus_tests/run_test.py merge`).	2025-09-22 16:13:59 +03:00
Colm	d2ea4043d4	Postgres 18: fix 'column does not exist' errors in grouping regress tests. (#8199 ) DESCRIPTION: Fix 'column does not exist' errors in grouping regress tests. Postgres 18's GROUP RTE was being ignored by query pushdown planning when constructing the query tree for the worker subquery. The solution is straightforward - ensure the worker subquery tree has the same groupRTE property as the original query. Postgres ruleutils then does the right thing when generating the pushed down query. Fixes category 1 in #7992.	2025-09-22 16:13:59 +03:00
Naisila Puka	b4cb1a94e9	Bump citus and citus_columnar to 14.0devel (#8170 )	2025-09-19 12:54:55 +03:00
eaydingol	360fbe3b99	Technical document update for outer join pushdown (#8200 ) Outer join pushdown entry and an example.	2025-09-17 17:01:45 +03:00
Colm	b7bfe42f1a	Document delayed fast path planning in README (#8176 ) Added detailed explanation of delayed fast path planning in Citus 13.2, including conditions and processes involved. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2025-09-09 11:54:13 +01:00
Onur Tirtir	0c658b73fc	Fix an assertion failure in Citus maintenance daemon that can happen in very slow systems (#8158 ) Fixes #5808. DESCRIPTION: Fixes an assertion failure in Citus maintenance daemon that can happen in very slow systems. Try running `make -C src/test/regress/ check-multi-1-vg` - while the tests will exit with code 2 at least %50 of the times in the very early stages of the test suite by producing a core-dump on main, it won't be the case on this branch, at least based on my trials :)	2025-09-04 12:13:57 +00:00
manaldush	2834fa26c9	Fix an undefined behavior for bit shift in citus_stat_tenants.c (#7954 ) DESCRIPTION: Fixes an undefined behavior that could happen when computing tenant score for citus_stat_tenants Add check for shift size, reset to zero in case of overflow Fixes #7953. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2025-09-04 10:57:45 +00:00
Onur Tirtir	8ece8acac7	Check citus version in citus_promote_clone_and_rebalance (#8169 )	2025-08-29 11:19:50 +03:00
Naisila Puka	0fd95d71e4	Order same frequency common values, and add test (#8167 ) Added similar test to what @colm-mchugh tested in the original PR https://github.com/citusdata/citus/pull/8026#discussion_r2279021218	2025-08-29 01:41:32 +03:00
Naisila Puka	d5f0ec5cd1	Fix invalid input syntax for type bigint (#8166 ) Fixes #8164	2025-08-29 01:01:18 +03:00
Naisila Puka	544b6c4716	Add GUC for queries with outer joins and pseudoconstant quals (#8163 ) Users can turn on this GUC at their own risk.	2025-08-27 22:31:22 +03:00
Colm	bb6eeb17cc	Fix bug in redundant WHERE clause detection. (#8162 ) Need to also check Postgres plan's rangetables for relations used in Initplans. DESCRIPTION: Fix a bug in redundant WHERE clause detection; we need to additionally check the Postgres plan's range tables for the presence of citus tables, to account for relations that are referenced from scalar subqueries. There is a fundamental flaw in `4139370`, the assumption that, after Postgres planning has completed, all tables used in a query can be obtained by walking the query tree. This is not the case for scalar subqueries, which will be referenced by `PARAM` nodes. The fix adds an additional check of the Postgres plan range tables; if there is at least one citus table in there we do not need to change the needs distributed planning flag. Fixes #8159	2025-08-27 13:32:02 +01:00
Colm	0a5cae19ed	In UPDATE deparse, check for a subscript before processing the targets. (#8155 ) DESCRIPTION: Checking first for the presence of subscript ops avoids a shallow copy of the target list for target lists where there are no array or json subscripts. Commit `0c1b31c` fixed a bug in UPDATE statements with array or json subscripting in the target list. This commit modifies that to first check that the target list has a subscript and avoid a shallow copy of the target list for UPDATE statements with no array/json subscripting.	2025-08-27 11:00:27 +00:00
Muhammad Usama	62e5fcfe09	Enhance clone node replication status messages (#8152 ) - Downgrade replication lag reporting from NOTICE to DEBUG to reduce noise and improve regression test stability. - Add hints to certain replication status messages for better clarity. - Update expected output files accordingly.	2025-08-26 21:48:07 +03:00
Naisila Puka	ce7ddc0d3d	Bump PG versions to 17.6, 16.10, 15.14 (#8142 ) Sister PR https://github.com/citusdata/the-process/pull/172 Fixes #8134 #8149	2025-08-25 15:34:13 +03:00
Onur Tirtir	439870f3a9	Fix incorrect usage of TupleDescSize() in #7950 , #8120 , #8124 , #8121 and #8114 (#8146 ) In #7950, #8120, #8124, #8121 and #8114, TupleDescSize() was used to check whether the tuple length is `Natts_<catalog_table_name>`. However this was wrong because TupleDescSize() returns the size of the tupledesc, not the length of it (i.e., number of attributes). Actually `TupleDescSize(tupleDesc) == Natts_<catalog_table_name>` was always returning false but this didn't cause any problems because using `tupleDesc->natts - 1` when `tupleDesc->natts == Natts_<catalog_table_name>` too had the same effect as using `Anum_<column_added_later> - 1` in that case. So this also makes me thinking of always returning `tupleDesc->natts - 1` (or `tupleDesc->natts - 2` if it's the second to last attribute) but being more explicit seems more useful. Even more, in the future we should probably switch to a different implementation if / when we think of adding more columns to those tables. We should probably scan non-dropped attributes of the relation, enumerate them and return the attribute number of the one that we're looking for, but seems this is not needed right now.	2025-08-22 11:46:06 +00:00
Onur Tirtir	785287c58f	Fix memory corruptions around pg_dist_node accessors after a Citus downgrade is followed by an upgrade (#8144 ) Unlike what has been fixed in #7950, #8120, #8124, #8121 and #8114, this was not an issue in older releases but is a potential issue to be introduced by the current (13.2) release because in one of recent commits (#8122) two columns has been added to pg_dist_node. In other words, none of the older releases since we started supporting downgrades added new columns to pg_dist_node. The mentioned PR actually attempted avoiding these kind of issues in one of the code-paths but not in some others. So, this PR, avoids memory corruptions around pg_dist_node accessors in a standardized way (as implemented in other example PRs) and in all code-paths.	2025-08-22 14:07:44 +03:00
Naisila Puka	eaa609f510	Add citus_stats UDF (#8026 ) DESCRIPTION: Add `citus_stats` UDF This UDF acts on a Citus table, and provides `null_frac`, `most_common_vals` and `most_common_freqs` for each column in the table, based on the definitions of these columns in the Postgres view `pg_stats`. Aggregated Views: pg\_stats > citus\_stats citus\_stats, is a view intended for use in Citus, a distributed extension of PostgreSQL. It collects and returns column-level statistics for a distributed table—specifically, the most common values, their frequencies, and fraction of null values, like pg\_stats view does for regular Postgres tables. Use Case This view is useful when: - You need column-level insights on a distributed table. - You're performing query optimization, cardinality estimation, or data profiling across shards. What It Returns A table with: \| Column Name \| Data Type \| Description \| \|---------------------\|-----------\|-----------------------------------------------------------------------------\| \| schemaname \| text \| Name of the schema containing the distributed table \| \| tablename \| text \| Name of the distributed table \| \| attname \| text \| Name of the column (attribute) \| \| null_frac \| float4 \| Estimated fraction of NULLs in the column across all shards \| \| most_common_vals \| text[] \| Array of most common values for the column \| \| most_common_freqs \| float4[] \| Array of corresponding frequencies (as fractions) of the most common values\| Caveats - The function assumes that the array of the most common values among different shards will be the same, therefore it just adds everything up.	2025-08-19 23:17:13 +03:00
Colm	bd0558fe39	Remove incorrect assertion from Postgres ruleutils. (#8136 ) DESCRIPTION: Remove an assertion from Postgres ruleutils that was rendered meaningless by a previous Citus commit. Fixes #8123. This has been present since `00068e0`, which changed the code preceding the assert as follows: ``` #ifdef USE_ASSERT_CHECKING - while (i < colinfo->num_cols && colinfo->colnames[i] == NULL) - i++; + for (int col_index = 0; col_index < colinfo->num_cols; col_index++) + { + /* + * In the above processing-loops, "i" advances only if + * the column is not new, check if this is a new column. + */ + if (colinfo->is_new_col[col_index]) + i++; + } Assert(i == colinfo->num_cols); Assert(j == nnewcolumns); #endif ``` This commit altered both the loop condition and the incrementing of `i`. After analysis, the assert no longer makes sense.	2025-08-19 15:52:13 +01:00
Muhammad Usama	be6668e440	Snapshot-Based Node Split – Foundation and Core Implementation (#8122 ) DESCRIPTION: This pull request introduces the foundation and core logic for the snapshot-based node split feature in Citus. This feature enables promoting a streaming replica (referred to as a clone in this feature and UI) to a primary node and rebalancing shards between the original and the newly promoted node without requiring a full data copy. This significantly reduces rebalance times for scale-out operations where the new node already contains a full copy of the data via streaming replication. Key Highlights: 1. Replica (Clone) Registration & Management Infrastructure Introduces a new set of UDFs to register and manage clone nodes: - citus_add_clone_node() - citus_add_clone_node_with_nodeid() - citus_remove_clone_node() - citus_remove_clone_node_with_nodeid() These functions allow administrators to register a streaming replica of an existing worker node as a clone, making it eligible for later promotion via snapshot-based split. 2. Snapshot-Based Node Split (Core Implementation) New core UDF: - citus_promote_clone_and_rebalance() This function implements the full workflow to promote a clone and rebalance shards between the old and new primaries. Steps include: 1. Ensuring Exclusivity – Blocks any concurrent placement-changing operations. 2. Blocking Writes – Temporarily blocks writes on the primary to ensure consistency. 3. Replica Catch-up – Waits for the replica to be fully in sync. 4. Promotion – Promotes the replica to a primary using pg_promote. 5. Metadata Update – Updates metadata to reflect the newly promoted primary node. 6. Shard Rebalancing – Redistributes shards between the old and new primary nodes. 3. Split Plan Preview A new helper UDF get_snapshot_based_node_split_plan() provides a preview of the shard distribution post-split, without executing the promotion. Example: ``` reb 63796> select * from pg_catalog.get_snapshot_based_node_split_plan('127.0.0.1',5433,'127.0.0.1',5453); table_name \| shardid \| shard_size \| placement_node --------------+---------+------------+---------------- companies \| 102008 \| 0 \| Primary Node campaigns \| 102010 \| 0 \| Primary Node ads \| 102012 \| 0 \| Primary Node mscompanies \| 102014 \| 0 \| Primary Node mscampaigns \| 102016 \| 0 \| Primary Node msads \| 102018 \| 0 \| Primary Node mscompanies2 \| 102020 \| 0 \| Primary Node mscampaigns2 \| 102022 \| 0 \| Primary Node msads2 \| 102024 \| 0 \| Primary Node companies \| 102009 \| 0 \| Clone Node campaigns \| 102011 \| 0 \| Clone Node ads \| 102013 \| 0 \| Clone Node mscompanies \| 102015 \| 0 \| Clone Node mscampaigns \| 102017 \| 0 \| Clone Node msads \| 102019 \| 0 \| Clone Node mscompanies2 \| 102021 \| 0 \| Clone Node mscampaigns2 \| 102023 \| 0 \| Clone Node msads2 \| 102025 \| 0 \| Clone Node (18 rows) ``` 4 Test Infrastructure Enhancements - Added a new test case scheduler for snapshot-based split scenarios. - Enhanced pg_regress_multi.pl to support creating node backups with slightly modified options to simulate real-world backup-based clone creation. ### 5. Usage Guide The snapshot-based node split can be performed using the following workflow: - Take a Backup of the Worker Node Run pg_basebackup (or an equivalent tool) against the existing worker node to create a physical backup. `pg_basebackup -h <primary_worker_host> -p <port> -D /path/to/replica/data --write-recovery-conf ` - Start the Replica Node Start PostgreSQL on the replica using the backup data directory, ensuring it is configured as a streaming replica of the original worker node. - Register the Backup Node as a Clone Mark the registered replica as a clone of its original worker node: `SELECT * FROM citus_add_clone_node('<clone_host>', <clone_port>, '<primary_host>', <primary_port>); ` - Promote and Rebalance the Clone Promote the clone to a primary and rebalance shards between it and the original worker: `SELECT * FROM citus_promote_clone_and_rebalance('clone_node_id'); ` - Drop Any Replication Slots from the Original Worker After promotion, clean up any unused replication slots from the original worker: `SELECT pg_drop_replication_slot('<slot_name>'); `	2025-08-19 14:13:55 +03:00
Muhammad Usama	f743b35fc2	Parallelize Shard Rebalancing & Unlock Concurrent Logical Shard Moves (#7983 ) DESCRIPTION: Parallelizes shard rebalancing and removes the bottlenecks that previously blocked concurrent logical-replication moves. These improvements reduce rebalance windows—particularly for clusters with large reference tables and enable multiple shard transfers to run in parallel. Motivation: Citus’ shard rebalancer has some key performance bottlenecks: Sequential Movement of Reference Tables: Reference tables are often assumed to be small, but in real-world deployments, they can grow significantly large. Previously, reference table shards were transferred as a single unit, making the process monolithic and time-consuming. No Parallelism Within a Colocation Group: Although Citus distributes data using colocated shards, shard movements within the same colocation group were serialized. In environments with hundreds of distributed tables colocated together, this serialization significantly slowed down rebalance operations. Excessive Locking: Rebalancer used restrictive locks and redundant logical replication guards, further limiting concurrency. The goal of this commit is to eliminate these inefficiencies and enable maximum parallelism during rebalance, without compromising correctness or compatibility. Parallelize shard rebalancing to reduce rebalance time. Feature Summary: 1. Parallel Reference Table Rebalancing Each reference-table shard is now copied in its own background task. Foreign key and other constraints are deferred until all shards are copied. For single shard movement without considering colocation a new internal-only UDF '`citus_internal_copy_single_shard_placement`' is introduced to allow single-shard copy/move operations. Since this function is internal, we do not allow users to call it directly. Temporary Hack to Set Background Task Context Background tasks cannot currently set custom GUCs like application_name before executing internal-only functions. 'citus_rebalancer ...' statement as a prefix in the task command. This is a temporary hack to label internal tasks until proper GUC injection support is added to the background task executor. 2. Changes in Locking Strategy - Drop the leftover replication lock that previously serialized shard moves performed via logical replication. This lock was only needed when we used to drop and recreate the subscriptions/publications before each move. Since Citus now removes those objects later as part of the “unused distributed objects” cleanup, shard moves via logical replication can safely run in parallel without additional locking. - Introduced a per-shard advisory lock to prevent concurrent operations on the same shard while allowing maximum parallelism elsewhere. - Change the lock mode in AcquirePlacementColocationLock from ExclusiveLock to RowExclusiveLock to allow concurrent updates within the same colocation group, while still preventing concurrent DDL operations. 3. citus_rebalance_start() enhancements The citus_rebalance_start() function now accepts two new optional parameters: ``` - parallel_transfer_colocated_shards BOOLEAN DEFAULT false, - parallel_transfer_reference_tables BOOLEAN DEFAULT false ``` This ensures backward compatibility by preserving the existing behavior and avoiding any disruption to user expectations and when both are set to true, the rebalancer operates with full parallelism. Previous Rebalancer Behavior: `SELECT citus_rebalance_start(shard_transfer_mode := 'force_logical');` This would: Start a single background task for replicating all reference tables Then, move all shards serially, one at a time. ``` Task 1: replicate_reference_tables() ↓ Task 2: move_shard_1() ↓ Task 3: move_shard_2() ↓ Task 4: move_shard_3() ``` Slow and sequential. Reference table copy is a bottleneck. Colocated shards must wait for each other. New Parallel Rebalancer: ``` SELECT citus_rebalance_start( shard_transfer_mode := 'force_logical', parallel_transfer_colocated_shards := true, parallel_transfer_reference_tables := true ); ``` This would: - Schedule independent background tasks for each reference-table shard. - Move colocated shards in parallel, while still maintaining dependency order. - Defer constraint application until all reference shards are in place. - ``` Task 1: copy_ref_shard_1() Task 2: copy_ref_shard_2() Task 3: copy_ref_shard_3() → Task 4: apply_constraints() ↓ Task 5: copy_shard_1() Task 6: copy_shard_2() Task 7: copy_shard_3() ↓ Task 8-10: move_shard_1..3() ``` Each operation is scheduled independently and can run as soon as dependencies are satisfied.	2025-08-18 17:44:14 +03:00
Karina	2095679dc8	Fix memory corruptions around pg_dist_object accessors after a Citus downgrade is followed by an upgrade (#8120 ) DESCRIPTION: Fixes potential memory corruptions that could happen when accessing pg_dist_object after a Citus downgrade is followed by a Citus upgrade. In case of Citus downgrade and further upgrade an undefined behavior may be encountered. The reason is that Citus hardcoded the number of columns in the extension's tables, but in case of downgrade and following update some of these tables can have more columns, and some of them can be marked as dropped. This PR fixes all such tables using the approach introduced in #7950, which solved the problem for the pg_dist_partition table. See #7515 for a more thorough explanation. --------- Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2025-08-18 12:52:34 +00:00
Karina	badaa21cb1	Fix memory corruptions around pg_dist_transaction accessors after a Citus downgrade is followed by an upgrade (#8121 ) DESCRIPTION: Fixes potential memory corruptions that could happen when accessing pg_dist_transaction after a Citus downgrade is followed by a Citus upgrade. In case of Citus downgrade and further upgrade an undefined behavior may be encountered. The reason is that Citus hardcoded the number of columns in the extension's tables, but in case of downgrade and following update some of these tables can have more columns, and some of them can be marked as dropped. This PR fixes all such tables using the approach introduced in #7950, which solved the problem for the pg_dist_partition table. See #7515 for a more thorough explanation. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2025-08-18 11:22:28 +00:00
eaydingol	8d929d3bf8	Push down recurring outer joins when possible (#7973 ) DESCRIPTION: Adds support for pushing down LEFT/RIGHT outer joins having a reference table in the outer side and a distributed table on the inner side (e.g., <reference table> LEFT JOIN <distributed table>) Partially addresses #6546 1) `<outer:reference>` LEFT JOIN `<inner:distributed>` 2) `<inner:distributed>` RIGHT JOIN `<outer:reference>` Previously, for outer joins of types (1) and (2), the distributed side was computed recursively. This was necessary because, when the inner side of a recurring outer join is a distributed table, it is not possible to directly distribute the join; the preserved (outer and recurring) side may generate rows with join keys that hash to different shards. To implement distributed planning while maintaining consistency with global execution semantics, this PR restricts the outer side only to those partition key values that route to the selected shard during distributed shard query computation. This method is employed )when the following criteria are met: (recursive planning applied otherwise) - The join type is (1) or (2) (lateral joins are not supported). - The outer side is a reference table. - The outer join qualifications include an equality condition between the partition column of a distributed table and the recurring table. - The join is not part of a chained join. - The “enable_recurring_outer_join_pushdown” GUC is enabled (default is on). --------- Co-authored-by: ebruaydingol <ebruaydingol@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2025-08-18 14:03:44 +03:00
Onur Tirtir	87a1b631e8	Not automatically create citus_columnar when creating citus extension (#8081 ) DESCRIPTION: Not automatically create citus_columnar when there are no relations using it. Previously, we were always creating citus_columnar when creating citus with version >= 11.1. And how we were doing was as follows: * Detach SQL objects owned by old columnar, i.e., "drop" them from citus, but not actually drop them from the database * "old columnar" is the one that we had before Citus 11.1 as part of citus, i.e., before splitting the access method ands its catalog to citus_columnar. * Create citus_columnar and attach the SQL objects leftover from old columnar to it so that we can continue supporting the columnar tables that user had before Citus 11.1 with citus_columnar. First part is unchanged, however, now we don't create citus_columnar automatically anymore if the user didn't have any relations using columnar. For this reason, as of Citus 13.2, when these SQL objects are not owned by an extension and there are no relations using columnar access method, we drop these SQL objects when updating Citus to 13.2. The net effect is still the same as if we automatically created citus_columnar and user dropped citus_columnar later, so we should not have any issues with dropping them. (Update: Seems we've made some assumptions in citus, e.g., citus_finish_pg_upgrade() still assumes columnar metadata exists and tries to apply some fixes for it, so this PR fixes them as well. See the last section of this PR description.) Also, ideally I was hoping to just remove some lines of code from extension.c, where we decide automatically creating citus_columnar when creating citus, however, this didn't happen to be the case for two reasons: * We still need to automatically create it for the servers using columnar access method. * We need to clean-up the leftover SQL objects from old columnar when the above is not case otherwise we would have leftover SQL objects from old columnar for no reason, and that would confuse users too. * Old columnar cannot be used to create columnar tables properly, so we should clean them up and let the user decide whether they want to create citus_columnar when they really need it later. --- Also made several changes in the test suite because similarly, we don't always want to have citus_columnar created in citus tests anymore: * Now, columnar specific test targets, which cover 41 test sql files, always install columnar by default, by using "--load-extension=citus_columnar". * "--load-extension=citus_columnar" is not added to citus specific test targets because by default we don't want to have citus_columnar created during citus tests. * Excluding citus_columnar specific tests, we have 601 sql files that we have as citus tests and in 27 of them we manually create citus_columnar at the very beginning of the test because these tests do test some functionalities of citus together with columnar tables. Also, before and after schedules for PG upgrade tests are now duplicated so we have two versions of each: one with columnar tests and one without. To choose between them, check-pg-upgrade now supports a "test-with-columnar" option, which can be set to "true" or anything else to logically indicate "false". In CI, we run the check-pg-upgrade test target with both options. The purpose is to ensure we can test PG upgrades where citus_columnar is not created in the cluster before the upgrade as well. Finally, added more tests to multi_extension.sql to test Citus upgrade scenarios with / without columnar tables / citus_columnar extension. --- Also, seems citus_finish_pg_upgrade was assuming that citus_columnar is always created but actually we should have never made such an assumption. To fix that, moved columnar specific post-PG-upgrade work from citus to a new columnar UDF, which is columnar_finish_pg_upgrade. But to avoid breaking existing customer / managed service scripts, we continue to automatically perform post PG-upgrade work for columnar within citus_finish_pg_upgrade, but only if columnar access method exists this time.	2025-08-18 08:29:27 +01:00
ibrahim halatci	f73da1ed40	Refactor background worker setup for security improvements (#8078 ) Enhance security by addressing a code scanning alert and refactoring the background worker setup code for better maintainability and clarity. --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2025-08-13 19:25:31 +03:00

1 2 3 4 5 ...

3385 Commits (5b05d44a69b275e8c3f96e027e3f7c3cfaedd9dc)