fixes#8364
PostgreSQL 18 changes VACUUM/ANALYZE to recurse into inheritance
children by default, and introduces `ONLY` to limit processing to the
parent. Upstream change:
[https://github.com/postgres/postgres/commit/62ddf7ee9](https://github.com/postgres/postgres/commit/62ddf7ee9)
For Citus tables, we should treat shard placements as “children” and
avoid propagating `VACUUM/ANALYZE` to shards when the user explicitly
asks for `ONLY`.
This PR adjusts the Citus VACUUM handling to align with PG18 semantics,
and adds regression coverage on both regular distributed tables and
partitioned distributed tables.
---
### Behavior changes
* Introduce a per-relation helper struct:
```c
typedef struct CitusVacuumRelation
{
VacuumRelation *vacuumRelation;
Oid relationId;
} CitusVacuumRelation;
```
This lets us keep both:
* the resolved relation OID (for `IsCitusTable`, task building), and
* the original `VacuumRelation` node (for column list and ONLY/inh
flag).
* Replace the old `VacuumRelationIdList` / `ExtractVacuumTargetRels`
flow with:
```c
static List *VacuumRelationList(VacuumStmt *vacuumStmt,
CitusVacuumParams vacuumParams);
```
`VacuumRelationList` now:
* Iterates over `vacuumStmt->rels`.
* Resolves `relid` via `RangeVarGetRelidExtended` when `relation` is
present.
* Falls back to locking `VacuumRelation->oid` when only an OID is
available.
* Respects `VACOPT_FULL` for lock mode and `VACOPT_SKIP_LOCKED` for
locking behavior.
* Builds a `List *` of `CitusVacuumRelation` entries.
* Update:
```c
IsDistributedVacuumStmt(List *vacuumRelationList);
ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt,
List *vacuumRelationList,
CitusVacuumParams vacuumParams);
```
to operate on `CitusVacuumRelation` instead of bare OIDs.
* Implement `ONLY` semantics in `ExecuteVacuumOnDistributedTables`:
```c
RangeVar *relation = vacuumRelation->relation;
if (relation != NULL && !relation->inh)
{
/* ONLY specified, so don't recurse to shard placements */
continue;
}
```
Effect:
* `VACUUM / ANALYZE` (no `ONLY`) on a Citus table: behavior unchanged,
Citus creates tasks and propagates to shard placements.
* `VACUUM ONLY <citus_table>` / `ANALYZE ONLY <citus_table>`:
* Core still processes the coordinator relation as usual.
* Citus **skips** building tasks for shard placements, so we do not
recurse into distributed children.
* The code compiles and behaves as before on pre-PG18; the new behavior
becomes observable only when the core planner starts setting `inh =
false` for `ONLY` (PG18).
* Unqualified `VACUUM` / `ANALYZE` (no rels) is unchanged and still
handled via `ExecuteUnqualifiedVacuumTasks`.
* Remove now-redundant helpers:
* `VacuumColumnList`
* `ExtractVacuumTargetRels`
Column lists are now taken directly from `vacuumRelation->va_cols` via
`CitusVacuumRelation`.
---
### Testing
Extend `src/test/regress/sql/pg18.sql` and `expected/pg18.out` with two
PG18-only blocks that verify we do not recurse into shard placements
when `ONLY` is used:
1. **Simple distributed table (`pg18_vacuum_part`)**
* Create and distribute a regular table:
```sql
CREATE SCHEMA pg18_vacuum_part;
SET search_path TO pg18_vacuum_part;
CREATE TABLE vac_analyze_only (a int);
SELECT create_distributed_table('vac_analyze_only', 'a');
INSERT INTO vac_analyze_only VALUES (1), (2), (3);
```
* On the coordinator:
* Run `ANALYZE vac_analyze_only;` and later `ANALYZE ONLY
vac_analyze_only;`.
* Run `VACUUM vac_analyze_only;` and later `VACUUM ONLY
vac_analyze_only;`.
* On `worker_1`:
* Capture `coalesce(max(last_analyze), 'epoch')` from
`pg_stat_user_tables` for `vac_analyze_only_%` into
`:analyze_before_only`, then assert:
```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS
analyze_only_skipped;
```
* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:
```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS
vacuum_only_skipped;
```
Both checks return `t`, confirming `ONLY` does not change `last_analyze`
/ `last_vacuum` on shard tables.
2. **Partitioned distributed table (`pg18_vacuum_part_dist`)**
* Create a partitioned table whose parent is distributed:
```sql
CREATE SCHEMA pg18_vacuum_part_dist;
SET search_path TO pg18_vacuum_part_dist;
SET citus.shard_count = 2;
SET citus.shard_replication_factor = 1;
CREATE TABLE part_dist (id int, v int) PARTITION BY RANGE (id);
CREATE TABLE part_dist_1 PARTITION OF part_dist FOR VALUES FROM (1) TO
(100);
CREATE TABLE part_dist_2 PARTITION OF part_dist FOR VALUES FROM (100) TO
(200);
SELECT create_distributed_table('part_dist', 'id');
INSERT INTO part_dist SELECT g, g FROM generate_series(1, 199) g;
```
* On the coordinator:
* Run `ANALYZE part_dist;` then `ANALYZE ONLY part_dist;`.
* Run `VACUUM part_dist;` then `VACUUM ONLY part_dist;` (PG18 emits the
expected warning: `VACUUM ONLY of partitioned table "part_dist" has no
effect`).
* On `worker_1`:
* Capture `coalesce(max(last_analyze), 'epoch')` for `part_dist_%` into
`:analyze_before_only`, then assert:
```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz
AS analyze_only_partitioned_skipped;
```
* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:
```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz
AS vacuum_only_partitioned_skipped;
```
Both checks return `t`, confirming that even for a partitioned
distributed parent, `VACUUM/ANALYZE ONLY` does not recurse into shard
placements, and Citus behavior matches PG18’s “ONLY = parent only”
semantics.
Generated columns can be virtual (not stored) and this is the default.
This PG18 feature requires tweaking citus_ruleutils and deparse table to
support in Citus. Relevant PG commit: 83ea6c540.
DESCRIPTION: Adds propagation of ENFORCED / NOT ENFORCED on CHECK
constraints.
Add propagation support to Citus ruleutils and appropriate regress
tests. Relevant PG commit: ca87c41.
Test that FOREIGN KEY .. NOT ENFORCED is propagated when applied to
Citus tables. No C code changes required, ruleutils handles it. Relevant
PG commit eec0040c4. Given that Citus does not yet support ALTER TABLE
.. ALTER CONSTRAINT its usefulness is questionable, but we propagate its
definition at least.
PG18 introduces `Disabled: <bool>` and planning/memory details that
cause unstable diffs, especially with `FORMAT JSON` when keys are
removed. This PR:
* Switches EXPLAIN assertions to `FORMAT YAML` wherever JSON structure
is not required.
* Stops normalizing/removing `"Disabled"` in JSON to avoid creating
invalid trailing commas.
* Adds alternative expected files for tests that must remain in JSON and
include `Disabled` lines.
### Rationale
* YAML is line-oriented and avoids JSON’s trailing-comma pitfalls; it
yields more stable diffs across PG15–PG18.
* Some tests intentionally assert JSON structure; those stay JSON but
use dedicated `.out` alternates rather than regex-based key removal.
* A proper “parse → modify → re-emit JSON” helper is being explored but
is non-trivial; this PR adopts the lowest-risk path now.
### What changed
1. **Test format**
* Convert `EXPLAIN (..., FORMAT JSON)` to `FORMAT YAML` in tests that
only care about plan shape/fields.
* Keep JSON only where the test explicitly validates JSON; introduce alt
expected files for those.
2. **Normalization**
* Remove JSON-side filtering of `"Disabled"` (no deletion, no regex
rewrite).
* Retain existing normalization for text/YAML (numbers, `Index
Searches`, `Window: ...`, buffers, etc.).
3. **Expected files**
* Update YAML-based expected outputs.
* Add `.out` alternates for JSON cases that include `Disabled`.
https://github.com/postgres/postgres/commit/7054186c4fixes#8358
This PR wires up PostgreSQL 18’s `publish_generated_columns` publication
option in Citus and adds regression coverage to ensure it behaves
correctly for distributed tables, without changing existing DDL output
for publications that rely on the default.
---
### 1. Preserve `publish_generated_columns` when rebuilding publications
In `BuildCreatePublicationStmt`:
* On PG18+ we now read the new `pubgencols` field from `pg_publication`
and map it as follows:
* `'n'` → default (`none`)
* `'s'` → `stored`
* For `pubgencols == 's'` we append a `publish_generated_columns`
defelem to the reconstructed statement:
```c
#if PG_VERSION_NUM >= PG_VERSION_18
if (publicationForm->pubgencols == 's') /* stored */
{
DefElem *pubGenColsOption =
makeDefElem("publish_generated_columns",
(Node *) makeString("stored"),
-1);
createPubStmt->options =
lappend(createPubStmt->options, pubGenColsOption);
}
else if (publicationForm->pubgencols != 'n') /* 'n' = none (default) */
{
ereport(ERROR,
(errmsg("unexpected pubgencols value '%c' for publication %u",
publicationForm->pubgencols, publicationId)));
}
#endif
```
* For `pubgencols == 'n'` we do **not** emit an option and rely on
PostgreSQL’s default.
* Any value other than `'n'` or `'s'` raises an error rather than
silently producing incorrect DDL.
This ensures:
* Publications that explicitly use `publish_generated_columns = stored`
are reconstructed with that option on workers, so workers get
`pubgencols = 's'`.
* Publications that use the default (`none`) continue to produce the
same `CREATE PUBLICATION ... WITH (...)` text as before (no extra
`publish_generated_columns = 'none'` noise), fixing the unintended diffs
in existing publication tests.
---
### 2. New PG18 regression coverage for distributed publications
In `src/test/regress/sql/pg18.sql`:
* Create a table with a stored generated column and make it distributed
so the publication goes through Citus DDL propagation:
```sql
CREATE TABLE gen_pub_tab (
id int primary key,
a int,
b int GENERATED ALWAYS AS (a * 10) STORED
);
SELECT create_distributed_table('gen_pub_tab', 'id', colocate_with :=
'none');
```
* Create two publications that exercise both `pubgencols` values:
```sql
CREATE PUBLICATION pub_gen_cols_stored
FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = stored);
CREATE PUBLICATION pub_gen_cols_none
FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = none);
```
* On coordinator and both workers, assert the catalog contents:
```sql
SELECT pubname, pubgencols
FROM pg_publication
WHERE pubname IN ('pub_gen_cols_stored', 'pub_gen_cols_none')
ORDER BY pubname;
```
Expected on all three nodes:
* `pub_gen_cols_stored | s`
* `pub_gen_cols_none | n`
This test verifies that:
* `pubgencols` is correctly set on the coordinator for both `stored` and
`none`.
* Citus propagates the setting unchanged to all workers for a
distributed table.
fixes#81568b1b342544
* Extend `normalize.sed` to:
* Rewrite auto-generated window names like `OVER w1` back to `OVER (?)`
on:
* `Sort Key: …`
* `Group Key: …`
* `Output: …`
* Leave functional window specs like `OVER (PARTITION BY …)` untouched.
* Use `public.explain_filter(...)` around EXPLAINs in window-related
tests to:
* Avoid plan text churn from PG18 planner/EXPLAIN changes while still
checking that we use the Citus executor and expected node types.
* Update expected outputs in:
* `mixed_relkind_tests.out`
* `multi_explain*.out`
* `multi_outer_join_columns*.out`
* `multi_subquery_window_functions.out`
* `multi_test_helpers.out`
* `window_functions.out`
to match the filtered EXPLAIN output on PG18 while remaining compatible
with older PG versions.
PG18: syntax & semantics behavior in Citus, part 1.
Includes PG18 tests for:
- OLD/NEW support in RETURNING clause of DML queries (PG commit
80feb727c)
- WITHOUT OVERLAPS in PRIMARY KEY and UNIQUE constraints (PG commit
fc0438b4e)
- COLUMNS clause in JSON_TABLE (PG commit bb766cd)
- Foreign tables created with LIKE <table> clause (PG commit 302cf1575)
- Foreign Key constraint with PERIOD clause (PG commit 89f908a6d)
- COPY command REJECT_LIMIT option (PG commit 4ac2a9bec)
- COPY TABLE TO on a materialized view (PG commit 534874fac)
Partially addresses issue #8250
fixes#8279
PostgreSQL 18 planner changes (probably AIO and updated cost model) make
sequential scans cheaper, so the psql `\d table`-style query that uses a
regex on `pg_class.relname` no longer chooses an index scan. This causes
a plan difference in the `mx_hide_shard_names` regression test.
This patch adds an alternative out for the multi_mx_hide_shard_names
test.
This PR introduces infrastructure and validation to detect breaking
changes during Citus minor version upgrades, designed to run in release
branches only.
**Breaking change detection:**
- [GUCs] Detects removed GUCs and changes to default values
- [UDFs] Detects removed functions and function signature changes
-- Supports backward-compatible function overloading (new optional
parameters allowed)
- [types] Detects removed data types
- [tables/views] Detects removed tables/views and removed/changed
columns
- New make targets for minor version upgrade tests
- Follow-up PRs will add test schedules with different upgrade scenarios
The test will be enabled in release branches (e.g., release-13) via the
new test-citus-minor-upgrade job shown below. It will not run on the
main branch.
Testing
Verified locally with sample breaking changes:
`make check-citus-minor-upgrade-local citus-old-version=v13.2.0
`
**Test case 1:** Backward-compatible signature change (allowed)
```
-- Old: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer)
-- New: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer, pBlockedByPid integer DEFAULT NULL)
```
No breaking change detected (new parameter has DEFAULT)
**Test case 2:** Incompatible signature change (breaking)
```
-- Old: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer)
-- New: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer, pBlockedByPid integer)
```
Breaking change detected:
`UDF signature removed: pg_catalog.citus_blocking_pids(pblockedpid
integer) RETURNS integer[]`
**Test case 3:** GUC changes (breaking)
- Removed `citus.max_worker_nodes_tracked`
- Changed default value of `citus.max_shared_pool_size` from 0 to 4
Breaking change detected:
```
The default value of GUC citus.max_shared_pool_size was changed from 0 to 4
GUC citus.max_worker_nodes_tracked was removed
```
**Test case 4:** Table/view changes
- Dropped `pg_catalog.pg_dist_rebalance_strategy` and removed a column
from `pg_catalog.citus_lock_waits`
```
- Column blocking_nodeid in table/view pg_catalog.citus_lock_waits was removed
- Table/view pg_catalog.pg_dist_rebalance_strategy was removed
```
**Test case 5:** Remove a custom type
- Dropped `cluster_clock` and the objects depend on it. In addition to
the dependent objects, test shows:
```
- Type pg_catalog.cluster_clock was removed
```
Sample new job for build and test workflow (for release branches):
```
test-citus-minor-upgrade:
name: PG17 - check-citus-minor-upgrade
runs-on: ubuntu-latest
container:
image: "${{ needs.params.outputs.citusupgrade_image_name }}:${{ fromJson(needs.params.outputs.pg17_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root
needs:
- params
- build
env:
citus_version: 13.2
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/setup_extension"
with:
skip_installation: true
- name: Install and test citus minor version upgrade
run: |-
gosu circleci \
make -C src/test/regress \
check-citus-minor-upgrade \
bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar;
- uses: "./.github/actions/save_logs_and_results"
if: always()
with:
folder: ${{ env.PG_MAJOR }}_citus_minor_upgrade
- uses: "./.github/actions/upload_coverage"
if: always()
with:
flags: ${{ env.PG_MAJOR }}_citus_minor_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
```
fixes#8264
PostgreSQL 18 introduced a planner improvement (commit `ae4569161`) that
rewrites simple `OR` equality clauses into `= ANY(...)` forms, allowing
the use of a single index scan instead of multiple scans or a custom
scan.
This change affects the columnar path tests where queries like `a=0 OR
a=5` previously chose a Columnar or Seq Scan plan.
In this PR:
* Updated test expectations for `uses_custom_scan` and `uses_seq_scan`
to reflect the new index scan plan.
This keeps the test output consistent with PostgreSQL 18’s updated
planner behavior.
fixes#83170d8bd0a72e
PG18 changed the wording of connection failures during `CREATE
SUBSCRIPTION` to include a subscription prefix and a verbose “connection
to server … failed:” preamble. This breaks one regression output
(`multi_move_mx`). This PR adds normalization rules to map PG18 output
back to the prior form so results are stable across PG15–PG18.
**What changes**
Add two rules in `src/test/regress/bin/normalize.sed`:
```sed
# PG18: drop 'subscription "<name>"' prefix
# remove when PG18 is the minimum supported version
s/^[[:space:]]*ERROR:[[:space:]]+subscription "[^"]+" could not connect to the publisher:[[:space:]]*/ERROR: could not connect to the publisher: /I
# PG18: drop verbose 'connection to server … failed:' preamble
s/^[[:space:]]*ERROR:[[:space:]]+could not connect to the publisher:[[:space:]]*connection to server .* failed:[[:space:]]*/ERROR: could not connect to the publisher: /I
```
**Before (PG18)**
```
ERROR: subscription "subs_01" could not connect to the publisher:
connection to server at "localhost" (::1), port 57637 failed:
root certificate file "/non/existing/certificate.crt" does not exist
```
**After normalization**
```
ERROR: could not connect to the publisher:
root certificate file "/non/existing/certificate.crt" does not exist
```
**Why**
Maintain identical regression outputs across supported PG versions while
Citus still supports PG<18.
PostgreSQL 18 adds a new line to text EXPLAIN with ANALYZE (`Index
Searches: N`). That extra line both creates noise and bumps psql’s `(N
rows)` footer. This PR keeps ANALYZE (so statements still execute) while
removing the version-specific churn in our regress outputs.
### What changed
* **Use `explain_filter(...)` instead of raw text EXPLAIN**
* In `local_shard_execution.sql` and
`local_shard_execution_replicated.sql`, replace direct:
```sql
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
<stmt>;
```
with:
```sql
\pset footer off
SELECT public.explain_filter('EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF,
TIMING OFF, BUFFERS OFF) <stmt>');
\pset footer on
```
* Expected files updated accordingly to show the `explain_filter` output
block instead of raw EXPLAIN text.
* **Extend `explain_filter` to drop the PG18 line**
* Filter now removes any `Index Searches: <number>` line before
normalizing numeric fields, preventing the “N” version of the same line
from sneaking in.
* **Keep suite-wide normalizer intact**
Fixes https://github.com/citusdata/citus/issues/8235
PG18 and PG latest minors ignore temporary relations in
`RelidByRelfilenumber` (`RelidByRelfilenode` in PG15)
Relevant PG commit:
https://github.com/postgres/postgres/commit/86831952
Here we are keeping temp reloids instead of getting it with
RelidByRelfilenumber, for example, in some cases, we can directly get
reloid from relations, in other cases we keep it in some structures.
Note: there is still an outstanding issue with columnar temp tables in
concurrent sessions, that will be fixed in PR
https://github.com/citusdata/citus/pull/8252
The `merge` regress test uses SQL functions which can be cached in PG18+
since commit
[0dca5d68d](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0dca5d68d7bebf2c1036fd84875533afef6df992).
Distributed plan's copy function did not include the
`sourceResultRepartitionColumnIndex` field, which is critical for MERGE
queries, and for cached distributed plans this field was always 0
leading to the problem (#8285). Ensuring it is copied fixes it. This was
an oversight in Citus, and not specific to PG18.
fixes#8318
PostgreSQL 18 started printing an extra line in `EXPLAIN (ANALYZE …)`
for index scans:
```
Index Searches: N
```
This makes our test output flap (extra line, footer row count changes)
while the intent of the test is simply to **prove we plan an index
scan**, not to assert runtime counters.
### What this PR does
```sql
EXPLAIN (... analyze on, ...)
```
to
```sql
EXPLAIN (... analyze off, ...)
```
### Why this approach
* **Minimal change**: keeps the test’s purpose (“we exercise an index
scan”) without introducing normalization rules
* **Version-stable**: avoids PG18’s new text while remaining valid on
PG15–PG18.
* **No behavioral impact**: we still validate the plan node is an Index
Scan; we just don’t request execution.
### Before → After (essence)
Before (PG18):
```
Aggregate
-> Index Scan ... (actual rows=1 loops=1)
Index Cond: ...
Index Searches: 1
```
After:
```
Aggregate
-> Index Scan ...
Index Cond: ...
```
fixes#8278
Please check issue:
https://github.com/citusdata/citus/issues/8278#issuecomment-3431707484f4e7756ef9
### What PG18 changed
SELECT creates diff has a **named join**:
```sql
(...) AS unsupported_join (x,y,z,t,e,f,q)
```
On PG17, `COUNT(unsupported_join.*)` stayed as a single whole-row Var
that referenced the **join alias**.
On PG18, the parser expands that whole-row Var **early** into a
`ROW(...)` of **base** columns:
```
ROW(a.user_id, a.item_id, a.buy_count,
b.id, b.it_name, b.k_no,
c.id, c.it_name, c.k_no)
```
But since the join is *named*, inner aliases `a/b/c` are hidden.
Referencing them later blows up with
“invalid reference to FROM-clause entry for table ‘a’”.
### What this PR changes
1. **Retarget at `RowExpr` deparse (not in `get_variable`)**
* In `get_rule_expr()`’s `T_RowExpr` branch, each element `e` of
`ROW(...)` is examined.
* If `e` unwraps to a simple, same-level `Var` (`varlevelsup == 0`,
`varattno > 0`) and there is a **named `RTE_JOIN`** with
`joinaliasvars`, we **do not** change `varno/varattno`.
* Instead, we build a copy of the Var and set **`varnosyn/varattnosyn`**
to the matching join alias column (from `joinaliasvars`).
* Then we deparse that Var via `get_rule_expr_toplevel(...)`, which
naturally prints `join_alias.colname`.
* Scope is limited to **query deparsing** (`dpns->plan == NULL`),
exactly where PG18 expands whole-row vars into `ROW(...)` of base Vars.
2. **Helpers (PG18-only file)**
* `unwrap_simple_var(Node*)`: strips trivial wrappers (`RelabelType`,
`CoerceToDomain`, `CollateExpr`) to reveal a `Var`.
* `var_matches_base(const Var*, int varno, AttrNumber attno)`: matches
canonical or synonym identity.
* `dpns_has_named_join(const deparse_namespace*)`: fast precheck for any
named join with `joinaliasvars`.
* `map_var_through_join_alias(...)`: scans `joinaliasvars` to locate the
**JOIN RTE index + attno** for a 1:1 alias; the caller uses these to set
`varnosyn/varattnosyn`.
3. **Safety and non-goals**
* **No effect on plan deparsing** (`dpns->plan != NULL`).
* **No change to semantic identity**: we leave `varno/varattno`
untouched; only set `varnosyn/varattnosyn`.
* Skip whole-row/system columns (`attno <= 0`) and non-simple join
columns (computed expressions).
* Works with named joins **with or without** an explicit column list (we
rely on `joinaliasvars`, not the alias collist).
### Reproducer
```sql
CREATE TABLE distributed_table(user_id int, item_id int, buy_count int);
CREATE TABLE reference_table(id int, it_name varchar(25), k_no int);
SELECT create_distributed_table('distributed_table', 'user_id');
SELECT COUNT(unsupported_join.*)
FROM (distributed_table a
LEFT JOIN reference_table b ON true
RIGHT JOIN reference_table c ON true)
AS unsupported_join (x,y,z,t,e,f,q)
JOIN (reference_table d JOIN reference_table e ON true) ON true;
```
**Before (PG18):** deparser emitted `ROW(a.user_id, …)` → `ERROR:
invalid reference to FROM-clause entry for table "a"`
**After:** deparser emits
`ROW(unsupported_join.x, ..., unsupported_join.k_no)` → runs
successfully.
Now maps to `unsupported_join.<auto_col_names>` and runs.
With PG18's GROUP RTE, queries that should have been eligible for fast
path planning were skipped because the fast path planner allows exactly
one range table only. This fix extends that to account for a GROUP RTE.
Fixes#8275 by printing the names in order so that in every message
`DETAIL: x and y are not co-located` x precedes (or is lexicographically
less than) y.
fixes#8267
* Extend `src/test/regress/bin/normalize.sed` to drop the new PG18
EXPLAIN instrumentation line:
```
Storage: <Memory|Disk|Memory and Disk> Maximum Storage: <size>
```
which appears under `Materialize`, some `CTE Scan`s, etc. when `ANALYZE`
is on.
**Why**
* PG18 added storage usage reporting for materialization/tuplestore
nodes. It’s useful for humans but creates noisy, non-semantic diffs in
regression output. There’s no EXPLAIN flag to suppress it, so we
normalize in tests instead. This PR wires that normalization into our
sed pipeline.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1eff8279d
**How**
* Add a narrowly scoped sed rule that matches only lines starting with
`Storage:` (keeping `Sort Method`, `Hash`, `Buffers`, etc. intact). Use
ERE compatible with `sed -Ef` and Python `re` (no POSIX character
classes), e.g.:
```
/^[ \t]*Storage:[ \t].*$/d
```
fixes#8263
* Introduces `columnar_test_helpers._plan_json(q text) -> jsonb`, which
runs `EXPLAIN (FORMAT JSON, COSTS OFF, ANALYZE OFF)` and returns the
plan as `jsonb`. This lets us assert on plan structure instead of text
matching.
* Updates `columnar_indexes.sql` tests to detect whether any index-based
scan is used by searching the JSON plan’s `"Node Type"` (e.g., `Index
Scan`, `Index Only Scan`, `Bitmap Index Scan`).
## Notable changes
* New helper in `src/test/regress/sql/columnar_test_helpers.sql`:
* `EXECUTE format('EXPLAIN (FORMAT JSON, COSTS OFF, ANALYZE OFF) %s', q)
INTO j;`
* Returns the `jsonb` plan for downstream assertions.
```sql
SELECT NOT jsonb_path_exists(
columnar_test_helpers._plan_json('SELECT b FROM columnar_table WHERE b =
30000'),
'$[*].Plan.** ? (@."Node Type" like_regex "^(Index|Bitmap
Index).*Scan$")'
) AS uses_no_index_scan;
```
to verify no index scan occurs at the partial index boundary (`b =
30000`) and that an index scan is used where expected (`b = 30001`).
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
fixes#8277https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=45188c2ea
PostgreSQL 18 + newer OpenSSL builds surface `ssl_ciphers` as a **rule
string** (e.g., `HIGH:MEDIUM:+3DES:!aNULL`) instead of an expanded
cipher list. Our tests hard-pinned the literal list and started failing
on PG18. Also, with TLS 1.3 in the picture, we need to assert that
cipher configuration is sane without coupling to OpenSSL’s expansion.
**What changed**
* **sql/ssl_by_default.sql**
* Replace brittle `SHOW ssl_ciphers` string matching with invariant
checks:
* non-empty ciphers: `current_setting('ssl_ciphers') <> ''`
* looks like a rule/list: `position(':' in
current_setting('ssl_ciphers')) > 0`
* Run the same checks on **workers** via `run_command_on_workers`.
* Keep existing validations for `ssl=on`, `sslmode=require` in
`citus.node_conninfo`, and `pg_stat_ssl.ssl = true`.
* **expected/ssl_by_default.out**
* Update expected output to booleans for the new checks (less diff-prone
across PG/SSL variants).
Fixes#8268
PG18 added core async I/O infrastructure. Relevant PG commit:
https://github.com/postgres/postgres/commit/da72269
In `pg16.sql test`, we tested `REINDEX SYSTEM` command, which would
later cause io debug lines in a command of `multi_insert_select.sql`
test, that runs _after_ `pg16.sql` test.
"This is expected because the I/O subsystem is repopulating its internal
callback pool after heavy catalog I/O churn caused by REINDEX SYSTEM." -
chatgpt (seems right 😄)
This PR removes `REINDEX SYSTEM` test as its not crucial anyway.
`REINDEX DATABASE` is sufficient for the purpose presented in pg16.sql
test.
In the PR [8142](https://github.com/citusdata/citus/pull/8142) was added
`PushActiveSnapshot`.
This commit places it outside a for loop as taking a snapshot is a resource
heavy operation.
The GUC configuration for SkipAdvisoryLockPermissionChecks had
misconfigured the settings for GUC_SUPERUSER_ONLY for PGC_SUSET - when
PostgreSQL running with ASAN, this fails when querying pg_settings due
to exceeding the size of the array GucContext_Names. Fix up this GUC
declaration to not crash with ASAN.
The failing queries all have a GROUP BY, and the fix teaches the Citus recursive planner how to handle a PG18 GROUP range table in the outer query:
- In recursive query planning, don't recurse into subquery expressions in a GROUP BY clause
- Flatten references to a GROUP rte before creating the worker subquery in pushdown planning
- If a PARAM node points to a GROUP rte then tunnel through to the underlying expression
Fixes#8296.
fixes#82650fbceae841
PostgreSQL 18 started printing an extra line in `EXPLAIN (ANALYZE …)`
for index scans:
```
Index Searches: N
```
**normalize.sed**: add a rule to remove the PG18-only line
```
/^\s*Index Searches:\s*\d+\s*$/d
```
0dca5d68d7fixes#8153
```diff
/citus/src/test/regress/expected/multi_subquery_misc.out
-- should error out
SELECT sql_subquery_test(1,1);
-ERROR: could not create distributed plan
-DETAIL: Possibly this is caused by the use of parameters in SQL functions, which is not supported in Citus.
-HINT: Consider using PL/pgSQL functions instead.
-CONTEXT: SQL function "sql_subquery_test" statement 1
+ sql_subquery_test
+-------------------
+ 307
+(1 row)
+
```
PostgreSQL 18 changes planner behavior for inlining/parameter handling
in SQL functions (pg18 commit `0dca5d68d`). As a result, a query in
`multi_subquery_misc` that previously failed to create a distributed
plan now succeeds. This PR updates the regression **expected** file to
reflect the new outcome on PG18+.
### What changed
* Updated `src/test/regress/expected/multi_subquery_misc.out`:
* Replaced the previous error block:
```
ERROR: could not create distributed plan
DETAIL: Possibly this is caused by the use of parameters in SQL
functions, which is not supported in Citus.
HINT: Consider using PL/pgSQL functions instead.
CONTEXT: SQL function "sql_subquery_test" statement 1
```
* with the actual successful result on PG18+:
```
sql_subquery_test
--------------------------------------------
307
(1 row)
```
* **PG < 18:** Behavior remains unchanged; the test still errors on
older versions.
* **PG ≥ 18:** The call succeeds; the updated expected output matches
actual results.
similar PR: https://github.com/citusdata/citus/pull/8184
fixes#8093c2a4078eba
- Enable buffer-usage reporting by default in `EXPLAIN ANALYZE` on
PostgreSQL 18 and above.
- Introduce the explicit `BUFFERS OFF` option in every existing
regression test to maintain pre-PG18 output consistency.
- This appends, `BUFFERS OFF` to all `EXPLAIN(...)` calls in
src/test/regress/sql and the corresponding .out files.
The error `Unrecognized range table id` seen in regress test
`insert_select_into_local_tables` is a consequence of the INSERT ..
SELECT planner getting confused by a SELECT query with a GROUP BY and
hence a Group RTE, introduced in PG18 (commit 247dea89f). The solution
is to flatten the relevant parts of the SELECT query before preparing
the INSERT .. SELECT query tree for use by Citus.
PG18 has removed heap_inplace_update(), which is crucial for
citus_columnar extension because we always want to update
stripe entries for columnar in-place.
Relevant PG18 commit:
https://github.com/postgres/postgres/commit/a07e03f
heap_inplace_update() has been replaced by
heap_inplace_update_and_unlock, which is used inside
systable_inplace_update_finish, which is used together with
systable_inplace_update_begin. This change has been back-patched
up to v12, which is enough for us since the oldest version
Citus supports is v15.
In PG<18, a deprecated heap_inplace_update() is retained,
however, let's start using the new functions because they are
better, and such that we don't need to wrap these changes in
PG18 version conditionals.
Basically, in this commit we replace the following:
SysScanDesc scanDescriptor = systable_beginscan(columnarStripes,
indexId, indexOk, &dirtySnapshot, 2, scanKey);
heap_inplace_update(columnarStripes, modifiedTuple);
with the following:
systable_inplace_update_begin(columnarStripes, indexId, indexOk,
NULL, 2, scanKey, &tuple, &state);
systable_inplace_update_finish(state, tuple);
For more understanding, it's best to refer to an example:
REL_18_0/src/backend/catalog/toasting.c#L349-L371
of how systable_inplace_update_begin and
systable_inplace_update_finish are used in PG18, because they
mirror the need of citus columnar.
Fixes#8207
This reverts commit 5d805eb10b.
heap_inplace_update was incorrectly replaced by
CatalogTupleUpdate in 5d805eb. In Citus, we assume a stripe
entry with some columns set to null means that a write
is in-progress, because otherwise we wouldn't see a such row.
But this breaks when we use CatalogTupleUpdate because it
inserts a new version of the row, which leaves the
in-progress version behind. Among other things, this was
causing various issues in PG18 - check-columnar test.
PG18 changed the names generated for child foreign key constraints.
https://github.com/postgres/postgres/commit/3db61db48
The test failures in Citus regression suite are all changing the name of
a constraint from `'sensors%'` to `'%to_parent%_1'`: the naming is very
nice here because `to_parent` means that we have a foreign key to a
parent table.
To fix the diff, we exclude those constraints from the output. To verify
correctness, we still count the problematic constraints to make sure
they are there - we are simply removing them from the first output (we
add this count query right after the previous one)
Fixes#8126
Co-authored-by: Mehmet YILMAZ <mehmety87@gmail.com>
Qualify create domain stmt after local execution, to avoid such diffs in
PG vanilla tests:
```diff
create domain d_fail as anyelement;
-ERROR: "anyelement" is not a valid base type for a domain
+ERROR: "pg_catalog.anyelement" is not a valid base type for a domain
```
These tests were newly added in PG18, however this is not new PG18
behavior, just some added tests.
https://github.com/postgres/postgres/commit/0172b4c94Fixes#8042
PG18 changed the visibility of various Explain Serialize functions and
structs to `extern`. Previously, for PG17 support, these were `static`,
so we had to copy paste their definitions from `explain.c` to Citus's
`multi_explain.c`.
Relevant PG18 commits:
https://github.com/postgres/postgres/commit/555960a0https://github.com/postgres/postgres/commit/77cb08be
Now we don't need to define the following anymore in Citus, since they
are extern in PG18:
- typedef struct SerializeMetrics
- void ExplainIndentText(ExplainState *es);
- SerializeMetrics GetSerializationMetrics(DestReceiver *dest);
- typedef struct SerializeDestReceiver (this is not extern, however it
is only used by GetSerializationMetrics function)
This was incorrectly handled in
https://github.com/citusdata/citus/commit/9e42f3f2c
by wrapping these definitions and usages in PG17 only,
causing such diffs in PG18 (not able to see serialization at all):
```diff
citus/src/test/regress/expected/pg17.out
select public.explain_filter('explain (analyze,
serialize binary,buffers,timing) select * from int8_tbl i8');
...
Planning Time: N.N ms
- Serialization: time=N.N ms output=NkB format=binary
Execution Time: N.N ms
Planning Time: N.N ms
Serialization: time=N.N ms output=NkB format=binary
Execution Time: N.N ms
-(14 rows)
+(13 rows)
```
This PR solves the following diffs, originating from the addition of
`varreturningtype` field to the `Var` struct in PG18:
https://github.com/postgres/postgres/commit/80feb727c
Previously we didn't account for this new field (as it's new), so this
wouldn't allow the parser to correctly reconstruct the `Var` node
structure, but rather it would error out with `did not find '}' at end
of input node`:
```diff
SELECT column_to_column_name(logicalrelid, partkey)
FROM pg_dist_partition WHERE partkey IS NOT NULL ORDER BY 1 LIMIT 1;
- column_to_column_name
----------------------------------------------------------------------
- a
-(1 row)
-
+ERROR: did not find '}' at end of input node
```
Solution follows precedent https://github.com/citusdata/citus/pull/7107,
when varnullingrels field was added to the `Var` struct in PG16.
The solution includes:
- Taking care of the `partkey` in `pg_dist_partition` table because it's
coming from the `Var` struct. This mainly includes fixing the upgrade
script to PG18, by saving all the `partkey` infos before upgrading to
PG18 (in `citus_prepare_pg_upgrade`), and then re-generating `partkey`
columns in `pg_dist_partition` (using `UPDATE`) after upgrading to PG18
(in `citus_finish_pg_upgrade`).
- Adding a normalize rule to fix output differences among PG versions.
Note that we need two normalize lines: one for PG15 since it doesn't
have `varnullingrels`, and one for PG16/PG17.
- Small trick on `metadata_sync_helpers` to use different text when
generating the `partkey`, based on the PG version.
Fixes#8189
3 minor changes to reduce some noise from the regression diffs.
1 - Reduce verbosity when ALTER EXTENSION fails
PG18 has improved reporting of errors in extension script files
Relevant PG commit:
https://github.com/postgres/postgres/commit/774171c4f
There was more context in PG18, so reducing verbosity
```
ALTER EXTENSION citus UPDATE TO '11.0-1';
ERROR: cstore_fdw tables are deprecated as of Citus 11.0
HINT: Install Citus 10.2 and convert your cstore_fdw tables to the
columnar access method before upgrading further
CONTEXT: PL/pgSQL function inline_code_block line 4 at RAISE
+SQL statement "DO LANGUAGE plpgsql
+$$
+BEGIN
+ IF EXISTS (SELECT 1 FROM pg_dist_shard where shardstorage = 'c') THEN
+ RAISE EXCEPTION 'cstore_fdw tables are deprecated as of Citus 11.0'
+ USING HINT = 'Install Citus 10.2 and convert your cstore_fdw tables
to the columnar access method before upgrading further';
+ END IF;
+END;
+$$"
+extension script file "citus--10.2-5--11.0-1.sql", near line 532
```
2 - Fix backend type order in tests for PG18
PG18 added another backend type which messed the order
in this test
Adding a separate IF condition for PG18
Relevant PG commit:
https://github.com/postgres/postgres/commit/18d67a8d7d
3 - Ignore "DEBUG: find_in_path" lines in output
Relevant PG commit:
https://github.com/postgres/postgres/commit/4f7f7b0375
The new GUC extension_control_path specifies a path to look for
extension control files.