fixes#8364
PostgreSQL 18 changes VACUUM/ANALYZE to recurse into inheritance
children by default, and introduces `ONLY` to limit processing to the
parent. Upstream change:
[https://github.com/postgres/postgres/commit/62ddf7ee9](https://github.com/postgres/postgres/commit/62ddf7ee9)
For Citus tables, we should treat shard placements as “children” and
avoid propagating `VACUUM/ANALYZE` to shards when the user explicitly
asks for `ONLY`.
This PR adjusts the Citus VACUUM handling to align with PG18 semantics,
and adds regression coverage on both regular distributed tables and
partitioned distributed tables.
---
### Behavior changes
* Introduce a per-relation helper struct:
```c
typedef struct CitusVacuumRelation
{
VacuumRelation *vacuumRelation;
Oid relationId;
} CitusVacuumRelation;
```
This lets us keep both:
* the resolved relation OID (for `IsCitusTable`, task building), and
* the original `VacuumRelation` node (for column list and ONLY/inh
flag).
* Replace the old `VacuumRelationIdList` / `ExtractVacuumTargetRels`
flow with:
```c
static List *VacuumRelationList(VacuumStmt *vacuumStmt,
CitusVacuumParams vacuumParams);
```
`VacuumRelationList` now:
* Iterates over `vacuumStmt->rels`.
* Resolves `relid` via `RangeVarGetRelidExtended` when `relation` is
present.
* Falls back to locking `VacuumRelation->oid` when only an OID is
available.
* Respects `VACOPT_FULL` for lock mode and `VACOPT_SKIP_LOCKED` for
locking behavior.
* Builds a `List *` of `CitusVacuumRelation` entries.
* Update:
```c
IsDistributedVacuumStmt(List *vacuumRelationList);
ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt,
List *vacuumRelationList,
CitusVacuumParams vacuumParams);
```
to operate on `CitusVacuumRelation` instead of bare OIDs.
* Implement `ONLY` semantics in `ExecuteVacuumOnDistributedTables`:
```c
RangeVar *relation = vacuumRelation->relation;
if (relation != NULL && !relation->inh)
{
/* ONLY specified, so don't recurse to shard placements */
continue;
}
```
Effect:
* `VACUUM / ANALYZE` (no `ONLY`) on a Citus table: behavior unchanged,
Citus creates tasks and propagates to shard placements.
* `VACUUM ONLY <citus_table>` / `ANALYZE ONLY <citus_table>`:
* Core still processes the coordinator relation as usual.
* Citus **skips** building tasks for shard placements, so we do not
recurse into distributed children.
* The code compiles and behaves as before on pre-PG18; the new behavior
becomes observable only when the core planner starts setting `inh =
false` for `ONLY` (PG18).
* Unqualified `VACUUM` / `ANALYZE` (no rels) is unchanged and still
handled via `ExecuteUnqualifiedVacuumTasks`.
* Remove now-redundant helpers:
* `VacuumColumnList`
* `ExtractVacuumTargetRels`
Column lists are now taken directly from `vacuumRelation->va_cols` via
`CitusVacuumRelation`.
---
### Testing
Extend `src/test/regress/sql/pg18.sql` and `expected/pg18.out` with two
PG18-only blocks that verify we do not recurse into shard placements
when `ONLY` is used:
1. **Simple distributed table (`pg18_vacuum_part`)**
* Create and distribute a regular table:
```sql
CREATE SCHEMA pg18_vacuum_part;
SET search_path TO pg18_vacuum_part;
CREATE TABLE vac_analyze_only (a int);
SELECT create_distributed_table('vac_analyze_only', 'a');
INSERT INTO vac_analyze_only VALUES (1), (2), (3);
```
* On the coordinator:
* Run `ANALYZE vac_analyze_only;` and later `ANALYZE ONLY
vac_analyze_only;`.
* Run `VACUUM vac_analyze_only;` and later `VACUUM ONLY
vac_analyze_only;`.
* On `worker_1`:
* Capture `coalesce(max(last_analyze), 'epoch')` from
`pg_stat_user_tables` for `vac_analyze_only_%` into
`:analyze_before_only`, then assert:
```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS
analyze_only_skipped;
```
* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:
```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS
vacuum_only_skipped;
```
Both checks return `t`, confirming `ONLY` does not change `last_analyze`
/ `last_vacuum` on shard tables.
2. **Partitioned distributed table (`pg18_vacuum_part_dist`)**
* Create a partitioned table whose parent is distributed:
```sql
CREATE SCHEMA pg18_vacuum_part_dist;
SET search_path TO pg18_vacuum_part_dist;
SET citus.shard_count = 2;
SET citus.shard_replication_factor = 1;
CREATE TABLE part_dist (id int, v int) PARTITION BY RANGE (id);
CREATE TABLE part_dist_1 PARTITION OF part_dist FOR VALUES FROM (1) TO
(100);
CREATE TABLE part_dist_2 PARTITION OF part_dist FOR VALUES FROM (100) TO
(200);
SELECT create_distributed_table('part_dist', 'id');
INSERT INTO part_dist SELECT g, g FROM generate_series(1, 199) g;
```
* On the coordinator:
* Run `ANALYZE part_dist;` then `ANALYZE ONLY part_dist;`.
* Run `VACUUM part_dist;` then `VACUUM ONLY part_dist;` (PG18 emits the
expected warning: `VACUUM ONLY of partitioned table "part_dist" has no
effect`).
* On `worker_1`:
* Capture `coalesce(max(last_analyze), 'epoch')` for `part_dist_%` into
`:analyze_before_only`, then assert:
```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz
AS analyze_only_partitioned_skipped;
```
* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:
```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz
AS vacuum_only_partitioned_skipped;
```
Both checks return `t`, confirming that even for a partitioned
distributed parent, `VACUUM/ANALYZE ONLY` does not recurse into shard
placements, and Citus behavior matches PG18’s “ONLY = parent only”
semantics.
Generated columns can be virtual (not stored) and this is the default.
This PG18 feature requires tweaking citus_ruleutils and deparse table to
support in Citus. Relevant PG commit: 83ea6c540.
DESCRIPTION: Adds propagation of ENFORCED / NOT ENFORCED on CHECK
constraints.
Add propagation support to Citus ruleutils and appropriate regress
tests. Relevant PG commit: ca87c41.
https://github.com/postgres/postgres/commit/7054186c4fixes#8358
This PR wires up PostgreSQL 18’s `publish_generated_columns` publication
option in Citus and adds regression coverage to ensure it behaves
correctly for distributed tables, without changing existing DDL output
for publications that rely on the default.
---
### 1. Preserve `publish_generated_columns` when rebuilding publications
In `BuildCreatePublicationStmt`:
* On PG18+ we now read the new `pubgencols` field from `pg_publication`
and map it as follows:
* `'n'` → default (`none`)
* `'s'` → `stored`
* For `pubgencols == 's'` we append a `publish_generated_columns`
defelem to the reconstructed statement:
```c
#if PG_VERSION_NUM >= PG_VERSION_18
if (publicationForm->pubgencols == 's') /* stored */
{
DefElem *pubGenColsOption =
makeDefElem("publish_generated_columns",
(Node *) makeString("stored"),
-1);
createPubStmt->options =
lappend(createPubStmt->options, pubGenColsOption);
}
else if (publicationForm->pubgencols != 'n') /* 'n' = none (default) */
{
ereport(ERROR,
(errmsg("unexpected pubgencols value '%c' for publication %u",
publicationForm->pubgencols, publicationId)));
}
#endif
```
* For `pubgencols == 'n'` we do **not** emit an option and rely on
PostgreSQL’s default.
* Any value other than `'n'` or `'s'` raises an error rather than
silently producing incorrect DDL.
This ensures:
* Publications that explicitly use `publish_generated_columns = stored`
are reconstructed with that option on workers, so workers get
`pubgencols = 's'`.
* Publications that use the default (`none`) continue to produce the
same `CREATE PUBLICATION ... WITH (...)` text as before (no extra
`publish_generated_columns = 'none'` noise), fixing the unintended diffs
in existing publication tests.
---
### 2. New PG18 regression coverage for distributed publications
In `src/test/regress/sql/pg18.sql`:
* Create a table with a stored generated column and make it distributed
so the publication goes through Citus DDL propagation:
```sql
CREATE TABLE gen_pub_tab (
id int primary key,
a int,
b int GENERATED ALWAYS AS (a * 10) STORED
);
SELECT create_distributed_table('gen_pub_tab', 'id', colocate_with :=
'none');
```
* Create two publications that exercise both `pubgencols` values:
```sql
CREATE PUBLICATION pub_gen_cols_stored
FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = stored);
CREATE PUBLICATION pub_gen_cols_none
FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = none);
```
* On coordinator and both workers, assert the catalog contents:
```sql
SELECT pubname, pubgencols
FROM pg_publication
WHERE pubname IN ('pub_gen_cols_stored', 'pub_gen_cols_none')
ORDER BY pubname;
```
Expected on all three nodes:
* `pub_gen_cols_stored | s`
* `pub_gen_cols_none | n`
This test verifies that:
* `pubgencols` is correctly set on the coordinator for both `stored` and
`none`.
* Citus propagates the setting unchanged to all workers for a
distributed table.
Fixes https://github.com/citusdata/citus/issues/8235
PG18 and PG latest minors ignore temporary relations in
`RelidByRelfilenumber` (`RelidByRelfilenode` in PG15)
Relevant PG commit:
https://github.com/postgres/postgres/commit/86831952
Here we are keeping temp reloids instead of getting it with
RelidByRelfilenumber, for example, in some cases, we can directly get
reloid from relations, in other cases we keep it in some structures.
Note: there is still an outstanding issue with columnar temp tables in
concurrent sessions, that will be fixed in PR
https://github.com/citusdata/citus/pull/8252
The `merge` regress test uses SQL functions which can be cached in PG18+
since commit
[0dca5d68d](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0dca5d68d7bebf2c1036fd84875533afef6df992).
Distributed plan's copy function did not include the
`sourceResultRepartitionColumnIndex` field, which is critical for MERGE
queries, and for cached distributed plans this field was always 0
leading to the problem (#8285). Ensuring it is copied fixes it. This was
an oversight in Citus, and not specific to PG18.
fixes#8278
Please check issue:
https://github.com/citusdata/citus/issues/8278#issuecomment-3431707484f4e7756ef9
### What PG18 changed
SELECT creates diff has a **named join**:
```sql
(...) AS unsupported_join (x,y,z,t,e,f,q)
```
On PG17, `COUNT(unsupported_join.*)` stayed as a single whole-row Var
that referenced the **join alias**.
On PG18, the parser expands that whole-row Var **early** into a
`ROW(...)` of **base** columns:
```
ROW(a.user_id, a.item_id, a.buy_count,
b.id, b.it_name, b.k_no,
c.id, c.it_name, c.k_no)
```
But since the join is *named*, inner aliases `a/b/c` are hidden.
Referencing them later blows up with
“invalid reference to FROM-clause entry for table ‘a’”.
### What this PR changes
1. **Retarget at `RowExpr` deparse (not in `get_variable`)**
* In `get_rule_expr()`’s `T_RowExpr` branch, each element `e` of
`ROW(...)` is examined.
* If `e` unwraps to a simple, same-level `Var` (`varlevelsup == 0`,
`varattno > 0`) and there is a **named `RTE_JOIN`** with
`joinaliasvars`, we **do not** change `varno/varattno`.
* Instead, we build a copy of the Var and set **`varnosyn/varattnosyn`**
to the matching join alias column (from `joinaliasvars`).
* Then we deparse that Var via `get_rule_expr_toplevel(...)`, which
naturally prints `join_alias.colname`.
* Scope is limited to **query deparsing** (`dpns->plan == NULL`),
exactly where PG18 expands whole-row vars into `ROW(...)` of base Vars.
2. **Helpers (PG18-only file)**
* `unwrap_simple_var(Node*)`: strips trivial wrappers (`RelabelType`,
`CoerceToDomain`, `CollateExpr`) to reveal a `Var`.
* `var_matches_base(const Var*, int varno, AttrNumber attno)`: matches
canonical or synonym identity.
* `dpns_has_named_join(const deparse_namespace*)`: fast precheck for any
named join with `joinaliasvars`.
* `map_var_through_join_alias(...)`: scans `joinaliasvars` to locate the
**JOIN RTE index + attno** for a 1:1 alias; the caller uses these to set
`varnosyn/varattnosyn`.
3. **Safety and non-goals**
* **No effect on plan deparsing** (`dpns->plan != NULL`).
* **No change to semantic identity**: we leave `varno/varattno`
untouched; only set `varnosyn/varattnosyn`.
* Skip whole-row/system columns (`attno <= 0`) and non-simple join
columns (computed expressions).
* Works with named joins **with or without** an explicit column list (we
rely on `joinaliasvars`, not the alias collist).
### Reproducer
```sql
CREATE TABLE distributed_table(user_id int, item_id int, buy_count int);
CREATE TABLE reference_table(id int, it_name varchar(25), k_no int);
SELECT create_distributed_table('distributed_table', 'user_id');
SELECT COUNT(unsupported_join.*)
FROM (distributed_table a
LEFT JOIN reference_table b ON true
RIGHT JOIN reference_table c ON true)
AS unsupported_join (x,y,z,t,e,f,q)
JOIN (reference_table d JOIN reference_table e ON true) ON true;
```
**Before (PG18):** deparser emitted `ROW(a.user_id, …)` → `ERROR:
invalid reference to FROM-clause entry for table "a"`
**After:** deparser emits
`ROW(unsupported_join.x, ..., unsupported_join.k_no)` → runs
successfully.
Now maps to `unsupported_join.<auto_col_names>` and runs.
With PG18's GROUP RTE, queries that should have been eligible for fast
path planning were skipped because the fast path planner allows exactly
one range table only. This fix extends that to account for a GROUP RTE.
Fixes#8275 by printing the names in order so that in every message
`DETAIL: x and y are not co-located` x precedes (or is lexicographically
less than) y.
In the PR [8142](https://github.com/citusdata/citus/pull/8142) was added
`PushActiveSnapshot`.
This commit places it outside a for loop as taking a snapshot is a resource
heavy operation.
The GUC configuration for SkipAdvisoryLockPermissionChecks had
misconfigured the settings for GUC_SUPERUSER_ONLY for PGC_SUSET - when
PostgreSQL running with ASAN, this fails when querying pg_settings due
to exceeding the size of the array GucContext_Names. Fix up this GUC
declaration to not crash with ASAN.
The failing queries all have a GROUP BY, and the fix teaches the Citus recursive planner how to handle a PG18 GROUP range table in the outer query:
- In recursive query planning, don't recurse into subquery expressions in a GROUP BY clause
- Flatten references to a GROUP rte before creating the worker subquery in pushdown planning
- If a PARAM node points to a GROUP rte then tunnel through to the underlying expression
Fixes#8296.
The error `Unrecognized range table id` seen in regress test
`insert_select_into_local_tables` is a consequence of the INSERT ..
SELECT planner getting confused by a SELECT query with a GROUP BY and
hence a Group RTE, introduced in PG18 (commit 247dea89f). The solution
is to flatten the relevant parts of the SELECT query before preparing
the INSERT .. SELECT query tree for use by Citus.
PG18 has removed heap_inplace_update(), which is crucial for
citus_columnar extension because we always want to update
stripe entries for columnar in-place.
Relevant PG18 commit:
https://github.com/postgres/postgres/commit/a07e03f
heap_inplace_update() has been replaced by
heap_inplace_update_and_unlock, which is used inside
systable_inplace_update_finish, which is used together with
systable_inplace_update_begin. This change has been back-patched
up to v12, which is enough for us since the oldest version
Citus supports is v15.
In PG<18, a deprecated heap_inplace_update() is retained,
however, let's start using the new functions because they are
better, and such that we don't need to wrap these changes in
PG18 version conditionals.
Basically, in this commit we replace the following:
SysScanDesc scanDescriptor = systable_beginscan(columnarStripes,
indexId, indexOk, &dirtySnapshot, 2, scanKey);
heap_inplace_update(columnarStripes, modifiedTuple);
with the following:
systable_inplace_update_begin(columnarStripes, indexId, indexOk,
NULL, 2, scanKey, &tuple, &state);
systable_inplace_update_finish(state, tuple);
For more understanding, it's best to refer to an example:
REL_18_0/src/backend/catalog/toasting.c#L349-L371
of how systable_inplace_update_begin and
systable_inplace_update_finish are used in PG18, because they
mirror the need of citus columnar.
Fixes#8207
This reverts commit 5d805eb10b.
heap_inplace_update was incorrectly replaced by
CatalogTupleUpdate in 5d805eb. In Citus, we assume a stripe
entry with some columns set to null means that a write
is in-progress, because otherwise we wouldn't see a such row.
But this breaks when we use CatalogTupleUpdate because it
inserts a new version of the row, which leaves the
in-progress version behind. Among other things, this was
causing various issues in PG18 - check-columnar test.
Qualify create domain stmt after local execution, to avoid such diffs in
PG vanilla tests:
```diff
create domain d_fail as anyelement;
-ERROR: "anyelement" is not a valid base type for a domain
+ERROR: "pg_catalog.anyelement" is not a valid base type for a domain
```
These tests were newly added in PG18, however this is not new PG18
behavior, just some added tests.
https://github.com/postgres/postgres/commit/0172b4c94Fixes#8042
PG18 changed the visibility of various Explain Serialize functions and
structs to `extern`. Previously, for PG17 support, these were `static`,
so we had to copy paste their definitions from `explain.c` to Citus's
`multi_explain.c`.
Relevant PG18 commits:
https://github.com/postgres/postgres/commit/555960a0https://github.com/postgres/postgres/commit/77cb08be
Now we don't need to define the following anymore in Citus, since they
are extern in PG18:
- typedef struct SerializeMetrics
- void ExplainIndentText(ExplainState *es);
- SerializeMetrics GetSerializationMetrics(DestReceiver *dest);
- typedef struct SerializeDestReceiver (this is not extern, however it
is only used by GetSerializationMetrics function)
This was incorrectly handled in
https://github.com/citusdata/citus/commit/9e42f3f2c
by wrapping these definitions and usages in PG17 only,
causing such diffs in PG18 (not able to see serialization at all):
```diff
citus/src/test/regress/expected/pg17.out
select public.explain_filter('explain (analyze,
serialize binary,buffers,timing) select * from int8_tbl i8');
...
Planning Time: N.N ms
- Serialization: time=N.N ms output=NkB format=binary
Execution Time: N.N ms
Planning Time: N.N ms
Serialization: time=N.N ms output=NkB format=binary
Execution Time: N.N ms
-(14 rows)
+(13 rows)
```
This PR solves the following diffs, originating from the addition of
`varreturningtype` field to the `Var` struct in PG18:
https://github.com/postgres/postgres/commit/80feb727c
Previously we didn't account for this new field (as it's new), so this
wouldn't allow the parser to correctly reconstruct the `Var` node
structure, but rather it would error out with `did not find '}' at end
of input node`:
```diff
SELECT column_to_column_name(logicalrelid, partkey)
FROM pg_dist_partition WHERE partkey IS NOT NULL ORDER BY 1 LIMIT 1;
- column_to_column_name
----------------------------------------------------------------------
- a
-(1 row)
-
+ERROR: did not find '}' at end of input node
```
Solution follows precedent https://github.com/citusdata/citus/pull/7107,
when varnullingrels field was added to the `Var` struct in PG16.
The solution includes:
- Taking care of the `partkey` in `pg_dist_partition` table because it's
coming from the `Var` struct. This mainly includes fixing the upgrade
script to PG18, by saving all the `partkey` infos before upgrading to
PG18 (in `citus_prepare_pg_upgrade`), and then re-generating `partkey`
columns in `pg_dist_partition` (using `UPDATE`) after upgrading to PG18
(in `citus_finish_pg_upgrade`).
- Adding a normalize rule to fix output differences among PG versions.
Note that we need two normalize lines: one for PG15 since it doesn't
have `varnullingrels`, and one for PG16/PG17.
- Small trick on `metadata_sync_helpers` to use different text when
generating the `partkey`, based on the PG version.
Fixes#8189
This crash has been there for a while but wasn't tested before pg18.
PG18 added this test:
CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo;
which tries to create statistics on a derived-on-the-fly table (which is
not allowed) However Citus assumes we always have a valid table when
intercepting CREATE STATISTICS command to check for Citus tables
Added a check to return early if needed.
pg18 commit: https://github.com/postgres/postgres/commit/3eea4dc2cFixes#8212
DESCRIPTION: Fixes a bug that causes allowing UPDATE / MERGE queries
that may change the distribution column value.
Fixes: #8087.
Probably as of #769, we were not properly checking if UPDATE
may change the distribution column.
In #769, we had these checks:
```c
if (targetEntry->resno != column->varattno)
{
/* target entry of the form SET some_other_col = <x> */
isColumnValueChanged = false;
}
else if (IsA(setExpr, Var))
{
Var *newValue = (Var *) setExpr;
if (newValue->varattno == column->varattno)
{
/* target entry of the form SET col = table.col */
isColumnValueChanged = false;
}
}
```
However, what we check in "if" and in the "else if" are not so
different in the sense they both attempt to verify if SET expr
of the target entry points to the attno of given column. So, in
#5220, we even removed the first check because it was redundant.
Also see this PR comment from #5220:
https://github.com/citusdata/citus/pull/5220#discussion_r699230597.
In #769, probably we actually wanted to first check whether both
SET expr of the target entry and given variable are pointing to the
same range var entry, but this wasn't what the "if" was checking,
so removed.
As a result, in the cases that are mentioned in the linked issue,
we were incorrectly concluding that the SET expr of the target
entry won't change given column just because it's pointing to the
same attno as given variable, regardless of what range var entries
the column and the SET expr are pointing to. Then we also started
using the same function to check for such cases for update action
of MERGE, so we have the same bug there as well.
So with this PR, we properly check for such cases by comparing
varno as well in TargetEntryChangesValue(). However, then some of
the existing tests started failing where the SET expr doesn't
directly assign the column to itself but the "where" clause could
actually imply that the distribution column won't change. Even before
we were not attempting to verify if "where" cluse quals could imply a
no-op assignment for the SET expr in such cases but that was not a
problem. This is because, for the most cases, we were always qualifying
such SET expressions as a no-op update as long as the SET expr's
attno is the same as given column's. For this reason, to prevent
regressions, this PR also adds some extra logic as well to understand
if the "where" clause quals could imply that SET expr for the
distribution key is a no-op.
Ideally, we should instead use "relation restriction equivalence"
mechanism to understand if the "where" clause implies a no-op
update. This is because, for instance, right now we're not able to
deduce that the update is a no-op when the "where" clause transitively
implies a no-op update, as in the case where we're setting "column a"
to "column c" and where clause looks like:
"column a = column b AND column b = column c".
If this means a regression for some users, we can consider doing it
that way. Until then, as a workaround, we can suggest adding additional
quals to "where" clause that would directly imply equivalence.
Also, after fixing TargetEntryChangesValue(), we started successfully
deducing that the update action is a no-op for such MERGE queries:
```sql
MERGE INTO dist_1
USING dist_1 src
ON (dist_1.a = src.b)
WHEN MATCHED THEN UPDATE SET a = src.b;
```
However, we then started seeing below error for above query even
though now the update is qualified as a no-op update:
```
ERROR: Unexpected column index of the source list
```
This was because of #8180 and #8201 fixed that.
In summary, with this PR:
* We disallow such queries,
```sql
-- attno for dist_1.a, dist_1.b: 1, 2
-- attno for dist_different_order_1.a, dist_different_order_1.b: 2, 1
UPDATE dist_1 SET a = dist_different_order_1.b
FROM dist_different_order_1
WHERE dist_1.a dist_different_order_1.a;
-- attno for dist_1.a, dist_1.b: 1, 2
-- but ON (..) doesn't imply a no-op update for SET expr
MERGE INTO dist_1
USING dist_1 src
ON (dist_1.a = src.b)
WHEN MATCHED THEN UPDATE SET a = src.a;
```
* .. and allow such queries,
```sql
MERGE INTO dist_1
USING dist_1 src
ON (dist_1.a = src.b)
WHEN MATCHED THEN UPDATE SET a = src.b;
```
The range table entry array created by the Postgres planner for each
SELECT in a query may have NULL entries as of PG18. Add a NULL check
to skip over these when looking for matches in rte identities.
Fix deparsing of UPDATE statements with indirection (#7675) involved
changing ruleutils of our supported Postgres versions. It means that
when integrating a new Postgres version we need to update its ruleutils
with the relevant parts of #7675; basically PG ruleutils needs to call
the `citus_ruleutils.c` functions added by #7675.
DESCRIPTION: Fixes a bug that causes an unexpected error when executing
repartitioned merge.
Fixes#8180.
This was happening because of a bug in
SourceResultPartitionColumnIndex(). And to fix it, this PR avoids
using DistributionColumnIndex() in SourceResultPartitionColumnIndex().
Instead, invents FindTargetListEntryWithVarExprAttno(), which finds
the index of the target entry in the source query's target list that
can be used to repartition the source for a repartitioned merge. In
short, to find the source target entry that refences the Var used in
ON (..) clause and that references the source rte, we should check the
varattno of the underlying expr, which presumably is always a Var for
repartitioned merge as we always wrap the source rte with a subquery,
where all target entries point to the columns of the original source
relation.
Using DistributionColumnIndex() prior to 13.0 wasn't causing such an
issue because prior to 13.0, the varattno of the underlying expr of
the source target entries was almost (*1) always equal to resno of the
target entry as we were including all target entries of the source
relation. However, starting with #7659, which is merged to main before
13.0, we started using CreateFilteredTargetListForRelation() instead of
CreateAllTargetListForRelation() to compute the target entry list for
the source rte to fix another bug. So we cannot revert to using
CreateAllTargetListForRelation() because otherwise we would re-introduce
bug that it helped fixing, so we instead had to find a way to properly
deal with the "filtered target list"s, as in this commit. Plus (*1),
even before #7659, probably we would still fail when the source relation
has dropped attributes or such because that would probably also cause
such a mismatch between the varattno of the underlying expr of the
target entry and its resno.
The change in `merge_planner.c` fixes _unrecognized range table entry_
diffs in merge regress tests (category 2 diffs in #7992), the change in
`multi_router_planner.c` fixes _column reference ... is ambiguous_ diffs
in `multi_insert_select` and `multi_insert_select_window` (category 3
diffs in #7992). Edit to `common.py` enables standalone regress tests
with pg18 (e..g `citus_tests/run_test.py merge`).
DESCRIPTION: Fix 'column does not exist' errors in grouping regress
tests.
Postgres 18's GROUP RTE was being ignored by query pushdown planning
when constructing the query tree for the worker subquery. The solution
is straightforward - ensure the worker subquery tree has the same
groupRTE property as the original query. Postgres ruleutils then does
the right thing when generating the pushed down query. Fixes category 1
in #7992.
Added detailed explanation of delayed fast path planning in Citus 13.2,
including conditions and processes involved.
---------
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Fixes#5808.
DESCRIPTION: Fixes an assertion failure in Citus maintenance daemon that
can happen in very slow systems.
Try running `make -C src/test/regress/ check-multi-1-vg` - while the
tests will exit with code 2 at least %50 of the times in the very early
stages of the test suite by producing a core-dump on main, it won't be
the case on this branch, at least based on my trials :)
DESCRIPTION: Fixes an undefined behavior that could happen when
computing tenant score for citus_stat_tenants
Add check for shift size, reset to zero in case of overflow
Fixes#7953.
---------
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Need to also check Postgres plan's rangetables for relations used in Initplans.
DESCRIPTION: Fix a bug in redundant WHERE clause detection; we need to
additionally check the Postgres plan's range tables for the presence of
citus tables, to account for relations that are referenced from scalar
subqueries.
There is a fundamental flaw in 4139370, the assumption that, after
Postgres planning has completed, all tables used in a query can be
obtained by walking the query tree. This is not the case for scalar
subqueries, which will be referenced by `PARAM` nodes. The fix adds an
additional check of the Postgres plan range tables; if there is at least
one citus table in there we do not need to change the needs distributed
planning flag.
Fixes#8159
DESCRIPTION: Checking first for the presence of subscript ops avoids a
shallow copy of the target list for target lists where there are no
array or json subscripts.
Commit 0c1b31c fixed a bug in UPDATE statements with array or json
subscripting in the target list. This commit modifies that to first
check that the target list has a subscript and avoid a shallow copy of
the target list for UPDATE statements with no array/json subscripting.
- Downgrade replication lag reporting from NOTICE to DEBUG to reduce
noise and improve regression test stability.
- Add hints to certain replication status messages for better clarity.
- Update expected output files accordingly.
In #7950, #8120, #8124, #8121 and #8114, TupleDescSize() was used to
check whether the tuple length is `Natts_<catalog_table_name>`. However
this was wrong because TupleDescSize() returns the size of the
tupledesc, not the length of it (i.e., number of attributes).
Actually `TupleDescSize(tupleDesc) == Natts_<catalog_table_name>` was
always returning false but this didn't cause any problems because using
`tupleDesc->natts - 1` when `tupleDesc->natts ==
Natts_<catalog_table_name>` too had the same effect as using
`Anum_<column_added_later> - 1` in that case.
So this also makes me thinking of always returning `tupleDesc->natts -
1` (or `tupleDesc->natts - 2` if it's the second to last attribute) but
being more explicit seems more useful.
Even more, in the future we should probably switch to a different
implementation if / when we think of adding more columns to those
tables. We should probably scan non-dropped attributes of the relation,
enumerate them and return the attribute number of the one that we're
looking for, but seems this is not needed right now.
Unlike what has been fixed in #7950, #8120, #8124, #8121 and #8114, this
was not an issue in older releases but is a potential issue to be
introduced by the current (13.2) release because in one of recent
commits (#8122) two columns has been added to pg_dist_node. In other
words, none of the older releases since we started supporting downgrades
added new columns to pg_dist_node.
The mentioned PR actually attempted avoiding these kind of issues in one
of the code-paths but not in some others.
So, this PR, avoids memory corruptions around pg_dist_node accessors in
a standardized way (as implemented in other example PRs) and in all
code-paths.