From 5005be31e6770e38fb3d700489533aaca67bca1a Mon Sep 17 00:00:00 2001 From: Mehmet YILMAZ Date: Wed, 16 Jul 2025 23:23:14 +0300 Subject: [PATCH] =?UTF-8?q?PG18=20-=20Handle=20PG18=E2=80=99s=20synthetic?= =?UTF-8?q?=20`RTE=5FGROUP`=20in=20`FindReferencedTableColumn`=20for=20cor?= =?UTF-8?q?rect=20GROUP=20BY=20pushdown=20(#8034)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fixes #8032 PostgreSQL 18 introduces a dedicated “grouping-step” range table entry (`RTE_GROUP`) whose target columns are exactly the expressions in our `GROUP BY` clause, rather than hiding them as `resjunk` items. In Citus’s distributed planner, the function `FindReferencedTableColumn` must be able to map from a `Var` referencing a grouped column back to the underlying table column. Without special handling for `RTE_GROUP`, queries that rely on pushdown of `GROUP BY` expressions can fail or mis-identify their target columns. This PR adds support for `RTE_GROUP` in Citus when built against PG 18 or later, ensuring that: * Each grouped expression is correctly resolved. * The pushdown planner can trace a `Var`’s `varattno` into the corresponding `groupexprs` list. * Existing behavior on PG < 18 is unchanged. --- ## What’s Changed In **`src/backend/distributed/planner/multi_logical_optimizer.c`**, inside `FindReferencedTableColumn`: * **Under** `#if PG_VERSION_NUM >= PG_VERSION_18` Introduce an `else if` branch for ```c rangeTableEntry->rtekind == RTE_GROUP ``` * **Extraction of grouped expressions:** ```c List *groupexprs = rangeTableEntry->groupexprs; AttrNumber groupIndex = candidateColumn->varattno - 1; ``` * **Safety check** to guard against malformed `Var` numbers: ```c if (groupIndex < 0 || groupIndex >= list_length(groupexprs)) return; /* malformed Var */ ``` * **Recursive descent:** Fetch the corresponding expression from `groupexprs` and call ```c FindReferencedTableColumn(groupExpr, parentQueryList, query, column, rteContainingReferencedColumn, skipOuterVars); ``` so that the normal resolution logic applies to the underlying expression. * **Unchanged code path** for PG < 18 and for other `rtekind` values. --- --- .../planner/multi_logical_optimizer.c | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/src/backend/distributed/planner/multi_logical_optimizer.c b/src/backend/distributed/planner/multi_logical_optimizer.c index c4c11f4eb..7deced084 100644 --- a/src/backend/distributed/planner/multi_logical_optimizer.c +++ b/src/backend/distributed/planner/multi_logical_optimizer.c @@ -4557,6 +4557,29 @@ FindReferencedTableColumn(Expr *columnExpression, List *parentQueryList, Query * FindReferencedTableColumn(joinColumn, parentQueryList, query, column, rteContainingReferencedColumn, skipOuterVars); } +#if PG_VERSION_NUM >= PG_VERSION_18 + else if (rangeTableEntry->rtekind == RTE_GROUP) + { + /* + * PG 18: synthetic GROUP RTE. Each groupexprs item corresponds to the + * columns produced by the grouping step, in the *same ordinal order* as + * the Vars that reference them. + */ + List *groupexprs = rangeTableEntry->groupexprs; + AttrNumber groupIndex = candidateColumn->varattno - 1; + + /* this must always hold unless upstream Postgres mis-constructed the RTE_GROUP */ + Assert(groupIndex >= 0 && groupIndex < list_length(groupexprs)); + + Expr *groupExpr = (Expr *) list_nth(groupexprs, groupIndex); + + /* Recurse on the underlying expression (stay in the same query) */ + FindReferencedTableColumn(groupExpr, parentQueryList, query, + column, rteContainingReferencedColumn, + skipOuterVars); + } +#endif /* PG_VERSION_NUM >= 180000 */ + else if (rangeTableEntry->rtekind == RTE_CTE) { /*