From da24ede835a308540344ed74f35b50d79ab8c7d4 Mon Sep 17 00:00:00 2001
From: Mehmet YILMAZ <mehmety87@gmail.com>
Date: Thu, 17 Jul 2025 13:15:31 +0300
Subject: [PATCH] =?UTF-8?q?Support=20PostgreSQL=2018=E2=80=99s=20new=20RTE?=
 =?UTF-8?q?=20kinds=20in=20Citus=20deparser=20(#8023)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fixes #8020

PostgreSQL 18 introduces two new, *pseudo* rangetable‐entry kinds that
Citus’ downstream deparser must recognize:

1. **Pulled-up shard RTE clones** (`CITUS_RTE_SHARD` with `relid ==
InvalidOid`)
2. **Grouping-step RTE** (`RTE_GROUP`, alias `*GROUP*`, not actually in
the FROM clause)

Without special handling, Citus crashes or emits invalid SQL when
running against PG 18beta1:

* **`ERROR: could not open relation with OID 0`**
Citus was unconditionally calling `relation_open(rte->relid,…)` on
entries whose `relid` is 0.
* **`ERROR: missing FROM-clause entry for table "*GROUP*"`**
Citus’ `set_rtable_names()` assigned the synthetic `*GROUP*` alias but
never printed a matching FROM item.

This PR teaches Citus’ `ruleutils_18.c` to skip catalog lookups for RTEs
without valid OIDs and to suppress the grouping-RTE alias, restoring
compatibility with both PG 17 and PG 18.

---

## Background

* **Upstream commit
[[247dea8](https://github.com/postgres/postgres/commit/247dea89f7616fdf06b7272b74abafc29e8e5860)](https://github.com/postgres/postgres/commit/247dea89f7616fdf06b7272b74abafc29e8e5860)**
Introduced `RTE_GROUP` for the grouping step so that multiple subqueries
in `GROUP BY`/`HAVING` can be deduplicated and planned correctly.
* **Citus PR
[[#6428](https://github.com/citusdata/citus/pull/6428)](https://github.com/citusdata/citus/pull/6428)**
Added initial support for treating shard RTEs like real
relations—calling `relation_open()` to pick up renamed-column fixes.
Worked fine on PG 11–17, but PG 18’s pull-up logic clones those shard
RTEs with `relid=0`, leading to OID 0 crashes.

---

## Changes

1. **Guard `relation_open()`**
In `set_relation_column_names()`, only call `relation_open(rte->relid,
…)` when

   ```c
   OidIsValid(rte->relid)
   ```

Prevents the “could not open relation with OID 0” crash on both
pulled-up shards and synthetic RTEs.

2. **Handle pulled-up shards** (`CITUS_RTE_SHARD` with `relid=0`)
Copy column names directly from `rte->eref->colnames` instead of hitting
the catalog.

3. **Handle grouping RTE** (`RTE_GROUP`)

* **In `set_relation_column_names()`**: fallback to
`rte->eref->colnames` for `RTE_GROUP`.
   * **In `set_rtable_names()`**: explicitly assign

     ```c
     refname = NULL;  /* never show *GROUP* in FROM */
     ```

     so that no `*GROUP*` alias is ever printed.

   **Why this is required:**
PostgreSQL 18’s parser now represents the grouping step with a synthetic
RTE whose alias is always `*GROUP*`—and that RTE is **never** actually
listed in the `FROM` clause. If Citus’ deparser assigns and emits
`*GROUP*` as a table reference, the pushed-down SQL becomes:

   ```sql
SELECT *GROUP*.mygroupcol … -- but there is no “*GROUP*” in the FROM
list
   ```

   Workers then fail:

   ```
   ERROR: missing FROM-clause entry for table "*GROUP*"
   ```

By setting `refname = NULL` for `RTE_GROUP` in `set_rtable_names()`, the
deparser prints just the column name unqualified, exactly matching
upstream PG 18’s behavior and yielding valid SQL on the workers.

4. **Maintain existing behavior on PG 15–17**

* Shard RTEs *with* valid `relid` still open the catalog to pick up
renamed-column fixes.
   * No impact on other RTE kinds or versions prior to PG 18.

---
---
 .../distributed/deparser/ruleutils_18.c       | 32 +++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/backend/distributed/deparser/ruleutils_18.c b/src/backend/distributed/deparser/ruleutils_18.c
index 31d78b122..7d4b99741 100644
--- a/src/backend/distributed/deparser/ruleutils_18.c
+++ b/src/backend/distributed/deparser/ruleutils_18.c
@@ -795,6 +795,11 @@ set_rtable_names(deparse_namespace *dpns, List *parent_namespaces,
 			/* Unnamed join has no refname */
 			refname = NULL;
 		}
+		else if (rte->rtekind == RTE_GROUP)
+		{
+			/* Use the name of the group */
+			refname = NULL;
+		}
 		else
 		{
 			/* Otherwise use whatever the parser assigned */
@@ -1191,8 +1196,9 @@ set_relation_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
 	 * real_colnames[] will be indexed by physical column number, with NULL
 	 * entries for dropped columns.
 	 */
-	if (rte->rtekind == RTE_RELATION ||
-        GetRangeTblKind(rte) == CITUS_RTE_SHARD)
+	if ((rte->rtekind == RTE_RELATION ||
+		GetRangeTblKind(rte) == CITUS_RTE_SHARD) &&
+		OidIsValid(rte->relid))
 	{
 		/* Relation --- look to the system catalogs for up-to-date info */
 		Relation	rel;
@@ -1215,6 +1221,28 @@ set_relation_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
 		}
 		relation_close(rel, AccessShareLock);
 	}
+	else if (GetRangeTblKind(rte) == CITUS_RTE_SHARD)
+	{
+		/* shard RTE without relid (pulled-up clone in PG18)            */
+		/* use the column aliases already stored in rte->eref->colnames */
+
+		ncolumns      = list_length(rte->eref->colnames);
+		real_colnames = (char **) palloc0(ncolumns * sizeof(char *));
+
+		for (i = 0; i < ncolumns; i++)
+			real_colnames[i] = pstrdup(strVal(list_nth(rte->eref->colnames, i)));
+
+		/* keep changed_any / has_anonymous defaults */
+	}
+    else if (rte->rtekind == RTE_GROUP)
+    {
+        /* ----- synthetic PG 18 RTE for GROUP BY / HAVING ----- */
+        ncolumns      = list_length(rte->eref->colnames);
+        real_colnames = (char **) palloc0(ncolumns * sizeof(char *));
+
+        for (i = 0; i < ncolumns; i++)
+            real_colnames[i] = pstrdup(strVal(list_nth(rte->eref->colnames, i)));
+    }
 	else
 	{
 		/* Otherwise get the column names from eref or expandRTE() */