This PR provides successful build against PG18Beta1. RuleUtils PR was
reviewed separately: #8010
## PG 18Beta1–related changes for building Citus
### TupleDesc / Attr layout
**What changed in PG:** Postgres consolidated the
`TupleDescData.attrs[]` array into a more compact representation. Direct
field access (tupdesc->attrs[i]) was replaced by the new
`TupleDescAttr()` API.
**Citus adaptation:** Everywhere we previously used
`tupdesc->attrs[...]`, we now call `TupleDescAttr(tupdesc, idx)` (or our
own `Attr()` macro) under a compatibility guard.
*
5983a4cffc
General Logic:
* Use `Attr(...)` in places where `columnar_version_compat.h` is
included. This avoids the need to sprinkle `#if PG_VERSION_NUM` guards
around each attribute access.
* Use `TupleDescAttr(tupdesc, i)` when the relevant PostgreSQL header is
already included and the additional macro indirection is unnecessary.
### Collation‐aware `LIKE`
**What changed in PG:** The `textlike` operator now requires an explicit
collation, to avoid ambiguous‐collation errors. Core code switched from
`DirectFunctionCall2(textlike, ...)` to
`DirectFunctionCall2Coll(textlike, DEFAULT_COLLATION_OID, ...)`.
**Citus adaptation:** In `remote_commands.c` and any other LIKE call, we
now use `DirectFunctionCall2Coll(textlike, DEFAULT_COLLATION_OID, ...)`
and `#include <utils/pg_collation.h>`.
*
85b7efa1cd
### Columnar storage API
* Adapt `columnar_relation_set_new_filelocator` (and related init
routines) for PG 18’s revised SMGR and storage-initialization hooks.
* Pull in the new headers (`explain_format.h`,
`columnar_version_compat.h`) so the columnar module compiles cleanly
against PG 18.
- heap_modify_tuple + heap_inplace_update only exist on PG < 18; on PG18
the in-place helper was removed upstream
-
a07e03fd8f
### OpenSSL / TLS integration
**What changed in PG:** Moved from the legacy `SSL_library_init()` to
`OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL)`, updated certificate
API calls (`X509_getm_notBefore`, `X509_getm_notAfter`), and
standardized on `TLS_method()`.
**Citus adaptation:** We now `#include <openssl/opensslv.h>` and use
`#if OPENSSL_VERSION_NUMBER >= 0x10100000L` to choose between`
OPENSSL_init_ssl()` or `SSL_library_init()`, and wrap`
X509_gmtime_adj()` calls around the new accessor functions.
*
6c66b7443c
### Adapt `ExtractColumns()` to the new PG-18 `expandRTE()` signature
PostgreSQL 18
80feb727c8
added a fourth argument of type `VarReturningType` to `expandRTE()`, so
calls that used the old 7-parameter form no longer compile. This patch:
* Wraps the `expandRTE(...)` call in a `#if PG_VERSION_NUM >= 180000`
guard.
* On PG 18+ passes the new `VAR_RETURNING_DEFAULT` argument before
`location`.
* On PG 15–17 continues to call the original 7-arg form.
* Adds the necessary includes (`parser/parse_relation.h` for `expandRTE`
and `VarReturningType`, and `pg_version_constants.h` for
`PG_VERSION_NUM`).
### Adapt `ExecutorStart`/`ExecutorRun` hooks to PG-18’s new signatures
PostgreSQL 18
525392d572
changed the signatures of the executor hooks:
* `ExecutorStart_hook` now returns `bool` instead of `void`, and
* `ExecutorRun_hook` drops its old `run_once` argument.
This patch preserves Citus’s existing hook logic by:
1. **Adding two adapter functions** under `#if PG_VERSION_NUM >=
PG_VERSION_18`:
* `citus_executor_start_adapter(QueryDesc *queryDesc, int eflags)`
Calls the old `CitusExecutorStart(queryDesc, eflags)` and then returns
`true` to satisfy the new hook’s `bool` return type.
* `citus_executor_run_adapter(QueryDesc *queryDesc, ScanDirection
direction, uint64 count)`
Calls the old `CitusExecutorRun(queryDesc, direction, count, true)`
(passing `true` for the dropped `run_once` argument), and returns
`void`.
2. **Installing the adapters** in `_PG_init()` instead of the original
hooks when building against PG 18+:
```c
#if PG_VERSION_NUM >= PG_VERSION_18
ExecutorStart_hook = citus_executor_start_adapter;
ExecutorRun_hook = citus_executor_run_adapter;
#else
ExecutorStart_hook = CitusExecutorStart;
ExecutorRun_hook = CitusExecutorRun;
#endif
```
### Adapt to PG-18’s removal of the “run\_once” flag from
ExecutorRun/PortalRun
PostgreSQL commit
[[3eea7a0](3eea7a0c97)
rationalized the executor’s parallelism logic by moving the “execute a
plan only once” check into `ExecutePlan()` itself and dropping the old
`bool run_once` argument from the public APIs:
```diff
- void ExecutorRun(QueryDesc *queryDesc,
- ScanDirection direction,
- uint64 count,
- bool run_once);
+ void ExecutorRun(QueryDesc *queryDesc,
+ ScanDirection direction,
+ uint64 count);
```
(and similarly for `PortalRun()`).
To stay compatible across PG 15–18, Citus now:
1. **Updates all internal calls** to `ExecutorRun(...)` and
`PortalRun(...)`:
* On PG 18+, use the new three-argument form (`ExecutorRun(qd, dir,
count)`).
* On PG 15–17, keep the old four-arg form (`ExecutorRun(qd, dir, count,
true)`) under a `#if PG_VERSION_NUM < 180000` guard.
2. **Guards the dispatcher hooks** via the adapter functions (from the
earlier patch) so that Citus’s executor hooks continue to work under
both the old and new signatures.
### Adapt to PG-18’s shortened PortalRun signature
PostgreSQL 18’s refactoring (see commit
[3eea7a0](3eea7a0c97))
also removed the old run_once and alternate‐dest arguments from the
public PortalRun() API. The signature changed from:
```diff
- bool PortalRun(Portal portal,
- long count,
- bool isTopLevel,
- bool run_once,
- DestReceiver *dest,
- DestReceiver *altdest,
- QueryCompletion *qc);
+ bool PortalRun(Portal portal,
+ long count,
+ bool isTopLevel,
+ DestReceiver *dest,
+ DestReceiver *altdest,
+ QueryCompletion *qc);
```
To support both versions in Citus, we:
1. **Version-guard each call** to `PortalRun()`:
* **On PG 18+** invoke the new 6-argument form.
* **On PG 15–17** fall back to the legacy 7-argument form, passing
`true` for `run_once`.
### Add support for PG-18’s new `plansource` argument in
`PortalDefineQuery`**
PostgreSQL 18 extended the `PortalDefineQuery` API to carry a
`CachedPlanSource *plansource` pointer so that the portal machinery can
track cached‐plan invalidation (as introduced alongside deferred-locking
in commit
525392d572.
To remain compatible across PG 15–18, Citus now wraps its calls under a
version guard:
```diff
- PortalDefineQuery(portal, NULL, sql, commandTag, plantree_list, NULL);
+#if PG_VERSION_NUM >= 180000
+ /* PG 18+: seven-arg signature (adds plansource) */
+ PortalDefineQuery(
+ portal,
+ NULL, /* no prepared-stmt name */
+ sql, /* the query text */
+ commandTag, /* the CommandTag */
+ plantree_list, /* List of PlannedStmt* */
+ NULL, /* no CachedPlan */
+ NULL /* no CachedPlanSource */
+ );
+#else
+ /* PG 15–17: six-arg signature */
+ PortalDefineQuery(
+ portal,
+ NULL, /* no prepared-stmt name */
+ sql, /* the query text */
+ commandTag, /* the CommandTag */
+ plantree_list, /* List of PlannedStmt* */
+ NULL /* no CachedPlan */
+ );
+#endif
```
### Adapt ExecInitRangeTable() calls to PG-18’s new signature
PostgreSQL commit
[cbc127917e04a978a788b8bc9d35a70244396d5b](cbc127917e)
overhauled the planner API for range‐table initialization:
**PG 18+**: added a fourth `Bitmapset *unpruned_relids` argument to
support deferred partition pruning
In Citus’s `create_estate_for_relation()` (in `columnar_metadata.c`), we
now wrap the call in a compile‐time guard so that the code compiles
correctly on all supported PostgreSQL versions:
```
/* Prepare permission info on PG 16+ */
#if PG_VERSION_NUM >= PG_VERSION_16
List *perminfos = NIL;
addRTEPermissionInfo(&perminfos, rte);
#else
List *perminfos = NIL; /* unused on PG 15 */
#endif
/* Initialize the range table, with the right signature for each PG version */
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG 18+: four‐arg signature (adds unpruned_relids) */
ExecInitRangeTable(
estate,
list_make1(rte),
perminfos,
NULL /* unpruned_relids: not used by columnar */
);
#elif PG_VERSION_NUM >= PG_VERSION_16
/* PG 16–17: three‐arg signature (permInfos) */
ExecInitRangeTable(
estate,
list_make1(rte),
perminfos
);
#else
/* PG 15: two‐arg signature */
ExecInitRangeTable(
estate,
list_make1(rte)
);
#endif
estate->es_output_cid = GetCurrentCommandId(true);
```
### Adapt `pgstat_report_vacuum()` to PG-18’s new timestamp argument
PostgreSQL commit
[[30a6ed0ce4bb18212ec38cdb537ea4b43bc99b83](30a6ed0ce4)
extended the `pgstat_report_vacuum()` API by adding a `TimestampTz
start_time` parameter at the end so that the VACUUM statistics collector
can record when the operation began:
```diff
/* PG ≤17: four-arg signature */
- void pgstat_report_vacuum(Oid tableoid,
- bool shared,
- double num_live_tuples,
- double num_dead_tuples);
+/* PG ≥18: five-arg signature adds a start_time */
+ void pgstat_report_vacuum(Oid tableoid,
+ bool shared,
+ double num_live_tuples,
+ double num_dead_tuples,
+ TimestampTz start_time);
```
To support both versions, we now wrap the call in `columnar_tableam.c`
with a version guard, supplying `GetCurrentTimestamp()` for PG-18+:
```c
#if PG_VERSION_NUM >= 180000
/* PG 18+: include start_timestamp */
pgstat_report_vacuum(
RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(new_live_tuples, 0), /* live tuples */
0, /* dead tuples */
GetCurrentTimestamp() /* start time */
);
#else
/* PG 15–17: original signature */
pgstat_report_vacuum(
RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(new_live_tuples, 0), /* live tuples */
0 /* dead tuples */
);
#endif
```
### Adapt `ExecuteTaskPlan()` to PG-18’s expanded `CreateQueryDesc()`
signature
PostgreSQL 18 changed `CreateQueryDesc()` from an eight-argument to a
nine-argument call by inserting a `CachedPlan *cplan` parameter
immediately after the `PlannedStmt *plannedstmt` argument (see commit
525392d572).
To remain compatible with PG 15–17, Citus now wraps its invocation in
`local_executor.c` with a version guard:
```diff
- /* PG15–17: eight-arg CreateQueryDesc without cached plan */
- QueryDesc *queryDesc = CreateQueryDesc(
- taskPlan, /* PlannedStmt *plannedstmt */
- queryString, /* const char *sourceText */
- GetActiveSnapshot(),/* Snapshot snapshot */
- InvalidSnapshot, /* Snapshot crosscheck_snapshot */
- destReceiver, /* DestReceiver *dest */
- paramListInfo, /* ParamListInfo params */
- queryEnv, /* QueryEnvironment *queryEnv */
- 0 /* int instrument_options */
- );
+#if PG_VERSION_NUM >= 180000
+ /* PG18+: nine-arg CreateQueryDesc with a CachedPlan slot */
+ QueryDesc *queryDesc = CreateQueryDesc(
+ taskPlan, /* PlannedStmt *plannedstmt */
+ NULL, /* CachedPlan *cplan (none) */
+ queryString, /* const char *sourceText */
+ GetActiveSnapshot(),/* Snapshot snapshot */
+ InvalidSnapshot, /* Snapshot crosscheck_snapshot */
+ destReceiver, /* DestReceiver *dest */
+ paramListInfo, /* ParamListInfo params */
+ queryEnv, /* QueryEnvironment *queryEnv */
+ 0 /* int instrument_options */
+ );
+#else
+ /* PG15–17: eight-arg CreateQueryDesc without cached plan */
+ QueryDesc *queryDesc = CreateQueryDesc(
+ taskPlan, /* PlannedStmt *plannedstmt */
+ queryString, /* const char *sourceText */
+ GetActiveSnapshot(),/* Snapshot snapshot */
+ InvalidSnapshot, /* Snapshot crosscheck_snapshot */
+ destReceiver, /* DestReceiver *dest */
+ paramListInfo, /* ParamListInfo params */
+ queryEnv, /* QueryEnvironment *queryEnv */
+ 0 /* int instrument_options */
+ );
+#endif
```
### Adapt `RelationGetPrimaryKeyIndex()` to PG-18’s new “deferrable\_ok”
flag
PostgreSQL commit
14e87ffa5c
added a new Boolean `deferrable_ok` parameter to
`RelationGetPrimaryKeyIndex()` so that the lock manager can defer
unique‐constraint locks when requested. The API changed from:
```c
RelationGetPrimaryKeyIndex(Relation relation)
```
to:
```c
RelationGetPrimaryKeyIndex(Relation relation, bool deferrable_ok)
```
```diff
diff --git a/src/backend/distributed/metadata/node_metadata.c
b/src/backend/distributed/metadata/node_metadata.c
index e3a1b2c..f4d5e6f 100644
--- a/src/backend/distributed/metadata/node_metadata.c
+++ b/src/backend/distributed/metadata/node_metadata.c
@@ -2965,8 +2965,18 @@
*/
- Relation replicaIndex =
index_open(RelationGetPrimaryKeyIndex(pgDistNode),
- AccessShareLock);
+ #if PG_VERSION_NUM >= PG_VERSION_18
+ /* PG 18+ adds a bool "deferrable_ok" parameter */
+ Relation replicaIndex =
+ index_open(
+ RelationGetPrimaryKeyIndex(pgDistNode, false),
+ AccessShareLock);
+ #else
+ Relation replicaIndex =
+ index_open(
+ RelationGetPrimaryKeyIndex(pgDistNode),
+ AccessShareLock);
+ #endif
ScanKeyInit(&scanKey[0], Anum_pg_dist_node_nodename,
BTEqualStrategyNumber, F_TEXTEQ, CStringGetTextDatum(nodeName));
```
```diff
diff --git a/src/backend/distributed/operations/node_protocol.c b/src/backend/distributed/operations/node_protocol.c
index e3a1b2c..f4d5e6f 100644
--- a/src/backend/distributed/operations/node_protocol.c
+++ b/src/backend/distributed/operations/node_protocol.c
@@ -746,7 +746,12 @@
if (!OidIsValid(idxoid))
{
- idxoid = RelationGetPrimaryKeyIndex(rel);
+ /* Determine the index OID of the primary key (PG18 adds a second parameter) */
+#if PG_VERSION_NUM >= PG_VERSION_18
+ idxoid = RelationGetPrimaryKeyIndex(rel, false);
+#else
+ idxoid = RelationGetPrimaryKeyIndex(rel);
+#endif
}
return idxoid;
```
Because Citus has always taken the lock immediately—just as the old
two-arg call did—we pass `false` to keep that same immediate-lock
behavior. Passing `true` would switch to deferred locking, which we
don’t want.
### Adapt `ExplainOnePlan()` to PG-18’s expanded API
PostgreSQL 18 extended
525392d572
the `ExplainOnePlan()` function to carry the `CachedPlan *` and
`CachedPlanSource *` pointers plus an explicit `query_index`, letting
the EXPLAIN machinery track plan‐source invalidation. The old signature:
```c
/* PG ≤17 */
void
ExplainOnePlan(PlannedStmt *plannedstmt,
IntoClause *into,
struct ExplainState *es,
const char *queryString,
ParamListInfo params,
QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage);
```
became, in PG 18:
```c
/* PG ≥18 */
void
ExplainOnePlan(PlannedStmt *plannedstmt,
CachedPlan *cplan,
CachedPlanSource *plansource,
int query_index,
IntoClause *into,
struct ExplainState *es,
const char *queryString,
ParamListInfo params,
QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
const MemoryContextCounters *mem_counters);
```
To compile under both versions, Citus now wraps each call in
`multi_explain.c` with:
```c
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG 18+: pass NULL for the new cached‐plan fields and zero for query_index */
ExplainOnePlan(
plan, /* PlannedStmt *plannedstmt */
NULL, /* CachedPlan *cplan */
NULL, /* CachedPlanSource *plansource */
0, /* query_index */
into, /* IntoClause *into */
es, /* ExplainState *es */
queryString, /* const char *queryString */
params, /* ParamListInfo params */
NULL, /* QueryEnvironment *queryEnv */
&planduration,/* const instr_time *planduration */
(es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL)
);
#elif PG_VERSION_NUM >= PG_VERSION_17
/* PG 17: same as before, plus passing mem_counters if enabled */
ExplainOnePlan(
plan,
into,
es,
queryString,
params,
queryEnv,
&planduration,
(es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL)
);
#else
/* PG 15–16: original seven-arg form */
ExplainOnePlan(
plan,
into,
es,
queryString,
params,
queryEnv,
&planduration,
(es->buffers ? &bufusage : NULL)
);
#endif
```
### Adapt to the unified “index interpretation” API in PG 18 (commit
a8025f544854)
PostgreSQL commit
a8025f5448
generalized the old btree‐specific operator‐interpretation API into a
single “index interpretation” interface:
* **Renamed type**:
`OpBtreeInterpretation` → `OpIndexInterpretation`
* **Renamed function**:
`get_op_btree_interpretation(opno)` →
`get_op_index_interpretation(opno)`
* **Unified field**:
Each interpretation now carries `cmptype` instead of `strategy`.
To build cleanly on PG 18 while still supporting PG 15–17, Citus’s
shard‐pruning code now wraps these changes:
```c
#include "pg_version_constants.h"
#if PG_VERSION_NUM >= PG_VERSION_18
/* On PG 18+ the btree‐only APIs vanished; alias them to the new generic versions */
typedef OpIndexInterpretation OpBtreeInterpretation;
#define get_op_btree_interpretation(opno) get_op_index_interpretation(opno)
#define ROWCOMPARE_NE COMPARE_NE
#endif
/* … later, when checking an interpretation … */
OpBtreeInterpretation *interp =
(OpBtreeInterpretation *) lfirst(cell);
#if PG_VERSION_NUM >= PG_VERSION_18
/* use cmptype on PG 18+ */
if (interp->cmptype == ROWCOMPARE_NE)
#else
/* use strategy on PG 15–17 */
if (interp->strategy == ROWCOMPARE_NE)
#endif
{
/* … */
}
```
### Adapt `create_foreignscan_path()` for PG-18’s revised signature
PostgreSQL commit
e222534679
reordered and removed a couple of parameters in the FDW‐path builder:
* **PG 15–17 signature (11 args)**
```c
create_foreignscan_path(PlannerInfo *root,
RelOptInfo *rel,
PathTarget *target,
double rows,
Cost startup_cost,
Cost total_cost,
List *pathkeys,
Relids required_outer,
Path *fdw_outerpath,
List *fdw_restrictinfo,
List *fdw_private);
```
* **PG 18+ signature (9 args)**
```c
create_foreignscan_path(PlannerInfo *root,
RelOptInfo *rel,
PathTarget *target,
double rows,
int disabled_nodes,
Cost startup_cost,
Cost total_cost,
Relids required_outer,
Path *fdw_outerpath,
List *fdw_private);
```
To support both, Citus now defines a compatibility macro in
`pg_version_compat.h`:
```c
#include "nodes/bitmapset.h" /* for Relids */
#include "nodes/pg_list.h" /* for List */
#include "optimizer/pathnode.h" /* for create_foreignscan_path() */
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG18+: drop pathkeys & fdw_restrictinfo, add disabled_nodes */
#define create_foreignscan_path_compat(a, b, c, d, e, f, g, h, i, j, k) \
create_foreignscan_path( \
(a), /* root */ \
(b), /* rel */ \
(c), /* target */ \
(d), /* rows */ \
(0), /* disabled_nodes (unused by Citus) */ \
(e), /* startup_cost */ \
(f), /* total_cost */ \
(g), /* required_outer */ \
(h), /* fdw_outerpath */ \
(k) /* fdw_private */ \
)
#else
/* PG15–17: original signature */
#define create_foreignscan_path_compat(a, b, c, d, e, f, g, h, i, j, k) \
create_foreignscan_path( \
(a), (b), (c), (d), \
(e), (f), \
(g), (h), (i), (j), (k) \
)
#endif
```
Now every call to `create_foreignscan_path_compat(...)`—even in tests
like `fake_fdw.c`—automatically picks the correct argument list for
PG 15 through PG 18.
### Drop the obsolete bitmap‐scan hooks on PG 18+
PostgreSQL commit
c3953226a0
cleaned up the `TableAmRoutine` API by removing the two bitmap‐scan
callback slots:
* `scan_bitmap_next_block`
* `scan_bitmap_next_tuple`
Since those hook‐slots no longer exist in PG 18, Citus now wraps their
NULL‐initialization in a `#if PG_VERSION_NUM < PG_VERSION_18` guard. On
PG 15–17 we still explicitly set them to `NULL` (to satisfy the old
struct layout), and on PG 18+ we omit them entirely:
```c
#if PG_VERSION_NUM < PG_VERSION_18
/* PG 15–17 only: these fields were removed upstream in PG 18 */
.scan_bitmap_next_block = NULL,
.scan_bitmap_next_tuple = NULL,
#endif
```
### Adapt `vac_update_relstats()` invocation to PG-18’s new
“all\_frozen” argument
PostgreSQL commit
99f8f3fbbc
extended the `vac_update_relstats()` API by inserting a
`num_all_frozen_pages` parameter between the existing
`num_all_visible_pages` and `hasindex` arguments:
```diff
- /* PG ≤17: */
- void
- vac_update_relstats(Relation relation,
- BlockNumber num_pages,
- double num_tuples,
- BlockNumber num_all_visible_pages,
- bool hasindex,
- TransactionId frozenxid,
- MultiXactId minmulti,
- bool *frozenxid_updated,
- bool *minmulti_updated,
- bool in_outer_xact);
+ /* PG ≥18: adds num_all_frozen_pages */
+ void
+ vac_update_relstats(Relation relation,
+ BlockNumber num_pages,
+ double num_tuples,
+ BlockNumber num_all_visible_pages,
+ BlockNumber num_all_frozen_pages,
+ bool hasindex,
+ TransactionId frozenxid,
+ MultiXactId minmulti,
+ bool *frozenxid_updated,
+ bool *minmulti_updated,
+ bool in_outer_xact);
```
To compile cleanly on both PG 15–17 and PG 18+, Citus wraps its call in
a version guard and supplies a zero placeholder for the new field:
```c
#if PG_VERSION_NUM >= 180000
/* PG 18+: supply explicit “all_frozen” count */
vac_update_relstats(
rel,
new_rel_pages,
new_live_tuples,
new_rel_allvisible, /* allvisible */
0, /* all_frozen */
nindexes > 0,
newRelFrozenXid,
newRelminMxid,
&frozenxid_updated,
&minmulti_updated,
false /* in_outer_xact */
);
#else
/* PG 15–17: original signature */
vac_update_relstats(
rel,
new_rel_pages,
new_live_tuples,
new_rel_allvisible,
nindexes > 0,
newRelFrozenXid,
newRelminMxid,
&frozenxid_updated,
&minmulti_updated,
false /* in_outer_xact */
);
#endif
```
**Why all_frozen = 0?**
Columnar storage never embeds transaction IDs in its pages, so it never
needs to track “all‐frozen” pages the way a heap does. Setting both
allvisible and allfrozen to zero simply tells Postgres “there are no
pages with the visibility or frozen‐status bits set,” matching our
existing behavior.
This change ensures Citus’s VACUUM‐statistic updates work unmodified
across all supported Postgres versions.
DESCRIPTION: Drops PG14 support
1. Remove "$version_num" != 'xx' from configure file
2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code
3. Look at pg_version_compat.h file, remove all _compat functions etc
defined specifically for PGXX differences
4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM <
PG_VERSION_(XX+1) ifs in the codebase
5. delete ruleutils_xx.c file
6. cleanup normalize.sed file from pg14 specific lines
7. delete all alternative output files for that particular PG version,
server_version_ge variable helps here
DESCRIPTION: Propagates MEMORY and SERIALIZE options of EXPLAIN
The options for `MEMORY` can be true or false. Default is false.
The options for `SERIALIZE` can be none, text or binary. Default is
none.
I referred to how we added support for WAL option in this PR [Support
EXPLAIN(ANALYZE, WAL)](https://github.com/citusdata/citus/pull/4196).
For the tests however, I used the same tests as Postgres, not like the
tests in the WAL PR. I used exactly the same tests as Postgres does, I
simply distributed the table beforehand. See below the relevant Postgres
commits from where you can see the tests added as well:
- [Add EXPLAIN
(MEMORY)](https://github.com/postgres/postgres/commit/5de890e36)
- [Invent SERIALIZE option for
EXPLAIN.](https://github.com/postgres/postgres/commit/06286709e)
This PR required a lot of copying of Postgres static functions regarding
how `EXPLAIN` works for `MEMORY` and `SERIALIZE` options. Specifically,
these copy-pastes were required for updating `ExplainWorkerPlan()`
function, which is in fact based on postgres' `ExplainOnePlan()`:
```C
/* copied from explain.c to update ExplainWorkerPlan() in citus according to ExplainOnePlan() in postgres */
#define BYTES_TO_KILOBYTES(b)
typedef struct SerializeMetrics
static bool peek_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_memory_counters(ExplainState *es, const MemoryContextCounters *mem_counters);
static void ExplainIndentText(ExplainState *es);
static void ExplainPrintSerialize(ExplainState *es, SerializeMetrics *metrics);
static SerializeMetrics GetSerializationMetrics(DestReceiver *dest);
```
_Note_: it looks like we were missing some `buffers` option details as
well. I put them together with the memory option, like the code in
Postgres explain.c, as I didn't want to change the copied code. However,
I tested locally and there is no big deal in previous Citus versions,
and you can also see that existing Citus tests with `buffers true`
didn't change. Therefore, I prefer not to backport "buffers" changes to
previous versions.
This PR provides successful compilation against PG17.0.
- Remove ExecFreeExprContext call
Relevant PG commit
d060e921ea5aa47b6265174c32e1128cebdbc3df
d060e921ea
- PG17 uses streaming IO in analyze, fix scan_analyze_next_block function
Relevant PG commit
041b96802efa33d2bc9456f2ad946976b92b5ae1
041b96802e
- Define ObjectClass for PG17+ only since it's removed
Relevant PG commit:
89e5ef7e21812916c9cf9fcf56e45f0f74034656
89e5ef7e21
- Remove ReorderBufferTupleBuf structure.
Relevant PG commit:
08e6344fd6423210b339e92c069bb979ba4e7cd6
08e6344fd6
- Define colliculocale and daticulocale since they have been renamed
Relevant PG commit:
f696c0cd5f299f1b51e214efc55a22a782cc175d
f696c0cd5f
- makeStringConst defined in PG17
Relevant PG commit:
de3600452b61d1bc3967e9e37e86db8956c8f577
de3600452b
- RangeVarCallbackOwnsTable was replaced by RangeVarCallbackMaintainsTable
Relevant PG commit:
ecb0fd33720fab91df1207e85704f382f55e1eb7
ecb0fd3372
- attstattarget is nullable, define pg compatible functions for it
Relevant PG commit:
4f622503d6de975ac87448aea5cea7de4bc140d5
4f622503d6
- stxstattarget is nullable in PG17, write compat functions for it
Relevant PG commit:
012460ee93c304fbc7220e5b55d9d0577fc766ab
012460ee93
- Use ResourceOwner to track WaitEventSet in PG17
Relevant PG commit:
50c67c2019ab9ade8aa8768bfe604cd802fe8591
50c67c2019
- getIdentitySequence now uses Relation instead of relation_id
Relevant PG commit:
509199587df73f06eda898ae13284292f4ae573a
509199587d
- Remove no-op tuplestore_donestoring function
Relevant PG commit:
75680c3d805e2323cd437ac567f0677fdfc7b680
75680c3d80
- MergeAction can have 3 merge kinds (now enum) in PG17, write compat
Relevant PG commit:
0294df2f1f842dfb0eed79007b21016f486a3c6c
0294df2f1f
- EXPLAIN (MEMORY) is added, make changes to ExplainOnePlan
Relevant PG commit:
5de890e3610d5a12cdaea36413d967cf5c544e20
5de890e361
- LIMIT_OPTION_DEFAULT has been removed as it's useless, use LIMIT_OPTION_COUNT
Relevant PG commit:
a6be0600ac3b71dda8277ab0fcbe59ee101ac1ce
a6be0600ac
- write compat for create_foreignscan_path bcs of more arguments in PG17
Relevant PG commit:
9e9931d2bf40e2fea447d779c2e133c2c1256ef3
9e9931d2bf
- pgprocno and lxid have been combined into a struct in PGPROC
Relevant PG commits:
28f3915b73f75bd1b50ba070f56b34241fe53fd1
28f3915b73
ab355e3a88de745607f6dd4c21f0119b5c68f2ad
ab355e3a88
024c521117579a6d356050ad3d78fdc95e44eefa
024c521117
- Simplify CitusNewNode (#7434)
postgres refactored newNode() in PG 17, the main point for doing this is
the original tricks is no longer neccessary for modern compilers[1].
This does the same for Citus.
This should have no backward compatibility issues since it just replaces
palloc0fast with palloc0.
This is good for forward compatibility since palloc0fast no longer
exists in PG 17.
[1]
https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
(cherry picked from commit 4b295cc)
This is prep work for successful compilation with PG17
PG17added foreach_ptr, foreach_int and foreach_oid macros
Relevant PG commit
14dd0f27d7cd56ffae9ecdbe324965073d01a9ff
14dd0f27d7
We already have these macros, but they are different with the
PG17 ones because our macros take a DECLARED variable, whereas
the PG16 macros declare a locally-scoped loop variable themselves.
Hence I am renaming our macros to foreach_declared_
I am separating this into its own PR since it touches many files. The
main compilation PR is https://github.com/citusdata/citus/pull/7699
We thought we provided support for this in
b8c493f2c4
However the use of parameters in SQL is not supported in Citus. Since
generic plan queries use parameters, we can't support for now.
Relevant PG16 commit https://github.com/postgres/postgres/commit/3c05284Fixes#7813 with proper error message
This change adds a script to programatically group all includes in a
specific order. The script was used as a one time invocation to group
and sort all includes throught our formatted code. The grouping is as
follows:
- System includes (eg. `#include<...>`)
- Postgres.h (eg. `#include "postgres.h"`)
- Toplevel imports from postgres, not contained in a directory (eg.
`#include "miscadmin.h"`)
- General postgres includes (eg . `#include "nodes/..."`)
- Toplevel citus includes, not contained in a directory (eg. `#include
"citus_verion.h"`)
- Columnar includes (eg. `#include "columnar/..."`)
- Distributed includes (eg. `#include "distributed/..."`)
Because it is quite hard to understand the difference between toplevel
citus includes and toplevel postgres includes it hardcodes the list of
toplevel citus includes. In the same manner it assumes anything not
prefixed with `columnar/` or `distributed/` as a postgres include.
The sorting/grouping is enforced by CI. Since we do so with our own
script there are not changes required in our uncrustify configuration.
Similar to https://github.com/citusdata/citus/pull/7077.
As PG 16+ has changed the join restriction information for certain outer
joins, MERGE is also impacted given that is is also underlying an outer
join.
See #7077 for the details.
This commit is the second and last phase of dropping PG13 support.
It consists of the following:
- Removes all PG_VERSION_13 & PG_VERSION_14 from codepaths
- Removes pg_version_compat entries and columnar_version_compat entries
specific for PG13
- Removes alternative pg13 test outputs
- Removes PG13 normalize lines and fix the test outputs based on that
It is a continuation of 5bf163a27d
1) For distributed tables that are not colocated.
2) When joining on a non-distribution column for colocated tables.
3) When merging into a distributed table using reference or citus-local tables as the data source.
This is accomplished primarily through the implementation of the following two strategies.
Repartition: Plan the source query independently,
execute the results into intermediate files, and repartition the files to
co-locate them with the merge-target table. Subsequently, compile a final
merge query on the target table using the intermediate results as the data
source.
Pull-to-coordinator: Execute the plan that requires evaluation at the coordinator,
run the query on the coordinator, and redistribute the resulting rows to ensure
colocation with the target shards. Direct the MERGE SQL operation to the worker
nodes' target shards, using the intermediate files colocated with the data as the
data source.
When auto_explain module is loaded and configured, EXPLAIN will be
implicitly run for all the supported commands. Postgres does not support
`EXPLAIN` for `ALTER` command. However, auto_explain will try to
`EXPLAIN` other supported commands internally triggered by `ALTER`.
For instance,
`ALTER TABLE target_table ADD CONSTRAINT fkey_167 FOREIGN KEY (col_1)
REFERENCES ref_table(key) ... `
command may trigger a SELECT command in the following form for foreign
key validation purpose:
`SELECT fk.col_1 FROM ONLY target_table fk LEFT OUTER JOIN ONLY
ref_table pk ON ( pk.key OPERATOR(pg_catalog.=) fk.col_1) WHERE pk.key
IS NULL AND (fk.col_1 IS NOT NULL) `
For Citus tables, the Citus utility hook should ensure that constraint
validation is skipped for shell tables but they are done for shard
tables. The reason behind this design choice can be summed up as:
- An ALTER TABLE command via coordinator node is run in a distributed
transaction.
- Citus does not support nested distributed transactions.
- A SELECT query on a distributed table (aka shell table) is also run in
a distributed transaction.
- Therefore, Citus does not support running a SELECT query on a shell
table while an ALTER TABLE command is running.
With
eadc88a800
a bug is introduced breaking the skip constraint validation behaviour of
Citus. With this change, we see that validation queries on distributed
tables are triggered within `ALTER` command adding constraints with
validation check. This regression did not cause an issue for regular use
cases since the citus executor hook blocks those queries heuristically
when there is an ALTER TABLE command in progress.
The issue is surfaced as a crash (#6424 Workers, when configured to use
auto_explain, crash during distributed transactions.) when auto_explain
is enabled. This is due to auto_explain trying to execute the SELECT
queries in a nested distributed transaction.
Now since the regression with constraint validation is fixed in
https://github.com/citusdata/citus/issues/6543, we should be able to
remove the workaround.
Fixes a bug that causes crash when using auto_explain extension with
ALTER TABLE...ADD FOREIGN KEY... queries.
Those queries trigger a SELECT query on the citus tables as part of the
foreign key constraint validation check. At the explain hook, workers
try to explain this SELECT query as a distributed query causing memory
corruption in the connection data structures. Hence, we will not explain
ALTER TABLE...ADD FOREIGN KEY... and the triggered queries on the
workers.
Fixes#6424.
This crash happens with recursively planned queries. For such queries,
subplans are explained via the ExplainOnePlan function of postgresql.
This function reconstructs the query description from the plan therefore
it expects the ActiveSnaphot for the query be available. This fix makes
sure that the snapshot is in the stack before calling ExplainOnePlan.
Fixes#2920.
use RecurseObjectDependencies api to find if an object is citus depended
make vanilla tests runnable to see if citus_depended function is working correctly
* Remove if conditions with PG_VERSION_NUM < 13
* Remove server_above_twelve(&eleven) checks from tests
* Fix tests
* Remove pg12 and pg11 alternative test output files
* Remove pg12 specific normalization rules
* Some more if conditions in the code
* Change RemoteCollationIdExpression and some pg12/pg13 comments
* Remove some more normalization rules
The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed.
Also fix another place in the code that might have been susceptible to this issue.
I tested on both multi-vg and multi-1-vg and the test were successful.
We've had custom versions of Postgres its `foreach` macro which with a
hidden ListCell for quite some time now. People like these custom
macros, because they are easier to use and require less boilerplate.
This adds similar custom versions of Postgres its `forboth` macro. Now
you don't need ListCells anymore when looping over two lists at the same
time.
If a worker node is being added, a command is sent to get the server_id of the worker from the pg_dist_node_metadata table. If the worker's id is the same as the node executing the code, we will know the node is trying to add itself. If the node tries to add itself without specifying `groupid:=0` the operation will result in an error.
PostgreSQL does not need calling this function since 7.4 release, and it
is a NOOP.
For more details, check PostgreSQL commit below :
commit dd04e958c8b03c0f0512497651678c7816af3198
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun Mar 9 03:34:10 2003 +0000
tuplestore_donestoring() isn't needed anymore, but provide a no-op
macro definition so as not to create compatibility problems.
diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h
index b46babacd1..76fe9fb428 100644
--- a/src/include/utils/tuplestore.h
+++ b/src/include/utils/tuplestore.h
@@ -17,7 +17,7 @@
* Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
- * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $
+ * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $
*
*-------------------------------------------------------------------------
*/
@@ -41,6 +41,9 @@ extern Tuplestorestate *tuplestore_begin_heap(bool randomAccess,
extern void tuplestore_puttuple(Tuplestorestate *state, void *tuple);
+/* tuplestore_donestoring() used to be required, but is no longer used */
+#define tuplestore_donestoring(state) ((void) 0)
+
/* backwards scan is only allowed if randomAccess was specified 'true' */
extern void *tuplestore_gettuple(Tuplestorestate *state, bool forward,
bool *should_free);
When queryId is not 0 and verbose is true, the query identifier is
emitted to the explain output. This is breaking Postgres outputs.
We disable de query identifier calculation in the tests.
Commit on PG that introduced the query identifier in the explain output:
4f0b0966c866ae9f0e15d7cc73ccf7ce4e1af84b
Postgres doesn't accept NULL for queryStrings in explain plans anymore.
Internally, there are some places in Postgres where they modified the
NULLS to ""(the empty string). So we do the same on citus side.
Commit on Postgres:
1111b2668d89bfcb6f502789158b1233ab4217a6
SetTuplestoreDestReceiverParams function now has two new parameters
This new macro give us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions
Existing parameters are set to NULL to keep previous behavior
Relevant PG commit:
2f48ede080f42b97b594fb14102c82ca1001b80c
We do not include dummy column if original task didn't return any
columns.
Otherwise, number of columns that original task returned wouldn't
match number of columns returned by worker_save_query_explain_analyze.
It seems that we forgot to pass the revelant
flag to enable Postgres' parallel query
capabilities on the shards when user does
EXPLAIN ANALYZE on a distributed table.
Add sort method parameter for regression tests
Fix check-style
Change sorting method parameters to enum
Polish
Add task fields to OutTask
Add test into multi_explain
Fix isolation test
It seems that currently we process even postgres tables in explain
commands. This is because we register a hook for explain and we don't
have any check to see if the query has any citus table.
With this commit, we now send the buffer usage as well to the relevant
API. There is some duplicate in the code but it is because of the
existing structure, we can refactor this separately.
The error message when index has opclassopts is improved and the commit
from postgres side is also included for future reference.
Also some minor style related changes are applied.
This commit mostly adds pg_get_triggerdef_command to our ruleutils_13.
This doesn't add anything extra for ruleutils 13 so it is basically a copy
of the change on ruleutils_12
Since ExplainOnePlan expects BufferUsage as well with PG >= 13,
ExplainOnePlanCompat is added.
Commit on Postgres side:
ed7a5095716ee498ecc406e1b8d5ab92c7662d10
As the new planner and pg_plan_query_compat methods expect the query
string as well, macros are defined to be compatible in different
versions of postgres.
Relevant commit on Postgres:
6aba63ef3e606db71beb596210dd95fa73c44ce2
Command on Postgres:
git log --all --grep="pg_plan_query"
With this patch, we introduce `locally_reserved_shared_connections.c/h` files
which are responsible for reserving some space in shared memory counters
upfront.
We sometimes need to reserve connections, but not necessarily
establish them. For example:
- COPY command should reserve connections as it cannot know which
connections it needs in which order. COPY establishes connections
as any input data hits the workers. For example, for router COPY
command, it only establishes 1 connection.
As discussed here (https://github.com/citusdata/citus/pull/3849#pullrequestreview-431792473),
COPY needs to reserve connections up-front, otherwise we can end
up with resource starvation/un-detected deadlocks.