Commit Graph

46 Commits (9e42f3f2c41b40f143f594aedb5802402ac58cf7)

Author SHA1 Message Date
Mehmet YILMAZ 9e42f3f2c4
Add PG 18Beta1 compatibility (Build + RuleUtils) (#7981)
This PR provides successful build against PG18Beta1. RuleUtils PR was
reviewed separately: #8010

## PG 18Beta1–related changes for building Citus


### TupleDesc / Attr layout

**What changed in PG:** Postgres consolidated the
`TupleDescData.attrs[]` array into a more compact representation. Direct
field access (tupdesc->attrs[i]) was replaced by the new
`TupleDescAttr()` API.

**Citus adaptation:** Everywhere we previously used
`tupdesc->attrs[...]`, we now call `TupleDescAttr(tupdesc, idx)` (or our
own `Attr()` macro) under a compatibility guard.
*
5983a4cffc

General Logic:

* Use `Attr(...)` in places where `columnar_version_compat.h` is
included. This avoids the need to sprinkle `#if PG_VERSION_NUM` guards
around each attribute access.

* Use `TupleDescAttr(tupdesc, i)` when the relevant PostgreSQL header is
already included and the additional macro indirection is unnecessary.


### Collation‐aware `LIKE`

**What changed in PG:** The `textlike` operator now requires an explicit
collation, to avoid ambiguous‐collation errors. Core code switched from
`DirectFunctionCall2(textlike, ...)` to
`DirectFunctionCall2Coll(textlike, DEFAULT_COLLATION_OID, ...)`.

**Citus adaptation:** In `remote_commands.c` and any other LIKE call, we
now use `DirectFunctionCall2Coll(textlike, DEFAULT_COLLATION_OID, ...)`
and `#include <utils/pg_collation.h>`.

*
85b7efa1cd

### Columnar storage API

* Adapt `columnar_relation_set_new_filelocator` (and related init
routines) for PG 18’s revised SMGR and storage-initialization hooks.
* Pull in the new headers (`explain_format.h`,
`columnar_version_compat.h`) so the columnar module compiles cleanly
against PG 18.
- heap_modify_tuple + heap_inplace_update only exist on PG < 18; on PG18
the in-place helper was removed upstream


-
a07e03fd8f

### OpenSSL / TLS integration

**What changed in PG:** Moved from the legacy `SSL_library_init()` to
`OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL)`, updated certificate
API calls (`X509_getm_notBefore`, `X509_getm_notAfter`), and
standardized on `TLS_method()`.

**Citus adaptation:** We now `#include <openssl/opensslv.h>` and use
`#if OPENSSL_VERSION_NUMBER >= 0x10100000L` to choose between`
OPENSSL_init_ssl()` or `SSL_library_init()`, and wrap`
X509_gmtime_adj()` calls around the new accessor functions.

*
6c66b7443c


### Adapt `ExtractColumns()` to the new PG-18 `expandRTE()` signature

PostgreSQL 18
80feb727c8
added a fourth argument of type `VarReturningType` to `expandRTE()`, so
calls that used the old 7-parameter form no longer compile. This patch:

* Wraps the `expandRTE(...)` call in a `#if PG_VERSION_NUM >= 180000`
guard.
* On PG 18+ passes the new `VAR_RETURNING_DEFAULT` argument before
`location`.
* On PG 15–17 continues to call the original 7-arg form.
* Adds the necessary includes (`parser/parse_relation.h` for `expandRTE`
and `VarReturningType`, and `pg_version_constants.h` for
`PG_VERSION_NUM`).



### Adapt `ExecutorStart`/`ExecutorRun` hooks to PG-18’s new signatures

PostgreSQL 18
525392d572
changed the signatures of the executor hooks:

* `ExecutorStart_hook` now returns `bool` instead of `void`, and
* `ExecutorRun_hook` drops its old `run_once` argument.

This patch preserves Citus’s existing hook logic by:

1. **Adding two adapter functions** under `#if PG_VERSION_NUM >=
PG_VERSION_18`:

   * `citus_executor_start_adapter(QueryDesc *queryDesc, int eflags)`
Calls the old `CitusExecutorStart(queryDesc, eflags)` and then returns
`true` to satisfy the new hook’s `bool` return type.
* `citus_executor_run_adapter(QueryDesc *queryDesc, ScanDirection
direction, uint64 count)`
Calls the old `CitusExecutorRun(queryDesc, direction, count, true)`
(passing `true` for the dropped `run_once` argument), and returns
`void`.

2. **Installing the adapters** in `_PG_init()` instead of the original
hooks when building against PG 18+:

   ```c
   #if PG_VERSION_NUM >= PG_VERSION_18
       ExecutorStart_hook = citus_executor_start_adapter;
       ExecutorRun_hook   = citus_executor_run_adapter;
   #else
       ExecutorStart_hook = CitusExecutorStart;
       ExecutorRun_hook   = CitusExecutorRun;
   #endif
   ```
   
### Adapt to PG-18’s removal of the “run\_once” flag from
ExecutorRun/PortalRun

PostgreSQL commit
[[3eea7a0](3eea7a0c97)
rationalized the executor’s parallelism logic by moving the “execute a
plan only once” check into `ExecutePlan()` itself and dropping the old
`bool run_once` argument from the public APIs:

```diff
- void ExecutorRun(QueryDesc *queryDesc,
-                  ScanDirection direction,
-                  uint64 count,
-                  bool run_once);
+ void ExecutorRun(QueryDesc *queryDesc,
+                  ScanDirection direction,
+                  uint64 count);
```

(and similarly for `PortalRun()`).

To stay compatible across PG 15–18, Citus now:

1. **Updates all internal calls** to `ExecutorRun(...)` and
`PortalRun(...)`:

* On PG 18+, use the new three-argument form (`ExecutorRun(qd, dir,
count)`).
* On PG 15–17, keep the old four-arg form (`ExecutorRun(qd, dir, count,
true)`) under a `#if PG_VERSION_NUM < 180000` guard.

2. **Guards the dispatcher hooks** via the adapter functions (from the
earlier patch) so that Citus’s executor hooks continue to work under
both the old and new signatures.


### Adapt to PG-18’s shortened PortalRun signature

PostgreSQL 18’s refactoring (see commit
[3eea7a0](3eea7a0c97))
also removed the old run_once and alternate‐dest arguments from the
public PortalRun() API. The signature changed from:



```diff
- bool PortalRun(Portal portal,
-                long count,
-                bool isTopLevel,
-                bool run_once,
-                DestReceiver *dest,
-                DestReceiver *altdest,
-                QueryCompletion *qc);
+ bool PortalRun(Portal portal,
+                long count,
+                bool isTopLevel,
+                DestReceiver *dest,
+                DestReceiver *altdest,
+                QueryCompletion *qc);
```

To support both versions in Citus, we:

1. **Version-guard each call** to `PortalRun()`:

   * **On PG 18+** invoke the new 6-argument form.
* **On PG 15–17** fall back to the legacy 7-argument form, passing
`true` for `run_once`.
   
### Add support for PG-18’s new `plansource` argument in
`PortalDefineQuery`**

PostgreSQL 18 extended the `PortalDefineQuery` API to carry a
`CachedPlanSource *plansource` pointer so that the portal machinery can
track cached‐plan invalidation (as introduced alongside deferred-locking
in commit
525392d572.
To remain compatible across PG 15–18, Citus now wraps its calls under a
version guard:

```diff
-   PortalDefineQuery(portal, NULL, sql, commandTag, plantree_list, NULL);
+#if PG_VERSION_NUM >= 180000
+   /* PG 18+: seven-arg signature (adds plansource) */
+   PortalDefineQuery(
+       portal,
+       NULL,            /* no prepared-stmt name */
+       sql,             /* the query text */
+       commandTag,      /* the CommandTag */
+       plantree_list,   /* List of PlannedStmt* */
+       NULL,            /* no CachedPlan */
+       NULL             /* no CachedPlanSource */
+   );
+#else
+   /* PG 15–17: six-arg signature */
+   PortalDefineQuery(
+       portal,
+       NULL,            /* no prepared-stmt name */
+       sql,             /* the query text */
+       commandTag,      /* the CommandTag */
+       plantree_list,   /* List of PlannedStmt* */
+       NULL             /* no CachedPlan */
+   );
+#endif
```


### Adapt ExecInitRangeTable() calls to PG-18’s new signature

PostgreSQL commit
[cbc127917e04a978a788b8bc9d35a70244396d5b](cbc127917e)
overhauled the planner API for range‐table initialization:

**PG 18+**: added a fourth `Bitmapset *unpruned_relids` argument to
support deferred partition pruning

In Citus’s `create_estate_for_relation()` (in `columnar_metadata.c`), we
now wrap the call in a compile‐time guard so that the code compiles
correctly on all supported PostgreSQL versions:

```
/* Prepare permission info on PG 16+ */
#if PG_VERSION_NUM >= PG_VERSION_16
    List *perminfos = NIL;
    addRTEPermissionInfo(&perminfos, rte);
#else
    List *perminfos = NIL;  /* unused on PG 15 */
#endif

/* Initialize the range table, with the right signature for each PG version */
#if PG_VERSION_NUM >= PG_VERSION_18
    /* PG 18+: four‐arg signature (adds unpruned_relids) */
    ExecInitRangeTable(
        estate,
        list_make1(rte),
        perminfos,
        NULL        /* unpruned_relids: not used by columnar */
    );
#elif PG_VERSION_NUM >= PG_VERSION_16
    /* PG 16–17: three‐arg signature (permInfos) */
    ExecInitRangeTable(
        estate,
        list_make1(rte),
        perminfos
    );
#else
    /* PG 15: two‐arg signature */
    ExecInitRangeTable(
        estate,
        list_make1(rte)
    );
#endif

estate->es_output_cid = GetCurrentCommandId(true);
```

### Adapt `pgstat_report_vacuum()` to PG-18’s new timestamp argument

PostgreSQL commit
[[30a6ed0ce4bb18212ec38cdb537ea4b43bc99b83](30a6ed0ce4)
extended the `pgstat_report_vacuum()` API by adding a `TimestampTz
start_time` parameter at the end so that the VACUUM statistics collector
can record when the operation began:

```diff
/* PG ≤17: four-arg signature */
- void pgstat_report_vacuum(Oid tableoid,
-                           bool shared,
-                           double num_live_tuples,
-                           double num_dead_tuples);
+/* PG ≥18: five-arg signature adds a start_time */
+ void pgstat_report_vacuum(Oid tableoid,
+                           bool shared,
+                           double num_live_tuples,
+                           double num_dead_tuples,
+                           TimestampTz start_time);
```

To support both versions, we now wrap the call in `columnar_tableam.c`
with a version guard, supplying `GetCurrentTimestamp()` for PG-18+:

```c
#if PG_VERSION_NUM >= 180000
    /* PG 18+: include start_timestamp */
    pgstat_report_vacuum(
        RelationGetRelid(rel),
        rel->rd_rel->relisshared,
        Max(new_live_tuples, 0),  /* live tuples */
        0,                        /* dead tuples */
        GetCurrentTimestamp()     /* start time */
    );
#else
    /* PG 15–17: original signature */
    pgstat_report_vacuum(
        RelationGetRelid(rel),
        rel->rd_rel->relisshared,
        Max(new_live_tuples, 0),  /* live tuples */
        0                         /* dead tuples */
    );
#endif
```


### Adapt `ExecuteTaskPlan()` to PG-18’s expanded `CreateQueryDesc()`
signature

PostgreSQL 18 changed `CreateQueryDesc()` from an eight-argument to a
nine-argument call by inserting a `CachedPlan *cplan` parameter
immediately after the `PlannedStmt *plannedstmt` argument (see commit
525392d572).
To remain compatible with PG 15–17, Citus now wraps its invocation in
`local_executor.c` with a version guard:

```diff
-    /* PG15–17: eight-arg CreateQueryDesc without cached plan */
-    QueryDesc *queryDesc = CreateQueryDesc(
-        taskPlan,           /* PlannedStmt *plannedstmt */
-        queryString,        /* const char *sourceText */
-        GetActiveSnapshot(),/* Snapshot snapshot */
-        InvalidSnapshot,    /* Snapshot crosscheck_snapshot */
-        destReceiver,       /* DestReceiver *dest */
-        paramListInfo,      /* ParamListInfo params */
-        queryEnv,           /* QueryEnvironment *queryEnv */
-        0                   /* int instrument_options */
-    );
+#if PG_VERSION_NUM >= 180000
+    /* PG18+: nine-arg CreateQueryDesc with a CachedPlan slot */
+    QueryDesc *queryDesc = CreateQueryDesc(
+        taskPlan,           /* PlannedStmt *plannedstmt */
+        NULL,               /* CachedPlan *cplan (none) */
+        queryString,        /* const char *sourceText */
+        GetActiveSnapshot(),/* Snapshot snapshot */
+        InvalidSnapshot,    /* Snapshot crosscheck_snapshot */
+        destReceiver,       /* DestReceiver *dest */
+        paramListInfo,      /* ParamListInfo params */
+        queryEnv,           /* QueryEnvironment *queryEnv */
+        0                   /* int instrument_options */
+    );
+#else
+    /* PG15–17: eight-arg CreateQueryDesc without cached plan */
+    QueryDesc *queryDesc = CreateQueryDesc(
+        taskPlan,           /* PlannedStmt *plannedstmt */
+        queryString,        /* const char *sourceText */
+        GetActiveSnapshot(),/* Snapshot snapshot */
+        InvalidSnapshot,    /* Snapshot crosscheck_snapshot */
+        destReceiver,       /* DestReceiver *dest */
+        paramListInfo,      /* ParamListInfo params */
+        queryEnv,           /* QueryEnvironment *queryEnv */
+        0                   /* int instrument_options */
+    );
+#endif
```



### Adapt `RelationGetPrimaryKeyIndex()` to PG-18’s new “deferrable\_ok”
flag

PostgreSQL commit
14e87ffa5c
added a new Boolean `deferrable_ok` parameter to
`RelationGetPrimaryKeyIndex()` so that the lock manager can defer
unique‐constraint locks when requested. The API changed from:

```c
RelationGetPrimaryKeyIndex(Relation relation)
```

to:

```c
RelationGetPrimaryKeyIndex(Relation relation, bool deferrable_ok)
 ```
                
```diff
diff --git a/src/backend/distributed/metadata/node_metadata.c
b/src/backend/distributed/metadata/node_metadata.c
index e3a1b2c..f4d5e6f 100644
--- a/src/backend/distributed/metadata/node_metadata.c
+++ b/src/backend/distributed/metadata/node_metadata.c
@@ -2965,8 +2965,18 @@
     */
- Relation replicaIndex =
index_open(RelationGetPrimaryKeyIndex(pgDistNode),
-                                      AccessShareLock);
+    #if PG_VERSION_NUM >= PG_VERSION_18
+        /* PG 18+ adds a bool "deferrable_ok" parameter */
+        Relation replicaIndex =
+            index_open(
+                RelationGetPrimaryKeyIndex(pgDistNode, false),
+                AccessShareLock);
+    #else
+        Relation replicaIndex =
+            index_open(
+                RelationGetPrimaryKeyIndex(pgDistNode),
+                AccessShareLock);
+    #endif

     ScanKeyInit(&scanKey[0], Anum_pg_dist_node_nodename,
BTEqualStrategyNumber, F_TEXTEQ, CStringGetTextDatum(nodeName));

```
  
  ```diff
  diff --git a/src/backend/distributed/operations/node_protocol.c b/src/backend/distributed/operations/node_protocol.c
index e3a1b2c..f4d5e6f 100644
--- a/src/backend/distributed/operations/node_protocol.c
+++ b/src/backend/distributed/operations/node_protocol.c
@@ -746,7 +746,12 @@
     if (!OidIsValid(idxoid))
     {
-        idxoid = RelationGetPrimaryKeyIndex(rel);
+        /* Determine the index OID of the primary key (PG18 adds a second parameter) */
+#if PG_VERSION_NUM >= PG_VERSION_18
+        idxoid = RelationGetPrimaryKeyIndex(rel, false);
+#else
+        idxoid = RelationGetPrimaryKeyIndex(rel);
+#endif
     }

     return idxoid;

```
  
Because Citus has always taken the lock immediately—just as the old
two-arg call did—we pass `false` to keep that same immediate-lock
behavior. Passing `true` would switch to deferred locking, which we
don’t want.



### Adapt `ExplainOnePlan()` to PG-18’s expanded API

PostgreSQL 18 extended
525392d572
the `ExplainOnePlan()` function to carry the `CachedPlan *` and
`CachedPlanSource *` pointers plus an explicit `query_index`, letting
the EXPLAIN machinery track plan‐source invalidation. The old signature:

```c
/* PG ≤17 */
void
ExplainOnePlan(PlannedStmt *plannedstmt,
               IntoClause *into,
               struct ExplainState *es,
               const char *queryString,
               ParamListInfo params,
               QueryEnvironment *queryEnv,
               const instr_time *planduration,
               const BufferUsage *bufusage);
```

became, in PG 18:

```c
/* PG ≥18 */
void
ExplainOnePlan(PlannedStmt *plannedstmt,
               CachedPlan   *cplan,
               CachedPlanSource *plansource,
               int            query_index,
               IntoClause    *into,
               struct ExplainState *es,
               const char   *queryString,
               ParamListInfo params,
               QueryEnvironment *queryEnv,
               const instr_time *planduration,
               const BufferUsage *bufusage,
               const MemoryContextCounters *mem_counters);
```

To compile under both versions, Citus now wraps each call in
`multi_explain.c` with:

```c
#if PG_VERSION_NUM >= PG_VERSION_18
    /* PG 18+: pass NULL for the new cached‐plan fields and zero for query_index */
    ExplainOnePlan(
        plan,         /* PlannedStmt *plannedstmt */
        NULL,         /* CachedPlan *cplan */
        NULL,         /* CachedPlanSource *plansource */
        0,            /* query_index */
        into,         /* IntoClause *into */
        es,           /* ExplainState *es */
        queryString,  /* const char *queryString */
        params,       /* ParamListInfo params */
        NULL,         /* QueryEnvironment *queryEnv */
        &planduration,/* const instr_time *planduration */
        (es->buffers ? &bufusage : NULL),
        (es->memory  ? &mem_counters : NULL)
    );
#elif PG_VERSION_NUM >= PG_VERSION_17
    /* PG 17: same as before, plus passing mem_counters if enabled */
    ExplainOnePlan(
        plan,
        into,
        es,
        queryString,
        params,
        queryEnv,
        &planduration,
        (es->buffers ? &bufusage : NULL),
        (es->memory ? &mem_counters : NULL)
    );
#else
    /* PG 15–16: original seven-arg form */
    ExplainOnePlan(
        plan,
        into,
        es,
        queryString,
        params,
        queryEnv,
        &planduration,
        (es->buffers ? &bufusage : NULL)
    );
#endif
```


### Adapt to the unified “index interpretation” API in PG 18 (commit
a8025f544854)

PostgreSQL commit
a8025f5448
generalized the old btree‐specific operator‐interpretation API into a
single “index interpretation” interface:

* **Renamed type**:
  `OpBtreeInterpretation` → `OpIndexInterpretation`
* **Renamed function**:
`get_op_btree_interpretation(opno)` →
`get_op_index_interpretation(opno)`
* **Unified field**:
  Each interpretation now carries `cmptype` instead of `strategy`.

To build cleanly on PG 18 while still supporting PG 15–17, Citus’s
shard‐pruning code now wraps these changes:

```c
#include "pg_version_constants.h"

#if PG_VERSION_NUM >= PG_VERSION_18
/* On PG 18+ the btree‐only APIs vanished; alias them to the new generic versions */
typedef OpIndexInterpretation OpBtreeInterpretation;
#define get_op_btree_interpretation(opno)  get_op_index_interpretation(opno)
#define ROWCOMPARE_NE  COMPARE_NE
#endif

/* … later, when checking an interpretation … */
OpBtreeInterpretation *interp =
    (OpBtreeInterpretation *) lfirst(cell);

#if PG_VERSION_NUM >= PG_VERSION_18
    /* use cmptype on PG 18+ */
    if (interp->cmptype == ROWCOMPARE_NE)
#else
    /* use strategy on PG 15–17 */
    if (interp->strategy == ROWCOMPARE_NE)
#endif
{
    /* … */
}
```


### Adapt `create_foreignscan_path()` for PG-18’s revised signature

PostgreSQL commit
e222534679
reordered and removed a couple of parameters in the FDW‐path builder:

* **PG 15–17 signature (11 args)**

  ```c
  create_foreignscan_path(PlannerInfo   *root,
                          RelOptInfo    *rel,
                          PathTarget    *target,
                          double         rows,
                          Cost           startup_cost,
                          Cost           total_cost,
                          List          *pathkeys,
                          Relids         required_outer,
                          Path          *fdw_outerpath,
                          List          *fdw_restrictinfo,
                          List          *fdw_private);
  ```
* **PG 18+ signature (9 args)**

  ```c
  create_foreignscan_path(PlannerInfo   *root,
                          RelOptInfo    *rel,
                          PathTarget    *target,
                          double         rows,
                          int            disabled_nodes,
                          Cost           startup_cost,
                          Cost           total_cost,
                          Relids         required_outer,
                          Path          *fdw_outerpath,
                          List          *fdw_private);
  ```

To support both, Citus now defines a compatibility macro in
`pg_version_compat.h`:

```c
#include "nodes/bitmapset.h"   /* for Relids */
#include "nodes/pg_list.h"     /* for List */
#include "optimizer/pathnode.h" /* for create_foreignscan_path() */

#if PG_VERSION_NUM >= PG_VERSION_18

/* PG18+: drop pathkeys & fdw_restrictinfo, add disabled_nodes */
#define create_foreignscan_path_compat(a, b, c, d, e, f, g, h, i, j, k) \
    create_foreignscan_path(                                            \
        (a),          /* root */                                       \
        (b),          /* rel */                                        \
        (c),          /* target */                                     \
        (d),          /* rows */                                       \
        (0),          /* disabled_nodes (unused by Citus) */           \
        (e),          /* startup_cost */                              \
        (f),          /* total_cost */                                \
        (g),          /* required_outer */                            \
        (h),          /* fdw_outerpath */                             \
        (k)           /* fdw_private */                               \
    )

#else

/* PG15–17: original signature */
#define create_foreignscan_path_compat(a, b, c, d, e, f, g, h, i, j, k) \
    create_foreignscan_path(                                            \
        (a), (b), (c), (d),                                            \
        (e), (f),                                                      \
        (g), (h), (i), (j), (k)                                        \
    )
#endif
```

Now every call to `create_foreignscan_path_compat(...)`—even in tests
like `fake_fdw.c`—automatically picks the correct argument list for
PG 15 through PG 18.



### Drop the obsolete bitmap‐scan hooks on PG 18+

PostgreSQL commit
c3953226a0
cleaned up the `TableAmRoutine` API by removing the two bitmap‐scan
callback slots:

* `scan_bitmap_next_block`
* `scan_bitmap_next_tuple`

Since those hook‐slots no longer exist in PG 18, Citus now wraps their
NULL‐initialization in a `#if PG_VERSION_NUM < PG_VERSION_18` guard. On
PG 15–17 we still explicitly set them to `NULL` (to satisfy the old
struct layout), and on PG 18+ we omit them entirely:

```c

#if PG_VERSION_NUM < PG_VERSION_18
    /* PG 15–17 only: these fields were removed upstream in PG 18 */
    .scan_bitmap_next_block = NULL,
    .scan_bitmap_next_tuple = NULL,
#endif


```


### Adapt `vac_update_relstats()` invocation to PG-18’s new
“all\_frozen” argument

PostgreSQL commit
99f8f3fbbc
extended the `vac_update_relstats()` API by inserting a
`num_all_frozen_pages` parameter between the existing
`num_all_visible_pages` and `hasindex` arguments:

```diff
- /* PG ≤17: */
- void
- vac_update_relstats(Relation relation,
-                    BlockNumber num_pages,
-                    double     num_tuples,
-                    BlockNumber num_all_visible_pages,
-                    bool       hasindex,
-                    TransactionId frozenxid,
-                    MultiXactId  minmulti,
-                    bool      *frozenxid_updated,
-                    bool      *minmulti_updated,
-                    bool       in_outer_xact);
+ /* PG ≥18: adds num_all_frozen_pages */
+ void
+ vac_update_relstats(Relation    relation,
+                    BlockNumber num_pages,
+                    double      num_tuples,
+                    BlockNumber num_all_visible_pages,
+                    BlockNumber num_all_frozen_pages,
+                    bool        hasindex,
+                    TransactionId frozenxid,
+                    MultiXactId  minmulti,
+                    bool      *frozenxid_updated,
+                    bool      *minmulti_updated,
+                    bool       in_outer_xact);
```

To compile cleanly on both PG 15–17 and PG 18+, Citus wraps its call in
a version guard and supplies a zero placeholder for the new field:

```c
#if PG_VERSION_NUM >= 180000
    /* PG 18+: supply explicit “all_frozen” count */
    vac_update_relstats(
        rel,
        new_rel_pages,
        new_live_tuples,
        new_rel_allvisible,    /* allvisible */
        0,                     /* all_frozen */
        nindexes > 0,
        newRelFrozenXid,
        newRelminMxid,
        &frozenxid_updated,
        &minmulti_updated,
        false                  /* in_outer_xact */
    );
#else
    /* PG 15–17: original signature */
    vac_update_relstats(
        rel,
        new_rel_pages,
        new_live_tuples,
        new_rel_allvisible,
        nindexes > 0,
        newRelFrozenXid,
        newRelminMxid,
        &frozenxid_updated,
        &minmulti_updated,
        false                  /* in_outer_xact */
    );
#endif
```

**Why all_frozen = 0?**
Columnar storage never embeds transaction IDs in its pages, so it never
needs to track “all‐frozen” pages the way a heap does. Setting both
allvisible and allfrozen to zero simply tells Postgres “there are no
pages with the visibility or frozen‐status bits set,” matching our
existing behavior.

This change ensures Citus’s VACUUM‐statistic updates work unmodified
across all supported Postgres versions.
2025-07-16 15:30:41 +03:00
Muhammad Usama 95da74c47f Fix Deadlock with transaction recovery is possible during Citus upgrades (#7910)
DESCRIPTION: Fixes deadlock with transaction recovery that is possible
during Citus upgrades.

Fixes #7875.

This commit addresses two interrelated deadlock issues uncovered during Citus
upgrades:
1. Local Deadlock:
   - **Problem:**
     In `RecoverWorkerTransactions()`, a new connection is created for each worker
     node to perform transaction recovery by locking the
     `pg_dist_transaction` catalog table until the end of the transaction. When
     `RecoverTwoPhaseCommits()` calls this function for each worker node, the order
     of acquiring locks on `pg_dist_authinfo` and `pg_dist_transaction` can alternate.
     This reversal can lead to a deadlock if any concurrent process requires locks on
     these tables.
   - **Fix:**
     Pre-establish all worker node connections upfront so that
     `RecoverWorkerTransactions()` operates with a single, consistent connection.
     This ensures that locks on `pg_dist_authinfo` and `pg_dist_transaction` are always
     acquired in the correct order, thereby preventing the local deadlock.

2. Distributed Deadlock:
   - **Problem:**
     After resolving the local deadlock, a distributed deadlock issue emerges. The
     maintenance daemon calls `RecoverWorkerTransactions()` on each worker node—
     including the local node—which leads to a complex locking sequence:
       - A RowExclusiveLock is taken on the `pg_dist_transaction` table in
         `RecoverWorkerTransactions()`.
       - An update extension then tries to acquire an AccessExclusiveLock on the same
         table, getting blocked by the RowExclusiveLock.
       - A subsequent query (e.g., a SELECT on `pg_prepared_xacts`) issued using a
         separate connection on the local node gets blocked due to locks held during a
         call to `BuildCitusTableCacheEntry()`.
       - The maintenance daemon waits for this query, resulting in a circular wait and
         stalling the entire cluster.
   - **Fix:**
     Avoid cache lookups for internal PostgreSQL tables by implementing an early bailout
     for relation IDs below `FirstNormalObjectId` (system objects). This eliminates
     unnecessary calls to `BuildCitusTableCache`, reducing lock contention and mitigating
     the distributed deadlock.
     Furthermore, this optimization improves performance in fast
     connect→query_catalog→disconnect cycles by eliminating redundant
     cache creation and lookups.

3. Also reverts the commit that disabled the relevant test cases.
2025-03-12 12:43:01 +03:00
Onur Tirtir 7004295065 Revert "Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered"
This reverts commit 684b4c6b96.
2025-03-12 12:43:01 +03:00
Onur Tirtir d5618b6b4c Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered
As of this commit, after recovering the remote transactions, now we release the lock
on pg_dist_transaction while closing it to avoid deadlocks that might occur because
of trying to acquire a lock on pg_dist_authinfo while holding a lock on
pg_dist_transaction. Such a scenario can only cause a deadlock if another transaction
is trying to acquire a strong lock on pg_dist_transaction while holding a lock on
pg_dist_authinfo. As of today, we (implicitly) acquire a strong lock on
pg_dist_transaction only when upgrading Citus to 11.3-1 and this happens when creating
a REPLICA IDENTITY on pg_dist_transaction.

And regardless of the code-path we are in, it should be okay to release the lock there
because all we do after that point is to abort the prepared transactions that are not
part of an in-progress distributed transaction and releasing the lock before doing so
should be just fine.

This also changes the blocking behavior between citus_create_restore_point and the
transaction recovery code-path in the sense that now citus_create_restore_point doesn't
until transaction recovery completes aborting the prepared transactions that are not
part of an in-progress distributed transaction. However, this should be fine because
even before this was possible, e.g., if transaction recovery fails to open a remote
connection to a node.
2025-03-12 12:43:01 +03:00
Naisila Puka 6bd3474804 Rename foreach_ macros to foreach_declared_ macros (#7700)
This is prep work for successful compilation with PG17

PG17added foreach_ptr, foreach_int and foreach_oid macros
Relevant PG commit
14dd0f27d7cd56ffae9ecdbe324965073d01a9ff

14dd0f27d7

We already have these macros, but they are different with the
PG17 ones because our macros take a DECLARED variable, whereas
the PG16 macros declare a locally-scoped loop variable themselves.

Hence I am renaming our macros to foreach_declared_

I am separating this into its own PR since it touches many files. The
main compilation PR is https://github.com/citusdata/citus/pull/7699
2025-03-12 11:01:49 +03:00
Onur Tirtir 0acb5f6e86
Fix assertion failure in maintenance daemon during Citus upgrades (#7537)
Fixes https://github.com/citusdata/citus/issues/7536.

Note to reviewer:

Before this commit, the following results in an assertion failure when
executed locally and this won't be the case anymore:
```console
make -C src/test/regress/ check-citus-upgrade-local citus-old-version=v10.2.0
```

Note that this doesn't happen on CI as we don't enable assertions there.

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2024-03-20 00:10:12 +00:00
Halil Ozan Akgül b877d606c7
Adds 2PC distributed commands from other databases (#7203)
DESCRIPTION: Adds support for 2PC from non-Citus main databases

This PR only adds support for `CREATE USER` queries, other queries need
to be added. But it should be simple because this PR creates the
underlying structure.

Citus main database is the database where the Citus extension is
created. A non-main database is all the other databases that are in the
same node with a Citus main database.

When a `CREATE USER` query is run on a non-main database we:

1. Run `start_management_transaction` on the main database. This
function saves the outer transaction's xid (the non-main database
query's transaction id) and marks the current query as main db command.
2. Run `execute_command_on_remote_nodes_as_user("CREATE USER
<username>", <username to run the command>)` on the main database. This
function creates the users in the rest of the cluster by running the
query on the other nodes. The user on the current node is created by the
query on the outer, non-main db, query to make sure consequent commands
in the same transaction can see this user.
3. Run `mark_object_distributed` on the main database. This function
adds the user to `pg_dist_object` in all of the nodes, including the
current one.

This PR also implements transaction recovery for the queries from
non-main databases.
2023-12-22 19:19:41 +03:00
Nils Dijk 0620c8f9a6
Sort includes (#7326)
This change adds a script to programatically group all includes in a
specific order. The script was used as a one time invocation to group
and sort all includes throught our formatted code. The grouping is as
follows:

 - System includes (eg. `#include<...>`)
 - Postgres.h (eg. `#include "postgres.h"`)
- Toplevel imports from postgres, not contained in a directory (eg.
`#include "miscadmin.h"`)
 - General postgres includes (eg . `#include "nodes/..."`)
- Toplevel citus includes, not contained in a directory (eg. `#include
"citus_verion.h"`)
 - Columnar includes (eg. `#include "columnar/..."`)
 - Distributed includes (eg. `#include "distributed/..."`)

Because it is quite hard to understand the difference between toplevel
citus includes and toplevel postgres includes it hardcodes the list of
toplevel citus includes. In the same manner it assumes anything not
prefixed with `columnar/` or `distributed/` as a postgres include.

The sorting/grouping is enforced by CI. Since we do so with our own
script there are not changes required in our uncrustify configuration.
2023-11-23 18:19:54 +01:00
Nils Dijk 0dac63afc0
move pg_version_constants.h to toplevel include (#7335)
In preparation of sorting and grouping all includes we wanted to move
this file to the toplevel includes for good grouping/sorting.
2023-11-09 15:09:39 +00:00
Ivan Vyazmitinov e94bf93152
#6548 2PC recovery is extremely ineffective on a cluster with multiple DATABASEs fix (#7174) 2023-09-04 15:28:22 +02:00
Onder Kalaci 149771792b Remove useless version compats
most likely leftover from earlier versions
2022-07-29 10:31:55 +02:00
Hanefi Onaldi 76176caea7 Fix typo s/exlusive/exclusive/ 2021-12-23 01:35:01 +03:00
SaitTalhaNisanci 03832f353c Drop postgres 11 support 2021-03-25 09:20:28 +03:00
Ahmet Gedemenli 936775e8e3 Delete transactions when removing node
With this commit, we delete entries in pg_dist_transaction
for the primary nodes that are removed by `master_remove_node`.
2020-12-07 11:35:20 +03:00
SaitTalhaNisanci e7cd1ed0ee
Not take ShareUpdateExlusiveLock on pg_dist_transaction (#4184)
* Not take ShareUpdateExlusiveLock on pg_dist_transaction

We were taking ShareUpdateExlusiveLock on pg_dist_transaction during
recovery to prevent multiple recoveries happening concurrenly. VACUUM(
not FULL) also takes ShareUpdateExclusiveLock, and they can conflict. It
seems that VACUUM will skip the table if there is a conflicting lock
already taken unless it is doing the vacuum to prevent id wraparound, in
which case there can be a deadlock. I guess the deadlock happens if:

- VACUUM takes a lock on pg_dist_transaction and is done for id
wraparound problem
- The transaction in the maintenance tries to take a lock but
cannot as that conflicts with the lock acquired by VACUUM
- The transaction in the maintenance daemon has a very old xid hence
VACUUM cannot proceed.

If we take a row exclusive lock in transaction recovery then it wouldn't
conflict with VACUUM hence it could proceed so the deadlock would be
resolved. To prevent concurrent transaction recoveries happening, an
advisory lock is taken with ShareUpdateExlusiveLock as before.

* Use CITUS_OPERATIONS tag
2020-09-21 15:20:38 +03:00
Sait Talha Nisanci b641f63bfd Use CMDTAG_SELECT_COMPAT
CMDTAG_SELECT exists in PG12 hence defining a MACRO such as
CMDTAG_SELECT -> "SELECT" is not possible. I chose CMDTAG_SELECT_COMPAT
because with the COMPAT suffix it is explicit that it maps to different
things in different versions and also has a less chance of mapping
something irrevelant. For example if we used SELECT as a macro, then it
would map every SELECT to whatever it is mapping to, which might have
unexpected/undesired behaviour.
2020-08-04 15:38:13 +03:00
Sait Talha Nisanci bf831d2e59 Use table_openXXX methods in the codebase
With PG13 heap_* (heap_open, heap_close etc) are replaced with table_*
(table_open, table_close etc).

It is better to use the new table access methods in the codebase and
define the macros for the previous versions as we can easily remove the
macro without having to change the codebase when we drop the support for
the old version.

Commits that introduced this change on Postgres:
f25968c49697db673f6cd2a07b3f7626779f1827
e0c4ec07284db817e1f8d9adfb3fffc952252db0
4b21acf522d751ba5b6679df391d5121b6c4a35f

Command to see relevant commits on Postgres side:
git log --all --grep="heap_open"
2020-08-04 15:10:22 +03:00
SaitTalhaNisanci ba01f3457a
use macros for pg versions instead of hardcoded values (#3694)
3 Macros are defined for removing the hardcoded pg versions.
PG_VERSION_11, PG_VERSION_12 and PG_VERSION_13.
2020-04-01 17:01:52 +03:00
Onur Tirtir 88bfd2e4b7 refactor around local group id checks
Mostyl optimizes the calls made to GetLocalGroupId and refactors
its usages
2020-03-05 20:20:41 +03:00
Philip Dubé 20abc4d2b5
Replace foreach with foreach_ptr/foreach_oid (#3544) 2020-02-27 16:54:49 +01:00
Philip Dubé 3a906b8210 Fix typos noticed while reading through code trying to understand HAVING 2020-02-11 19:55:10 +00:00
Jelte Fennema 1d8dde232f
Automatically convert useless declarations using regex replace (#3181)
* Add declaration removal to CI

* Convert declarations
2019-11-21 13:47:29 +01:00
Philip Dubé e8ecbbfcb3 Escape transaction names 2019-11-08 21:23:01 +00:00
SaitTalhaNisanci 94a7e6475c
Remove copyright years (#2918)
* Update year as 2012-2019

* Remove copyright years
2019-10-15 17:44:30 +03:00
Philip Dubé 492d1b2cba ActivePrimaryNodeList: add lockMode parameter 2019-09-13 17:44:56 +00:00
Philip Dubé 68c4b71f93 Fix up includes with pg12 changes 2019-08-22 18:56:21 +00:00
Marco Slot bb3a96eacb Cache a configurable number of connections at xact end 2019-05-29 13:24:31 +02:00
Hadi Moshayedi f4d3b94e22
Fix some of the casts for groupId (#2609)
A small change which partially addresses #2608.
2019-03-05 12:06:44 -08:00
Onder Kalaci bf28dd0cff Do not recover wrong distributed transactions in MX 2018-09-07 09:52:46 +03:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Marco Slot 8486f76e15 Auto-recover 2PC transactions 2017-11-22 11:26:58 +01:00
Marco Slot 9793218122 Do not commit already-committed prepared transactions in recovery 2017-11-20 13:18:48 +01:00
Marco Slot ae47df01ea Observe prepared xacts twice in RecoverWorkerTransactions to avoid race condition 2017-11-20 11:44:08 +01:00
Marco Slot 2410c2e450 Rewrite recover_prepared_transactions to be fast, non-blocking 2017-11-20 11:27:40 +01:00
Marco Slot 9e516513fc Use local group ID when querying for prepared transactions 2017-10-03 16:36:53 +02:00
Marco Slot da6b42a3e2 Use unique constraint index for transaction record deletion 2017-09-28 12:04:56 +02:00
Marco Slot 3d7f79127d Do not release locks in LogTransactionRecord 2017-07-24 20:44:38 +02:00
Brian Cloutier ec99f8f983 Add nodeRole column
- master_add_node enforces that there is only one primary per group
- there's also a trigger on pg_dist_node to prevent multiple primaries
  per group
- functions in metadata cache only return primary nodes
- Rename ActiveWorkerNodeList -> ActivePrimaryNodeList
- Rename WorkerGetLive{Node->Group}Count()
- Refactor WorkerGetRandomCandidateNode
- master_remove_node only complains about active shard placements if the
  node being removed is a primary.
- master_remove_node only deletes all reference table placements in the
  group if the node being removed is the primary.
- Rename {Node->NodeGroup}HasShardPlacements, this reflects the behavior it
  already had.
- Rename DeleteAllReferenceTablePlacementsFrom{Node->NodeGroup}. This also
  reflects the behavior it already had, but the new signature forces the
  caller to pass in a groupId
- Rename {WorkerGetLiveGroup->ActivePrimaryNode}Count
2017-07-24 11:57:46 +03:00
Jason Petersen 2204da19f0 Support PostgreSQL 10 (#1379)
Adds support for PostgreSQL 10 by copying in the requisite ruleutils
and updating all API usages to conform with changes in PostgreSQL 10.
Most changes are fairly minor but they are numerous. One particular
obstacle was the change in \d behavior in PostgreSQL 10's psql; I had
to add SQL implementations (views, mostly) to mimic the pre-10 output.
2017-06-26 02:35:46 -06:00
Burak Yucesoy 9fb15c439c Add version checks to necessary UDFs 2017-05-22 09:53:29 +03:00
Burak Yucesoy e9095e62ec Decouple reference table replication
With this change we add an option to add a node without replicating all reference
tables to that node. If a node is added with this option, we mark the node as
inactive and no queries will sent to that node.

We also added two new UDFs;
 - master_activate_node(host, port):
    - marks node as active and replicates all reference tables to that node
 - master_add_inactive_node(host, port):
    - only adds node to pg_dist_node
2017-04-17 13:33:31 +03:00
Brian Cloutier b1b2b4fadf Create ExecuteOptionalRemoteCommand
A small refactor which pulls some code out of `RecoverWorkerTransactions`
and into `remote_commands.c`. This code block currently only occurs in
`RecoverWorkerTransactions` but will be useful to other functions
shortly.

Unfortunately we couldn't call it `ExecuteRemoteCommand`, that name was
already taken.
2017-01-17 17:04:37 +02:00
Marco Slot 31231ce196 Use GetNodeConnection to establish a connection in transaction recovery 2017-01-10 02:44:34 +01:00
Andres Freund d256f3fca9 Remove unused LogPreparedTransactions() function.
This is unused since 92c7567008.
2017-01-06 09:15:01 -08:00
Marco Slot 6cbc1945f9 Enable transaction recovery in connection API 2016-12-23 16:14:29 +01:00
Eren Basak cee7b54e7c Add worker transaction and transaction recovery infrastructure 2016-10-18 14:18:14 +03:00