Compare commits

...

155 Commits

Author SHA1 Message Date
Mehmet YILMAZ 31911d8297
PG18 – Respect VACUUM/ANALYZE ONLY semantics for Citus tables (#8365)
fixes #8364

PostgreSQL 18 changes VACUUM/ANALYZE to recurse into inheritance
children by default, and introduces `ONLY` to limit processing to the
parent. Upstream change:

[https://github.com/postgres/postgres/commit/62ddf7ee9](https://github.com/postgres/postgres/commit/62ddf7ee9)

For Citus tables, we should treat shard placements as “children” and
avoid propagating `VACUUM/ANALYZE` to shards when the user explicitly
asks for `ONLY`.

This PR adjusts the Citus VACUUM handling to align with PG18 semantics,
and adds regression coverage on both regular distributed tables and
partitioned distributed tables.

---

### Behavior changes

* Introduce a per-relation helper struct:

  ```c
  typedef struct CitusVacuumRelation
  {
      VacuumRelation *vacuumRelation;
      Oid             relationId;
  } CitusVacuumRelation;
  ```

  This lets us keep both:

  * the resolved relation OID (for `IsCitusTable`, task building), and
* the original `VacuumRelation` node (for column list and ONLY/inh
flag).

* Replace the old `VacuumRelationIdList` / `ExtractVacuumTargetRels`
flow with:

  ```c
  static List *VacuumRelationList(VacuumStmt *vacuumStmt,
                                  CitusVacuumParams vacuumParams);
  ```

  `VacuumRelationList` now:

  * Iterates over `vacuumStmt->rels`.
* Resolves `relid` via `RangeVarGetRelidExtended` when `relation` is
present.
* Falls back to locking `VacuumRelation->oid` when only an OID is
available.
* Respects `VACOPT_FULL` for lock mode and `VACOPT_SKIP_LOCKED` for
locking behavior.
  * Builds a `List *` of `CitusVacuumRelation` entries.

* Update:

  ```c
  IsDistributedVacuumStmt(List *vacuumRelationList);
  ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt,
                                   List *vacuumRelationList,
                                   CitusVacuumParams vacuumParams);
  ```

  to operate on `CitusVacuumRelation` instead of bare OIDs.

* Implement `ONLY` semantics in `ExecuteVacuumOnDistributedTables`:

  ```c
  RangeVar *relation = vacuumRelation->relation;
  if (relation != NULL && !relation->inh)
  {
      /* ONLY specified, so don't recurse to shard placements */
      continue;
  }
  ```

  Effect:

* `VACUUM / ANALYZE` (no `ONLY`) on a Citus table: behavior unchanged,
Citus creates tasks and propagates to shard placements.
  * `VACUUM ONLY <citus_table>` / `ANALYZE ONLY <citus_table>`:

    * Core still processes the coordinator relation as usual.
* Citus **skips** building tasks for shard placements, so we do not
recurse into distributed children.
* The code compiles and behaves as before on pre-PG18; the new behavior
becomes observable only when the core planner starts setting `inh =
false` for `ONLY` (PG18).

* Unqualified `VACUUM` / `ANALYZE` (no rels) is unchanged and still
handled via `ExecuteUnqualifiedVacuumTasks`.

* Remove now-redundant helpers:

  * `VacuumColumnList`
  * `ExtractVacuumTargetRels`

Column lists are now taken directly from `vacuumRelation->va_cols` via
`CitusVacuumRelation`.

---

### Testing

Extend `src/test/regress/sql/pg18.sql` and `expected/pg18.out` with two
PG18-only blocks that verify we do not recurse into shard placements
when `ONLY` is used:

1. **Simple distributed table (`pg18_vacuum_part`)**

   * Create and distribute a regular table:

     ```sql
     CREATE SCHEMA pg18_vacuum_part;
     SET search_path TO pg18_vacuum_part;
     CREATE TABLE vac_analyze_only (a int);
     SELECT create_distributed_table('vac_analyze_only', 'a');
     INSERT INTO vac_analyze_only VALUES (1), (2), (3);
     ```

   * On the coordinator:

* Run `ANALYZE vac_analyze_only;` and later `ANALYZE ONLY
vac_analyze_only;`.
* Run `VACUUM vac_analyze_only;` and later `VACUUM ONLY
vac_analyze_only;`.

   * On `worker_1`:

* Capture `coalesce(max(last_analyze), 'epoch')` from
`pg_stat_user_tables` for `vac_analyze_only_%` into
`:analyze_before_only`, then assert:

       ```sql
SELECT max(last_analyze) = :'analyze_before_only'::timestamptz AS
analyze_only_skipped;
       ```

* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:

       ```sql
SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz AS
vacuum_only_skipped;
       ```

Both checks return `t`, confirming `ONLY` does not change `last_analyze`
/ `last_vacuum` on shard tables.

2. **Partitioned distributed table (`pg18_vacuum_part_dist`)**

   * Create a partitioned table whose parent is distributed:

     ```sql
     CREATE SCHEMA pg18_vacuum_part_dist;
     SET search_path TO pg18_vacuum_part_dist;
     SET citus.shard_count = 2;
     SET citus.shard_replication_factor = 1;

     CREATE TABLE part_dist (id int, v int) PARTITION BY RANGE (id);
CREATE TABLE part_dist_1 PARTITION OF part_dist FOR VALUES FROM (1) TO
(100);
CREATE TABLE part_dist_2 PARTITION OF part_dist FOR VALUES FROM (100) TO
(200);

     SELECT create_distributed_table('part_dist', 'id');
     INSERT INTO part_dist SELECT g, g FROM generate_series(1, 199) g;
     ```

   * On the coordinator:

     * Run `ANALYZE part_dist;` then `ANALYZE ONLY part_dist;`.
* Run `VACUUM part_dist;` then `VACUUM ONLY part_dist;` (PG18 emits the
expected warning: `VACUUM ONLY of partitioned table "part_dist" has no
effect`).

   * On `worker_1`:

* Capture `coalesce(max(last_analyze), 'epoch')` for `part_dist_%` into
`:analyze_before_only`, then assert:

       ```sql
       SELECT max(last_analyze) = :'analyze_before_only'::timestamptz
              AS analyze_only_partitioned_skipped;
       ```

* Capture `coalesce(max(last_vacuum), 'epoch')` into
`:vacuum_before_only`, then assert:

       ```sql
       SELECT max(last_vacuum) = :'vacuum_before_only'::timestamptz
              AS vacuum_only_partitioned_skipped;
       ```

Both checks return `t`, confirming that even for a partitioned
distributed parent, `VACUUM/ANALYZE ONLY` does not recurse into shard
placements, and Citus behavior matches PG18’s “ONLY = parent only”
semantics.
2025-12-05 16:50:42 +03:00
Colm 3399d660f3
PG18: fix incorrectly failing tests in pg18 regress. (#8370)
Some tests showed diffs because of dependencies on citus GUCs and
getting shuffled around in the test order. This commit fixes that.
2025-12-05 12:37:05 +00:00
Colm 002046b87b
PG18: Add support for virtual generated columns. (#8346)
Generated columns can be virtual (not stored) and this is the default.
This PG18 feature requires tweaking citus_ruleutils and deparse table to
support in Citus. Relevant PG commit: 83ea6c540.
2025-12-04 19:51:45 +00:00
Colm 79cabe7eca
PG18: CHECK constraints can be ENFORCED / NOT ENFORCED. (#8349)
DESCRIPTION: Adds propagation of ENFORCED / NOT ENFORCED on CHECK
constraints.
    
Add propagation support to Citus ruleutils and appropriate regress
tests. Relevant PG commit: ca87c41.
2025-12-03 08:01:01 +00:00
Colm e28591df08
PG18: Foreign key constraint can be specified NOT ENFORCED. (#8347)
Test that FOREIGN KEY .. NOT ENFORCED is propagated when applied to
Citus tables. No C code changes required, ruleutils handles it. Relevant
PG commit eec0040c4. Given that Citus does not yet support ALTER TABLE
.. ALTER CONSTRAINT its usefulness is questionable, but we propagate its
definition at least.
2025-12-03 07:10:55 +00:00
Mehmet YILMAZ a39ce7942f
PG18 - small fix on pg18.out (#8367) 2025-12-03 09:41:24 +03:00
Mehmet YILMAZ ae2eb65be0
PG18 - Stabilize EXPLAIN outputs: prefer YAML; drop JSON-side “Disabled” filtering; add alt expected files (#8336)
PG18 introduces `Disabled: <bool>` and planning/memory details that
cause unstable diffs, especially with `FORMAT JSON` when keys are
removed. This PR:

* Switches EXPLAIN assertions to `FORMAT YAML` wherever JSON structure
is not required.
* Stops normalizing/removing `"Disabled"` in JSON to avoid creating
invalid trailing commas.
* Adds alternative expected files for tests that must remain in JSON and
include `Disabled` lines.

### Rationale

* YAML is line-oriented and avoids JSON’s trailing-comma pitfalls; it
yields more stable diffs across PG15–PG18.
* Some tests intentionally assert JSON structure; those stay JSON but
use dedicated `.out` alternates rather than regex-based key removal.
* A proper “parse → modify → re-emit JSON” helper is being explored but
is non-trivial; this PR adopts the lowest-risk path now.

### What changed

1. **Test format**

* Convert `EXPLAIN (..., FORMAT JSON)` to `FORMAT YAML` in tests that
only care about plan shape/fields.
* Keep JSON only where the test explicitly validates JSON; introduce alt
expected files for those.

2. **Normalization**

* Remove JSON-side filtering of `"Disabled"` (no deletion, no regex
rewrite).
* Retain existing normalization for text/YAML (numbers, `Index
Searches`, `Window: ...`, buffers, etc.).

3. **Expected files**

   * Update YAML-based expected outputs.
   * Add `.out` alternates for JSON cases that include `Disabled`.
2025-12-01 15:29:03 +03:00
Mehmet YILMAZ c600eabd82
PG18 - Handle publish_generated_columns in distributed publications (#8360)
https://github.com/postgres/postgres/commit/7054186c4

fixes #8358 

This PR wires up PostgreSQL 18’s `publish_generated_columns` publication
option in Citus and adds regression coverage to ensure it behaves
correctly for distributed tables, without changing existing DDL output
for publications that rely on the default.

---

### 1. Preserve `publish_generated_columns` when rebuilding publications

In `BuildCreatePublicationStmt`:

* On PG18+ we now read the new `pubgencols` field from `pg_publication`
and map it as follows:

  * `'n'` → default (`none`)
  * `'s'` → `stored`

* For `pubgencols == 's'` we append a `publish_generated_columns`
defelem to the reconstructed statement:

  ```c
  #if PG_VERSION_NUM >= PG_VERSION_18
      if (publicationForm->pubgencols == 's')    /* stored */
      {
          DefElem *pubGenColsOption =
              makeDefElem("publish_generated_columns",
                          (Node *) makeString("stored"),
                          -1);

          createPubStmt->options =
              lappend(createPubStmt->options, pubGenColsOption);
      }
else if (publicationForm->pubgencols != 'n') /* 'n' = none (default) */
      {
          ereport(ERROR,
(errmsg("unexpected pubgencols value '%c' for publication %u",
                          publicationForm->pubgencols, publicationId)));
      }
  #endif
  ```

* For `pubgencols == 'n'` we do **not** emit an option and rely on
PostgreSQL’s default.

* Any value other than `'n'` or `'s'` raises an error rather than
silently producing incorrect DDL.

This ensures:

* Publications that explicitly use `publish_generated_columns = stored`
are reconstructed with that option on workers, so workers get
`pubgencols = 's'`.
* Publications that use the default (`none`) continue to produce the
same `CREATE PUBLICATION ... WITH (...)` text as before (no extra
`publish_generated_columns = 'none'` noise), fixing the unintended diffs
in existing publication tests.

---

### 2. New PG18 regression coverage for distributed publications

In `src/test/regress/sql/pg18.sql`:

* Create a table with a stored generated column and make it distributed
so the publication goes through Citus DDL propagation:

  ```sql
  CREATE TABLE gen_pub_tab (
      id int primary key,
      a  int,
      b  int GENERATED ALWAYS AS (a * 10) STORED
  );

SELECT create_distributed_table('gen_pub_tab', 'id', colocate_with :=
'none');
  ```

* Create two publications that exercise both `pubgencols` values:

  ```sql
  CREATE PUBLICATION pub_gen_cols_stored
      FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = stored);

  CREATE PUBLICATION pub_gen_cols_none
      FOR TABLE gen_pub_tab
WITH (publish = 'insert, update', publish_generated_columns = none);
  ```

* On coordinator and both workers, assert the catalog contents:

  ```sql
  SELECT pubname, pubgencols
  FROM pg_publication
  WHERE pubname IN ('pub_gen_cols_stored', 'pub_gen_cols_none')
  ORDER BY pubname;
  ```

  Expected on all three nodes:

  * `pub_gen_cols_stored | s`
  * `pub_gen_cols_none   | n`

This test verifies that:

* `pubgencols` is correctly set on the coordinator for both `stored` and
`none`.
* Citus propagates the setting unchanged to all workers for a
distributed table.
2025-12-01 09:17:57 +00:00
Mehmet YILMAZ 662b7248db
PG18 - Normalize window output and add filter (#8344)
fixes #8156


8b1b342544


* Extend `normalize.sed` to:

* Rewrite auto-generated window names like `OVER w1` back to `OVER (?)`
on:

    * `Sort Key: …`
    * `Group Key: …`
    * `Output: …`
* Leave functional window specs like `OVER (PARTITION BY …)` untouched.

* Use `public.explain_filter(...)` around EXPLAINs in window-related
tests to:

* Avoid plan text churn from PG18 planner/EXPLAIN changes while still
checking that we use the Citus executor and expected node types.

* Update expected outputs in:

  * `mixed_relkind_tests.out`
  * `multi_explain*.out`
  * `multi_outer_join_columns*.out`
  * `multi_subquery_window_functions.out`
  * `multi_test_helpers.out`
  * `window_functions.out`
to match the filtered EXPLAIN output on PG18 while remaining compatible
with older PG versions.
2025-11-19 22:48:24 +03:00
Mehmet YILMAZ c843cb2060
PG18: Normalize EXPLAIN “Actual Rows” decimals in normalize.sed (#8290)
fixes #8266 

* **Strip trivial `.0…`** tails early (e.g., `111111.00 → 111111`) for
text EXPLAIN.
* **Special-case for Seq Scan**: map sub-1 averages to `0` only on `Seq
Scan` lines (e.g., `0.50 → 0`) to match pre-PG18 behavior seen in
expected files.
* **General text EXPLAIN**: map sub-1 averages (`0.xxx`) to `1` for
non-Seq-Scan nodes (e.g., `Append`, `Custom Scan`) to align with
historic “at least one row observed” semantics in our expected plans.
* **Fallback normalizer across all formats**: convert any remaining
decimals to a placeholder `N.N`, then collapse to integer `N` for
text/YAML/JSON.



Ordered block added to `src/test/regress/bin/normalize.sed`:

```sed
# --- PG18 Actual Rows normalization ---
# New in PG18: Actual Rows in EXPLAIN output are now rounded to
# 1) 0.50 (and 0.5, 0.5000...) -> 0
s/(actual[[:space:]]*rows[[:space:]]*[=:][[:space:]]*)0\.50*/\10/gI
s/(actual[^)]*rows[[:space:]]*=[[:space:]]*)0\.50*/\10/gI

# 2) 0.51+ -> 1
s/(actual[[:space:]]*rows[[:space:]]*[=:][[:space:]]*)0\.(5[1-9][0-9]*|[6-9][0-9]*)/\11/gI
s/(actual[^)]*rows[[:space:]]*=[[:space:]]*)0\.(5[1-9][0-9]*|[6-9][0-9]*)/\11/gI

# 3) Strip trivial trailing ".0..." (6.00 -> 6)  [keep your existing cross-format rules]
s/(actual[[:space:]]*rows[[:space:]]*[=:][[:space:]]*)([0-9]+)\.0+/\1\2/gI
s/(actual[^)]*rows[[:space:]]*=[[:space:]]*)([0-9]+)\.0+/\1\2/gI

# 4) YAML/XML/JSON: strip trailing ".0..."
s/(Actual[[:space:]]+Rows:[[:space:]]*[0-9]+)\.0+/\1/gI
s/(<Actual-Rows>[0-9]+)\.0+(<\/Actual-Rows>)/\1\2/g
s/("Actual[[:space:]]+Rows":[[:space:]]*[0-9]+)\.0+/\1/gI

# 5) Placeholder cleanups (kept from existing rules; harmless if unused)
#    JSON placeholder cleanup: '"Actual Rows": N.N' -> N
s/("Actual[[:space:]]+Rows":[[:space:]]*)N\.N/\1N/gI
#    Text EXPLAIN collapse: "rows=N.N" -> "rows=N"
s/(rows[[:space:]]*=[[:space:]]*)N\.N/\1N/gI
#    YAML placeholder: "Actual Rows: N.N" -> "Actual Rows: N"
s/(Actual[[:space:]]+Rows:[[:space:]]*)N\.N/\1N/gI
# --- PG18 Actual Rows normalization ---
```

### Examples

**Before (PG18):**

```
Append (actual rows=0.60 loops=5)
Custom Scan (ColumnarScan) ... (actual rows=0.67 loops=3)
Seq Scan ... (actual rows=0.50 loops=2)
Custom Scan (ColumnarScan) ... (actual rows=111111.00 loops=1)
```

**After normalization:**

```
Append (actual rows=1 loops=5)
Custom Scan (ColumnarScan) ... (actual rows=1 loops=3)
Seq Scan ... (actual rows=0 loops=2)
Custom Scan (ColumnarScan) ... (actual rows=111111 loops=1)
```
2025-11-19 15:19:12 +00:00
Colm 4e47293f9f
PG18: syntax & semantics behavior in Citus, part 1. (#8335)
PG18: syntax & semantics behavior in Citus, part 1.
    
Includes PG18 tests for:
- OLD/NEW support in RETURNING clause of DML queries (PG commit
80feb727c)
- WITHOUT OVERLAPS in PRIMARY KEY and UNIQUE constraints (PG commit
fc0438b4e)
  - COLUMNS clause in JSON_TABLE (PG commit bb766cd)
- Foreign tables created with LIKE <table> clause (PG commit 302cf1575)
  - Foreign Key constraint with PERIOD clause (PG commit 89f908a6d)
  - COPY command REJECT_LIMIT option (PG commit 4ac2a9bec)
  - COPY TABLE TO on a materialized view (PG commit 534874fac)

Partially addresses issue #8250
2025-11-17 11:08:30 +00:00
Mehmet YILMAZ accd01fbf6
PG18 - add alternative out for multi_mx_hide_shard_names test (#8333)
fixes #8279

PostgreSQL 18 planner changes (probably AIO and updated cost model) make
sequential scans cheaper, so the psql `\d table`-style query that uses a
regex on `pg_class.relname` no longer chooses an index scan. This causes
a plan difference in the `mx_hide_shard_names` regression test.

This patch adds an alternative out for the multi_mx_hide_shard_names
test.
2025-11-14 08:55:45 +00:00
eaydingol cf533ebae9
Add breaking change detection for minor version upgrades (#8334)
This PR introduces infrastructure and validation to detect breaking
changes during Citus minor version upgrades, designed to run in release
branches only.

**Breaking change detection:**

- [GUCs] Detects removed GUCs and changes to default values
- [UDFs] Detects removed functions and function signature changes
-- Supports backward-compatible function overloading (new optional
parameters allowed)
- [types] Detects removed data types
- [tables/views] Detects removed tables/views and removed/changed
columns
- New make targets for minor version upgrade tests
- Follow-up PRs will add test schedules with different upgrade scenarios

The test will be enabled in release branches (e.g., release-13) via the
new test-citus-minor-upgrade job shown below. It will not run on the
main branch.

Testing
Verified locally with sample breaking changes:
`make check-citus-minor-upgrade-local citus-old-version=v13.2.0
`

**Test case 1:** Backward-compatible signature change (allowed)
```
-- Old: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer)
-- New: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer, pBlockedByPid integer DEFAULT NULL)
```
No breaking change detected (new parameter has DEFAULT)

**Test case 2:** Incompatible signature change (breaking)
```
-- Old: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer)
-- New: CREATE FUNCTION pg_catalog.citus_blocking_pids(pBlockedPid integer, pBlockedByPid integer)
```
Breaking change detected:
`UDF signature removed: pg_catalog.citus_blocking_pids(pblockedpid
integer) RETURNS integer[]`

**Test case 3:** GUC changes (breaking)

- Removed `citus.max_worker_nodes_tracked`
- Changed default value of `citus.max_shared_pool_size` from 0 to 4
Breaking change detected:

```
The default value of GUC citus.max_shared_pool_size was changed from 0 to 4
GUC citus.max_worker_nodes_tracked was removed
```

**Test case 4:** Table/view changes

- Dropped `pg_catalog.pg_dist_rebalance_strategy` and removed a column
from `pg_catalog.citus_lock_waits`

```
  - Column blocking_nodeid in table/view pg_catalog.citus_lock_waits was removed
  - Table/view pg_catalog.pg_dist_rebalance_strategy was removed
```

**Test case 5:** Remove a custom type 
- Dropped `cluster_clock` and the objects depend on it. In addition to
the dependent objects, test shows:
```
 - Type pg_catalog.cluster_clock was removed
```

Sample new job for build and test workflow (for release branches):

```
  test-citus-minor-upgrade:
    name: PG17 - check-citus-minor-upgrade
    runs-on: ubuntu-latest
    container:
      image: "${{ needs.params.outputs.citusupgrade_image_name }}:${{ fromJson(needs.params.outputs.pg17_version).full }}${{ needs.params.outputs.image_suffix }}"
      options: --user root
    needs:
    - params
    - build
    env:
      citus_version: 13.2
    steps:
    - uses: actions/checkout@v4
    - uses: "./.github/actions/setup_extension"
      with:
        skip_installation: true
    - name: Install and test citus minor version upgrade
      run: |-
        gosu circleci \
          make -C src/test/regress \
            check-citus-minor-upgrade \
            bindir=/usr/lib/postgresql/${PG_MAJOR}/bin \
            citus-pre-tar=/install-pg${PG_MAJOR}-citus${citus_version}.tar \
            citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar;
    - uses: "./.github/actions/save_logs_and_results"
      if: always()
      with:
        folder: ${{ env.PG_MAJOR }}_citus_minor_upgrade
    - uses: "./.github/actions/upload_coverage"
      if: always()
      with:
        flags: ${{ env.PG_MAJOR }}_citus_minor_upgrade
        codecov_token: ${{ secrets.CODECOV_TOKEN }}
```
2025-11-14 06:48:35 +00:00
Mehmet YILMAZ 8bba66f207
Fix EXPLAIN output in regression tests for consistency (#8332) 2025-11-13 11:16:33 +03:00
Mehmet YILMAZ f80fa1c83b
PG18 - Adjust columnar path tests for PG18 OR clause optimization (#8337)
fixes #8264

PostgreSQL 18 introduced a planner improvement (commit `ae4569161`) that
rewrites simple `OR` equality clauses into `= ANY(...)` forms, allowing
the use of a single index scan instead of multiple scans or a custom
scan.
This change affects the columnar path tests where queries like `a=0 OR
a=5` previously chose a Columnar or Seq Scan plan.

In this PR:

* Updated test expectations for `uses_custom_scan` and `uses_seq_scan`
to reflect the new index scan plan.

This keeps the test output consistent with PostgreSQL 18’s updated
planner behavior.
2025-11-13 09:32:21 +03:00
Mehmet YILMAZ 4244bc8516
PG18: Normalize verbose CREATE SUBSCRIPTION connect errors (#8326)
fixes #8317


0d8bd0a72e

PG18 changed the wording of connection failures during `CREATE
SUBSCRIPTION` to include a subscription prefix and a verbose “connection
to server … failed:” preamble. This breaks one regression output
(`multi_move_mx`). This PR adds normalization rules to map PG18 output
back to the prior form so results are stable across PG15–PG18.

**What changes**
Add two rules in `src/test/regress/bin/normalize.sed`:

```sed
# PG18: drop 'subscription "<name>"' prefix
# remove when PG18 is the minimum supported version
s/^[[:space:]]*ERROR:[[:space:]]+subscription "[^"]+" could not connect to the publisher:[[:space:]]*/ERROR: could not connect to the publisher: /I

# PG18: drop verbose 'connection to server … failed:' preamble
s/^[[:space:]]*ERROR:[[:space:]]+could not connect to the publisher:[[:space:]]*connection to server .* failed:[[:space:]]*/ERROR: could not connect to the publisher: /I
```

**Before (PG18)**

```
ERROR:  subscription "subs_01" could not connect to the publisher:
        connection to server at "localhost" (::1), port 57637 failed:
        root certificate file "/non/existing/certificate.crt" does not exist
```

**After normalization**

```
ERROR:  could not connect to the publisher:
        root certificate file "/non/existing/certificate.crt" does not exist
```

**Why**
Maintain identical regression outputs across supported PG versions while
Citus still supports PG<18.
2025-11-10 08:01:47 +00:00
Mehmet YILMAZ b2356f1c85
PG18: Make EXPLAIN ANALYZE output stable by routing through explain_filter and hiding footers (#8325)
PostgreSQL 18 adds a new line to text EXPLAIN with ANALYZE (`Index
Searches: N`). That extra line both creates noise and bumps psql’s `(N
rows)` footer. This PR keeps ANALYZE (so statements still execute) while
removing the version-specific churn in our regress outputs.

### What changed

* **Use `explain_filter(...)` instead of raw text EXPLAIN**

* In `local_shard_execution.sql` and
`local_shard_execution_replicated.sql`, replace direct:

    ```sql
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF)
<stmt>;
    ```

    with:

    ```sql
    \pset footer off
SELECT public.explain_filter('EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF,
TIMING OFF, BUFFERS OFF) <stmt>');
    \pset footer on
    ```
* Expected files updated accordingly to show the `explain_filter` output
block instead of raw EXPLAIN text.
* **Extend `explain_filter` to drop the PG18 line**

* Filter now removes any `Index Searches: <number>` line before
normalizing numeric fields, preventing the “N” version of the same line
from sneaking in.
* **Keep suite-wide normalizer intact**
2025-11-10 10:43:11 +03:00
manaldush daa69bec8f
Keep temp reloid for columnar cases (#8309)
Fixes https://github.com/citusdata/citus/issues/8235

PG18 and PG latest minors ignore temporary relations in
`RelidByRelfilenumber` (`RelidByRelfilenode` in PG15)
Relevant PG commit:
https://github.com/postgres/postgres/commit/86831952

Here we are keeping temp reloids instead of getting it with
RelidByRelfilenumber, for example, in some cases, we can directly get
reloid from relations, in other cases we keep it in some structures.

Note: there is still an outstanding issue with columnar temp tables in
concurrent sessions, that will be fixed in PR
https://github.com/citusdata/citus/pull/8252
2025-11-06 23:15:52 +03:00
Colm bc41e7b94f
PG18: fix query results diff in merge regress test. (#8323)
The `merge` regress test uses SQL functions which can be cached in PG18+
since commit
[0dca5d68d](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0dca5d68d7bebf2c1036fd84875533afef6df992).
Distributed plan's copy function did not include the
`sourceResultRepartitionColumnIndex` field, which is critical for MERGE
queries, and for cached distributed plans this field was always 0
leading to the problem (#8285). Ensuring it is copied fixes it. This was
an oversight in Citus, and not specific to PG18.
2025-11-06 12:31:43 +00:00
Mehmet YILMAZ b10aa02908
PG18: Stabilize EXPLAIN output by disabling ANALYZE in index-scan test (#8322)
fixes #8318 

PostgreSQL 18 started printing an extra line in `EXPLAIN (ANALYZE …)`
for index scans:

```
Index Searches: N
```

This makes our test output flap (extra line, footer row count changes)
while the intent of the test is simply to **prove we plan an index
scan**, not to assert runtime counters.

### What this PR does

  ```sql
  EXPLAIN (... analyze on, ...)
  ```

  to

  ```sql
  EXPLAIN (... analyze off, ...)
  ```

### Why this approach

* **Minimal change**: keeps the test’s purpose (“we exercise an index
scan”) without introducing normalization rules
* **Version-stable**: avoids PG18’s new text while remaining valid on
PG15–PG18.
* **No behavioral impact**: we still validate the plan node is an Index
Scan; we just don’t request execution.

### Before → After (essence)

Before (PG18):

```
Aggregate
  ->  Index Scan ... (actual rows=1 loops=1)
        Index Cond: ...
        Index Searches: 1
```

After:

```
Aggregate
  ->  Index Scan ...
        Index Cond: ...
```
2025-11-05 13:53:36 +00:00
Mehmet YILMAZ b63572d72f
PG18 deparser: map Vars through JOIN aliases (fixes whole-row join column names) (#8300)
fixes #8278

Please check issue:
https://github.com/citusdata/citus/issues/8278#issuecomment-3431707484


f4e7756ef9

### What PG18 changed

SELECT creates diff has a **named join**:

```sql
(...) AS unsupported_join (x,y,z,t,e,f,q)
```

On PG17, `COUNT(unsupported_join.*)` stayed as a single whole-row Var
that referenced the **join alias**.

On PG18, the parser expands that whole-row Var **early** into a
`ROW(...)` of **base** columns:

```
ROW(a.user_id, a.item_id, a.buy_count,
    b.id, b.it_name, b.k_no,
    c.id, c.it_name, c.k_no)
```

But since the join is *named*, inner aliases `a/b/c` are hidden.
Referencing them later blows up with
“invalid reference to FROM-clause entry for table ‘a’”.

### What this PR changes

1. **Retarget at `RowExpr` deparse (not in `get_variable`)**

* In `get_rule_expr()`’s `T_RowExpr` branch, each element `e` of
`ROW(...)` is examined.
* If `e` unwraps to a simple, same-level `Var` (`varlevelsup == 0`,
`varattno > 0`) and there is a **named `RTE_JOIN`** with
`joinaliasvars`, we **do not** change `varno/varattno`.
* Instead, we build a copy of the Var and set **`varnosyn/varattnosyn`**
to the matching join alias column (from `joinaliasvars`).
* Then we deparse that Var via `get_rule_expr_toplevel(...)`, which
naturally prints `join_alias.colname`.
* Scope is limited to **query deparsing** (`dpns->plan == NULL`),
exactly where PG18 expands whole-row vars into `ROW(...)` of base Vars.

2. **Helpers (PG18-only file)**

* `unwrap_simple_var(Node*)`: strips trivial wrappers (`RelabelType`,
`CoerceToDomain`, `CollateExpr`) to reveal a `Var`.
* `var_matches_base(const Var*, int varno, AttrNumber attno)`: matches
canonical or synonym identity.
* `dpns_has_named_join(const deparse_namespace*)`: fast precheck for any
named join with `joinaliasvars`.
* `map_var_through_join_alias(...)`: scans `joinaliasvars` to locate the
**JOIN RTE index + attno** for a 1:1 alias; the caller uses these to set
`varnosyn/varattnosyn`.

3. **Safety and non-goals**

   * **No effect on plan deparsing** (`dpns->plan != NULL`).
* **No change to semantic identity**: we leave `varno/varattno`
untouched; only set `varnosyn/varattnosyn`.
* Skip whole-row/system columns (`attno <= 0`) and non-simple join
columns (computed expressions).
* Works with named joins **with or without** an explicit column list (we
rely on `joinaliasvars`, not the alias collist).

### Reproducer

```sql
CREATE TABLE distributed_table(user_id int, item_id int, buy_count int);
CREATE TABLE reference_table(id int, it_name varchar(25), k_no int);
SELECT create_distributed_table('distributed_table', 'user_id');

SELECT COUNT(unsupported_join.*)
FROM (distributed_table a
      LEFT JOIN reference_table b ON true
      RIGHT JOIN reference_table c ON true)
     AS unsupported_join (x,y,z,t,e,f,q)
JOIN (reference_table d JOIN reference_table e ON true) ON true;
```

**Before (PG18):** deparser emitted `ROW(a.user_id, …)` → `ERROR:
invalid reference to FROM-clause entry for table "a"`
**After:** deparser emits
`ROW(unsupported_join.x, ..., unsupported_join.k_no)` → runs
successfully.

Now maps to `unsupported_join.<auto_col_names>` and runs.
2025-11-05 11:08:58 +00:00
Colm 7a7a0ba9c7
PG18: Fix missing Planning Fast Path Query DEBUG messages (#8320)
With PG18's GROUP RTE, queries that should have been eligible for fast
path planning were skipped because the fast path planner allows exactly
one range table only. This fix extends that to account for a GROUP RTE.
2025-11-04 12:33:21 +00:00
Colm 5a71f0d1ca
PG18: Print names in order in tables are not colocated error detail. (#8319)
Fixes #8275 by printing the names in order so that in every message
`DETAIL: x and y are not co-located` x precedes (or is lexicographically
less than) y.
2025-11-03 18:56:17 +00:00
Naisila Puka fa7ca79c6f
Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n) (#8240)
TupleDescAttr is available since PG11.
https://github.com/postgres/postgres/commit/2cd7084
PG18 simply forces you to use it, but there is no need to
guard it with pg18 version.
2025-11-03 21:39:11 +03:00
Naisila Puka 94653c1f4e
PG18 - Exclude child fk constraints from output (#8307)
Fixes #8280 
Similar to
https://github.com/citusdata/citus/commit/432b69e
2025-11-03 16:38:56 +03:00
Mehmet YILMAZ be2fcda071
PG18 - Normalize PG18 EXPLAIN: hide “Storage … Maximum Storage …” line (#8292)
fixes #8267 

* Extend `src/test/regress/bin/normalize.sed` to drop the new PG18
EXPLAIN instrumentation line:

  ```
  Storage: <Memory|Disk|Memory and Disk>  Maximum Storage: <size>
  ```

which appears under `Materialize`, some `CTE Scan`s, etc. when `ANALYZE`
is on.

**Why**

* PG18 added storage usage reporting for materialization/tuplestore
nodes. It’s useful for humans but creates noisy, non-semantic diffs in
regression output. There’s no EXPLAIN flag to suppress it, so we
normalize in tests instead. This PR wires that normalization into our
sed pipeline.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1eff8279d

**How**

* Add a narrowly scoped sed rule that matches only lines starting with
`Storage:` (keeping `Sort Method`, `Hash`, `Buffers`, etc. intact). Use
ERE compatible with `sed -Ef` and Python `re` (no POSIX character
classes), e.g.:

  ```
  /^[ \t]*Storage:[ \t].*$/d
  ```
2025-11-03 15:59:00 +03:00
Mehmet YILMAZ 61b491f0f4
PG18: Add _plan_json() test helper and switch columnar index tests to JSON plan checks (#8299)
fixes #8263

* Introduces `columnar_test_helpers._plan_json(q text) -> jsonb`, which
runs `EXPLAIN (FORMAT JSON, COSTS OFF, ANALYZE OFF)` and returns the
plan as `jsonb`. This lets us assert on plan structure instead of text
matching.

* Updates `columnar_indexes.sql` tests to detect whether any index-based
scan is used by searching the JSON plan’s `"Node Type"` (e.g., `Index
Scan`, `Index Only Scan`, `Bitmap Index Scan`).

## Notable changes

* New helper in `src/test/regress/sql/columnar_test_helpers.sql`:

* `EXECUTE format('EXPLAIN (FORMAT JSON, COSTS OFF, ANALYZE OFF) %s', q)
INTO j;`
  * Returns the `jsonb` plan for downstream assertions.
  
  ```sql
  SELECT NOT jsonb_path_exists(
columnar_test_helpers._plan_json('SELECT b FROM columnar_table WHERE b =
30000'),
'$[*].Plan.** ? (@."Node Type" like_regex "^(Index|Bitmap
Index).*Scan$")'
  ) AS uses_no_index_scan;
  ```

to verify no index scan occurs at the partial index boundary (`b =
30000`) and that an index scan is used where expected (`b = 30001`).

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-03 15:34:28 +03:00
Mehmet YILMAZ 6251eab9b7
PG18: Make SSL tests resilient & validate TLSv1.3 cipher config (#8298)
fixes #8277 


https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=45188c2ea

PostgreSQL 18 + newer OpenSSL builds surface `ssl_ciphers` as a **rule
string** (e.g., `HIGH:MEDIUM:+3DES:!aNULL`) instead of an expanded
cipher list. Our tests hard-pinned the literal list and started failing
on PG18. Also, with TLS 1.3 in the picture, we need to assert that
cipher configuration is sane without coupling to OpenSSL’s expansion.

**What changed**

* **sql/ssl_by_default.sql**

* Replace brittle `SHOW ssl_ciphers` string matching with invariant
checks:

    * non-empty ciphers: `current_setting('ssl_ciphers') <> ''`
* looks like a rule/list: `position(':' in
current_setting('ssl_ciphers')) > 0`
  * Run the same checks on **workers** via `run_command_on_workers`.
* Keep existing validations for `ssl=on`, `sslmode=require` in
`citus.node_conninfo`, and `pg_stat_ssl.ssl = true`.


* **expected/ssl_by_default.out**

* Update expected output to booleans for the new checks (less diff-prone
across PG/SSL variants).
2025-11-03 14:51:39 +03:00
Naisila Puka e0570baad6
PG18 - Remove REINDEX SYSTEM test because of aio debug lines (#8305)
Fixes #8268 

PG18 added core async I/O infrastructure. Relevant PG commit:
https://github.com/postgres/postgres/commit/da72269

In `pg16.sql test`, we tested `REINDEX SYSTEM` command, which would
later cause io debug lines in a command of `multi_insert_select.sql`
test, that runs _after_ `pg16.sql` test.

"This is expected because the I/O subsystem is repopulating its internal
callback pool after heavy catalog I/O churn caused by REINDEX SYSTEM." -
chatgpt (seems right 😄)

This PR removes `REINDEX SYSTEM` test as its not crucial anyway.
`REINDEX DATABASE` is sufficient for the purpose presented in pg16.sql
test.
2025-11-01 19:36:16 +00:00
Ivan Kush 503a2aba73
Move PushActiveSnapshot outside a for loop (#8253)
In the PR [8142](https://github.com/citusdata/citus/pull/8142) was added
`PushActiveSnapshot`.

This commit places it outside a for loop as taking a snapshot is a resource
heavy operation.
2025-10-31 21:20:12 +03:00
Vinod Sridharan 86010de733
Update GUC setting to not crash with ASAN (#8301)
The GUC configuration for SkipAdvisoryLockPermissionChecks had
misconfigured the settings for GUC_SUPERUSER_ONLY for PGC_SUSET - when
PostgreSQL running with ASAN, this fails when querying pg_settings due
to exceeding the size of the array GucContext_Names. Fix up this GUC
declaration to not crash with ASAN.
2025-10-31 10:21:58 +03:00
Colm 458299035b
PG18: fix regress test failures in subquery_in_targetlist.
The failing queries all have a GROUP BY, and the fix teaches the Citus recursive planner how to handle a PG18 GROUP range table in the outer query:
- In recursive query planning, don't recurse into subquery expressions in a GROUP BY clause
- Flatten references to a GROUP rte before creating the worker subquery in pushdown planning
- If a PARAM node points to a GROUP rte then tunnel through to the underlying expression
    
Fixes #8296.
2025-10-30 11:49:28 +00:00
Mehmet YILMAZ 188c182be4
PG18 - Enable NUMA syscalls in CI containers to fix PG18 numa.out regression test failures (#8258)
fixes #8246

PostgreSQL 18 introduced stricter NUMA page-inquiry permissions for the
`pg_shmem_allocations_numa` view.
Without the required kernel capabilities, the test fails with:

```
ERROR:  failed NUMA pages inquiry status: Operation not permitted
```

This PR updates our test containers to include the necessary privileges:

* Adds `--cap-add=SYS_NICE` and `--security-opt seccomp=unconfined`

When PostgreSQL’s new NUMA views (`pg_shmem_allocations_numa`,
`pg_buffercache_numa`) run, they call `move_pages()` to ask the kernel
which NUMA node holds each shared memory page.


https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8cc139bec

That syscall (`move_pages()`) requires `CAP_SYS_NICE` when inspecting
another process.

So: `--cap-add=SYS_NICE` grants the container permission to perform that
NUMA page query.


https://man7.org/linux/man-pages/man2/move_pages.2.html#:~:text=must%20be%20privileged%0A%20%20%20%20%20%20%20%20%20%20(-,CAP_SYS_NICE,-)%20or%20the%20real


`--security-opt seccomp=unconfined`

Docker containers still run under a seccomp filter which a kernel-level
sandbox that blocks many system calls entirely for safety.
The default Docker seccomp profile blocks `move_pages()` outright,
because it can expose kernel memory layout information.


https://docs.docker.com/engine/security/seccomp/#:~:text=You%20can%20pass-,unconfined,-to%20run%20a


**In combination**

Both flags are required for NUMA introspection inside a container:
- `SYS_NICE` → permission
- `seccomp=unconfined` → ability
2025-10-27 21:00:32 +03:00
Mehmet YILMAZ dba9379ea5
PG18: Normalize EXPLAIN ANALYZE output for drop “Index Searches” line (#8291)
fixes #8265


0fbceae841

PostgreSQL 18 started printing an extra line in `EXPLAIN (ANALYZE …)`
for index scans:

```
Index Searches: N
```

**normalize.sed**: add a rule to remove the PG18-only line

 ```
 /^\s*Index Searches:\s*\d+\s*$/d
 ```
2025-10-27 14:37:20 +03:00
Mehmet YILMAZ 785a87c659
PG18: adapt multi_subquery_misc expected output to SQL-function plan cache (#8289)
0dca5d68d7
fixes #8153 

```diff
/citus/src/test/regress/expected/multi_subquery_misc.out

 -- should error out
 SELECT sql_subquery_test(1,1);
-ERROR:  could not create distributed plan
-DETAIL:  Possibly this is caused by the use of parameters in SQL functions, which is not supported in Citus.
-HINT:  Consider using PL/pgSQL functions instead.
-CONTEXT:  SQL function "sql_subquery_test" statement 1
+ sql_subquery_test 
+-------------------
+               307
+(1 row)
+

```

PostgreSQL 18 changes planner behavior for inlining/parameter handling
in SQL functions (pg18 commit `0dca5d68d`). As a result, a query in
`multi_subquery_misc` that previously failed to create a distributed
plan now succeeds. This PR updates the regression **expected** file to
reflect the new outcome on PG18+.

### What changed

* Updated `src/test/regress/expected/multi_subquery_misc.out`:

  * Replaced the previous error block:

    ```
    ERROR:  could not create distributed plan
DETAIL: Possibly this is caused by the use of parameters in SQL
functions, which is not supported in Citus.
    HINT:   Consider using PL/pgSQL functions instead.
    CONTEXT: SQL function "sql_subquery_test" statement 1
    ```
  * with the actual successful result on PG18+:

    ```
    sql_subquery_test
    --------------------------------------------
    307
    (1 row)
    ```

* **PG < 18:** Behavior remains unchanged; the test still errors on
older versions.
* **PG ≥ 18:** The call succeeds; the updated expected output matches
actual results.

similar PR: https://github.com/citusdata/citus/pull/8184
2025-10-24 15:48:08 +03:00
Mehmet YILMAZ 95477e6d02
PG18 - Add BUFFERS OFF to remaining EXPLAIN calls (#8288)
fixes #8093 


c2a4078eba

- Enable buffer-usage reporting by default in `EXPLAIN ANALYZE` on
PostgreSQL 18 and above.
- Introduce the explicit `BUFFERS OFF` option in every existing
regression test to maintain pre-PG18 output consistency.
- This appends, `BUFFERS OFF` to all `EXPLAIN(...)` calls in
src/test/regress/sql and the corresponding .out files.
2025-10-24 15:09:49 +03:00
ibrahim halatci 5fc4cea1ce
pin PostgreSQL server development package version to 17 (#8286)
DESCRIPTION: pin PostgreSQL server development package version to 17
rather than full dev package which now pulls in 18 and Citus does not
yet support pg18
2025-10-21 11:32:32 +03:00
Colm bf959de39e
PG18: Fix diffs in EXPLAINs introduced by PR #8242 in pg18 goldfile (#8262) 2025-10-19 21:20:16 +01:00
Onur Tirtir 90f2ab6648
Actually deprecate mark_tables_colocated() 2025-10-17 11:57:36 +00:00
Colm 3ca66e1fcc
PG18: Fix "Unrecognized range table id" in INSERT .. SELECT planning (#8256)
The error `Unrecognized range table id` seen in regress test
`insert_select_into_local_tables` is a consequence of the INSERT ..
SELECT planner getting confused by a SELECT query with a GROUP BY and
hence a Group RTE, introduced in PG18 (commit 247dea89f). The solution
is to flatten the relevant parts of the SELECT query before preparing
the INSERT .. SELECT query tree for use by Citus.
2025-10-17 11:21:25 +01:00
Colm 5d71fca3b4
PG18 regress sanity: disable `enable_self_join_elimination` on queries (#8242)
.. involving Citus tables. Interim fix for #8217 to achieve regress
sanity with PG18. A complete fix will follow with PG18 feature
integration.
2025-10-17 10:25:33 +01:00
naisila 76f18624e5 PG18 - use systable_inplace_update_* in UpdateStripeMetadataRow
PG18 has removed heap_inplace_update(), which is crucial for
citus_columnar extension because we always want to update
stripe entries for columnar in-place.
Relevant PG18 commit:
https://github.com/postgres/postgres/commit/a07e03f

heap_inplace_update() has been replaced by
heap_inplace_update_and_unlock, which is used inside
systable_inplace_update_finish, which is used together with
systable_inplace_update_begin. This change has been back-patched
up to v12, which is enough for us since the oldest version
Citus supports is v15.

In PG<18, a deprecated heap_inplace_update() is retained,
however, let's start using the new functions because they are
better, and such that we don't need to wrap these changes in
PG18 version conditionals.

Basically, in this commit we replace the following:

SysScanDesc scanDescriptor = systable_beginscan(columnarStripes,
indexId, indexOk, &dirtySnapshot, 2, scanKey);
heap_inplace_update(columnarStripes, modifiedTuple);

with the following:

systable_inplace_update_begin(columnarStripes, indexId, indexOk,
NULL, 2, scanKey, &tuple, &state);
systable_inplace_update_finish(state, tuple);

For more understanding, it's best to refer to an example:
REL_18_0/src/backend/catalog/toasting.c#L349-L371
of how systable_inplace_update_begin and
systable_inplace_update_finish are used in PG18, because they
mirror the need of citus columnar.

Fixes #8207
2025-10-17 11:11:08 +03:00
naisila abd50a0bb8 Revert "PG18 - Adapt columnar stripe metadata updates (#8030)"
This reverts commit 5d805eb10b.
heap_inplace_update was incorrectly replaced by
CatalogTupleUpdate in 5d805eb. In Citus, we assume a stripe
entry with some columns set to null means that a write
is in-progress, because otherwise we wouldn't see a such row.
But this breaks when we use CatalogTupleUpdate because it
inserts a new version of the row, which leaves the
in-progress version behind. Among other things, this was
causing various issues in PG18 - check-columnar test.
2025-10-17 11:11:08 +03:00
eaydingol aa0ac0af60
Citus upgrade tests (#8237)
Expand the citus upgrade tests matrix:
PG15: v11.1.0 v11.3.0 v12.1.10 
PG16: v12.1.10

See https://github.com/citusdata/the-process/pull/174
2025-10-15 15:28:44 +03:00
Naisila Puka 432b69eb9d
PG18 - fix naming diffs of child FK constraints (#8247)
PG18 changed the names generated for child foreign key constraints.
https://github.com/postgres/postgres/commit/3db61db48

The test failures in Citus regression suite are all changing the name of
a constraint from `'sensors%'` to `'%to_parent%_1'`: the naming is very
nice here because `to_parent` means that we have a foreign key to a
parent table.

To fix the diff, we exclude those constraints from the output. To verify
correctness, we still count the problematic constraints to make sure
they are there - we are simply removing them from the first output (we
add this count query right after the previous one)

Fixes #8126

Co-authored-by: Mehmet YILMAZ <mehmety87@gmail.com>
2025-10-13 13:33:38 +03:00
Naisila Puka f1dd976a14
Fix vanilla tests with domain creation (#8238)
Qualify create domain stmt after local execution, to avoid such diffs in
PG vanilla tests:

```diff
create domain d_fail as anyelement;
-ERROR:  "anyelement" is not a valid base type for a domain
+ERROR:  "pg_catalog.anyelement" is not a valid base type for a domain
```

These tests were newly added in PG18, however this is not new PG18
behavior, just some added tests.
https://github.com/postgres/postgres/commit/0172b4c94

Fixes #8042
2025-10-10 15:34:32 +03:00
Naisila Puka 351cb2044d
PG18 - define some EXPLAIN funcs and structs only in PG17 (#8239)
PG18 changed the visibility of various Explain Serialize functions and
structs to `extern`. Previously, for PG17 support, these were `static`,
so we had to copy paste their definitions from `explain.c` to Citus's
`multi_explain.c`.
Relevant PG18 commits:
https://github.com/postgres/postgres/commit/555960a0
https://github.com/postgres/postgres/commit/77cb08be

Now we don't need to define the following anymore in Citus, since they
are extern in PG18:
- typedef struct SerializeMetrics
- void ExplainIndentText(ExplainState *es);
- SerializeMetrics GetSerializationMetrics(DestReceiver *dest);
- typedef struct SerializeDestReceiver (this is not extern, however it
is only used by GetSerializationMetrics function)

This was incorrectly handled in
https://github.com/citusdata/citus/commit/9e42f3f2c
by wrapping these definitions and usages in PG17 only,
causing such diffs in PG18 (not able to see serialization at all):
```diff
citus/src/test/regress/expected/pg17.out

select public.explain_filter('explain (analyze,
serialize binary,buffers,timing) select * from int8_tbl i8');
...
              Planning Time: N.N ms
-             Serialization: time=N.N ms  output=NkB  format=binary
              Execution Time: N.N ms
  Planning Time: N.N ms
  Serialization: time=N.N ms  output=NkB  format=binary
  Execution Time: N.N ms
-(14 rows)
+(13 rows)
```
2025-10-10 15:05:47 +03:00
Naisila Puka 287abea661
PG18 compatibility - varreturningtype additions (#8231)
This PR solves the following diffs, originating from the addition of
`varreturningtype` field to the `Var` struct in PG18:
https://github.com/postgres/postgres/commit/80feb727c

Previously we didn't account for this new field (as it's new), so this
wouldn't allow the parser to correctly reconstruct the `Var` node
structure, but rather it would error out with `did not find '}' at end
of input node`:

```diff
 SELECT column_to_column_name(logicalrelid, partkey)
 FROM pg_dist_partition WHERE partkey IS NOT NULL ORDER BY 1 LIMIT 1;
- column_to_column_name
----------------------------------------------------------------------
- a
-(1 row)
-
+ERROR:  did not find '}' at end of input node
```

Solution follows precedent https://github.com/citusdata/citus/pull/7107,
when varnullingrels field was added to the `Var` struct in PG16.

The solution includes:
- Taking care of the `partkey` in `pg_dist_partition` table because it's
coming from the `Var` struct. This mainly includes fixing the upgrade
script to PG18, by saving all the `partkey` infos before upgrading to
PG18 (in `citus_prepare_pg_upgrade`), and then re-generating `partkey`
columns in `pg_dist_partition` (using `UPDATE`) after upgrading to PG18
(in `citus_finish_pg_upgrade`).
- Adding a normalize rule to fix output differences among PG versions.
Note that we need two normalize lines: one for PG15 since it doesn't
have `varnullingrels`, and one for PG16/PG17.
- Small trick on `metadata_sync_helpers` to use different text when
generating the `partkey`, based on the PG version.

Fixes #8189
2025-10-09 17:35:03 +03:00
Naisila Puka f0014cf0df
PG18 compatibility: misc output diffs pt2 (#8234)
3 minor changes to reduce some noise from the regression diffs.

1 - Reduce verbosity when ALTER EXTENSION fails
PG18 has improved reporting of errors in extension script files
Relevant PG commit:
https://github.com/postgres/postgres/commit/774171c4f
There was more context in PG18, so reducing verbosity
```
ALTER EXTENSION citus UPDATE TO '11.0-1';
 ERROR:  cstore_fdw tables are deprecated as of Citus 11.0
 HINT:  Install Citus 10.2 and convert your cstore_fdw tables to the
        columnar access method before upgrading further
 CONTEXT:  PL/pgSQL function inline_code_block line 4 at RAISE
+SQL statement "DO LANGUAGE plpgsql
+$$
+BEGIN
+    IF EXISTS (SELECT 1 FROM pg_dist_shard where shardstorage = 'c') THEN
+     RAISE EXCEPTION 'cstore_fdw tables are deprecated as of Citus 11.0'
+        USING HINT = 'Install Citus 10.2 and convert your cstore_fdw tables
                       to the columnar access method before upgrading further';
+ END IF;
+END;
+$$"
+extension script file "citus--10.2-5--11.0-1.sql", near line 532
```

2 - Fix backend type order in tests for PG18
PG18 added another backend type which messed the order
in this test
Adding a separate IF condition for PG18
Relevant PG commit:
https://github.com/postgres/postgres/commit/18d67a8d7d

3 - Ignore "DEBUG: find_in_path" lines in output
Relevant PG commit:
https://github.com/postgres/postgres/commit/4f7f7b0375
The new GUC extension_control_path specifies a path to look for
extension control files.
2025-10-09 16:50:41 +03:00
Naisila Puka d9652bf5f9
PG18 compatibility: misc output diffs (#8233)
6 minor changes to reduce some noise from the regression diffs.

1 - Add ORDER BY to fix subquery_in_where diff

2 - Disable buffers in explain analyze calls
Leftover work from
https://github.com/citusdata/citus/commit/f1f0b09f7

3 - Reduce verbosity to avoid diffs between PG versions
Relevant PG commit:
https://github.com/postgres/postgres/commit/0dca5d68d7
diff was:
```
CALL test_procedure_commit(2,5);
 ERROR:  COMMIT is not allowed in an SQL function
-CONTEXT:  SQL function "test_procedure_commit" during startup
+CONTEXT:  SQL function "test_procedure_commit" statement 2
```

4 - Rename array_sort to array_sort_citus since PG18 added array_sort
Relevant PG commit:
https://github.com/postgres/postgres/commit/6c12ae09f5a
Diff we were seeing in multi_array_agg, because the PG18 test was using
PG18's array_sort function instead:
```
-- Check that we return NULL in case there are no input rows to array_agg()
 SELECT array_sort(array_agg(l_orderkey))
     FROM lineitem WHERE l_orderkey < 0;
  array_sort
 ------------
- {}
+
 (1 row)
```

5 - Exclude not-null constraints from output to avoid diffs
PG18 has added pg_constraint rows for not-null constraints
Relevant PG commit
https://github.com/postgres/postgres/commit/14e87ffa5c
Remove them by condition contype <> 'n'

6 - Reduce verbosity to avoid md5 pwd deprecation warning in PG18
PG18 has deprecated MD5 passwords
Relevant PG commit:
https://github.com/postgres/postgres/commit/db6a4a985

Fixes #8154 
Fixes #8157
2025-10-08 13:23:55 +03:00
Naisila Puka 77d5807fd6
Update changelog with 12.1.9, 12.1.10, 13.0.5, 13.1.1 entries (#8224) 2025-10-07 22:33:12 +03:00
Naisila Puka 2a6414c727
PG18: use data-checksums by default in upgrades (#8230)
Checksums are now on by default in PG18: 04bec894a

Upgrade to PG18 fails with the following error:
`old cluster does not use data checksums but the new one does`

To overcome this error, we add --data-checksums option such that
clusters with PG less than 18 also use data checksums.

Fixes #8229
2025-10-07 12:17:08 +03:00
Naisila Puka c5dde4b115
Fix crash on create statistics with non-RangeVar type pt2 (#8227)
Fixes #8225 
very similar to #8213 
Also the error message changed between pg18rc1 and pg18.0
2025-10-07 11:56:20 +03:00
Naisila Puka 5a3648b2cb
PG18: fix yet another unregistered snapshot crash (#8228)
PG18 added an assertion that a snapshot is active or registered before
it's used. Relevant PG commit
8076c00592

Fixes #8209
2025-10-07 11:31:41 +03:00
Mehmet YILMAZ d4dfdd765b
PG18 - Normalize \d+ output in PG18 by filtering “Not-null constraints” blocks (#8183)
DESCRIPTION: Normalize \d+ output in PG18 by filtering “Not-null
constraints” blocks
fixes #8095 


**PR Description**
Postgres 18 started representing column `NOT NULL` as named constraints
in `pg_constraint`, and `psql \d+` now prints them under a `Not-null
constraints:` section. This caused extra diffs in our regression tests.

14e87ffa5c

This PR updates the normalization rules to strip those sections during
diff filtering by adding two regex rules:

* remove the `Not-null constraints:` header
* remove any indented constraint lines ending in `_not_null`
2025-10-02 13:48:27 +03:00
Mehmet YILMAZ cec1848b13
PG18: adapt multi_sql_function expected output to SQL-function plan cache (#8184)
0dca5d68d7
fixes #8153 

```diff
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_sql_function.out /__w/citus/citus/src/test/regress/results/multi_sql_function.out
--- /__w/citus/citus/src/test/regress/expected/multi_sql_function.out.modified	2025-08-25 12:43:24.373634581 +0000
+++ /__w/citus/citus/src/test/regress/results/multi_sql_function.out.modified	2025-08-25 12:43:24.383634533 +0000
@@ -317,24 +317,25 @@
 $$ LANGUAGE SQL STABLE;
 INSERT INTO test_parameterized_sql VALUES(1, 1);
 -- all of them should fail
 SELECT * FROM test_parameterized_sql_function(1);
 ERROR:  cannot perform distributed planning on this query because parameterized queries for SQL functions referencing distributed tables are not supported
 HINT:  Consider using PL/pgSQL functions instead.
 SELECT (SELECT 1 FROM test_parameterized_sql limit 1) FROM test_parameterized_sql_function(1);
 ERROR:  cannot perform distributed planning on this query because parameterized queries for SQL functions referencing distributed tables are not supported
 HINT:  Consider using PL/pgSQL functions instead.
 SELECT test_parameterized_sql_function_in_subquery_where(1);
-ERROR:  could not create distributed plan
-DETAIL:  Possibly this is caused by the use of parameters in SQL functions, which is not supported in Citus.
-HINT:  Consider using PL/pgSQL functions instead.
-CONTEXT:  SQL function "test_parameterized_sql_function_in_subquery_where" statement 1
+ test_parameterized_sql_function_in_subquery_where 
+---------------------------------------------------
+                                                 1
+(1 row)
+
```

allows custom vs. generic plans for SQL functions; arguments can be
folded to consts, enabling more rewrites/optimizations (and in your
case, routable Citus plans)

seems that P18 rewrote how LANGUAGE SQL functions are planned/executed:
they now go through the plan cache (like PL/pgSQL does) and can produce
custom plans with the function arguments substituted as constants. That
means your call
SELECT test_parameterized_sql_function_in_subquery_where(1);
is planned with org_id_val = 1 baked in, so Citus no longer sees an
unresolved Param inside the function body and is able to build a
distributed plan instead of tripping the old “params in SQL functions”
error path.


**What’s in here**
- Update `expected/multi_sql_function.out` to reflect PG18 behavior
- Add `expected/multi_sql_function_0.out` as an alternate expected file
that retains the pre-PG18 error output for the same test
2025-10-01 16:03:21 +03:00
Naisila Puka bb840e58a7
Fix crash on create statistics with non-RangeVar type (#8213)
This crash has been there for a while but wasn't tested before pg18.

PG18 added this test:
CREATE STATISTICS tst ON a FROM (VALUES (x)) AS foo;

which tries to create statistics on a derived-on-the-fly table (which is
not allowed) However Citus assumes we always have a valid table when
intercepting CREATE STATISTICS command to check for Citus tables
Added a check to return early if needed.

pg18 commit: https://github.com/postgres/postgres/commit/3eea4dc2c

Fixes #8212
2025-10-01 00:09:11 +03:00
Onur Tirtir 5eb1d93be1
Properly detect no-op shard-key updates via UPDATE / MERGE (#8214)
DESCRIPTION: Fixes a bug that causes allowing UPDATE / MERGE queries
that may change the distribution column value.

Fixes: #8087.

Probably as of #769, we were not properly checking if UPDATE
may change the distribution column.

In #769, we had these checks:
```c
	if (targetEntry->resno != column->varattno)
	{
		/* target entry of the form SET some_other_col = <x> */
		isColumnValueChanged = false;
	}
	else if (IsA(setExpr, Var))
	{
		Var *newValue = (Var *) setExpr;
		if (newValue->varattno == column->varattno)
		{
			/* target entry of the form SET col = table.col */
			isColumnValueChanged = false;
		}
	}
```

However, what we check in "if" and in the "else if" are not so
different in the sense they both attempt to verify if SET expr
of the target entry points to the attno of given column. So, in
#5220, we even removed the first check because it was redundant.
Also see this PR comment from #5220:
https://github.com/citusdata/citus/pull/5220#discussion_r699230597.
In #769, probably we actually wanted to first check whether both
SET expr of the target entry and given variable are pointing to the
same range var entry, but this wasn't what the "if" was checking,
so removed.

As a result, in the cases that are mentioned in the linked issue,
we were incorrectly concluding that the SET expr of the target
entry won't change given column just because it's pointing to the
same attno as given variable, regardless of what range var entries
the column and the SET expr are pointing to. Then we also started
using the same function to check for such cases for update action
of MERGE, so we have the same bug there as well.

So with this PR, we properly check for such cases by comparing
varno as well in TargetEntryChangesValue(). However, then some of
the existing tests started failing where the SET expr doesn't
directly assign the column to itself but the "where" clause could
actually imply that the distribution column won't change. Even before
we were not attempting to verify if "where" cluse quals could imply a
no-op assignment for the SET expr in such cases but that was not a
problem. This is because, for the most cases, we were always qualifying
such SET expressions as a no-op update as long as the SET expr's
attno is the same as given column's. For this reason, to prevent
regressions, this PR also adds some extra logic as well to understand
if the "where" clause quals could imply that SET expr for the
distribution key is a no-op.

Ideally, we should instead use "relation restriction equivalence"
mechanism to understand if the "where" clause implies a no-op
update. This is because, for instance, right now we're not able to
deduce that the update is a no-op when the "where" clause transitively
implies a no-op update, as in the case where we're setting "column a"
to "column c" and where clause looks like:
  "column a = column b AND column b = column c".
If this means a regression for some users, we can consider doing it
that way. Until then, as a workaround, we can suggest adding additional
quals to "where" clause that would directly imply equivalence.

Also, after fixing TargetEntryChangesValue(), we started successfully
deducing that the update action is a no-op for such MERGE queries:
```sql
MERGE INTO dist_1
USING dist_1 src
ON (dist_1.a = src.b)
WHEN MATCHED THEN UPDATE SET a = src.b;
```
However, we then started seeing below error for above query even
though now the update is qualified as a no-op update:
```
ERROR:  Unexpected column index of the source list
```
This was because of #8180 and #8201 fixed that.

In summary, with this PR:

* We disallow such queries,
  ```sql
  -- attno for dist_1.a, dist_1.b: 1, 2
  -- attno for dist_different_order_1.a, dist_different_order_1.b: 2, 1
  UPDATE dist_1 SET a = dist_different_order_1.b
  FROM dist_different_order_1
  WHERE dist_1.a dist_different_order_1.a;

  -- attno for dist_1.a, dist_1.b: 1, 2
  -- but ON (..) doesn't imply a no-op update for SET expr
  MERGE INTO dist_1
  USING dist_1 src
  ON (dist_1.a = src.b)
  WHEN MATCHED THEN UPDATE SET a = src.a;
  ```

* .. and allow such queries,
  ```sql
  MERGE INTO dist_1
  USING dist_1 src
  ON (dist_1.a = src.b)
  WHEN MATCHED THEN UPDATE SET a = src.b;
  ```
2025-09-30 10:13:47 +00:00
Naisila Puka de045402f3
PG18 - register snapshot where needed (#8196)
Register and push snapshots as needed per the relevant PG18 commits

8076c00592
https://github.com/postgres/postgres/commit/706054b

`citus_split_shard_columnar_partitioned`, `multi_partitioning` tests are
handled.

Fixes #8195
2025-09-26 18:04:34 +03:00
Colm McHugh 81776fe190 Fix crash in Range Table identity check.
The range table entry array created by the Postgres planner for each
SELECT in a query may have NULL entries as of PG18. Add a NULL check
to skip over these when looking for matches in rte identities.
2025-09-26 14:53:30 +01:00
Colm 80945212ae
PG18 regress sanity: update pg18 ruleutils with fix #7675 (#8216)
Fix deparsing of UPDATE statements with indirection (#7675) involved
changing ruleutils of our supported Postgres versions. It means that
when integrating a new Postgres version we need to update its ruleutils
with the relevant parts of #7675; basically PG ruleutils needs to call
the `citus_ruleutils.c` functions added by #7675.
2025-09-26 13:19:47 +01:00
Onur Tirtir 83b25e1fb1
Fix unexpected column index error for repartitioned merge (#8201)
DESCRIPTION: Fixes a bug that causes an unexpected error when executing
repartitioned merge.

Fixes #8180.

This was happening because of a bug in
SourceResultPartitionColumnIndex(). And to fix it, this PR avoids
using DistributionColumnIndex() in SourceResultPartitionColumnIndex().
Instead, invents FindTargetListEntryWithVarExprAttno(), which finds
the index of the target entry in the source query's target list that
can be used to repartition the source for a repartitioned merge. In
short, to find the source target entry that refences the Var used in
ON (..) clause and that references the source rte, we should check the
varattno of the underlying expr, which presumably is always a Var for
repartitioned merge as we always wrap the source rte with a subquery,
where all target entries point to the columns of the original source
relation.

Using DistributionColumnIndex() prior to 13.0 wasn't causing such an
issue because prior to 13.0, the varattno of the underlying expr of
the source target entries was almost (*1) always equal to resno of the
target entry as we were including all target entries of the source
relation. However, starting with #7659, which is merged to main before
13.0, we started using CreateFilteredTargetListForRelation() instead of 
CreateAllTargetListForRelation() to compute the target entry list for
the source rte to fix another bug. So we cannot revert to using
CreateAllTargetListForRelation() because otherwise we would re-introduce
bug that it helped fixing, so we instead had to find a way to properly
deal with the "filtered target list"s, as in this commit. Plus (*1),
even before #7659, probably we would still fail when the source relation
has dropped attributes or such because that would probably also cause
such a mismatch between the varattno of the underlying expr of the
target entry and its resno.
2025-09-23 11:17:51 +00:00
Colm b5e70f56ab Postgres 18: Fix regress tests caused by GROUP RTE. (#8206)
The change in `merge_planner.c` fixes _unrecognized range table entry_
diffs in merge regress tests (category 2 diffs in #7992), the change in
`multi_router_planner.c` fixes _column reference ... is ambiguous_ diffs
in `multi_insert_select` and `multi_insert_select_window` (category 3
diffs in #7992). Edit to `common.py` enables standalone regress tests
with pg18 (e..g `citus_tests/run_test.py merge`).
2025-09-22 16:13:59 +03:00
Colm d2ea4043d4 Postgres 18: fix 'column does not exist' errors in grouping regress tests. (#8199)
DESCRIPTION: Fix 'column does not exist' errors in grouping regress
tests.

Postgres 18's GROUP RTE was being ignored by query pushdown planning
when constructing the query tree for the worker subquery. The solution
is straightforward - ensure the worker subquery tree has the same
groupRTE property as the original query. Postgres ruleutils then does
the right thing when generating the pushed down query. Fixes category 1
in #7992.
2025-09-22 16:13:59 +03:00
Mehmet YILMAZ 10d62d50ea
Stabilize table_checks across PG15–PG18: switch to pg_constraint, remove dupes, exclude NOT NULL (#8140)
DESCRIPTION: Stabilize table_checks across PG15–PG18: switch to
pg_constraint, remove dupes, exclude NOT NUL

fixes #8138
fixes #8131 

**Problem**

```diff
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out
--- /__w/citus/citus/src/test/regress/expected/multi_create_table_constraints.out.modified	2025-08-18 12:26:51.991598284 +0000
+++ /__w/citus/citus/src/test/regress/results/multi_create_table_constraints.out.modified	2025-08-18 12:26:52.004598519 +0000
@@ -403,22 +403,30 @@
     relid = 'check_example_partition_col_key_365068'::regclass;
     Column     |  Type   |  Definition   
 ---------------+---------+---------------
  partition_col | integer | partition_col
 (1 row)
 
 SELECT "Constraint", "Definition" FROM table_checks WHERE relid='public.check_example_365068'::regclass;
              Constraint              |            Definition             
 -------------------------------------+-----------------------------------
  check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
+ check_example_other_col_check       | CHECK other_col >= 100
  check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
-(2 rows)
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+ check_example_other_other_col_check | CHECK abs(other_other_col) >= 100
+(10 rows)
```

On PostgreSQL 18, `NOT NULL` is represented as a cataloged constraint
and surfaces through `information_schema.check_constraints`.
14e87ffa5c
Our helper view `table_checks` (built on
`information_schema.check_constraints` + `constraint_column_usage`)
started returning:

* Extra `…_not_null` rows (noise for our tests)
* Duplicate rows for real CHECKs due to the one-to-many join via
`constraint_column_usage`
* Occasional literal formatting differences (e.g., dates) coming from
the information\_schema deparser

### What changed

1. **Rewrite `table_checks` to use system catalogs directly**
We now select only expression-based, table-level constraints—excluding
NOT NULL—by filtering on `contype <> 'n'` and requiring `conbin IS NOT
NULL`. This yields the same effective set as real CHECKs while remaining
future-proof against non-CHECK constraint types.

```sql
CREATE OR REPLACE VIEW table_checks AS
SELECT
  c.conname AS "Constraint",
  'CHECK ' ||
  -- drop a single pair of outer parens if the deparser adds them
  regexp_replace(pg_get_expr(c.conbin, c.conrelid, true), '^\((.*)\)$', '\1')
    AS "Definition",
  c.conrelid AS relid
FROM pg_catalog.pg_constraint AS c
WHERE c.contype <> 'n'         -- drop NOT NULL (PG18)
  AND c.conbin IS NOT NULL     -- only expression-bearing constraints (i.e., CHECKs)
  AND c.conrelid <> 0          -- table-level only (exclude domains)
ORDER BY "Constraint", "Definition";
```

Why this filter?

* `contype <> 'n'` excludes PG18’s NOT NULL rows.
* `conbin IS NOT NULL` restricts to expression-backed constraints
(CHECKs); PK/UNIQUE/FK/EXCLUSION don’t have `conbin`.
* `conrelid <> 0` removes domain constraints.

2. **Add a PG18-specific regression test for `contype = 'n'`**
   New test (`pg18_not_null_constraints`) verifies:

* Coordinator tables have `n` rows for NOT NULL (columns `a`, `c`),
* A worker shard has matching `n` rows,
* Dropping a NOT NULL on the coordinator propagates to shards (count
goes from 2 → 1),
* `table_checks` *never* reports NOT NULL, but does report a real CHECK
added for the test.

---

### Why this works (PG15–PG18)

* **Stable source of truth:** Directly reads `pg_constraint` instead of
`information_schema`.
* **No duplicates:** Eliminates the `constraint_column_usage` join,
removing multiplicity.
* **No NOT NULL noise:** PG18’s `contype = 'n'` is filtered out by
design.
* **Deterministic text:** Uses `pg_get_expr` and strips a single outer
set of parentheses for consistent output.

---

### Impact on tests

* Removes spurious `…_not_null` entries and duplicate `checky_…` rows
(e.g., in `multi_name_lengths` and similar).
* Existing expected files stabilize without adding brittle
normalizations.
* New PG18 test asserts correct catalog behavior and Citus propagation
while remaining a no-op on earlier PG versions.

---
2025-09-22 15:50:32 +03:00
Naisila Puka b4cb1a94e9
Bump citus and citus_columnar to 14.0devel (#8170) 2025-09-19 12:54:55 +03:00
Naisila Puka becc02b398
Cleanup from dropping pg14 in merge isolation tests (#8204)
These alternative test outputs are redundant since we have dropped PG14
support on main.
2025-09-19 12:01:29 +03:00
eaydingol 360fbe3b99
Technical document update for outer join pushdown (#8200)
Outer join pushdown entry and an example.
2025-09-17 17:01:45 +03:00
Mehmet YILMAZ b58af1c8d5
PG18: stabilize constraint-name tests by filtering pg_constraint on contype (#8185)
14e87ffa5c

PostgreSQL 18 now records column `NOT NULL` constraints in
`pg_constraint` (`contype = 'n'`). That means queries that previously
listed “all constraints” for a relation now return extra rows, causing
noisy diffs in Citus regression tests. This PR narrows each catalog
probe to the specific constraint type under test
(PK/UNIQUE/EXCLUDE/CHECK), keeping results stable across PG15–PG18.

## What changed

* Update
`src/test/regress/sql/multi_alter_table_add_constraints_without_name.sql`
to:

* Add `AND con.contype IN ('p'|'u'|'x'|'c')` in each query, matching the
constraint just created.
  * Join namespace via `rel.relnamespace` for robustness.
* Refresh
`src/test/regress/expected/multi_alter_table_add_constraints_without_name.out`
to reflect the filtered results.

## Why

* PG18 adds named `NOT NULL` entries to `pg_constraint`, which
previously lived only in `pg_attribute`. Tests that select from
`pg_constraint` without filtering now see extra rows (e.g.,
`*_not_null`), breaking expectations. Filtering by `contype` validates
exactly what the test intends (PK/UNIQUE/EXCLUDE/CHECK
naming/propagation) and ignores unrelated `NOT NULL` rows.



```diff
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/multi_alter_table_add_constraints_without_name.out /__w/citus/citus/src/test/regress/results/multi_alter_table_add_constraints_without_name.out
--- /__w/citus/citus/src/test/regress/expected/multi_alter_table_add_constraints_without_name.out.modified	2025-09-11 14:36:52.521254512 +0000
+++ /__w/citus/citus/src/test/regress/results/multi_alter_table_add_constraints_without_name.out.modified	2025-09-11 14:36:52.549254440 +0000
@@ -20,34 +20,36 @@
 
 ALTER TABLE AT_AddConstNoName.products ADD PRIMARY KEY(product_no);
 SELECT con.conname
     FROM pg_catalog.pg_constraint con
       INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid
       INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace
 	      WHERE rel.relname = 'products';
            conname            
 ------------------------------
  products_pkey
-(1 row)
+ products_product_no_not_null
+(2 rows)
 
 -- Check that the primary key name created on the coordinator is sent to workers and
 -- the constraints created for the shard tables conform to the <conname>_shardid naming scheme.
 \c - - :public_worker_1_host :worker_1_port
 SELECT con.conname
     FROM pg_catalog.pg_constraint con
       INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid
       INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace
 		WHERE rel.relname = 'products_5410000';
                conname                
 --------------------------------------
+ products_5410000_product_no_not_null
  products_pkey_5410000
-(1 row)
+(2 rows)
```

after pr:
https://github.com/citusdata/citus/actions/runs/17697415668/job/50298622183#step:5:265
2025-09-17 14:12:15 +03:00
Mehmet YILMAZ 4012e5938a
PG18 - normalize PG18 “RESTRICT” FK error wording to legacy form (#8188)
fixes #8186


086c84b23d

PG18 emitting a more specific message for foreign-key violations when
the action is `RESTRICT` (SQLSTATE 23001), e.g.
`violates RESTRICT setting of foreign key constraint ...` and `Key (...)
is referenced from table ...`.
Older versions printed the generic FK text (SQLSTATE 23503), e.g.
`violates foreign key constraint ...` and `Key (...) is still referenced
from table ...`.

This change was causing noisy diffs in our regression tests (e.g.,
`multi_foreign_key.out`).
To keep a single set of expected files across PG15–PG18, this PR adds
two normalization rules to the test filter:

```sed
# PG18 FK wording -> legacy generic form
s/violates RESTRICT setting of foreign key constraint/violates foreign key constraint/g

# DETAIL line: "is referenced" -> "is still referenced"
s/\<is referenced from table\>/is still referenced from table/g
```

**Scope / impact**

* Test-only change; runtime behavior is unaffected.
* Keeps outputs stable across PG15–PG18 without version-splitting
expected files.
* Rules are narrowly targeted to the FK wording introduced in PG18.

with pr:
https://github.com/citusdata/citus/actions/runs/17698469722/job/50300960878#step:5:252
2025-09-17 10:46:36 +03:00
Mehmet YILMAZ 8bb8b2ce2d
Remove Code Climate coverage upload steps from GitHub Actions workflow (#8182)
DESCRIPTION: Remove Code Climate coverage upload steps from GitHub
Actions workflow

CI: remove Code Climate coverage reporting (cc-test-reporter) and
related jobs; keep Codecov as source of truth

* **Why**
Code Climate’s test-reporter has been archived; their download/API path
is no longer served, which breaks our CC upload step (`cc-test-reporter
…` ends up downloading HTML/404).

* **What changed**

* Drop the Code Climate formatting/artifact steps from the composite
action `.github/actions/upload_coverage/action.yml`.
* Delete the `upload-coverage` job that aggregated and pushed to Code
Climate (`cc-test-reporter sum-coverage` / `upload-coverage`).


* **Impact**

  * Codecov uploads remain; coverage stays visible via Codecov.
  * No test/build behavior change—only removes a failing reporter path.
2025-09-15 13:53:35 +03:00
Colm b7bfe42f1a
Document delayed fast path planning in README (#8176)
Added detailed explanation of delayed fast path planning in Citus 13.2,
including conditions and processes involved.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-09-09 11:54:13 +01:00
Onur Tirtir 0c658b73fc
Fix an assertion failure in Citus maintenance daemon that can happen in very slow systems (#8158)
Fixes #5808.

DESCRIPTION: Fixes an assertion failure in Citus maintenance daemon that
can happen in very slow systems.

Try running `make -C src/test/regress/ check-multi-1-vg` - while the
tests will exit with code 2 at least %50 of the times in the very early
stages of the test suite by producing a core-dump on main, it won't be
the case on this branch, at least based on my trials :)
2025-09-04 12:13:57 +00:00
manaldush 2834fa26c9
Fix an undefined behavior for bit shift in citus_stat_tenants.c (#7954)
DESCRIPTION: Fixes an undefined behavior that could happen when
computing tenant score for citus_stat_tenants

Add check for shift size, reset to zero in case of overflow

Fixes #7953.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-09-04 10:57:45 +00:00
Onur Tirtir 8ece8acac7
Check citus version in citus_promote_clone_and_rebalance (#8169) 2025-08-29 11:19:50 +03:00
Naisila Puka 0fd95d71e4
Order same frequency common values, and add test (#8167)
Added similar test to what @colm-mchugh tested in the original PR
https://github.com/citusdata/citus/pull/8026#discussion_r2279021218
2025-08-29 01:41:32 +03:00
Naisila Puka d5f0ec5cd1
Fix invalid input syntax for type bigint (#8166)
Fixes #8164
2025-08-29 01:01:18 +03:00
Naisila Puka 544b6c4716
Add GUC for queries with outer joins and pseudoconstant quals (#8163)
Users can turn on this GUC at their own risk.
2025-08-27 22:31:22 +03:00
Onur Tirtir 2e1de77744
Also use pid in valgrind logfile name (#8150)
Also use pid in valgrind logfile name to avoid overwriting the valgrind
logs due to the memory errors that can happen in different processes
concurrently:

(from https://valgrind.org/docs/manual/manual-core.html)
```
--log-file=<filename>
Specifies that Valgrind should send all of its messages to the specified file. If the file name is empty, it causes an abort. There are three special format specifiers that can be used in the file name.

%p is replaced with the current process ID. This is very useful for program that invoke multiple processes. WARNING: If you use --trace-children=yes and your program invokes multiple processes OR your program forks without calling exec afterwards, and you don't use this specifier (or the %q specifier below), the Valgrind output from all those processes will go into one file, possibly jumbled up, and possibly incomplete.
```

With this change, we'll start having lots of valgrind output files
generated under "src/test/regress" with the same prefix,
citus_valgrind_test_log.txt, by default, during valgrind tests, so it'll
look a bit ugly; but one can use `cat
src/test/regress/citus_valgrind_test_log.txt.[0-9]*"` or such to combine
them into a single valgrind log file later.
2025-08-27 14:01:25 +00:00
Colm bb6eeb17cc
Fix bug in redundant WHERE clause detection. (#8162)
Need to also check Postgres plan's rangetables for relations used in Initplans.

DESCRIPTION: Fix a bug in redundant WHERE clause detection; we need to
additionally check the Postgres plan's range tables for the presence of
citus tables, to account for relations that are referenced from scalar
subqueries.

There is a fundamental flaw in 4139370, the assumption that, after
Postgres planning has completed, all tables used in a query can be
obtained by walking the query tree. This is not the case for scalar
subqueries, which will be referenced by `PARAM` nodes. The fix adds an
additional check of the Postgres plan range tables; if there is at least
one citus table in there we do not need to change the needs distributed
planning flag.

Fixes #8159
2025-08-27 13:32:02 +01:00
Colm 0a5cae19ed
In UPDATE deparse, check for a subscript before processing the targets. (#8155)
DESCRIPTION: Checking first for the presence of subscript ops avoids a
shallow copy of the target list for target lists where there are no
array or json subscripts.

Commit 0c1b31c fixed a bug in UPDATE statements with array or json
subscripting in the target list. This commit modifies that to first
check that the target list has a subscript and avoid a shallow copy of
the target list for UPDATE statements with no array/json subscripting.
2025-08-27 11:00:27 +00:00
Muhammad Usama 62e5fcfe09
Enhance clone node replication status messages (#8152)
- Downgrade replication lag reporting from NOTICE to DEBUG to reduce
noise and improve regression test stability.
- Add hints to certain replication status messages for better clarity.
- Update expected output files accordingly.
2025-08-26 21:48:07 +03:00
Naisila Puka ce7ddc0d3d
Bump PG versions to 17.6, 16.10, 15.14 (#8142)
Sister PR https://github.com/citusdata/the-process/pull/172

Fixes #8134 #8149
2025-08-25 15:34:13 +03:00
Naisila Puka aaa31376e0
Make columnar_chunk_filtering pass consecutive runs (#8147)
Test was not cleaning up after itself therefore failed consecutive runs

Test locally with:
make check-columnar-minimal
\ EXTRA_TESTS='columnar_chunk_filtering columnar_chunk_filtering'
2025-08-25 14:35:37 +03:00
Onur Tirtir 439870f3a9
Fix incorrect usage of TupleDescSize() in #7950, #8120, #8124, #8121 and #8114 (#8146)
In #7950, #8120, #8124, #8121 and #8114, TupleDescSize() was used to
check whether the tuple length is `Natts_<catalog_table_name>`. However
this was wrong because TupleDescSize() returns the size of the
tupledesc, not the length of it (i.e., number of attributes).

Actually `TupleDescSize(tupleDesc) == Natts_<catalog_table_name>` was
always returning false but this didn't cause any problems because using
`tupleDesc->natts - 1` when `tupleDesc->natts ==
Natts_<catalog_table_name>` too had the same effect as using
`Anum_<column_added_later> - 1` in that case.

So this also makes me thinking of always returning `tupleDesc->natts -
1` (or `tupleDesc->natts - 2` if it's the second to last attribute) but
being more explicit seems more useful.

Even more, in the future we should probably switch to a different
implementation if / when we think of adding more columns to those
tables. We should probably scan non-dropped attributes of the relation,
enumerate them and return the attribute number of the one that we're
looking for, but seems this is not needed right now.
2025-08-22 11:46:06 +00:00
Onur Tirtir 785287c58f
Fix memory corruptions around pg_dist_node accessors after a Citus downgrade is followed by an upgrade (#8144)
Unlike what has been fixed in #7950, #8120, #8124, #8121 and #8114, this
was not an issue in older releases but is a potential issue to be
introduced by the current (13.2) release because in one of recent
commits (#8122) two columns has been added to pg_dist_node. In other
words, none of the older releases since we started supporting downgrades
added new columns to pg_dist_node.

The mentioned PR actually attempted avoiding these kind of issues in one
of the code-paths but not in some others.

So, this PR, avoids memory corruptions around pg_dist_node accessors in
a standardized way (as implemented in other example PRs) and in all
code-paths.
2025-08-22 14:07:44 +03:00
Mehmet YILMAZ 86b5bc6a20
Normalize Actual Rows output in regression tests for PG18 compatibility (#8141)
DESCRIPTION: Normalize Actual Rows output in regression tests for PG18
compatibility

PostgreSQL 18 changed `EXPLAIN ANALYZE` to always print fractional row
counts (e.g. `1.00` instead of `1`).
95dbd827f2
This caused diffs across multiple output formats in Citus regression
tests:

* Text EXPLAIN: `actual rows=50.00` vs `actual rows=50`
* YAML: `Actual Rows: 1.00` vs `Actual Rows: 1`
* XML: `<Actual-Rows>1.00</Actual-Rows>` vs
`<Actual-Rows>1</Actual-Rows>`
* JSON: `"Actual Rows": 1.00` vs `"Actual Rows": 1`
* Placeholders: `rows=N.N` vs `rows=N`

This patch extends `normalize.sed` to strip trailing `.0…` from `Actual
Rows` in all supported formats and collapses placeholder values back to
`N`. With these changes, regression tests produce stable output across
PG15–PG18.

No functional changes to Citus itself — only test normalization was
updated.
2025-08-21 17:47:46 +03:00
Mehmet YILMAZ f1f0b09f73
PG18 - Add BUFFERS OFF to EXPLAIN ANALYZE calls (#8101)
Relevant PG18 commit:
c2a4078eba
- Enable buffer-usage reporting by default in `EXPLAIN ANALYZE` on
PostgreSQL 18 and above.

Solution:
- Introduce the explicit `BUFFERS OFF` option in every existing
regression test to maintain pre-PG18 output consistency.
- This appends, `BUFFERS OFF` to all `EXPLAIN ANALYZE(...)` calls in
src/test/regress/sql and the corresponding .out files.

fixes #8093
2025-08-21 13:48:50 +03:00
Onur Tirtir 683ead9607
Add changelog for 13.2.0 (#8130) 2025-08-21 09:22:42 +00:00
Naisila Puka eaa609f510
Add citus_stats UDF (#8026)
DESCRIPTION: Add `citus_stats` UDF

This UDF acts on a Citus table, and provides `null_frac`,
`most_common_vals` and `most_common_freqs` for each column in the table,
based on the definitions of these columns in the Postgres view
`pg_stats`.

**Aggregated Views: pg\_stats > citus\_stats** 

citus\_stats, is a **view** intended for use in **Citus**, a distributed
extension of PostgreSQL. It collects and returns **column-level**
**statistics** for a distributed table—specifically, the **most common
values**, their **frequencies,** and **fraction of null values**, like
pg\_stats view does for regular Postgres tables.

**Use Case** 

This view is useful when: 

- You need **column-level insights** on a distributed table. 
- You're performing **query optimization**, **cardinality estimation**,
or **data profiling** across shards.

**What It Returns** 

A **table** with: 

| Column Name | Data Type | Description |

|---------------------|-----------|-----------------------------------------------------------------------------|
| schemaname | text | Name of the schema containing the distributed
table |
| tablename | text | Name of the distributed table |
| attname | text | Name of the column (attribute) |
| null_frac | float4 | Estimated fraction of NULLs in the column across
all shards |
| most_common_vals | text[] | Array of most common values for the column
|
| most_common_freqs | float4[] | Array of corresponding frequencies (as
fractions) of the most common values|

**Caveats** 
- The function assumes that the array of the most common values among
different shards will be the same, therefore it just adds everything up.
2025-08-19 23:17:13 +03:00
Colm bd0558fe39
Remove incorrect assertion from Postgres ruleutils. (#8136)
DESCRIPTION: Remove an assertion from Postgres ruleutils that was rendered meaningless by a previous Citus commit.

Fixes #8123. This has been present since 00068e0, which changed the code preceding the assert as follows:
```
#ifdef USE_ASSERT_CHECKING
-	while (i < colinfo->num_cols && colinfo->colnames[i] == NULL)
-		i++;
+	for (int col_index = 0; col_index < colinfo->num_cols; col_index++)
+	{
+		/*
+		 * In the above processing-loops, "i" advances only if
+		 * the column is not new, check if this is a new column.
+		 */
+		if (colinfo->is_new_col[col_index])
+			i++;
+	}
	Assert(i == colinfo->num_cols);
	Assert(j == nnewcolumns);
#endif
```

This commit altered both the loop condition and the incrementing of `i`. After analysis, the assert no longer makes sense.
2025-08-19 15:52:13 +01:00
Muhammad Usama be6668e440
Snapshot-Based Node Split – Foundation and Core Implementation (#8122)
**DESCRIPTION:**
This pull request introduces the foundation and core logic for the
snapshot-based node split feature in Citus. This feature enables
promoting a streaming replica (referred to as a clone in this feature
and UI) to a primary node and rebalancing shards between the original
and the newly promoted node without requiring a full data copy.

This significantly reduces rebalance times for scale-out operations
where the new node already contains a full copy of the data via
streaming replication.

Key Highlights:
**1. Replica (Clone) Registration & Management Infrastructure**

Introduces a new set of UDFs to register and manage clone nodes:

- citus_add_clone_node()
- citus_add_clone_node_with_nodeid()
- citus_remove_clone_node()
- citus_remove_clone_node_with_nodeid()

These functions allow administrators to register a streaming replica of
an existing worker node as a clone, making it eligible for later
promotion via snapshot-based split.

**2. Snapshot-Based Node Split (Core Implementation)**
New core UDF: 

- citus_promote_clone_and_rebalance()

This function implements the full workflow to promote a clone and
rebalance shards between the old and new primaries. Steps include:

1. Ensuring Exclusivity – Blocks any concurrent placement-changing
operations.
2. Blocking Writes – Temporarily blocks writes on the primary to ensure
consistency.
3. Replica Catch-up – Waits for the replica to be fully in sync.
4. Promotion – Promotes the replica to a primary using pg_promote.
5. Metadata Update – Updates metadata to reflect the newly promoted
primary node.
6. Shard Rebalancing – Redistributes shards between the old and new
primary nodes.


**3. Split Plan Preview**
A new helper UDF get_snapshot_based_node_split_plan() provides a preview
of the shard distribution post-split, without executing the promotion.

**Example:**

```
reb 63796> select * from pg_catalog.get_snapshot_based_node_split_plan('127.0.0.1',5433,'127.0.0.1',5453);
  table_name  | shardid | shard_size | placement_node 
--------------+---------+------------+----------------
 companies    |  102008 |          0 | Primary Node
 campaigns    |  102010 |          0 | Primary Node
 ads          |  102012 |          0 | Primary Node
 mscompanies  |  102014 |          0 | Primary Node
 mscampaigns  |  102016 |          0 | Primary Node
 msads        |  102018 |          0 | Primary Node
 mscompanies2 |  102020 |          0 | Primary Node
 mscampaigns2 |  102022 |          0 | Primary Node
 msads2       |  102024 |          0 | Primary Node
 companies    |  102009 |          0 | Clone Node
 campaigns    |  102011 |          0 | Clone Node
 ads          |  102013 |          0 | Clone Node
 mscompanies  |  102015 |          0 | Clone Node
 mscampaigns  |  102017 |          0 | Clone Node
 msads        |  102019 |          0 | Clone Node
 mscompanies2 |  102021 |          0 | Clone Node
 mscampaigns2 |  102023 |          0 | Clone Node
 msads2       |  102025 |          0 | Clone Node
(18 rows)

```
**4 Test Infrastructure Enhancements**

- Added a new test case scheduler for snapshot-based split scenarios.
- Enhanced pg_regress_multi.pl to support creating node backups with
slightly modified options to simulate real-world backup-based clone
creation.

### 5. Usage Guide
The snapshot-based node split can be performed using the following
workflow:

**- Take a Backup of the Worker Node**
Run pg_basebackup (or an equivalent tool) against the existing worker
node to create a physical backup.

`pg_basebackup -h <primary_worker_host> -p <port> -D
/path/to/replica/data --write-recovery-conf
`

**- Start the Replica Node**
Start PostgreSQL on the replica using the backup data directory,
ensuring it is configured as a streaming replica of the original worker
node.

**- Register the Backup Node as a Clone**
Mark the registered replica as a clone of its original worker node:

`SELECT * FROM citus_add_clone_node('<clone_host>', <clone_port>,
'<primary_host>', <primary_port>);
`

**- Promote and Rebalance the Clone**
Promote the clone to a primary and rebalance shards between it and the
original worker:

`SELECT * FROM citus_promote_clone_and_rebalance('clone_node_id');
`

**- Drop Any Replication Slots from the Original Worker**
After promotion, clean up any unused replication slots from the original
worker:

`SELECT pg_drop_replication_slot('<slot_name>');
`
2025-08-19 14:13:55 +03:00
Muhammad Usama f743b35fc2
Parallelize Shard Rebalancing & Unlock Concurrent Logical Shard Moves (#7983)
DESCRIPTION: Parallelizes shard rebalancing and removes the bottlenecks
that previously blocked concurrent logical-replication moves.
These improvements reduce rebalance windows—particularly for clusters
with large reference tables and enable multiple shard transfers to run in parallel.

Motivation:
Citus’ shard rebalancer has some key performance bottlenecks:
**Sequential Movement of Reference Tables:**
Reference tables are often assumed to be small, but in real-world
deployments, they can grow significantly large. Previously, reference
table shards were transferred as a single unit, making the process
monolithic and time-consuming.
**No Parallelism Within a Colocation Group:**
Although Citus distributes data using colocated shards, shard
movements within the same colocation group were serialized. In
environments with hundreds of distributed tables colocated
together, this serialization significantly slowed down rebalance
operations.
 **Excessive Locking:**
 Rebalancer used restrictive locks and redundant logical replication
guards, further limiting concurrency.
The goal of this commit is to eliminate these inefficiencies and enable
maximum parallelism during rebalance, without compromising correctness
or compatibility. Parallelize shard rebalancing to reduce rebalance
time.

Feature Summary:

**1. Parallel Reference Table Rebalancing**
Each reference-table shard is now copied in its own background task.
Foreign key and other constraints are deferred until all shards are
copied.
For single shard movement without considering colocation a new
internal-only UDF '`citus_internal_copy_single_shard_placement`' is
introduced to allow single-shard copy/move operations.
Since this function is internal, we do not allow users to call it
directly.

**Temporary Hack to Set Background Task Context** Background tasks
cannot currently set custom GUCs like application_name before executing
internal-only functions. 'citus_rebalancer ...' statement as a prefix in
the task command. This is a temporary hack to label internal tasks until
proper GUC injection support is added to the background task executor.

**2. Changes in Locking Strategy**

- Drop the leftover replication lock that previously serialized shard
moves performed via logical replication. This lock was only needed when
we used to drop and recreate the subscriptions/publications before each
move. Since Citus now removes those objects later as part of the “unused
distributed objects” cleanup, shard moves via logical replication can
safely run in parallel without additional locking.

- Introduced a per-shard advisory lock to prevent concurrent operations
on the same shard while allowing maximum parallelism elsewhere.

- Change the lock mode in AcquirePlacementColocationLock from
ExclusiveLock to RowExclusiveLock to allow concurrent updates within the
same colocation group, while still preventing concurrent DDL operations.

**3. citus_rebalance_start() enhancements**
The citus_rebalance_start() function now accepts two new optional
parameters:

```
- parallel_transfer_colocated_shards BOOLEAN DEFAULT false,
- parallel_transfer_reference_tables BOOLEAN DEFAULT false
```
This ensures backward compatibility by preserving the existing behavior
and avoiding any disruption to user expectations and when both are set
to true, the rebalancer operates with full parallelism.

**Previous Rebalancer Behavior:**
`SELECT citus_rebalance_start(shard_transfer_mode := 'force_logical');`
This would:
Start a single background task for replicating all reference tables
Then, move all shards serially, one at a time.
```
Task 1: replicate_reference_tables()
         ↓
         Task 2: move_shard_1()
         ↓
         Task 3: move_shard_2()
         ↓
         Task 4: move_shard_3()
```
Slow and sequential. Reference table copy is a bottleneck. Colocated
shards must wait for each other.

**New Parallel Rebalancer:**
```
SELECT citus_rebalance_start(
        shard_transfer_mode := 'force_logical',
        parallel_transfer_colocated_shards := true,
        parallel_transfer_reference_tables := true
      );
```
This would:

- Schedule independent background tasks for each reference-table shard.
- Move colocated shards in parallel, while still maintaining dependency
order.
- Defer constraint application until all reference shards are in place.
-     
```
Task 1: copy_ref_shard_1()
          Task 2: copy_ref_shard_2()
          Task 3: copy_ref_shard_3()
            → Task 4: apply_constraints()
          ↓
         Task 5: copy_shard_1()
         Task 6: copy_shard_2()
         Task 7: copy_shard_3()
         ↓
         Task 8-10: move_shard_1..3()
```
Each operation is scheduled independently and can run as soon as
dependencies are satisfied.
2025-08-18 17:44:14 +03:00
Karina 2095679dc8
Fix memory corruptions around pg_dist_object accessors after a Citus downgrade is followed by an upgrade (#8120)
DESCRIPTION: Fixes potential memory corruptions that could happen when
accessing pg_dist_object after a Citus downgrade is followed by a Citus
upgrade.

In case of Citus downgrade and further upgrade an undefined behavior may
be encountered. The reason is that Citus hardcoded the number of columns
in the extension's tables, but in case of downgrade and following update
some of these tables can have more columns, and some of them can be
marked as dropped.

This PR fixes all such tables using the approach introduced in #7950,
which solved the problem for the pg_dist_partition table.

See #7515 for a more thorough explanation.

---------

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-08-18 12:52:34 +00:00
Karina e15cc5c63b
Fix memory corruptions around columnar.stripe accessors after a Citus downgrade is followed by an upgrade (#8124)
DESCRIPTION: Fixes potential memory corruptions that could happen when
accessing columnar.stripe after a Citus downgrade is followed by a Citus
upgrade.

In case of Citus downgrade and further upgrade an undefined behavior may
be encountered. The reason is that Citus hardcoded the number of columns
in the extension's tables, but in case of downgrade and following update
some of these tables can have more columns, and some of them can be
marked as dropped.

This PR fixes all such tables using the approach introduced in
https://github.com/citusdata/citus/pull/7950, which solved the problem
for the pg_dist_partition table.

See https://github.com/citusdata/citus/issues/7515 for a more thorough
explanation.

---------

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-08-18 12:34:26 +00:00
Karina badaa21cb1
Fix memory corruptions around pg_dist_transaction accessors after a Citus downgrade is followed by an upgrade (#8121)
DESCRIPTION: Fixes potential memory corruptions that could happen when
accessing pg_dist_transaction after a Citus downgrade is followed by a
Citus upgrade.

In case of Citus downgrade and further upgrade an undefined behavior may
be encountered. The reason is that Citus hardcoded the number of columns
in the extension's tables, but in case of downgrade and following update
some of these tables can have more columns, and some of them can be
marked as dropped.

This PR fixes all such tables using the approach introduced in #7950,
which solved the problem for the pg_dist_partition table.

See #7515 for a more thorough explanation.

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
2025-08-18 11:22:28 +00:00
eaydingol 8d929d3bf8
Push down recurring outer joins when possible (#7973)
DESCRIPTION: Adds support for pushing down LEFT/RIGHT outer joins having
a reference table in the outer side and a distributed table on the inner
side (e.g., <reference table> LEFT JOIN <distributed table>)

Partially addresses #6546 

1) `<outer:reference>` LEFT JOIN `<inner:distributed>` 
2) `<inner:distributed>` RIGHT JOIN `<outer:reference>` 
 
Previously, for outer joins of types (1) and (2), the distributed side
was computed recursively. This was necessary because, when the inner
side of a recurring outer join is a distributed table, it is not
possible to directly distribute the join; the preserved (outer and
recurring) side may generate rows with join keys that hash to different
shards.
 
To implement distributed planning while maintaining consistency with
global execution semantics, this PR restricts the outer side only to
those partition key values that route to the selected shard during
distributed shard query computation. This method is employed )when the
following criteria are met: (recursive planning applied otherwise)

- The join type is (1) or (2) (lateral joins are not supported). 
- The outer side is a reference table. 
- The outer join qualifications include an equality condition between
the partition column of a distributed table and the recurring table.
- The join is not part of a chained join. 
- The “enable_recurring_outer_join_pushdown” GUC is enabled (default is
on).

---------

Co-authored-by: ebruaydingol <ebruaydingol@microsoft.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-08-18 14:03:44 +03:00
Onur Tirtir 87a1b631e8
Not automatically create citus_columnar when creating citus extension (#8081)
DESCRIPTION: Not automatically create citus_columnar when there are no
relations using it.

Previously, we were always creating citus_columnar when creating citus
with version >= 11.1. And how we were doing was as follows:
* Detach SQL objects owned by old columnar, i.e., "drop" them from
citus, but not actually drop them from the database
* "old columnar" is the one that we had before Citus 11.1 as part of
citus, i.e., before splitting the access method ands its catalog to
citus_columnar.
* Create citus_columnar and attach the SQL objects leftover from old
columnar to it so that we can continue supporting the columnar tables
that user had before Citus 11.1 with citus_columnar.

First part is unchanged, however, now we don't create citus_columnar
automatically anymore if the user didn't have any relations using
columnar. For this reason, as of Citus 13.2, when these SQL objects are
not owned by an extension and there are no relations using columnar
access method, we drop these SQL objects when updating Citus to 13.2.

The net effect is still the same as if we automatically created
citus_columnar and user dropped citus_columnar later, so we should not
have any issues with dropping them.

(**Update:** Seems we've made some assumptions in citus, e.g.,
citus_finish_pg_upgrade() still assumes columnar metadata exists and
tries to apply some fixes for it, so this PR fixes them as well. See the
last section of this PR description.)

Also, ideally I was hoping to just remove some lines of code from
extension.c, where we decide automatically creating citus_columnar when
creating citus, however, this didn't happen to be the case for two
reasons:
* We still need to automatically create it for the servers using
columnar access method.
* We need to clean-up the leftover SQL objects from old columnar when
the above is not case otherwise we would have leftover SQL objects from
old columnar for no reason, and that would confuse users too.
* Old columnar cannot be used to create columnar tables properly, so we
should clean them up and let the user decide whether they want to create
citus_columnar when they really need it later.

---

Also made several changes in the test suite because similarly, we don't
always want to have citus_columnar created in citus tests anymore:
* Now, columnar specific test targets, which cover **41** test sql
files, always install columnar by default, by using
"--load-extension=citus_columnar".
* "--load-extension=citus_columnar" is not added to citus specific test
targets because by default we don't want to have citus_columnar created
during citus tests.
* Excluding citus_columnar specific tests, we have **601** sql files
that we have as citus tests and in **27** of them we manually create
citus_columnar at the very beginning of the test because these tests do
test some functionalities of citus together with columnar tables.

Also, before and after schedules for PG upgrade tests are now duplicated
so we have two versions of each: one with columnar tests and one
without. To choose between them, check-pg-upgrade now supports a
"test-with-columnar" option, which can be set to "true" or anything else
to logically indicate "false". In CI, we run the check-pg-upgrade test
target with both options. The purpose is to ensure we can test PG
upgrades where citus_columnar is not created in the cluster before the
upgrade as well.

Finally, added more tests to multi_extension.sql to test Citus upgrade
scenarios with / without columnar tables / citus_columnar extension.

---

Also, seems citus_finish_pg_upgrade was assuming that citus_columnar is
always created but actually we should have never made such an
assumption. To fix that, moved columnar specific post-PG-upgrade work
from citus to a new columnar UDF, which is columnar_finish_pg_upgrade.
But to avoid breaking existing customer / managed service scripts, we
continue to automatically perform post PG-upgrade work for columnar
within citus_finish_pg_upgrade, but only if columnar access method
exists this time.
2025-08-18 08:29:27 +01:00
ibrahim halatci 649050c676
results of extension compatibility testing (#8048)
List of extensions that are verified to be working with Citus, and some
special cases that needs attention. Thanks to the efforts of @emelsimsek
, @m3hm3t , @alperkocatas , @eaydingol
2025-08-15 15:32:29 +03:00
ibrahim halatci cf9a4476e0
Merge branch 'main' into ihalatci-extension-compat-test-report 2025-08-13 19:27:45 +03:00
ibrahim halatci f73da1ed40
Refactor background worker setup for security improvements (#8078)
Enhance security by addressing a code scanning alert and refactoring the
background worker setup code for better maintainability and clarity.

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-08-13 19:25:31 +03:00
Mehmet YILMAZ 41883cea38
PG18 - unify psql headings to ‘List of relations’ (#8119)
fixes #8110 

This patch updates the `normalize.sed` script used in pg18 psql
regression tests:

- Replaces the headings “List of tables”, “List of indexes”, and “List
of sequences” with a single, uniform heading: “List of relations”.
2025-08-13 12:22:23 +03:00
Mehmet YILMAZ bfc6d1f440
PG18 - Adjust EXPLAIN's output for disabled nodes (#8108)
fixes #8097
2025-08-12 12:38:19 +03:00
Mehmet YILMAZ a6161f5a21
Fix CTE traversal for outer Vars in FindReferencedTableColumn (remove assert; correct parentQueryList handling) (#8106)
fixes #8105 

This change lets `FindReferencedTableColumn()` correctly resolve columns
through a CTE even when the expression comes from an outer query level
(`varlevelsup > 0`, `skipOuterVars = false`). Before, we hit an
`Assert(skipOuterVars)` in this path.

**Problem**

* Hitting a CTE after walking outer Vars triggered
`Assert(skipOuterVars)`.
* Cause: we modified `parentQueryList` in place and didn’t rebuild the
correct parent chain before recursing into the CTE, so the path was
considered unsafe.

**Fix**

* Remove the `Assert(skipOuterVars)` in the `RTE_CTE` branch.
* Find the CTE’s owning level via `ctelevelsup` and compute
`cteParentListIndex`.
* Rebuild a private parent list for recursion: `list_copy` →
`list_truncate` → `lappend(current query)`.
* Add a bounds check before indexing the CTE’s `targetList`.

**Why it works**


```diff
-parentQueryList = lappend(parentQueryList, query);
-FindReferencedTableColumn(targetEntry->expr, parentQueryList,
-                          cteQuery, column, rteContainingReferencedColumn,
-                          skipOuterVars);
+    /* hand a private, bounded parent list to the recursion */
+    List *newParent = list_copy(parentQueryList);
+    newParent = list_truncate(newParent, cteParentListIndex + 1);
+    newParent = lappend(newParent, query);
+
+    FindReferencedTableColumn(targetEntry->expr,
+                              newParent,
+                              cteQuery,
+                              column,
+                              rteContainingReferencedColumn,
+                              skipOuterVars);
+}


```
**Before:** We changed `parentQueryList` in place (`parentQueryList =
lappend(...)`) and didn’t trim it to the CTE’s owner level.

**After:** We copy the list, trim it to the CTE’s owner level, then
append the current query. This keeps the parent list accurate for the
current recursion and safe when following outer Vars.


**Example: Nested subquery referencing the CTE (two levels down)**

```
WITH c AS MATERIALIZED (SELECT user_id FROM raw_events_first)
SELECT 1
FROM raw_events_first t
WHERE EXISTS (
  SELECT 1
  FROM (SELECT user_id FROM c) c2
  WHERE c2.user_id = t.user_id
);
```

Levels:
Q0 = top SELECT
Q1 = EXISTS subquery
Q2 = inner (SELECT user_id FROM c)

When resolving c2.user_id inside Q2:

- parentQueryList is [Q0, Q1, Q2].
- `ctelevelsup`: 2


`cteParentListIndex = length(parentQueryList) - ctelevelsup - 1`

- Recurse into the CTE’s query with [Q0, Q2].


**Tests (added in `multi_insert_select`)**

* **T1:** Correlated subquery that references a CTE (one level down) 
Verifies that resolving through `RTE_CTE` after following an outer `Var`
succeeds, row count matches source table.
* **T2:** Nested subquery that references a CTE (two levels down) 
Exercises deeper recursion and confirms identical to T1.
* **T3:** Scalar subquery in a target list that reads from the outer CTE
Checks expected row count and that no NULLs are inserted.

These tests cover the cases that previously hit `Assert(skipOuterVars)`
and confirm CTE references while following outer Vars.
2025-08-12 11:49:50 +03:00
Karina 71d6328378
Fix memory corruptions around pg_dist_background_task accessors after a Citus downgrade is followed by an upgrade (#8114)
DESCRIPTION: Fixes potential memory corruptions that could happen when
accessing pg_dist_background_task after a Citus downgrade is followed by
a Citus upgrade.

In case of Citus downgrade and further upgrade an undefined behavior may
be encountered. The reason is that Citus hardcoded the number of columns
in the extension's tables, but in case of downgrade and following update
some of these tables can have more columns, and some of them can be
marked as dropped.

This PR fixes all such tables using the approach introduced in #7950,
which solved the problem for the pg_dist_partition table.

See #7515 for a more thorough explanation.

---------

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-08-11 18:34:06 +03:00
Mehmet YILMAZ 6b6d959fac
PG18 - pg17.sql Simplify step 10 verification to use COUNT(*) instead of SELECT * (#8111)
fixes #8096 

PostgreSQL 18 adds a `conenforced` flag allowing `CHECK` constraints to
be declared `NOT ENFORCED`.



ca87c415e2
```diff
@@ -1256,26 +1278,26 @@
  distributed_partitioned_table_id_partition_col_excl | x
 (2 rows)
 
 -- Step 9: Drop the exclusion constraints from both tables
 \c - - :master_host :master_port
 SET search_path TO pg17;
 ALTER TABLE distributed_partitioned_table DROP CONSTRAINT dist_exclude_named;
 ALTER TABLE local_partitioned_table DROP CONSTRAINT local_exclude_named;
 -- Step 10: Verify the constraints were dropped
 SELECT * FROM pg_constraint WHERE conname = 'dist_exclude_named' AND contype = 'x';
- oid | conname | connamespace | contype | condeferrable | condeferred | convalidated | conrelid | contypid | conindid | conparentid | confrelid | confupdtype | confdeltype | confmatchtype | conislocal | coninhcount | connoinherit | conkey | confkey | conpfeqop | conppeqop | conffeqop | confdelsetcols | conexclop | conbin
+ oid | conname | connamespace | contype | condeferrable | condeferred | conenforced | convalidated | conrelid | contypid | conindid | conparentid | confrelid | confupdtype | confdeltype | confmatchtype | conislocal | coninhcount | connoinherit | conperiod | conkey | confkey | conpfeqop | conppeqop | conffeqop | confdelsetcols | conexclop | conbin 
 -----+---------+--------------+---------+---------------+-------------+-------------+--------------+----------+----------+----------+-------------+-----------+-------------+-------------+---------------+------------+-------------+--------------+-----------+--------+---------+-----------+-----------+-----------+----------------+-----------+--------
 (0 rows)
 
 SELECT * FROM pg_constraint WHERE conname = 'local_exclude_named' AND contype = 'x';
- oid | conname | connamespace | contype | condeferrable | condeferred | convalidated | conrelid | contypid | conindid | conparentid | confrelid | confupdtype | confdeltype | confmatchtype | conislocal | coninhcount | connoinherit | conkey | confkey | conpfeqop | conppeqop | conffeqop | confdelsetcols | conexclop | conbin
+ oid | conname | connamespace | contype | condeferrable | condeferred | conenforced | convalidated | conrelid | contypid | conindid | conparentid | confrelid | confupdtype | confdeltype | confmatchtype | conislocal | coninhcount | connoinherit | conperiod | conkey | confkey | conpfeqop | conppeqop | conffeqop | confdelsetcols | conexclop | conbin 
 -----+---------+--------------+---------+---------------+-------------+-------------+--------------+----------+----------+----------+-------------+-----------+-------------+-------------+---------------+------------+-------------+--------------+-----------+--------+---------+-----------+-----------+-----------+----------------+-----------+--------
 (0 rows)
 
```

The purpose of step 10 is merely to confirm that the exclusion
constraints dist_exclude_named and local_exclude_named have been
dropped. There’s no need to pull back every column from pg_constraint—we
only care about whether any matching row remains.

- Reduces noise in the output
- Eliminates dependence on the full set of pg_constraint columns (which
can drift across Postgres versions)
- Resolves the pg18 regression diff without altering test expectations
2025-08-08 13:46:11 +03:00
ibrahim halatci 26409f6400
Update EXTENSION_COMPATIBILITY.md 2025-08-07 14:57:45 +03:00
ibrahim halatci dbf0e647a9
Merge branch 'main' into ihalatci-extension-compat-test-report 2025-08-07 14:35:34 +03:00
eaydingol 3d8fd337e5
Check outer table partition column (#8092)
DESCRIPTION: Introduce a new check to push down a query including union
and outer join to fix #8091 .

In "SafeToPushdownUnionSubquery", we check if the distribution column of
the outer relation is in the target list.
2025-08-06 16:13:14 +03:00
manaldush f0789bd388
Fix memory corruptions that could happen when a Citus downgrade is followed by an upgrade (#7950)
DESCRIPTION: Fixes potential memory corruptions that could happen when a
Citus downgrade is followed by a Citus upgrade.

In case of citus downgrade and further upgrade citus crash with core
dump.
The reason is that citus hardcoded number of columns in
pg_dist_partition table,
but in case of downgrade and following update table can have more
columns, and
some of then can be marked as dropped.

Patch suggest decision for this problem with using
tupleDescriptor->nattrs(postgres internal approach).

Fixes #7933.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-08-05 10:03:35 +00:00
Onur Tirtir c183634207
Move "DROP FUNCTION" for older version of UDF to correct file (#8085)
We never update an older version of a SQL object for consistency across
release tags, so this commit moves "DROP FUNCTION .." for the older
version of "pg_catalog.worker_last_saved_explain_analyze();" to the
appropriate migration script.

See https://github.com/citusdata/citus/pull/8017.
2025-07-31 13:30:12 +03:00
Teja Mupparti 889aa92ac0
EXPLAIN ANALYZE - Prevent execution of the plan during the plan-print (#8017)
DESCRIPTION: Fixed a bug in EXPLAIN ANALYZE to prevent unintended (duplicate) execution of the (sub)plans during the explain phase.

Fixes #4212 

### 🐞 Bug #4212 : Redundant (Subplan) Execution in `EXPLAIN ANALYZE`
codepath

#### 🔍 Background
In the standard PostgreSQL execution path, `ExplainOnePlan()` is
responsible for two distinct operations depending on whether `EXPLAIN
ANALYZE` is requested:

1. **Execute the plan**

   ```c
   if (es->analyze)
       ExecutorRun(queryDesc, direction, 0L, true);
   ```

2. **Print the plan tree** 

   ```c
   ExplainPrintPlan(es, queryDesc);
   ```

When printing the plan, the executor should **not run the plan again**.
Execution is only expected to happen once—at the top level when
`es->analyze = true`.

---

#### ⚠️ Issue in Citus

In the Citus implementation of `CustomScanMethods.ExplainCustomScan =
CitusExplainScan`, which is a custom scan explain callback function used
to print explain information of a Citus plan incorrectly performs
**redundant execution** inside the explain path of `ExplainPrintPlan()`

```c
ExplainOnePlan()
  ExplainPrintPlan()
      ExplainNode()
        CitusExplainScan()
          if (distributedPlan->subPlanList != NIL)
          {
              ExplainSubPlans(distributedPlan, es);
             {
              PlannedStmt *plan = subPlan->plan;
              ExplainOnePlan(plan, ...);  // ⚠️ May re-execute subplan if es->analyze is true
             }
         }
```
This causes the subplans to be **executed again**, even though they have
already been executed during the top-level plan execution. This behavior
violates the expectation in PostgreSQL where `EXPLAIN ANALYZE` should
**execute each node exactly once** for analysis.

---
####  Fix (proposed)
Save the output of Subplans during `ExecuteSubPlans()`, and later use it
in `ExplainSubPlans()`
2025-07-30 11:29:50 -07:00
Mehmet YILMAZ f31bcb4219
PG18 - Assert("HaveRegisteredOrActiveSnapshot() fix for cluster creation (#8073)
fixes #8072 
fixes #8055 


706054b11b

before fix

when try to create cluster with assert on


`citus_dev make test1 --destroy`

```
TRAP: failed Assert("HaveRegisteredOrActiveSnapshot()"), File: "heapam.c", Line: 232, PID: 75572
postgres: citus citus [local] SELECT(ExceptionalCondition+0x6e)[0x5585e16123e6]
postgres: citus citus [local] SELECT(heap_insert+0x220)[0x5585e10709af]
postgres: citus citus [local] SELECT(simple_heap_insert+0x33)[0x5585e1071a20]
postgres: citus citus [local] SELECT(CatalogTupleInsert+0x32)[0x5585e1135843]
/home/citus/.pgenv/pgsql-18beta2/lib/citus.so(+0x11e0aa)[0x7fa26f1ca0aa]
/home/citus/.pgenv/pgsql-18beta2/lib/citus.so(+0x11b607)[0x7fa26f1c7607]
/home/citus/.pgenv/pgsql-18beta2/lib/citus.so(+0x11bf25)[0x7fa26f1c7f25]
/home/citus/.pgenv/pgsql-18beta2/lib/citus.so(+0x11d4e2)[0x7fa26f1c94e2]
postgres: citus citus [local] SELECT(+0x1c267d)[0x5585e10e967d]
postgres: citus citus [local] SELECT(+0x1c6ba0)[0x5585e10edba0]
postgres: citus citus [local] SELECT(+0x1c7b80)[0x5585e10eeb80]
postgres: citus citus [local] SELECT(CommitTransactionCommand+0xd)[0x5585e10eef0a]
postgres: citus citus [local] SELECT(+0x575b3d)[0x5585e149cb3d]
postgres: citus citus [local] SELECT(+0x5788ce)[0x5585e149f8ce]
postgres: citus citus [local] SELECT(PostgresMain+0xae7)[0x5585e14a2088]
postgres: citus citus [local] SELECT(BackendMain+0x51)[0x5585e149ab36]
postgres: citus citus [local] SELECT(postmaster_child_launch+0x101)[0x5585e13d6b32]
postgres: citus citus [local] SELECT(+0x4b273f)[0x5585e13d973f]
postgres: citus citus [local] SELECT(+0x4b49f3)[0x5585e13db9f3]
postgres: citus citus [local] SELECT(PostmasterMain+0x1089)[0x5585e13dcee2]
postgres: citus citus [local] SELECT(main+0x1d7)[0x5585e12e3428]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fa271421d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fa271421e40]

```
2025-07-29 15:52:36 +03:00
ibrahim halatci 6b9962c0c0
[doc] wrong code comments for function PopUnassignedPlacementExecution (#8079)
Fixes #7621

DESCRIPTION: function comment correction
2025-07-29 13:24:42 +03:00
dependabot[bot] 3e2b6f61fa
Bump certifi from 2024.2.2 to 2024.7.4 in /src/test/regress (#8076)
Bumps [certifi](https://github.com/certifi/python-certifi) from 2024.2.2
to 2024.7.4.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="bd8153872e"><code>bd81538</code></a>
2024.07.04 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/295">#295</a>)</li>
<li><a
href="06a2cbf21f"><code>06a2cbf</code></a>
Bump peter-evans/create-pull-request from 6.0.5 to 6.1.0 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/294">#294</a>)</li>
<li><a
href="13bba02b72"><code>13bba02</code></a>
Bump actions/checkout from 4.1.6 to 4.1.7 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/293">#293</a>)</li>
<li><a
href="e8abcd0e62"><code>e8abcd0</code></a>
Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/292">#292</a>)</li>
<li><a
href="124f4adf17"><code>124f4ad</code></a>
2024.06.02 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/291">#291</a>)</li>
<li><a
href="c2196ce5d6"><code>c2196ce</code></a>
--- (<a
href="https://redirect.github.com/certifi/python-certifi/issues/290">#290</a>)</li>
<li><a
href="fefdeec758"><code>fefdeec</code></a>
Bump actions/checkout from 4.1.4 to 4.1.5 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/289">#289</a>)</li>
<li><a
href="3c5fb1560b"><code>3c5fb15</code></a>
Bump actions/download-artifact from 4.1.6 to 4.1.7 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/286">#286</a>)</li>
<li><a
href="4a9569a3eb"><code>4a9569a</code></a>
Bump actions/checkout from 4.1.2 to 4.1.4 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/287">#287</a>)</li>
<li><a
href="1fc808626a"><code>1fc8086</code></a>
Bump peter-evans/create-pull-request from 6.0.4 to 6.0.5 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/288">#288</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/certifi/python-certifi/compare/2024.02.02...2024.07.04">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=certifi&package-manager=pip&previous-version=2024.2.2&new-version=2024.7.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-25 20:48:36 +03:00
dependabot[bot] a2e3c797e8
Bump black from 23.11.0 to 24.3.0 in /.devcontainer (#8075)
Bumps [black](https://github.com/psf/black) from 23.11.0 to 24.3.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/releases">black's
releases</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
<h2>24.2.0</h2>
<h3>Stable style</h3>
<ul>
<li>Fixed a bug where comments where mistakenly removed along with
redundant parentheses
(<a
href="https://redirect.github.com/psf/black/issues/4218">#4218</a>)</li>
</ul>
<h3>Preview style</h3>
<ul>
<li>Move the <code>hug_parens_with_braces_and_square_brackets</code>
feature to the unstable style
due to an outstanding crash and proposed formatting tweaks (<a
href="https://redirect.github.com/psf/black/issues/4198">#4198</a>)</li>
<li>Fixed a bug where base expressions caused inconsistent formatting of
** in tenary
expression (<a
href="https://redirect.github.com/psf/black/issues/4154">#4154</a>)</li>
<li>Checking for newline before adding one on docstring that is almost
at the line limit
(<a
href="https://redirect.github.com/psf/black/issues/4185">#4185</a>)</li>
<li>Remove redundant parentheses in <code>case</code> statement
<code>if</code> guards (<a
href="https://redirect.github.com/psf/black/issues/4214">#4214</a>).</li>
</ul>
<h3>Configuration</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/blob/main/CHANGES.md">black's
changelog</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
<h2>24.2.0</h2>
<h3>Stable style</h3>
<ul>
<li>Fixed a bug where comments where mistakenly removed along with
redundant parentheses
(<a
href="https://redirect.github.com/psf/black/issues/4218">#4218</a>)</li>
</ul>
<h3>Preview style</h3>
<ul>
<li>Move the <code>hug_parens_with_braces_and_square_brackets</code>
feature to the unstable style
due to an outstanding crash and proposed formatting tweaks (<a
href="https://redirect.github.com/psf/black/issues/4198">#4198</a>)</li>
<li>Fixed a bug where base expressions caused inconsistent formatting of
** in tenary
expression (<a
href="https://redirect.github.com/psf/black/issues/4154">#4154</a>)</li>
<li>Checking for newline before adding one on docstring that is almost
at the line limit
(<a
href="https://redirect.github.com/psf/black/issues/4185">#4185</a>)</li>
<li>Remove redundant parentheses in <code>case</code> statement
<code>if</code> guards (<a
href="https://redirect.github.com/psf/black/issues/4214">#4214</a>).</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="552baf8229"><code>552baf8</code></a>
Prepare release 24.3.0 (<a
href="https://redirect.github.com/psf/black/issues/4279">#4279</a>)</li>
<li><a
href="f000936726"><code>f000936</code></a>
Fix catastrophic performance in lines_with_leading_tabs_expanded() (<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
<li><a
href="7b5a657285"><code>7b5a657</code></a>
Fix --line-ranges behavior when ranges are at EOF (<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
<li><a
href="1abcffc818"><code>1abcffc</code></a>
Use regex where we ignore case on windows (<a
href="https://redirect.github.com/psf/black/issues/4252">#4252</a>)</li>
<li><a
href="719e67462c"><code>719e674</code></a>
Fix 4227: Improve documentation for --quiet --check (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
<li><a
href="e5510afc06"><code>e5510af</code></a>
update plugin url for Thonny (<a
href="https://redirect.github.com/psf/black/issues/4259">#4259</a>)</li>
<li><a
href="6af7d11096"><code>6af7d11</code></a>
Fix AST safety check false negative (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li><a
href="f03ee113c9"><code>f03ee11</code></a>
Ensure <code>blib2to3.pygram</code> is initialized before use (<a
href="https://redirect.github.com/psf/black/issues/4224">#4224</a>)</li>
<li><a
href="e4bfedbec2"><code>e4bfedb</code></a>
fix: Don't move comments while splitting delimiters (<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li><a
href="d0287e1f75"><code>d0287e1</code></a>
Make trailing comma logic more concise (<a
href="https://redirect.github.com/psf/black/issues/4202">#4202</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/psf/black/compare/23.11.0...24.3.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=black&package-manager=pip&previous-version=23.11.0&new-version=24.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-25 17:54:28 +03:00
Colm f1160b0892
Fix assert failure introduced in 245a62df3e
The assert on the number of shards incorrectly used the value of
citus.shard_replication_factor; it should check the table's metadata
to determine the replication factor of its data, and not assume it is
the current GUC value.
2025-07-24 16:19:39 +03:00
Mehmet YILMAZ 9327df8446
Add PG 18Beta2 Build compatibility (#8060)
Fixes #8061 

Add PG 18Beta2 Build compatibility

Revert "Don't lock partitions pruned by initial pruning
Relevant PG commit:
1722d5eb05d8e5d2e064cd1798abcae4f296ca9d
https://github.com/postgres/postgres/commit/1722d5e
2025-07-23 15:15:55 +03:00
Colm 9ccf758bb8
Fix PG15 compiler error introduced in commit 245a62df3e (#8069)
Commit 245a62df3e included an assertion on a struct field that is
in PG16+, without PG_VERSION_NUM check. This commit removes the
offending line of code. The same assertion is present later in the
function with the PG_VERSION_NUM check, so the offending line of code is
redundant.
2025-07-23 10:44:26 +01:00
Cédric Villemain 0c1b31cdb5
Fix UPDATE stmts with indirection & array/jsonb subscripting with more than 1 field (#7675)
DESCRIPTION: Fixes problematic UPDATE statements with indirection and array/jsonb subscripting with more than one field.

Fixes #4092, #7674 and #5621. Issues #7674 and #4092 involve an UPDATE with out of order columns and a sublink (SELECT) in the source, e.g. `UPDATE T SET (col3, col1, col4) = (SELECT 3, 1, 4)` where an incorrect value could get written to a column because query deparsing generated an incorrect SQL statement. To address this the fix adds an additional
check to `ruleutils` to ensure that the target list of an UPDATE statement is in an order so that deparsing can be done safely. It is needed when the source of the UPDATE has a sublink, because Postgres `rewrite` will have put the target list in attribute order, but for deparsing to produce a correct SQL text the target list needs to be in order of the references (or `paramids`) to the target list of the sublink(s). Issue #5621 involves an UPDATE with array/jsonb subscripting that can behave incorrectly with more than one field, again because Citus query deparsing is receiving a post-`rewrite` query tree. The fix also adds a
check to `ruleutils` to enable correct query deparsing of the UPDATE.

---------

Co-authored-by: Ibrahim Halatci <ihalatci@gmail.com>
Co-authored-by: Colm McHugh <colm.mchugh@gmail.com>
2025-07-22 17:49:26 +01:00
Colm 245a62df3e
Avoid query deparse and planning of shard query in local execution. (#8035)
DESCRIPTION: Avoid query deparse and planning of shard query in local execution. Adds citus.enable_local_execution_local_plan GUC to allow avoiding unnecessary query deparsing to improve performance of fast-path queries targeting local shards.

If a fast path query resolves to a shard that is local to the node planning the query, a shortcut can be taken so that the OID of the shard is plugged into the parse tree, which is then planned by Postgres. In `local_executor.c` the task uses that plan instead of parsing and planning a shard query. How this is done: The fast path planner identifies if the shortcut is possible, and then the distributed planner checks, using `CheckAndBuildDelayedFastPathPlan()`, if a local plan can be generated or if the shard query should be generated.

This optimization is controlled by a GUC `citus.enable_local_execution_local_plan` which is on by default. A new
regress test `local_execution_local_plan` tests both row-sharding and schema sharding. Negative tests are added to
`local_shard_execution_dropped_column` to verify that the optimization is not taken when the shard is local but there is a difference between the shard and distributed table because of a dropped column.
2025-07-22 17:16:53 +01:00
ibrahim halatci 2ba566599f
Merge branch 'main' into ihalatci-extension-compat-test-report 2025-07-22 18:37:01 +03:00
dependabot[bot] c978de41b4
Bump black from 24.2.0 to 24.3.0 in /.devcontainer/src/test/regress (#8068)
Bumps [black](https://github.com/psf/black) from 24.2.0 to 24.3.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/releases">black's
releases</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/blob/main/CHANGES.md">black's
changelog</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="552baf8229"><code>552baf8</code></a>
Prepare release 24.3.0 (<a
href="https://redirect.github.com/psf/black/issues/4279">#4279</a>)</li>
<li><a
href="f000936726"><code>f000936</code></a>
Fix catastrophic performance in lines_with_leading_tabs_expanded() (<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
<li><a
href="7b5a657285"><code>7b5a657</code></a>
Fix --line-ranges behavior when ranges are at EOF (<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
<li><a
href="1abcffc818"><code>1abcffc</code></a>
Use regex where we ignore case on windows (<a
href="https://redirect.github.com/psf/black/issues/4252">#4252</a>)</li>
<li><a
href="719e67462c"><code>719e674</code></a>
Fix 4227: Improve documentation for --quiet --check (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
<li><a
href="e5510afc06"><code>e5510af</code></a>
update plugin url for Thonny (<a
href="https://redirect.github.com/psf/black/issues/4259">#4259</a>)</li>
<li><a
href="6af7d11096"><code>6af7d11</code></a>
Fix AST safety check false negative (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li><a
href="f03ee113c9"><code>f03ee11</code></a>
Ensure <code>blib2to3.pygram</code> is initialized before use (<a
href="https://redirect.github.com/psf/black/issues/4224">#4224</a>)</li>
<li><a
href="e4bfedbec2"><code>e4bfedb</code></a>
fix: Don't move comments while splitting delimiters (<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li><a
href="d0287e1f75"><code>d0287e1</code></a>
Make trailing comma logic more concise (<a
href="https://redirect.github.com/psf/black/issues/4202">#4202</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/psf/black/compare/24.2.0...24.3.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=black&package-manager=pip&previous-version=24.2.0&new-version=24.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-22 18:30:20 +03:00
Naisila Puka 1a5db371f5
Point to new images (#8067)
Point to new images because of libpq symlink issue resurfacing
2025-07-22 16:59:20 +03:00
ibrahim halatci 5fc3db3cda
Merge branch 'main' into ihalatci-extension-compat-test-report 2025-07-19 08:12:38 +03:00
dependabot[bot] 194e6bb0d0
Bump certifi from 2024.2.2 to 2024.7.4 in /.devcontainer/src/test/regress (#7994)
Bumps [certifi](https://github.com/certifi/python-certifi) from 2024.2.2
to 2024.7.4.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="bd8153872e"><code>bd81538</code></a>
2024.07.04 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/295">#295</a>)</li>
<li><a
href="06a2cbf21f"><code>06a2cbf</code></a>
Bump peter-evans/create-pull-request from 6.0.5 to 6.1.0 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/294">#294</a>)</li>
<li><a
href="13bba02b72"><code>13bba02</code></a>
Bump actions/checkout from 4.1.6 to 4.1.7 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/293">#293</a>)</li>
<li><a
href="e8abcd0e62"><code>e8abcd0</code></a>
Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/292">#292</a>)</li>
<li><a
href="124f4adf17"><code>124f4ad</code></a>
2024.06.02 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/291">#291</a>)</li>
<li><a
href="c2196ce5d6"><code>c2196ce</code></a>
--- (<a
href="https://redirect.github.com/certifi/python-certifi/issues/290">#290</a>)</li>
<li><a
href="fefdeec758"><code>fefdeec</code></a>
Bump actions/checkout from 4.1.4 to 4.1.5 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/289">#289</a>)</li>
<li><a
href="3c5fb1560b"><code>3c5fb15</code></a>
Bump actions/download-artifact from 4.1.6 to 4.1.7 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/286">#286</a>)</li>
<li><a
href="4a9569a3eb"><code>4a9569a</code></a>
Bump actions/checkout from 4.1.2 to 4.1.4 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/287">#287</a>)</li>
<li><a
href="1fc808626a"><code>1fc8086</code></a>
Bump peter-evans/create-pull-request from 6.0.4 to 6.0.5 (<a
href="https://redirect.github.com/certifi/python-certifi/issues/288">#288</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/certifi/python-certifi/compare/2024.02.02...2024.07.04">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=certifi&package-manager=pip&previous-version=2024.2.2&new-version=2024.7.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-18 17:18:44 +03:00
dependabot[bot] 3da9096d53
Bump black from 24.2.0 to 24.3.0 in /src/test/regress (#8062)
Bumps [black](https://github.com/psf/black) from 24.2.0 to 24.3.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/releases">black's
releases</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/blob/main/CHANGES.md">black's
changelog</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="552baf8229"><code>552baf8</code></a>
Prepare release 24.3.0 (<a
href="https://redirect.github.com/psf/black/issues/4279">#4279</a>)</li>
<li><a
href="f000936726"><code>f000936</code></a>
Fix catastrophic performance in lines_with_leading_tabs_expanded() (<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
<li><a
href="7b5a657285"><code>7b5a657</code></a>
Fix --line-ranges behavior when ranges are at EOF (<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
<li><a
href="1abcffc818"><code>1abcffc</code></a>
Use regex where we ignore case on windows (<a
href="https://redirect.github.com/psf/black/issues/4252">#4252</a>)</li>
<li><a
href="719e67462c"><code>719e674</code></a>
Fix 4227: Improve documentation for --quiet --check (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
<li><a
href="e5510afc06"><code>e5510af</code></a>
update plugin url for Thonny (<a
href="https://redirect.github.com/psf/black/issues/4259">#4259</a>)</li>
<li><a
href="6af7d11096"><code>6af7d11</code></a>
Fix AST safety check false negative (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li><a
href="f03ee113c9"><code>f03ee11</code></a>
Ensure <code>blib2to3.pygram</code> is initialized before use (<a
href="https://redirect.github.com/psf/black/issues/4224">#4224</a>)</li>
<li><a
href="e4bfedbec2"><code>e4bfedb</code></a>
fix: Don't move comments while splitting delimiters (<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li><a
href="d0287e1f75"><code>d0287e1</code></a>
Make trailing comma logic more concise (<a
href="https://redirect.github.com/psf/black/issues/4202">#4202</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/psf/black/compare/24.2.0...24.3.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=black&package-manager=pip&previous-version=24.2.0&new-version=24.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-18 15:48:59 +03:00
dependabot[bot] 16f375ff7d
Bump werkzeug from 2.3.7 to 3.0.6 in /.devcontainer/src/test/regress (#8039)
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.3.7 to
3.0.6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.6</h2>
<p>This is the Werkzeug 3.0.6 security fix release, which fixes security
issues but does not otherwise change behavior and should not result in
breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.6/">https://pypi.org/project/Werkzeug/3.0.6/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-6">https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-6</a></p>
<ul>
<li>Fix how <code>max_form_memory_size</code> is applied when parsing
large non-file fields. <a
href="https://github.com/advisories/GHSA-q34m-jh98-gwm2">GHSA-q34m-jh98-gwm2</a></li>
<li><code>safe_join</code> catches certain paths on Windows that were
not caught by <code>ntpath.isabs</code> on Python &lt; 3.11. <a
href="https://github.com/advisories/GHSA-f9vj-2wh5-fj8j">GHSA-f9vj-2wh5-fj8j</a></li>
</ul>
<h2>3.0.5</h2>
<p>This is the Werkzeug 3.0.5 fix release, which fixes bugs but does not
otherwise change behavior and should not result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.5/">https://pypi.org/project/Werkzeug/3.0.5/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-5">https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-5</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/37?closed=1">https://github.com/pallets/werkzeug/milestone/37?closed=1</a></p>
<ul>
<li>The Watchdog reloader ignores file closed no write events. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2945">#2945</a></li>
<li>Logging works with client addresses containing an IPv6 scope. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2952">#2952</a></li>
<li>Ignore invalid authorization parameters. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2955">#2955</a></li>
<li>Improve type annotation fore <code>SharedDataMiddleware</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2958">#2958</a></li>
<li>Compatibility with Python 3.13 when generating debugger pin and the
current UID does not have an associated name. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2957">#2957</a></li>
</ul>
<h2>3.0.4</h2>
<p>This is the Werkzeug 3.0.4 fix release, which fixes bugs but does not
otherwise change behavior and should not result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.4/">https://pypi.org/project/Werkzeug/3.0.4/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-4">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-4</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/36?closed=1">https://github.com/pallets/werkzeug/milestone/36?closed=1</a></p>
<ul>
<li>Restore behavior where parsing
<code>multipart/x-www-form-urlencoded</code> data with
invalid UTF-8 bytes in the body results in no form data parsed rather
than a
413 error. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2930">#2930</a></li>
<li>Improve <code>parse_options_header</code> performance when parsing
unterminated
quoted string values. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2904">#2904</a></li>
<li>Debugger pin auth is synchronized across threads/processes when
tracking
failed entries. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2916">#2916</a></li>
<li>Dev server handles unexpected <code>SSLEOFError</code> due to issue
in Python &lt; 3.13.
<a
href="https://redirect.github.com/pallets/werkzeug/issues/2926">#2926</a></li>
<li>Debugger pin auth works when the URL already contains a query
string.
<a
href="https://redirect.github.com/pallets/werkzeug/issues/2918">#2918</a></li>
</ul>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.6</h2>
<p>Released 2024-10-25</p>
<ul>
<li>Fix how <code>max_form_memory_size</code> is applied when parsing
large non-file
fields. :ghsa:<code>q34m-jh98-gwm2</code></li>
<li><code>safe_join</code> catches certain paths on Windows that were
not caught by
<code>ntpath.isabs</code> on Python &lt; 3.11.
:ghsa:<code>f9vj-2wh5-fj8j</code></li>
</ul>
<h2>Version 3.0.5</h2>
<p>Released 2024-10-24</p>
<ul>
<li>The Watchdog reloader ignores file closed no write events.
:issue:<code>2945</code></li>
<li>Logging works with client addresses containing an IPv6 scope
:issue:<code>2952</code></li>
<li>Ignore invalid authorization parameters.
:issue:<code>2955</code></li>
<li>Improve type annotation fore <code>SharedDataMiddleware</code>.
:issue:<code>2958</code></li>
<li>Compatibility with Python 3.13 when generating debugger pin and the
current
UID does not have an associated name. :issue:<code>2957</code></li>
</ul>
<h2>Version 3.0.4</h2>
<p>Released 2024-08-21</p>
<ul>
<li>Restore behavior where parsing
<code>multipart/x-www-form-urlencoded</code> data with
invalid UTF-8 bytes in the body results in no form data parsed rather
than a
413 error. :issue:<code>2930</code></li>
<li>Improve <code>parse_options_header</code> performance when parsing
unterminated
quoted string values. :issue:<code>2904</code></li>
<li>Debugger pin auth is synchronized across threads/processes when
tracking
failed entries. :issue:<code>2916</code></li>
<li>Dev server handles unexpected <code>SSLEOFError</code> due to issue
in Python &lt; 3.13.
:issue:<code>2926</code></li>
<li>Debugger pin auth works when the URL already contains a query
string.
:issue:<code>2918</code></li>
</ul>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="5eaefc3996"><code>5eaefc3</code></a>
release version 3.0.6</li>
<li><a
href="2767bcb10a"><code>2767bcb</code></a>
Merge commit from fork</li>
<li><a
href="87cc78a25f"><code>87cc78a</code></a>
catch special absolute path on Windows Python &lt; 3.11</li>
<li><a
href="50cfeebcb0"><code>50cfeeb</code></a>
Merge commit from fork</li>
<li><a
href="8760275afb"><code>8760275</code></a>
apply max_form_memory_size another level up in the parser</li>
<li><a
href="8d6a12e2af"><code>8d6a12e</code></a>
start version 3.0.6</li>
<li><a
href="a7b121abc7"><code>a7b121a</code></a>
release version 3.0.5 (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2961">#2961</a>)</li>
<li><a
href="9caf72ac06"><code>9caf72a</code></a>
release version 3.0.5</li>
<li><a
href="e28a2451e9"><code>e28a245</code></a>
catch OSError from getpass.getuser (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2960">#2960</a>)</li>
<li><a
href="e6b4cce97e"><code>e6b4cce</code></a>
catch OSError from getpass.getuser</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/2.3.7...3.0.6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=2.3.7&new-version=3.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-18 14:47:03 +03:00
Ibrahim Halatci 9c7f138b3e style correction 2025-07-18 08:40:10 +00:00
ibrahim halatci 261c97d151
Merge branch 'main' into ihalatci-extension-compat-test-report 2025-07-18 11:12:33 +03:00
SongYoungUk 743c9bbf87
fix #7715 - add assign hook for CDC library path adjustment (#8025)
DESCRIPTION: Automatically updates dynamic_library_path when CDC is
enabled

fix : #7715 

According to the documentation and `pg_settings`, the context of the
`citus.enable_change_data_capture` parameter is user.

However, changing this parameter — even as a superuser — doesn't work as
expected: while the initial copy phase works correctly, subsequent
change events are not propagated.

This appears to be due to the fact that `dynamic_library_path` is only
updated to `$libdir/citus_decoders:$libdir` when the server is restarted
and the `_PG_init` function is invoked.

To address this, I added an `EnableChangeDataCaptureAssignHook` that
automatically updates `dynamic_library_path` at runtime when
`citus.enable_change_data_capture` is enabled, ensuring that the CDC
decoder libraries are properly loaded.

Note that `dynamic_library_path` is already a `superuser`-context
parameter in base PostgreSQL, so updating it from within the assign hook
should be safe and consistent with PostgreSQL’s configuration model.

If there’s any reason this approach might be problematic or if there’s a
preferred alternative, I’d appreciate any feedback.




cc. @jy-min

---------

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Co-authored-by: ibrahim halatci <ihalatci@gmail.com>
2025-07-18 11:07:17 +03:00
Mehmet YILMAZ a8900b57e6
PG18 - Strip decimal fractions from actual rows counts in normalize.sed (#8041)
Fixes #8040 

```
- Custom Scan (Citus Adaptive) (actual rows=0 loops=1)
+ Custom Scan (Citus Adaptive) (actual rows=0.00 loops=1)
```


Add a normalization rule to the pg_regress `normalize.sed` script that
strips any trailing decimal fraction from actual rows= counts (e.g.
turning `actual rows=0.00` into `actual rows=0`). This silences noise
diffs introduced by the new PostgreSQL 18 beta’s planner output.

commit b06bde5771
2025-07-17 15:38:06 +03:00
Mehmet YILMAZ 5d805eb10b
PG18 - Adapt columnar stripe metadata updates (#8030)
Fixes #8019

**Background / Problem**
- PostgreSQL 18 (commit
[a07e03f…](a07e03fd8f))
removed `heap_inplace_update()` and related helpers.
- Citus’ columnar writer relied on that API in
`UpdateStripeMetadataRow()` to patch the `columnar_stripe` catalog row
with the stripe file-offset, size, and row-count.
- Building the extension against PG 18 therefore failed at link-time
and, if stubbed out, left `file_offset = 0`, causing every insert to
abort with
`ERROR: attempted columnar write … to invalid logical offset: 0`



**Scope of This PR**

- Keep the fast-path on PG 12–17 (`heap_inplace_update()` unchanged).
- Switch to `CatalogTupleUpdate()` on PG 18+, matching core’s new
catalog-update API.
- Bump the lock level from `AccessShareLock` → `RowExclusiveLock` when
the normal heap-update path is taken.
- No behavioral changes for users on PG ≤ 17
2025-07-17 15:15:43 +03:00
Mehmet YILMAZ da24ede835
Support PostgreSQL 18’s new RTE kinds in Citus deparser (#8023)
Fixes #8020 

PostgreSQL 18 introduces two new, *pseudo* rangetable‐entry kinds that
Citus’ downstream deparser must recognize:

1. **Pulled-up shard RTE clones** (`CITUS_RTE_SHARD` with `relid ==
InvalidOid`)
2. **Grouping-step RTE** (`RTE_GROUP`, alias `*GROUP*`, not actually in
the FROM clause)

Without special handling, Citus crashes or emits invalid SQL when
running against PG 18beta1:

* **`ERROR: could not open relation with OID 0`**
Citus was unconditionally calling `relation_open(rte->relid,…)` on
entries whose `relid` is 0.
* **`ERROR: missing FROM-clause entry for table "*GROUP*"`**
Citus’ `set_rtable_names()` assigned the synthetic `*GROUP*` alias but
never printed a matching FROM item.

This PR teaches Citus’ `ruleutils_18.c` to skip catalog lookups for RTEs
without valid OIDs and to suppress the grouping-RTE alias, restoring
compatibility with both PG 17 and PG 18.

---

## Background

* **Upstream commit
[[247dea8](247dea89f7)**
Introduced `RTE_GROUP` for the grouping step so that multiple subqueries
in `GROUP BY`/`HAVING` can be deduplicated and planned correctly.
* **Citus PR
[[#6428](https://github.com/citusdata/citus/pull/6428)](https://github.com/citusdata/citus/pull/6428)**
Added initial support for treating shard RTEs like real
relations—calling `relation_open()` to pick up renamed-column fixes.
Worked fine on PG 11–17, but PG 18’s pull-up logic clones those shard
RTEs with `relid=0`, leading to OID 0 crashes.

---

## Changes

1. **Guard `relation_open()`**
In `set_relation_column_names()`, only call `relation_open(rte->relid,
…)` when

   ```c
   OidIsValid(rte->relid)
   ```

Prevents the “could not open relation with OID 0” crash on both
pulled-up shards and synthetic RTEs.

2. **Handle pulled-up shards** (`CITUS_RTE_SHARD` with `relid=0`)
Copy column names directly from `rte->eref->colnames` instead of hitting
the catalog.

3. **Handle grouping RTE** (`RTE_GROUP`)

* **In `set_relation_column_names()`**: fallback to
`rte->eref->colnames` for `RTE_GROUP`.
   * **In `set_rtable_names()`**: explicitly assign

     ```c
     refname = NULL;  /* never show *GROUP* in FROM */
     ```

     so that no `*GROUP*` alias is ever printed.

   **Why this is required:**
PostgreSQL 18’s parser now represents the grouping step with a synthetic
RTE whose alias is always `*GROUP*`—and that RTE is **never** actually
listed in the `FROM` clause. If Citus’ deparser assigns and emits
`*GROUP*` as a table reference, the pushed-down SQL becomes:

   ```sql
SELECT *GROUP*.mygroupcol … -- but there is no “*GROUP*” in the FROM
list
   ```

   Workers then fail:

   ```
   ERROR: missing FROM-clause entry for table "*GROUP*"
   ```

By setting `refname = NULL` for `RTE_GROUP` in `set_rtable_names()`, the
deparser prints just the column name unqualified, exactly matching
upstream PG 18’s behavior and yielding valid SQL on the workers.

4. **Maintain existing behavior on PG 15–17**

* Shard RTEs *with* valid `relid` still open the catalog to pick up
renamed-column fixes.
   * No impact on other RTE kinds or versions prior to PG 18.

---
2025-07-17 13:15:31 +03:00
Mehmet YILMAZ 5005be31e6
PG18 - Handle PG18’s synthetic `RTE_GROUP` in `FindReferencedTableColumn` for correct GROUP BY pushdown (#8034)
Fixes #8032 

PostgreSQL 18 introduces a dedicated “grouping-step” range table entry
(`RTE_GROUP`) whose target columns are exactly the expressions in our
`GROUP BY` clause, rather than hiding them as `resjunk` items. In
Citus’s distributed planner, the function `FindReferencedTableColumn`
must be able to map from a `Var` referencing a grouped column back to
the underlying table column. Without special handling for `RTE_GROUP`,
queries that rely on pushdown of `GROUP BY` expressions can fail or
mis-identify their target columns.

This PR adds support for `RTE_GROUP` in Citus when built against PG 18
or later, ensuring that:

* Each grouped expression is correctly resolved.
* The pushdown planner can trace a `Var`’s `varattno` into the
corresponding `groupexprs` list.
* Existing behavior on PG < 18 is unchanged.

---

## What’s Changed

In **`src/backend/distributed/planner/multi_logical_optimizer.c`**,
inside `FindReferencedTableColumn`:

* **Under** `#if PG_VERSION_NUM >= PG_VERSION_18`
  Introduce an `else if` branch for

  ```c
  rangeTableEntry->rtekind == RTE_GROUP
  ```

* **Extraction of grouped expressions:**

  ```c
  List *groupexprs   = rangeTableEntry->groupexprs;
  AttrNumber groupIndex = candidateColumn->varattno - 1;
  ```

* **Safety check** to guard against malformed `Var` numbers:

  ```c
  if (groupIndex < 0 || groupIndex >= list_length(groupexprs))
      return;    /* malformed Var */
  ```

* **Recursive descent:**
  Fetch the corresponding expression from `groupexprs` and call

  ```c
  FindReferencedTableColumn(groupExpr, parentQueryList, query,
                            column, rteContainingReferencedColumn,
                            skipOuterVars);
  ```

so that the normal resolution logic applies to the underlying
expression.

* **Unchanged code path** for PG < 18 and for other `rtekind` values.

---
2025-07-16 23:23:14 +03:00
Mehmet YILMAZ 9e42f3f2c4
Add PG 18Beta1 compatibility (Build + RuleUtils) (#7981)
This PR provides successful build against PG18Beta1. RuleUtils PR was
reviewed separately: #8010

## PG 18Beta1–related changes for building Citus


### TupleDesc / Attr layout

**What changed in PG:** Postgres consolidated the
`TupleDescData.attrs[]` array into a more compact representation. Direct
field access (tupdesc->attrs[i]) was replaced by the new
`TupleDescAttr()` API.

**Citus adaptation:** Everywhere we previously used
`tupdesc->attrs[...]`, we now call `TupleDescAttr(tupdesc, idx)` (or our
own `Attr()` macro) under a compatibility guard.
*
5983a4cffc

General Logic:

* Use `Attr(...)` in places where `columnar_version_compat.h` is
included. This avoids the need to sprinkle `#if PG_VERSION_NUM` guards
around each attribute access.

* Use `TupleDescAttr(tupdesc, i)` when the relevant PostgreSQL header is
already included and the additional macro indirection is unnecessary.


### Collation‐aware `LIKE`

**What changed in PG:** The `textlike` operator now requires an explicit
collation, to avoid ambiguous‐collation errors. Core code switched from
`DirectFunctionCall2(textlike, ...)` to
`DirectFunctionCall2Coll(textlike, DEFAULT_COLLATION_OID, ...)`.

**Citus adaptation:** In `remote_commands.c` and any other LIKE call, we
now use `DirectFunctionCall2Coll(textlike, DEFAULT_COLLATION_OID, ...)`
and `#include <utils/pg_collation.h>`.

*
85b7efa1cd

### Columnar storage API

* Adapt `columnar_relation_set_new_filelocator` (and related init
routines) for PG 18’s revised SMGR and storage-initialization hooks.
* Pull in the new headers (`explain_format.h`,
`columnar_version_compat.h`) so the columnar module compiles cleanly
against PG 18.
- heap_modify_tuple + heap_inplace_update only exist on PG < 18; on PG18
the in-place helper was removed upstream


-
a07e03fd8f

### OpenSSL / TLS integration

**What changed in PG:** Moved from the legacy `SSL_library_init()` to
`OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL)`, updated certificate
API calls (`X509_getm_notBefore`, `X509_getm_notAfter`), and
standardized on `TLS_method()`.

**Citus adaptation:** We now `#include <openssl/opensslv.h>` and use
`#if OPENSSL_VERSION_NUMBER >= 0x10100000L` to choose between`
OPENSSL_init_ssl()` or `SSL_library_init()`, and wrap`
X509_gmtime_adj()` calls around the new accessor functions.

*
6c66b7443c


### Adapt `ExtractColumns()` to the new PG-18 `expandRTE()` signature

PostgreSQL 18
80feb727c8
added a fourth argument of type `VarReturningType` to `expandRTE()`, so
calls that used the old 7-parameter form no longer compile. This patch:

* Wraps the `expandRTE(...)` call in a `#if PG_VERSION_NUM >= 180000`
guard.
* On PG 18+ passes the new `VAR_RETURNING_DEFAULT` argument before
`location`.
* On PG 15–17 continues to call the original 7-arg form.
* Adds the necessary includes (`parser/parse_relation.h` for `expandRTE`
and `VarReturningType`, and `pg_version_constants.h` for
`PG_VERSION_NUM`).



### Adapt `ExecutorStart`/`ExecutorRun` hooks to PG-18’s new signatures

PostgreSQL 18
525392d572
changed the signatures of the executor hooks:

* `ExecutorStart_hook` now returns `bool` instead of `void`, and
* `ExecutorRun_hook` drops its old `run_once` argument.

This patch preserves Citus’s existing hook logic by:

1. **Adding two adapter functions** under `#if PG_VERSION_NUM >=
PG_VERSION_18`:

   * `citus_executor_start_adapter(QueryDesc *queryDesc, int eflags)`
Calls the old `CitusExecutorStart(queryDesc, eflags)` and then returns
`true` to satisfy the new hook’s `bool` return type.
* `citus_executor_run_adapter(QueryDesc *queryDesc, ScanDirection
direction, uint64 count)`
Calls the old `CitusExecutorRun(queryDesc, direction, count, true)`
(passing `true` for the dropped `run_once` argument), and returns
`void`.

2. **Installing the adapters** in `_PG_init()` instead of the original
hooks when building against PG 18+:

   ```c
   #if PG_VERSION_NUM >= PG_VERSION_18
       ExecutorStart_hook = citus_executor_start_adapter;
       ExecutorRun_hook   = citus_executor_run_adapter;
   #else
       ExecutorStart_hook = CitusExecutorStart;
       ExecutorRun_hook   = CitusExecutorRun;
   #endif
   ```
   
### Adapt to PG-18’s removal of the “run\_once” flag from
ExecutorRun/PortalRun

PostgreSQL commit
[[3eea7a0](3eea7a0c97)
rationalized the executor’s parallelism logic by moving the “execute a
plan only once” check into `ExecutePlan()` itself and dropping the old
`bool run_once` argument from the public APIs:

```diff
- void ExecutorRun(QueryDesc *queryDesc,
-                  ScanDirection direction,
-                  uint64 count,
-                  bool run_once);
+ void ExecutorRun(QueryDesc *queryDesc,
+                  ScanDirection direction,
+                  uint64 count);
```

(and similarly for `PortalRun()`).

To stay compatible across PG 15–18, Citus now:

1. **Updates all internal calls** to `ExecutorRun(...)` and
`PortalRun(...)`:

* On PG 18+, use the new three-argument form (`ExecutorRun(qd, dir,
count)`).
* On PG 15–17, keep the old four-arg form (`ExecutorRun(qd, dir, count,
true)`) under a `#if PG_VERSION_NUM < 180000` guard.

2. **Guards the dispatcher hooks** via the adapter functions (from the
earlier patch) so that Citus’s executor hooks continue to work under
both the old and new signatures.


### Adapt to PG-18’s shortened PortalRun signature

PostgreSQL 18’s refactoring (see commit
[3eea7a0](3eea7a0c97))
also removed the old run_once and alternate‐dest arguments from the
public PortalRun() API. The signature changed from:



```diff
- bool PortalRun(Portal portal,
-                long count,
-                bool isTopLevel,
-                bool run_once,
-                DestReceiver *dest,
-                DestReceiver *altdest,
-                QueryCompletion *qc);
+ bool PortalRun(Portal portal,
+                long count,
+                bool isTopLevel,
+                DestReceiver *dest,
+                DestReceiver *altdest,
+                QueryCompletion *qc);
```

To support both versions in Citus, we:

1. **Version-guard each call** to `PortalRun()`:

   * **On PG 18+** invoke the new 6-argument form.
* **On PG 15–17** fall back to the legacy 7-argument form, passing
`true` for `run_once`.
   
### Add support for PG-18’s new `plansource` argument in
`PortalDefineQuery`**

PostgreSQL 18 extended the `PortalDefineQuery` API to carry a
`CachedPlanSource *plansource` pointer so that the portal machinery can
track cached‐plan invalidation (as introduced alongside deferred-locking
in commit
525392d572.
To remain compatible across PG 15–18, Citus now wraps its calls under a
version guard:

```diff
-   PortalDefineQuery(portal, NULL, sql, commandTag, plantree_list, NULL);
+#if PG_VERSION_NUM >= 180000
+   /* PG 18+: seven-arg signature (adds plansource) */
+   PortalDefineQuery(
+       portal,
+       NULL,            /* no prepared-stmt name */
+       sql,             /* the query text */
+       commandTag,      /* the CommandTag */
+       plantree_list,   /* List of PlannedStmt* */
+       NULL,            /* no CachedPlan */
+       NULL             /* no CachedPlanSource */
+   );
+#else
+   /* PG 15–17: six-arg signature */
+   PortalDefineQuery(
+       portal,
+       NULL,            /* no prepared-stmt name */
+       sql,             /* the query text */
+       commandTag,      /* the CommandTag */
+       plantree_list,   /* List of PlannedStmt* */
+       NULL             /* no CachedPlan */
+   );
+#endif
```


### Adapt ExecInitRangeTable() calls to PG-18’s new signature

PostgreSQL commit
[cbc127917e04a978a788b8bc9d35a70244396d5b](cbc127917e)
overhauled the planner API for range‐table initialization:

**PG 18+**: added a fourth `Bitmapset *unpruned_relids` argument to
support deferred partition pruning

In Citus’s `create_estate_for_relation()` (in `columnar_metadata.c`), we
now wrap the call in a compile‐time guard so that the code compiles
correctly on all supported PostgreSQL versions:

```
/* Prepare permission info on PG 16+ */
#if PG_VERSION_NUM >= PG_VERSION_16
    List *perminfos = NIL;
    addRTEPermissionInfo(&perminfos, rte);
#else
    List *perminfos = NIL;  /* unused on PG 15 */
#endif

/* Initialize the range table, with the right signature for each PG version */
#if PG_VERSION_NUM >= PG_VERSION_18
    /* PG 18+: four‐arg signature (adds unpruned_relids) */
    ExecInitRangeTable(
        estate,
        list_make1(rte),
        perminfos,
        NULL        /* unpruned_relids: not used by columnar */
    );
#elif PG_VERSION_NUM >= PG_VERSION_16
    /* PG 16–17: three‐arg signature (permInfos) */
    ExecInitRangeTable(
        estate,
        list_make1(rte),
        perminfos
    );
#else
    /* PG 15: two‐arg signature */
    ExecInitRangeTable(
        estate,
        list_make1(rte)
    );
#endif

estate->es_output_cid = GetCurrentCommandId(true);
```

### Adapt `pgstat_report_vacuum()` to PG-18’s new timestamp argument

PostgreSQL commit
[[30a6ed0ce4bb18212ec38cdb537ea4b43bc99b83](30a6ed0ce4)
extended the `pgstat_report_vacuum()` API by adding a `TimestampTz
start_time` parameter at the end so that the VACUUM statistics collector
can record when the operation began:

```diff
/* PG ≤17: four-arg signature */
- void pgstat_report_vacuum(Oid tableoid,
-                           bool shared,
-                           double num_live_tuples,
-                           double num_dead_tuples);
+/* PG ≥18: five-arg signature adds a start_time */
+ void pgstat_report_vacuum(Oid tableoid,
+                           bool shared,
+                           double num_live_tuples,
+                           double num_dead_tuples,
+                           TimestampTz start_time);
```

To support both versions, we now wrap the call in `columnar_tableam.c`
with a version guard, supplying `GetCurrentTimestamp()` for PG-18+:

```c
#if PG_VERSION_NUM >= 180000
    /* PG 18+: include start_timestamp */
    pgstat_report_vacuum(
        RelationGetRelid(rel),
        rel->rd_rel->relisshared,
        Max(new_live_tuples, 0),  /* live tuples */
        0,                        /* dead tuples */
        GetCurrentTimestamp()     /* start time */
    );
#else
    /* PG 15–17: original signature */
    pgstat_report_vacuum(
        RelationGetRelid(rel),
        rel->rd_rel->relisshared,
        Max(new_live_tuples, 0),  /* live tuples */
        0                         /* dead tuples */
    );
#endif
```


### Adapt `ExecuteTaskPlan()` to PG-18’s expanded `CreateQueryDesc()`
signature

PostgreSQL 18 changed `CreateQueryDesc()` from an eight-argument to a
nine-argument call by inserting a `CachedPlan *cplan` parameter
immediately after the `PlannedStmt *plannedstmt` argument (see commit
525392d572).
To remain compatible with PG 15–17, Citus now wraps its invocation in
`local_executor.c` with a version guard:

```diff
-    /* PG15–17: eight-arg CreateQueryDesc without cached plan */
-    QueryDesc *queryDesc = CreateQueryDesc(
-        taskPlan,           /* PlannedStmt *plannedstmt */
-        queryString,        /* const char *sourceText */
-        GetActiveSnapshot(),/* Snapshot snapshot */
-        InvalidSnapshot,    /* Snapshot crosscheck_snapshot */
-        destReceiver,       /* DestReceiver *dest */
-        paramListInfo,      /* ParamListInfo params */
-        queryEnv,           /* QueryEnvironment *queryEnv */
-        0                   /* int instrument_options */
-    );
+#if PG_VERSION_NUM >= 180000
+    /* PG18+: nine-arg CreateQueryDesc with a CachedPlan slot */
+    QueryDesc *queryDesc = CreateQueryDesc(
+        taskPlan,           /* PlannedStmt *plannedstmt */
+        NULL,               /* CachedPlan *cplan (none) */
+        queryString,        /* const char *sourceText */
+        GetActiveSnapshot(),/* Snapshot snapshot */
+        InvalidSnapshot,    /* Snapshot crosscheck_snapshot */
+        destReceiver,       /* DestReceiver *dest */
+        paramListInfo,      /* ParamListInfo params */
+        queryEnv,           /* QueryEnvironment *queryEnv */
+        0                   /* int instrument_options */
+    );
+#else
+    /* PG15–17: eight-arg CreateQueryDesc without cached plan */
+    QueryDesc *queryDesc = CreateQueryDesc(
+        taskPlan,           /* PlannedStmt *plannedstmt */
+        queryString,        /* const char *sourceText */
+        GetActiveSnapshot(),/* Snapshot snapshot */
+        InvalidSnapshot,    /* Snapshot crosscheck_snapshot */
+        destReceiver,       /* DestReceiver *dest */
+        paramListInfo,      /* ParamListInfo params */
+        queryEnv,           /* QueryEnvironment *queryEnv */
+        0                   /* int instrument_options */
+    );
+#endif
```



### Adapt `RelationGetPrimaryKeyIndex()` to PG-18’s new “deferrable\_ok”
flag

PostgreSQL commit
14e87ffa5c
added a new Boolean `deferrable_ok` parameter to
`RelationGetPrimaryKeyIndex()` so that the lock manager can defer
unique‐constraint locks when requested. The API changed from:

```c
RelationGetPrimaryKeyIndex(Relation relation)
```

to:

```c
RelationGetPrimaryKeyIndex(Relation relation, bool deferrable_ok)
 ```
                
```diff
diff --git a/src/backend/distributed/metadata/node_metadata.c
b/src/backend/distributed/metadata/node_metadata.c
index e3a1b2c..f4d5e6f 100644
--- a/src/backend/distributed/metadata/node_metadata.c
+++ b/src/backend/distributed/metadata/node_metadata.c
@@ -2965,8 +2965,18 @@
     */
- Relation replicaIndex =
index_open(RelationGetPrimaryKeyIndex(pgDistNode),
-                                      AccessShareLock);
+    #if PG_VERSION_NUM >= PG_VERSION_18
+        /* PG 18+ adds a bool "deferrable_ok" parameter */
+        Relation replicaIndex =
+            index_open(
+                RelationGetPrimaryKeyIndex(pgDistNode, false),
+                AccessShareLock);
+    #else
+        Relation replicaIndex =
+            index_open(
+                RelationGetPrimaryKeyIndex(pgDistNode),
+                AccessShareLock);
+    #endif

     ScanKeyInit(&scanKey[0], Anum_pg_dist_node_nodename,
BTEqualStrategyNumber, F_TEXTEQ, CStringGetTextDatum(nodeName));

```
  
  ```diff
  diff --git a/src/backend/distributed/operations/node_protocol.c b/src/backend/distributed/operations/node_protocol.c
index e3a1b2c..f4d5e6f 100644
--- a/src/backend/distributed/operations/node_protocol.c
+++ b/src/backend/distributed/operations/node_protocol.c
@@ -746,7 +746,12 @@
     if (!OidIsValid(idxoid))
     {
-        idxoid = RelationGetPrimaryKeyIndex(rel);
+        /* Determine the index OID of the primary key (PG18 adds a second parameter) */
+#if PG_VERSION_NUM >= PG_VERSION_18
+        idxoid = RelationGetPrimaryKeyIndex(rel, false);
+#else
+        idxoid = RelationGetPrimaryKeyIndex(rel);
+#endif
     }

     return idxoid;

```
  
Because Citus has always taken the lock immediately—just as the old
two-arg call did—we pass `false` to keep that same immediate-lock
behavior. Passing `true` would switch to deferred locking, which we
don’t want.



### Adapt `ExplainOnePlan()` to PG-18’s expanded API

PostgreSQL 18 extended
525392d572
the `ExplainOnePlan()` function to carry the `CachedPlan *` and
`CachedPlanSource *` pointers plus an explicit `query_index`, letting
the EXPLAIN machinery track plan‐source invalidation. The old signature:

```c
/* PG ≤17 */
void
ExplainOnePlan(PlannedStmt *plannedstmt,
               IntoClause *into,
               struct ExplainState *es,
               const char *queryString,
               ParamListInfo params,
               QueryEnvironment *queryEnv,
               const instr_time *planduration,
               const BufferUsage *bufusage);
```

became, in PG 18:

```c
/* PG ≥18 */
void
ExplainOnePlan(PlannedStmt *plannedstmt,
               CachedPlan   *cplan,
               CachedPlanSource *plansource,
               int            query_index,
               IntoClause    *into,
               struct ExplainState *es,
               const char   *queryString,
               ParamListInfo params,
               QueryEnvironment *queryEnv,
               const instr_time *planduration,
               const BufferUsage *bufusage,
               const MemoryContextCounters *mem_counters);
```

To compile under both versions, Citus now wraps each call in
`multi_explain.c` with:

```c
#if PG_VERSION_NUM >= PG_VERSION_18
    /* PG 18+: pass NULL for the new cached‐plan fields and zero for query_index */
    ExplainOnePlan(
        plan,         /* PlannedStmt *plannedstmt */
        NULL,         /* CachedPlan *cplan */
        NULL,         /* CachedPlanSource *plansource */
        0,            /* query_index */
        into,         /* IntoClause *into */
        es,           /* ExplainState *es */
        queryString,  /* const char *queryString */
        params,       /* ParamListInfo params */
        NULL,         /* QueryEnvironment *queryEnv */
        &planduration,/* const instr_time *planduration */
        (es->buffers ? &bufusage : NULL),
        (es->memory  ? &mem_counters : NULL)
    );
#elif PG_VERSION_NUM >= PG_VERSION_17
    /* PG 17: same as before, plus passing mem_counters if enabled */
    ExplainOnePlan(
        plan,
        into,
        es,
        queryString,
        params,
        queryEnv,
        &planduration,
        (es->buffers ? &bufusage : NULL),
        (es->memory ? &mem_counters : NULL)
    );
#else
    /* PG 15–16: original seven-arg form */
    ExplainOnePlan(
        plan,
        into,
        es,
        queryString,
        params,
        queryEnv,
        &planduration,
        (es->buffers ? &bufusage : NULL)
    );
#endif
```


### Adapt to the unified “index interpretation” API in PG 18 (commit
a8025f544854)

PostgreSQL commit
a8025f5448
generalized the old btree‐specific operator‐interpretation API into a
single “index interpretation” interface:

* **Renamed type**:
  `OpBtreeInterpretation` → `OpIndexInterpretation`
* **Renamed function**:
`get_op_btree_interpretation(opno)` →
`get_op_index_interpretation(opno)`
* **Unified field**:
  Each interpretation now carries `cmptype` instead of `strategy`.

To build cleanly on PG 18 while still supporting PG 15–17, Citus’s
shard‐pruning code now wraps these changes:

```c
#include "pg_version_constants.h"

#if PG_VERSION_NUM >= PG_VERSION_18
/* On PG 18+ the btree‐only APIs vanished; alias them to the new generic versions */
typedef OpIndexInterpretation OpBtreeInterpretation;
#define get_op_btree_interpretation(opno)  get_op_index_interpretation(opno)
#define ROWCOMPARE_NE  COMPARE_NE
#endif

/* … later, when checking an interpretation … */
OpBtreeInterpretation *interp =
    (OpBtreeInterpretation *) lfirst(cell);

#if PG_VERSION_NUM >= PG_VERSION_18
    /* use cmptype on PG 18+ */
    if (interp->cmptype == ROWCOMPARE_NE)
#else
    /* use strategy on PG 15–17 */
    if (interp->strategy == ROWCOMPARE_NE)
#endif
{
    /* … */
}
```


### Adapt `create_foreignscan_path()` for PG-18’s revised signature

PostgreSQL commit
e222534679
reordered and removed a couple of parameters in the FDW‐path builder:

* **PG 15–17 signature (11 args)**

  ```c
  create_foreignscan_path(PlannerInfo   *root,
                          RelOptInfo    *rel,
                          PathTarget    *target,
                          double         rows,
                          Cost           startup_cost,
                          Cost           total_cost,
                          List          *pathkeys,
                          Relids         required_outer,
                          Path          *fdw_outerpath,
                          List          *fdw_restrictinfo,
                          List          *fdw_private);
  ```
* **PG 18+ signature (9 args)**

  ```c
  create_foreignscan_path(PlannerInfo   *root,
                          RelOptInfo    *rel,
                          PathTarget    *target,
                          double         rows,
                          int            disabled_nodes,
                          Cost           startup_cost,
                          Cost           total_cost,
                          Relids         required_outer,
                          Path          *fdw_outerpath,
                          List          *fdw_private);
  ```

To support both, Citus now defines a compatibility macro in
`pg_version_compat.h`:

```c
#include "nodes/bitmapset.h"   /* for Relids */
#include "nodes/pg_list.h"     /* for List */
#include "optimizer/pathnode.h" /* for create_foreignscan_path() */

#if PG_VERSION_NUM >= PG_VERSION_18

/* PG18+: drop pathkeys & fdw_restrictinfo, add disabled_nodes */
#define create_foreignscan_path_compat(a, b, c, d, e, f, g, h, i, j, k) \
    create_foreignscan_path(                                            \
        (a),          /* root */                                       \
        (b),          /* rel */                                        \
        (c),          /* target */                                     \
        (d),          /* rows */                                       \
        (0),          /* disabled_nodes (unused by Citus) */           \
        (e),          /* startup_cost */                              \
        (f),          /* total_cost */                                \
        (g),          /* required_outer */                            \
        (h),          /* fdw_outerpath */                             \
        (k)           /* fdw_private */                               \
    )

#else

/* PG15–17: original signature */
#define create_foreignscan_path_compat(a, b, c, d, e, f, g, h, i, j, k) \
    create_foreignscan_path(                                            \
        (a), (b), (c), (d),                                            \
        (e), (f),                                                      \
        (g), (h), (i), (j), (k)                                        \
    )
#endif
```

Now every call to `create_foreignscan_path_compat(...)`—even in tests
like `fake_fdw.c`—automatically picks the correct argument list for
PG 15 through PG 18.



### Drop the obsolete bitmap‐scan hooks on PG 18+

PostgreSQL commit
c3953226a0
cleaned up the `TableAmRoutine` API by removing the two bitmap‐scan
callback slots:

* `scan_bitmap_next_block`
* `scan_bitmap_next_tuple`

Since those hook‐slots no longer exist in PG 18, Citus now wraps their
NULL‐initialization in a `#if PG_VERSION_NUM < PG_VERSION_18` guard. On
PG 15–17 we still explicitly set them to `NULL` (to satisfy the old
struct layout), and on PG 18+ we omit them entirely:

```c

#if PG_VERSION_NUM < PG_VERSION_18
    /* PG 15–17 only: these fields were removed upstream in PG 18 */
    .scan_bitmap_next_block = NULL,
    .scan_bitmap_next_tuple = NULL,
#endif


```


### Adapt `vac_update_relstats()` invocation to PG-18’s new
“all\_frozen” argument

PostgreSQL commit
99f8f3fbbc
extended the `vac_update_relstats()` API by inserting a
`num_all_frozen_pages` parameter between the existing
`num_all_visible_pages` and `hasindex` arguments:

```diff
- /* PG ≤17: */
- void
- vac_update_relstats(Relation relation,
-                    BlockNumber num_pages,
-                    double     num_tuples,
-                    BlockNumber num_all_visible_pages,
-                    bool       hasindex,
-                    TransactionId frozenxid,
-                    MultiXactId  minmulti,
-                    bool      *frozenxid_updated,
-                    bool      *minmulti_updated,
-                    bool       in_outer_xact);
+ /* PG ≥18: adds num_all_frozen_pages */
+ void
+ vac_update_relstats(Relation    relation,
+                    BlockNumber num_pages,
+                    double      num_tuples,
+                    BlockNumber num_all_visible_pages,
+                    BlockNumber num_all_frozen_pages,
+                    bool        hasindex,
+                    TransactionId frozenxid,
+                    MultiXactId  minmulti,
+                    bool      *frozenxid_updated,
+                    bool      *minmulti_updated,
+                    bool       in_outer_xact);
```

To compile cleanly on both PG 15–17 and PG 18+, Citus wraps its call in
a version guard and supplies a zero placeholder for the new field:

```c
#if PG_VERSION_NUM >= 180000
    /* PG 18+: supply explicit “all_frozen” count */
    vac_update_relstats(
        rel,
        new_rel_pages,
        new_live_tuples,
        new_rel_allvisible,    /* allvisible */
        0,                     /* all_frozen */
        nindexes > 0,
        newRelFrozenXid,
        newRelminMxid,
        &frozenxid_updated,
        &minmulti_updated,
        false                  /* in_outer_xact */
    );
#else
    /* PG 15–17: original signature */
    vac_update_relstats(
        rel,
        new_rel_pages,
        new_live_tuples,
        new_rel_allvisible,
        nindexes > 0,
        newRelFrozenXid,
        newRelminMxid,
        &frozenxid_updated,
        &minmulti_updated,
        false                  /* in_outer_xact */
    );
#endif
```

**Why all_frozen = 0?**
Columnar storage never embeds transaction IDs in its pages, so it never
needs to track “all‐frozen” pages the way a heap does. Setting both
allvisible and allfrozen to zero simply tells Postgres “there are no
pages with the visibility or frozen‐status bits set,” matching our
existing behavior.

This change ensures Citus’s VACUUM‐statistic updates work unmodified
across all supported Postgres versions.
2025-07-16 15:30:41 +03:00
Ibrahim Halatci 7e4dd604d0 typo 2025-07-02 15:53:35 +03:00
Ibrahim Halatci d0644a9884 results of extension compatibility testing 2025-07-02 15:51:22 +03:00
Ibrahim Halatci 37ad5e6958 results of extension compatibility testing 2025-07-02 15:46:12 +03:00
dependabot[bot] 5deaf9a616
Bump werkzeug from 2.3.7 to 3.0.6 in /src/test/regress (#8003)
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.3.7 to
3.0.6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/releases">werkzeug's
releases</a>.</em></p>
<blockquote>
<h2>3.0.6</h2>
<p>This is the Werkzeug 3.0.6 security fix release, which fixes security
issues but does not otherwise change behavior and should not result in
breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.6/">https://pypi.org/project/Werkzeug/3.0.6/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-6">https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-6</a></p>
<ul>
<li>Fix how <code>max_form_memory_size</code> is applied when parsing
large non-file fields. <a
href="https://github.com/advisories/GHSA-q34m-jh98-gwm2">GHSA-q34m-jh98-gwm2</a></li>
<li><code>safe_join</code> catches certain paths on Windows that were
not caught by <code>ntpath.isabs</code> on Python &lt; 3.11. <a
href="https://github.com/advisories/GHSA-f9vj-2wh5-fj8j">GHSA-f9vj-2wh5-fj8j</a></li>
</ul>
<h2>3.0.5</h2>
<p>This is the Werkzeug 3.0.5 fix release, which fixes bugs but does not
otherwise change behavior and should not result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.5/">https://pypi.org/project/Werkzeug/3.0.5/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-5">https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-5</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/37?closed=1">https://github.com/pallets/werkzeug/milestone/37?closed=1</a></p>
<ul>
<li>The Watchdog reloader ignores file closed no write events. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2945">#2945</a></li>
<li>Logging works with client addresses containing an IPv6 scope. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2952">#2952</a></li>
<li>Ignore invalid authorization parameters. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2955">#2955</a></li>
<li>Improve type annotation fore <code>SharedDataMiddleware</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2958">#2958</a></li>
<li>Compatibility with Python 3.13 when generating debugger pin and the
current UID does not have an associated name. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2957">#2957</a></li>
</ul>
<h2>3.0.4</h2>
<p>This is the Werkzeug 3.0.4 fix release, which fixes bugs but does not
otherwise change behavior and should not result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.4/">https://pypi.org/project/Werkzeug/3.0.4/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-4">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-4</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/36?closed=1">https://github.com/pallets/werkzeug/milestone/36?closed=1</a></p>
<ul>
<li>Restore behavior where parsing
<code>multipart/x-www-form-urlencoded</code> data with
invalid UTF-8 bytes in the body results in no form data parsed rather
than a
413 error. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2930">#2930</a></li>
<li>Improve <code>parse_options_header</code> performance when parsing
unterminated
quoted string values. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2904">#2904</a></li>
<li>Debugger pin auth is synchronized across threads/processes when
tracking
failed entries. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2916">#2916</a></li>
<li>Dev server handles unexpected <code>SSLEOFError</code> due to issue
in Python &lt; 3.13.
<a
href="https://redirect.github.com/pallets/werkzeug/issues/2926">#2926</a></li>
<li>Debugger pin auth works when the URL already contains a query
string.
<a
href="https://redirect.github.com/pallets/werkzeug/issues/2918">#2918</a></li>
</ul>
<h2>3.0.3</h2>
<p>This is the Werkzeug 3.0.3 security release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a>
Changes: <a
href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a>
Milestone: <a
href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified hostname when running the dev
server, to make debugger requests. Additional hosts can be added by
using the debugger middleware directly. The debugger UI makes requests
using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li>
<li>Make reloader more robust when <code>&quot;&quot;</code> is in
<code>sys.path</code>. <a
href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.0.6</h2>
<p>Released 2024-10-25</p>
<ul>
<li>Fix how <code>max_form_memory_size</code> is applied when parsing
large non-file
fields. :ghsa:<code>q34m-jh98-gwm2</code></li>
<li><code>safe_join</code> catches certain paths on Windows that were
not caught by
<code>ntpath.isabs</code> on Python &lt; 3.11.
:ghsa:<code>f9vj-2wh5-fj8j</code></li>
</ul>
<h2>Version 3.0.5</h2>
<p>Released 2024-10-24</p>
<ul>
<li>The Watchdog reloader ignores file closed no write events.
:issue:<code>2945</code></li>
<li>Logging works with client addresses containing an IPv6 scope
:issue:<code>2952</code></li>
<li>Ignore invalid authorization parameters.
:issue:<code>2955</code></li>
<li>Improve type annotation fore <code>SharedDataMiddleware</code>.
:issue:<code>2958</code></li>
<li>Compatibility with Python 3.13 when generating debugger pin and the
current
UID does not have an associated name. :issue:<code>2957</code></li>
</ul>
<h2>Version 3.0.4</h2>
<p>Released 2024-08-21</p>
<ul>
<li>Restore behavior where parsing
<code>multipart/x-www-form-urlencoded</code> data with
invalid UTF-8 bytes in the body results in no form data parsed rather
than a
413 error. :issue:<code>2930</code></li>
<li>Improve <code>parse_options_header</code> performance when parsing
unterminated
quoted string values. :issue:<code>2904</code></li>
<li>Debugger pin auth is synchronized across threads/processes when
tracking
failed entries. :issue:<code>2916</code></li>
<li>Dev server handles unexpected <code>SSLEOFError</code> due to issue
in Python &lt; 3.13.
:issue:<code>2926</code></li>
<li>Debugger pin auth works when the URL already contains a query
string.
:issue:<code>2918</code></li>
</ul>
<h2>Version 3.0.3</h2>
<p>Released 2024-05-05</p>
<ul>
<li>Only allow <code>localhost</code>, <code>.localhost</code>,
<code>127.0.0.1</code>, or the specified
hostname when running the dev server, to make debugger requests.
Additional
hosts can be added by using the debugger middleware directly. The
debugger</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="5eaefc3996"><code>5eaefc3</code></a>
release version 3.0.6</li>
<li><a
href="2767bcb10a"><code>2767bcb</code></a>
Merge commit from fork</li>
<li><a
href="87cc78a25f"><code>87cc78a</code></a>
catch special absolute path on Windows Python &lt; 3.11</li>
<li><a
href="50cfeebcb0"><code>50cfeeb</code></a>
Merge commit from fork</li>
<li><a
href="8760275afb"><code>8760275</code></a>
apply max_form_memory_size another level up in the parser</li>
<li><a
href="8d6a12e2af"><code>8d6a12e</code></a>
start version 3.0.6</li>
<li><a
href="a7b121abc7"><code>a7b121a</code></a>
release version 3.0.5 (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2961">#2961</a>)</li>
<li><a
href="9caf72ac06"><code>9caf72a</code></a>
release version 3.0.5</li>
<li><a
href="e28a2451e9"><code>e28a245</code></a>
catch OSError from getpass.getuser (<a
href="https://redirect.github.com/pallets/werkzeug/issues/2960">#2960</a>)</li>
<li><a
href="e6b4cce97e"><code>e6b4cce</code></a>
catch OSError from getpass.getuser</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/werkzeug/compare/2.3.7...3.0.6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=werkzeug&package-manager=pip&previous-version=2.3.7&new-version=3.0.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-26 18:30:16 +03:00
dependabot[bot] c36072064a
Bump cryptography from 42.0.3 to 44.0.1 in /.devcontainer/src/test/regress (#8038)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.3
to 44.0.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's
changelog</a>.</em></p>
<blockquote>
<p>44.0.1 - 2025-02-11</p>
<pre><code>
* Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL
3.4.1.
* We now build ``armv7l`` ``manylinux`` wheels and publish them to PyPI.
* We now build ``manylinux_2_34`` wheels and publish them to PyPI.
<p>.. _v44-0-0:</p>
<p>44.0.0 - 2024-11-27
</code></pre></p>
<ul>
<li><strong>BACKWARDS INCOMPATIBLE:</strong> Dropped support for
LibreSSL &lt; 3.9.</li>
<li>Deprecated Python 3.7 support. Python 3.7 is no longer supported by
the
Python core team. Support for Python 3.7 will be removed in a future
<code>cryptography</code> release.</li>
<li>Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL
3.4.0.</li>
<li>macOS wheels are now built against the macOS 10.13 SDK. Users on
older
versions of macOS should upgrade, or they will need to build
<code>cryptography</code> themselves.</li>
<li>Enforce the :rfc:<code>5280</code> requirement that extended key
usage extensions must
not be empty.</li>
<li>Added support for timestamp extraction to the
:class:<code>~cryptography.fernet.MultiFernet</code> class.</li>
<li>Relax the Authority Key Identifier requirements on root CA
certificates
during X.509 verification to allow fields permitted by
:rfc:<code>5280</code> but
forbidden by the CA/Browser BRs.</li>
<li>Added support for
:class:<code>~cryptography.hazmat.primitives.kdf.argon2.Argon2id</code>
when using OpenSSL 3.2.0+.</li>
<li>Added support for the
:class:<code>~cryptography.x509.Admissions</code> certificate
extension.</li>
<li>Added basic support for PKCS7 decryption (including S/MIME 3.2) via

:func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_der</code>,

:func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_pem</code>,
and

:func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_smime</code>.</li>
</ul>
<p>.. _v43-0-3:</p>
<p>43.0.3 - 2024-10-18</p>
<pre><code>
* Fixed release metadata for ``cryptography-vectors``
<p>.. _v43-0-2:</p>
<p>43.0.2 - 2024-10-18
</code></pre></p>
<ul>
<li>Fixed compilation when using LibreSSL 4.0.0.</li>
</ul>
<p>.. _v43-0-1:</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="adaaaed77d"><code>adaaaed</code></a>
Bump for 44.0.1 release (<a
href="https://redirect.github.com/pyca/cryptography/issues/12441">#12441</a>)</li>
<li><a
href="ccc61dabe3"><code>ccc61da</code></a>
[backport] test and build on armv7l (<a
href="https://redirect.github.com/pyca/cryptography/issues/12420">#12420</a>)
(<a
href="https://redirect.github.com/pyca/cryptography/issues/12431">#12431</a>)</li>
<li><a
href="f299a48153"><code>f299a48</code></a>
remove deprecated call (<a
href="https://redirect.github.com/pyca/cryptography/issues/12052">#12052</a>)</li>
<li><a
href="439eb0594a"><code>439eb05</code></a>
Bump version for 44.0.0 (<a
href="https://redirect.github.com/pyca/cryptography/issues/12051">#12051</a>)</li>
<li><a
href="2c5ad4d8dc"><code>2c5ad4d</code></a>
chore(deps): bump maturin from 1.7.4 to 1.7.5 in /.github/requirements
(<a
href="https://redirect.github.com/pyca/cryptography/issues/12050">#12050</a>)</li>
<li><a
href="d23968addd"><code>d23968a</code></a>
chore(deps): bump libc from 0.2.165 to 0.2.166 (<a
href="https://redirect.github.com/pyca/cryptography/issues/12049">#12049</a>)</li>
<li><a
href="133c0e02ed"><code>133c0e0</code></a>
Bump x509-limbo and/or wycheproof in CI (<a
href="https://redirect.github.com/pyca/cryptography/issues/12047">#12047</a>)</li>
<li><a
href="f2259d7aa0"><code>f2259d7</code></a>
Bump BoringSSL and/or OpenSSL in CI (<a
href="https://redirect.github.com/pyca/cryptography/issues/12046">#12046</a>)</li>
<li><a
href="e201c870b8"><code>e201c87</code></a>
fixed metadata in changelog (<a
href="https://redirect.github.com/pyca/cryptography/issues/12044">#12044</a>)</li>
<li><a
href="c6104cc366"><code>c6104cc</code></a>
Prohibit Python 3.9.0, 3.9.1 -- they have a bug that causes errors (<a
href="https://redirect.github.com/pyca/cryptography/issues/12045">#12045</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/pyca/cryptography/compare/42.0.3...44.0.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cryptography&package-manager=pip&previous-version=42.0.3&new-version=44.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-26 17:28:51 +03:00
dependabot[bot] c350c7be46
Bump tornado from 6.4 to 6.5 in /.devcontainer/src/test/regress (#8037)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4 to 6.5.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst">tornado's
changelog</a>.</em></p>
<blockquote>
<h1>Release notes</h1>
<p>.. toctree::
:maxdepth: 2</p>
<p>releases/v6.5.1
releases/v6.5.0
releases/v6.4.2
releases/v6.4.1
releases/v6.4.0
releases/v6.3.3
releases/v6.3.2
releases/v6.3.1
releases/v6.3.0
releases/v6.2.0
releases/v6.1.0
releases/v6.0.4
releases/v6.0.3
releases/v6.0.2
releases/v6.0.1
releases/v6.0.0
releases/v5.1.1
releases/v5.1.0
releases/v5.0.2
releases/v5.0.1
releases/v5.0.0
releases/v4.5.3
releases/v4.5.2
releases/v4.5.1
releases/v4.5.0
releases/v4.4.3
releases/v4.4.2
releases/v4.4.1
releases/v4.4.0
releases/v4.3.0
releases/v4.2.1
releases/v4.2.0
releases/v4.1.0
releases/v4.0.2
releases/v4.0.1
releases/v4.0.0
releases/v3.2.2
releases/v3.2.1
releases/v3.2.0
releases/v3.1.1
releases/v3.1.0
releases/v3.0.2
releases/v3.0.1
releases/v3.0.0</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="ab5f354312"><code>ab5f354</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3498">#3498</a>
from bdarnell/final-6.5</li>
<li><a
href="3623024dfc"><code>3623024</code></a>
Final release notes for 6.5.0</li>
<li><a
href="b39b892bf7"><code>b39b892</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3497">#3497</a>
from bdarnell/multipart-log-spam</li>
<li><a
href="cc61050e8f"><code>cc61050</code></a>
httputil: Raise errors instead of logging in multipart/form-data
parsing</li>
<li><a
href="ae4a4e4fea"><code>ae4a4e4</code></a>
asyncio: Preserve contextvars across SelectorThread on Windows (<a
href="https://redirect.github.com/tornadoweb/tornado/issues/3479">#3479</a>)</li>
<li><a
href="197ff13f76"><code>197ff13</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3496">#3496</a>
from bdarnell/undeprecate-set-event-loop</li>
<li><a
href="c3d906c4ad"><code>c3d906c</code></a>
requirements: Upgrade tox to 4.26.0</li>
<li><a
href="a83897732e"><code>a838977</code></a>
testing: Remove deprecation warning filter for set_event_loop</li>
<li><a
href="d8e0d36eba"><code>d8e0d36</code></a>
build: Fix free-threaded build, mark speedups module as no-GIL</li>
<li><a
href="bfe7489485"><code>bfe7489</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3492">#3492</a>
from bdarnell/relnotes-6.5</li>
<li>Additional commits viewable in <a
href="https://github.com/tornadoweb/tornado/compare/v6.4.0...v6.5.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tornado&package-manager=pip&previous-version=6.4&new-version=6.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-26 17:00:18 +03:00
ibrahim halatci 8587de850b
Filter out upload coverage action for PRs from forks (#8033)
add condition to filter out coverage upload action for PRs from forks as
the necessary secret is not available to them and fails the whole
pipeline
2025-06-26 14:12:39 +03:00
naisila 4cd8bb1b67 Bump Citus version to 13.2devel 2025-06-24 16:21:48 +02:00
naisila 4456913801 Add Changelog entries for 13.1.0, 13.0.4, 12.1.8
13.1.0 https://github.com/citusdata/citus/pull/8006
13.0.4 https://github.com/citusdata/citus/pull/8005
12.1.8 https://github.com/citusdata/citus/pull/8004
2025-06-24 16:21:48 +02:00
Onur Tirtir 55a0d1f730
Add skip_qualify_public param to shard_name() to allow qualifying for "public" schema (#8014)
DESCRIPTION: Adds skip_qualify_public param to `shard_name()` UDF to
allow qualifying for "public" schema when needed.
2025-06-02 10:15:32 +03:00
dependabot[bot] 5e37fe0c46
Bump cryptography from 42.0.3 to 44.0.1 in /src/test/regress (#7996)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.3
to 44.0.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's
changelog</a>.</em></p>
<blockquote>
<p>44.0.1 - 2025-02-11</p>
<pre><code>
* Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL
3.4.1.
* We now build ``armv7l`` ``manylinux`` wheels and publish them to PyPI.
* We now build ``manylinux_2_34`` wheels and publish them to PyPI.
<p>.. _v44-0-0:</p>
<p>44.0.0 - 2024-11-27
</code></pre></p>
<ul>
<li><strong>BACKWARDS INCOMPATIBLE:</strong> Dropped support for
LibreSSL &lt; 3.9.</li>
<li>Deprecated Python 3.7 support. Python 3.7 is no longer supported by
the
Python core team. Support for Python 3.7 will be removed in a future
<code>cryptography</code> release.</li>
<li>Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL
3.4.0.</li>
<li>macOS wheels are now built against the macOS 10.13 SDK. Users on
older
versions of macOS should upgrade, or they will need to build
<code>cryptography</code> themselves.</li>
<li>Enforce the :rfc:<code>5280</code> requirement that extended key
usage extensions must
not be empty.</li>
<li>Added support for timestamp extraction to the
:class:<code>~cryptography.fernet.MultiFernet</code> class.</li>
<li>Relax the Authority Key Identifier requirements on root CA
certificates
during X.509 verification to allow fields permitted by
:rfc:<code>5280</code> but
forbidden by the CA/Browser BRs.</li>
<li>Added support for
:class:<code>~cryptography.hazmat.primitives.kdf.argon2.Argon2id</code>
when using OpenSSL 3.2.0+.</li>
<li>Added support for the
:class:<code>~cryptography.x509.Admissions</code> certificate
extension.</li>
<li>Added basic support for PKCS7 decryption (including S/MIME 3.2) via

:func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_der</code>,

:func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_pem</code>,
and

:func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_smime</code>.</li>
</ul>
<p>.. _v43-0-3:</p>
<p>43.0.3 - 2024-10-18</p>
<pre><code>
* Fixed release metadata for ``cryptography-vectors``
<p>.. _v43-0-2:</p>
<p>43.0.2 - 2024-10-18
</code></pre></p>
<ul>
<li>Fixed compilation when using LibreSSL 4.0.0.</li>
</ul>
<p>.. _v43-0-1:</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="adaaaed77d"><code>adaaaed</code></a>
Bump for 44.0.1 release (<a
href="https://redirect.github.com/pyca/cryptography/issues/12441">#12441</a>)</li>
<li><a
href="ccc61dabe3"><code>ccc61da</code></a>
[backport] test and build on armv7l (<a
href="https://redirect.github.com/pyca/cryptography/issues/12420">#12420</a>)
(<a
href="https://redirect.github.com/pyca/cryptography/issues/12431">#12431</a>)</li>
<li><a
href="f299a48153"><code>f299a48</code></a>
remove deprecated call (<a
href="https://redirect.github.com/pyca/cryptography/issues/12052">#12052</a>)</li>
<li><a
href="439eb0594a"><code>439eb05</code></a>
Bump version for 44.0.0 (<a
href="https://redirect.github.com/pyca/cryptography/issues/12051">#12051</a>)</li>
<li><a
href="2c5ad4d8dc"><code>2c5ad4d</code></a>
chore(deps): bump maturin from 1.7.4 to 1.7.5 in /.github/requirements
(<a
href="https://redirect.github.com/pyca/cryptography/issues/12050">#12050</a>)</li>
<li><a
href="d23968addd"><code>d23968a</code></a>
chore(deps): bump libc from 0.2.165 to 0.2.166 (<a
href="https://redirect.github.com/pyca/cryptography/issues/12049">#12049</a>)</li>
<li><a
href="133c0e02ed"><code>133c0e0</code></a>
Bump x509-limbo and/or wycheproof in CI (<a
href="https://redirect.github.com/pyca/cryptography/issues/12047">#12047</a>)</li>
<li><a
href="f2259d7aa0"><code>f2259d7</code></a>
Bump BoringSSL and/or OpenSSL in CI (<a
href="https://redirect.github.com/pyca/cryptography/issues/12046">#12046</a>)</li>
<li><a
href="e201c870b8"><code>e201c87</code></a>
fixed metadata in changelog (<a
href="https://redirect.github.com/pyca/cryptography/issues/12044">#12044</a>)</li>
<li><a
href="c6104cc366"><code>c6104cc</code></a>
Prohibit Python 3.9.0, 3.9.1 -- they have a bug that causes errors (<a
href="https://redirect.github.com/pyca/cryptography/issues/12045">#12045</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/pyca/cryptography/compare/42.0.3...44.0.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cryptography&package-manager=pip&previous-version=42.0.3&new-version=44.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 20:48:29 +03:00
dependabot[bot] e8c3179b4d
Bump tornado from 6.4.2 to 6.5.1 in /src/test/regress (#8001)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.2 to
6.5.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst">tornado's
changelog</a>.</em></p>
<blockquote>
<h1>Release notes</h1>
<p>.. toctree::
:maxdepth: 2</p>
<p>releases/v6.5.1
releases/v6.5.0
releases/v6.4.2
releases/v6.4.1
releases/v6.4.0
releases/v6.3.3
releases/v6.3.2
releases/v6.3.1
releases/v6.3.0
releases/v6.2.0
releases/v6.1.0
releases/v6.0.4
releases/v6.0.3
releases/v6.0.2
releases/v6.0.1
releases/v6.0.0
releases/v5.1.1
releases/v5.1.0
releases/v5.0.2
releases/v5.0.1
releases/v5.0.0
releases/v4.5.3
releases/v4.5.2
releases/v4.5.1
releases/v4.5.0
releases/v4.4.3
releases/v4.4.2
releases/v4.4.1
releases/v4.4.0
releases/v4.3.0
releases/v4.2.1
releases/v4.2.0
releases/v4.1.0
releases/v4.0.2
releases/v4.0.1
releases/v4.0.0
releases/v3.2.2
releases/v3.2.1
releases/v3.2.0
releases/v3.1.1
releases/v3.1.0
releases/v3.0.2
releases/v3.0.1
releases/v3.0.0</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b5586f3f29"><code>b5586f3</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3503">#3503</a>
from bdarnell/multipart-utf8</li>
<li><a
href="62c276434d"><code>62c2764</code></a>
Release notes for v6.5.1</li>
<li><a
href="170a58af2c"><code>170a58a</code></a>
httputil: Fix support for non-latin1 filenames in multipart uploads</li>
<li><a
href="ab5f354312"><code>ab5f354</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3498">#3498</a>
from bdarnell/final-6.5</li>
<li><a
href="3623024dfc"><code>3623024</code></a>
Final release notes for 6.5.0</li>
<li><a
href="b39b892bf7"><code>b39b892</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3497">#3497</a>
from bdarnell/multipart-log-spam</li>
<li><a
href="cc61050e8f"><code>cc61050</code></a>
httputil: Raise errors instead of logging in multipart/form-data
parsing</li>
<li><a
href="ae4a4e4fea"><code>ae4a4e4</code></a>
asyncio: Preserve contextvars across SelectorThread on Windows (<a
href="https://redirect.github.com/tornadoweb/tornado/issues/3479">#3479</a>)</li>
<li><a
href="197ff13f76"><code>197ff13</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3496">#3496</a>
from bdarnell/undeprecate-set-event-loop</li>
<li><a
href="c3d906c4ad"><code>c3d906c</code></a>
requirements: Upgrade tox to 4.26.0</li>
<li>Additional commits viewable in <a
href="https://github.com/tornadoweb/tornado/compare/v6.4.2...v6.5.1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tornado&package-manager=pip&previous-version=6.4.2&new-version=6.5.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 17:45:11 +03:00
dependabot[bot] 92dc7f36fc
Bump jinja2 from 3.1.3 to 3.1.6 in /src/test/regress (#8002)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/jinja/releases">jinja2's
releases</a>.</em></p>
<blockquote>
<h2>3.1.6</h2>
<p>This is the Jinja 3.1.6 security release, which fixes security issues
but does not otherwise change behavior and should not result in breaking
changes compared to the latest feature release.</p>
<p>PyPI: <a
href="https://pypi.org/project/Jinja2/3.1.6/">https://pypi.org/project/Jinja2/3.1.6/</a>
Changes: <a
href="https://jinja.palletsprojects.com/en/stable/changes/#version-3-1-6">https://jinja.palletsprojects.com/en/stable/changes/#version-3-1-6</a></p>
<ul>
<li>The <code>|attr</code> filter does not bypass the environment's
attribute lookup, allowing the sandbox to apply its checks. <a
href="https://github.com/pallets/jinja/security/advisories/GHSA-cpwx-vrp4-4pq7">https://github.com/pallets/jinja/security/advisories/GHSA-cpwx-vrp4-4pq7</a></li>
</ul>
<h2>3.1.5</h2>
<p>This is the Jinja 3.1.5 security fix release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes compared to the latest feature release.</p>
<p>PyPI: <a
href="https://pypi.org/project/Jinja2/3.1.5/">https://pypi.org/project/Jinja2/3.1.5/</a>
Changes: <a
href="https://jinja.palletsprojects.com/changes/#version-3-1-5">https://jinja.palletsprojects.com/changes/#version-3-1-5</a>
Milestone: <a
href="https://github.com/pallets/jinja/milestone/16?closed=1">https://github.com/pallets/jinja/milestone/16?closed=1</a></p>
<ul>
<li>The sandboxed environment handles indirect calls to
<code>str.format</code>, such as by passing a stored reference to a
filter that calls its argument. <a
href="https://github.com/pallets/jinja/security/advisories/GHSA-q2x7-8rv6-6q7h">GHSA-q2x7-8rv6-6q7h</a></li>
<li>Escape template name before formatting it into error messages, to
avoid issues with names that contain f-string syntax. <a
href="https://redirect.github.com/pallets/jinja/issues/1792">#1792</a>,
<a
href="https://github.com/pallets/jinja/security/advisories/GHSA-gmj6-6f8f-6699">GHSA-gmj6-6f8f-6699</a></li>
<li>Sandbox does not allow <code>clear</code> and <code>pop</code> on
known mutable sequence types. <a
href="https://redirect.github.com/pallets/jinja/issues/2032">#2032</a></li>
<li>Calling sync <code>render</code> for an async template uses
<code>asyncio.run</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1952">#1952</a></li>
<li>Avoid unclosed <code>auto_aiter</code> warnings. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>Return an <code>aclose</code>-able <code>AsyncGenerator</code> from
<code>Template.generate_async</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>Avoid leaving <code>root_render_func()</code> unclosed in
<code>Template.generate_async</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>Avoid leaving async generators unclosed in blocks, includes and
extends. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>The runtime uses the correct <code>concat</code> function for the
current environment when calling block references. <a
href="https://redirect.github.com/pallets/jinja/issues/1701">#1701</a></li>
<li>Make <code>|unique</code> async-aware, allowing it to be used after
another async-aware filter. <a
href="https://redirect.github.com/pallets/jinja/issues/1781">#1781</a></li>
<li><code>|int</code> filter handles <code>OverflowError</code> from
scientific notation. <a
href="https://redirect.github.com/pallets/jinja/issues/1921">#1921</a></li>
<li>Make compiling deterministic for tuple unpacking in a <code>{% set
... %}</code> call. <a
href="https://redirect.github.com/pallets/jinja/issues/2021">#2021</a></li>
<li>Fix dunder protocol (<code>copy</code>/<code>pickle</code>/etc)
interaction with <code>Undefined</code> objects. <a
href="https://redirect.github.com/pallets/jinja/issues/2025">#2025</a></li>
<li>Fix <code>copy</code>/<code>pickle</code> support for the internal
<code>missing</code> object. <a
href="https://redirect.github.com/pallets/jinja/issues/2027">#2027</a></li>
<li><code>Environment.overlay(enable_async)</code> is applied correctly.
<a
href="https://redirect.github.com/pallets/jinja/issues/2061">#2061</a></li>
<li>The error message from <code>FileSystemLoader</code> includes the
paths that were searched. <a
href="https://redirect.github.com/pallets/jinja/issues/1661">#1661</a></li>
<li><code>PackageLoader</code> shows a clearer error message when the
package does not contain the templates directory. <a
href="https://redirect.github.com/pallets/jinja/issues/1705">#1705</a></li>
<li>Improve annotations for methods returning copies. <a
href="https://redirect.github.com/pallets/jinja/issues/1880">#1880</a></li>
<li><code>urlize</code> does not add <code>mailto:</code> to values like
<code>@a@b</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1870">#1870</a></li>
<li>Tests decorated with <code>@pass_context</code> can be used with the
<code>|select</code> filter. <a
href="https://redirect.github.com/pallets/jinja/issues/1624">#1624</a></li>
<li>Using <code>set</code> for multiple assignment (<code>a, b = 1,
2</code>) does not fail when the target is a namespace attribute. <a
href="https://redirect.github.com/pallets/jinja/issues/1413">#1413</a></li>
<li>Using <code>set</code> in all branches of <code>{% if %}{% elif %}{%
else %}</code> blocks does not cause the variable to be considered
initially undefined. <a
href="https://redirect.github.com/pallets/jinja/issues/1253">#1253</a></li>
</ul>
<h2>3.1.4</h2>
<p>This is the Jinja 3.1.4 security release, which fixes security issues
and bugs but does not otherwise change behavior and should not result in
breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Jinja2/3.1.4/">https://pypi.org/project/Jinja2/3.1.4/</a>
Changes: <a
href="https://jinja.palletsprojects.com/en/3.1.x/changes/#version-3-1-4">https://jinja.palletsprojects.com/en/3.1.x/changes/#version-3-1-4</a></p>
<ul>
<li>The <code>xmlattr</code> filter does not allow keys with
<code>/</code> solidus, <code>&gt;</code> greater-than sign, or
<code>=</code> equals sign, in addition to disallowing spaces.
Regardless of any validation done by Jinja, user input should never be
used as keys to this filter, or must be separately validated first.
GHSA-h75v-3vvj-5mfj</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/jinja/blob/main/CHANGES.rst">jinja2's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.1.6</h2>
<p>Released 2025-03-05</p>
<ul>
<li>The <code>|attr</code> filter does not bypass the environment's
attribute lookup,
allowing the sandbox to apply its checks.
:ghsa:<code>cpwx-vrp4-4pq7</code></li>
</ul>
<h2>Version 3.1.5</h2>
<p>Released 2024-12-21</p>
<ul>
<li>The sandboxed environment handles indirect calls to
<code>str.format</code>, such as
by passing a stored reference to a filter that calls its argument.
:ghsa:<code>q2x7-8rv6-6q7h</code></li>
<li>Escape template name before formatting it into error messages, to
avoid
issues with names that contain f-string syntax.
:issue:<code>1792</code>, :ghsa:<code>gmj6-6f8f-6699</code></li>
<li>Sandbox does not allow <code>clear</code> and <code>pop</code> on
known mutable sequence
types. :issue:<code>2032</code></li>
<li>Calling sync <code>render</code> for an async template uses
<code>asyncio.run</code>.
:pr:<code>1952</code></li>
<li>Avoid unclosed <code>auto_aiter</code> warnings.
:pr:<code>1960</code></li>
<li>Return an <code>aclose</code>-able <code>AsyncGenerator</code> from
<code>Template.generate_async</code>. :pr:<code>1960</code></li>
<li>Avoid leaving <code>root_render_func()</code> unclosed in
<code>Template.generate_async</code>. :pr:<code>1960</code></li>
<li>Avoid leaving async generators unclosed in blocks, includes and
extends.
:pr:<code>1960</code></li>
<li>The runtime uses the correct <code>concat</code> function for the
current environment
when calling block references. :issue:<code>1701</code></li>
<li>Make <code>|unique</code> async-aware, allowing it to be used after
another
async-aware filter. :issue:<code>1781</code></li>
<li><code>|int</code> filter handles <code>OverflowError</code> from
scientific notation.
:issue:<code>1921</code></li>
<li>Make compiling deterministic for tuple unpacking in a <code>{% set
... %}</code>
call. :issue:<code>2021</code></li>
<li>Fix dunder protocol (<code>copy</code>/<code>pickle</code>/etc)
interaction with <code>Undefined</code>
objects. :issue:<code>2025</code></li>
<li>Fix <code>copy</code>/<code>pickle</code> support for the internal
<code>missing</code> object.
:issue:<code>2027</code></li>
<li><code>Environment.overlay(enable_async)</code> is applied correctly.
:pr:<code>2061</code></li>
<li>The error message from <code>FileSystemLoader</code> includes the
paths that were
searched. :issue:<code>1661</code></li>
<li><code>PackageLoader</code> shows a clearer error message when the
package does not
contain the templates directory. :issue:<code>1705</code></li>
<li>Improve annotations for methods returning copies.
:pr:<code>1880</code></li>
<li><code>urlize</code> does not add <code>mailto:</code> to values like
<code>@a@b</code>. :pr:<code>1870</code></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="15206881c0"><code>1520688</code></a>
release version 3.1.6</li>
<li><a
href="90457bbf33"><code>90457bb</code></a>
Merge commit from fork</li>
<li><a
href="065334d1ee"><code>065334d</code></a>
attr filter uses env.getattr</li>
<li><a
href="033c20015c"><code>033c200</code></a>
start version 3.1.6</li>
<li><a
href="bc68d4efa9"><code>bc68d4e</code></a>
use global contributing guide (<a
href="https://redirect.github.com/pallets/jinja/issues/2070">#2070</a>)</li>
<li><a
href="247de5e0c5"><code>247de5e</code></a>
use global contributing guide</li>
<li><a
href="ab8218c7a1"><code>ab8218c</code></a>
use project advisory link instead of global</li>
<li><a
href="b4ffc8ff29"><code>b4ffc8f</code></a>
release version 3.1.5 (<a
href="https://redirect.github.com/pallets/jinja/issues/2066">#2066</a>)</li>
<li><a
href="877f6e51be"><code>877f6e5</code></a>
release version 3.1.5</li>
<li><a
href="8d58859265"><code>8d58859</code></a>
remove test pypi</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/jinja/compare/3.1.3...3.1.6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=jinja2&package-manager=pip&previous-version=3.1.3&new-version=3.1.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 17:01:42 +03:00
dependabot[bot] 98d95a9b9d
Bump jinja2 from 3.1.3 to 3.1.6 in /.devcontainer/src/test/regress (#7995)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/jinja/releases">jinja2's
releases</a>.</em></p>
<blockquote>
<h2>3.1.6</h2>
<p>This is the Jinja 3.1.6 security release, which fixes security issues
but does not otherwise change behavior and should not result in breaking
changes compared to the latest feature release.</p>
<p>PyPI: <a
href="https://pypi.org/project/Jinja2/3.1.6/">https://pypi.org/project/Jinja2/3.1.6/</a>
Changes: <a
href="https://jinja.palletsprojects.com/en/stable/changes/#version-3-1-6">https://jinja.palletsprojects.com/en/stable/changes/#version-3-1-6</a></p>
<ul>
<li>The <code>|attr</code> filter does not bypass the environment's
attribute lookup, allowing the sandbox to apply its checks. <a
href="https://github.com/pallets/jinja/security/advisories/GHSA-cpwx-vrp4-4pq7">https://github.com/pallets/jinja/security/advisories/GHSA-cpwx-vrp4-4pq7</a></li>
</ul>
<h2>3.1.5</h2>
<p>This is the Jinja 3.1.5 security fix release, which fixes security
issues and bugs but does not otherwise change behavior and should not
result in breaking changes compared to the latest feature release.</p>
<p>PyPI: <a
href="https://pypi.org/project/Jinja2/3.1.5/">https://pypi.org/project/Jinja2/3.1.5/</a>
Changes: <a
href="https://jinja.palletsprojects.com/changes/#version-3-1-5">https://jinja.palletsprojects.com/changes/#version-3-1-5</a>
Milestone: <a
href="https://github.com/pallets/jinja/milestone/16?closed=1">https://github.com/pallets/jinja/milestone/16?closed=1</a></p>
<ul>
<li>The sandboxed environment handles indirect calls to
<code>str.format</code>, such as by passing a stored reference to a
filter that calls its argument. <a
href="https://github.com/pallets/jinja/security/advisories/GHSA-q2x7-8rv6-6q7h">GHSA-q2x7-8rv6-6q7h</a></li>
<li>Escape template name before formatting it into error messages, to
avoid issues with names that contain f-string syntax. <a
href="https://redirect.github.com/pallets/jinja/issues/1792">#1792</a>,
<a
href="https://github.com/pallets/jinja/security/advisories/GHSA-gmj6-6f8f-6699">GHSA-gmj6-6f8f-6699</a></li>
<li>Sandbox does not allow <code>clear</code> and <code>pop</code> on
known mutable sequence types. <a
href="https://redirect.github.com/pallets/jinja/issues/2032">#2032</a></li>
<li>Calling sync <code>render</code> for an async template uses
<code>asyncio.run</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1952">#1952</a></li>
<li>Avoid unclosed <code>auto_aiter</code> warnings. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>Return an <code>aclose</code>-able <code>AsyncGenerator</code> from
<code>Template.generate_async</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>Avoid leaving <code>root_render_func()</code> unclosed in
<code>Template.generate_async</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>Avoid leaving async generators unclosed in blocks, includes and
extends. <a
href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li>
<li>The runtime uses the correct <code>concat</code> function for the
current environment when calling block references. <a
href="https://redirect.github.com/pallets/jinja/issues/1701">#1701</a></li>
<li>Make <code>|unique</code> async-aware, allowing it to be used after
another async-aware filter. <a
href="https://redirect.github.com/pallets/jinja/issues/1781">#1781</a></li>
<li><code>|int</code> filter handles <code>OverflowError</code> from
scientific notation. <a
href="https://redirect.github.com/pallets/jinja/issues/1921">#1921</a></li>
<li>Make compiling deterministic for tuple unpacking in a <code>{% set
... %}</code> call. <a
href="https://redirect.github.com/pallets/jinja/issues/2021">#2021</a></li>
<li>Fix dunder protocol (<code>copy</code>/<code>pickle</code>/etc)
interaction with <code>Undefined</code> objects. <a
href="https://redirect.github.com/pallets/jinja/issues/2025">#2025</a></li>
<li>Fix <code>copy</code>/<code>pickle</code> support for the internal
<code>missing</code> object. <a
href="https://redirect.github.com/pallets/jinja/issues/2027">#2027</a></li>
<li><code>Environment.overlay(enable_async)</code> is applied correctly.
<a
href="https://redirect.github.com/pallets/jinja/issues/2061">#2061</a></li>
<li>The error message from <code>FileSystemLoader</code> includes the
paths that were searched. <a
href="https://redirect.github.com/pallets/jinja/issues/1661">#1661</a></li>
<li><code>PackageLoader</code> shows a clearer error message when the
package does not contain the templates directory. <a
href="https://redirect.github.com/pallets/jinja/issues/1705">#1705</a></li>
<li>Improve annotations for methods returning copies. <a
href="https://redirect.github.com/pallets/jinja/issues/1880">#1880</a></li>
<li><code>urlize</code> does not add <code>mailto:</code> to values like
<code>@a@b</code>. <a
href="https://redirect.github.com/pallets/jinja/issues/1870">#1870</a></li>
<li>Tests decorated with <code>@pass_context</code> can be used with the
<code>|select</code> filter. <a
href="https://redirect.github.com/pallets/jinja/issues/1624">#1624</a></li>
<li>Using <code>set</code> for multiple assignment (<code>a, b = 1,
2</code>) does not fail when the target is a namespace attribute. <a
href="https://redirect.github.com/pallets/jinja/issues/1413">#1413</a></li>
<li>Using <code>set</code> in all branches of <code>{% if %}{% elif %}{%
else %}</code> blocks does not cause the variable to be considered
initially undefined. <a
href="https://redirect.github.com/pallets/jinja/issues/1253">#1253</a></li>
</ul>
<h2>3.1.4</h2>
<p>This is the Jinja 3.1.4 security release, which fixes security issues
and bugs but does not otherwise change behavior and should not result in
breaking changes.</p>
<p>PyPI: <a
href="https://pypi.org/project/Jinja2/3.1.4/">https://pypi.org/project/Jinja2/3.1.4/</a>
Changes: <a
href="https://jinja.palletsprojects.com/en/3.1.x/changes/#version-3-1-4">https://jinja.palletsprojects.com/en/3.1.x/changes/#version-3-1-4</a></p>
<ul>
<li>The <code>xmlattr</code> filter does not allow keys with
<code>/</code> solidus, <code>&gt;</code> greater-than sign, or
<code>=</code> equals sign, in addition to disallowing spaces.
Regardless of any validation done by Jinja, user input should never be
used as keys to this filter, or must be separately validated first.
GHSA-h75v-3vvj-5mfj</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pallets/jinja/blob/main/CHANGES.rst">jinja2's
changelog</a>.</em></p>
<blockquote>
<h2>Version 3.1.6</h2>
<p>Released 2025-03-05</p>
<ul>
<li>The <code>|attr</code> filter does not bypass the environment's
attribute lookup,
allowing the sandbox to apply its checks.
:ghsa:<code>cpwx-vrp4-4pq7</code></li>
</ul>
<h2>Version 3.1.5</h2>
<p>Released 2024-12-21</p>
<ul>
<li>The sandboxed environment handles indirect calls to
<code>str.format</code>, such as
by passing a stored reference to a filter that calls its argument.
:ghsa:<code>q2x7-8rv6-6q7h</code></li>
<li>Escape template name before formatting it into error messages, to
avoid
issues with names that contain f-string syntax.
:issue:<code>1792</code>, :ghsa:<code>gmj6-6f8f-6699</code></li>
<li>Sandbox does not allow <code>clear</code> and <code>pop</code> on
known mutable sequence
types. :issue:<code>2032</code></li>
<li>Calling sync <code>render</code> for an async template uses
<code>asyncio.run</code>.
:pr:<code>1952</code></li>
<li>Avoid unclosed <code>auto_aiter</code> warnings.
:pr:<code>1960</code></li>
<li>Return an <code>aclose</code>-able <code>AsyncGenerator</code> from
<code>Template.generate_async</code>. :pr:<code>1960</code></li>
<li>Avoid leaving <code>root_render_func()</code> unclosed in
<code>Template.generate_async</code>. :pr:<code>1960</code></li>
<li>Avoid leaving async generators unclosed in blocks, includes and
extends.
:pr:<code>1960</code></li>
<li>The runtime uses the correct <code>concat</code> function for the
current environment
when calling block references. :issue:<code>1701</code></li>
<li>Make <code>|unique</code> async-aware, allowing it to be used after
another
async-aware filter. :issue:<code>1781</code></li>
<li><code>|int</code> filter handles <code>OverflowError</code> from
scientific notation.
:issue:<code>1921</code></li>
<li>Make compiling deterministic for tuple unpacking in a <code>{% set
... %}</code>
call. :issue:<code>2021</code></li>
<li>Fix dunder protocol (<code>copy</code>/<code>pickle</code>/etc)
interaction with <code>Undefined</code>
objects. :issue:<code>2025</code></li>
<li>Fix <code>copy</code>/<code>pickle</code> support for the internal
<code>missing</code> object.
:issue:<code>2027</code></li>
<li><code>Environment.overlay(enable_async)</code> is applied correctly.
:pr:<code>2061</code></li>
<li>The error message from <code>FileSystemLoader</code> includes the
paths that were
searched. :issue:<code>1661</code></li>
<li><code>PackageLoader</code> shows a clearer error message when the
package does not
contain the templates directory. :issue:<code>1705</code></li>
<li>Improve annotations for methods returning copies.
:pr:<code>1880</code></li>
<li><code>urlize</code> does not add <code>mailto:</code> to values like
<code>@a@b</code>. :pr:<code>1870</code></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="15206881c0"><code>1520688</code></a>
release version 3.1.6</li>
<li><a
href="90457bbf33"><code>90457bb</code></a>
Merge commit from fork</li>
<li><a
href="065334d1ee"><code>065334d</code></a>
attr filter uses env.getattr</li>
<li><a
href="033c20015c"><code>033c200</code></a>
start version 3.1.6</li>
<li><a
href="bc68d4efa9"><code>bc68d4e</code></a>
use global contributing guide (<a
href="https://redirect.github.com/pallets/jinja/issues/2070">#2070</a>)</li>
<li><a
href="247de5e0c5"><code>247de5e</code></a>
use global contributing guide</li>
<li><a
href="ab8218c7a1"><code>ab8218c</code></a>
use project advisory link instead of global</li>
<li><a
href="b4ffc8ff29"><code>b4ffc8f</code></a>
release version 3.1.5 (<a
href="https://redirect.github.com/pallets/jinja/issues/2066">#2066</a>)</li>
<li><a
href="877f6e51be"><code>877f6e5</code></a>
release version 3.1.5</li>
<li><a
href="8d58859265"><code>8d58859</code></a>
remove test pypi</li>
<li>Additional commits viewable in <a
href="https://github.com/pallets/jinja/compare/3.1.3...3.1.6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=jinja2&package-manager=pip&previous-version=3.1.3&new-version=3.1.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 15:36:56 +03:00
dependabot[bot] c7f5e2b975
Bump tornado from 6.4 to 6.4.2 in /src/test/regress (#7984)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4 to
6.4.2.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst">tornado's
changelog</a>.</em></p>
<blockquote>
<h1>Release notes</h1>
<p>.. toctree::
:maxdepth: 2</p>
<p>releases/v6.5.0
releases/v6.4.2
releases/v6.4.1
releases/v6.4.0
releases/v6.3.3
releases/v6.3.2
releases/v6.3.1
releases/v6.3.0
releases/v6.2.0
releases/v6.1.0
releases/v6.0.4
releases/v6.0.3
releases/v6.0.2
releases/v6.0.1
releases/v6.0.0
releases/v5.1.1
releases/v5.1.0
releases/v5.0.2
releases/v5.0.1
releases/v5.0.0
releases/v4.5.3
releases/v4.5.2
releases/v4.5.1
releases/v4.5.0
releases/v4.4.3
releases/v4.4.2
releases/v4.4.1
releases/v4.4.0
releases/v4.3.0
releases/v4.2.1
releases/v4.2.0
releases/v4.1.0
releases/v4.0.2
releases/v4.0.1
releases/v4.0.0
releases/v3.2.2
releases/v3.2.1
releases/v3.2.0
releases/v3.1.1
releases/v3.1.0
releases/v3.0.2
releases/v3.0.1
releases/v3.0.0
releases/v2.4.1</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a5ecfab15e"><code>a5ecfab</code></a>
Bump version to 6.4.2</li>
<li><a
href="bc7df6bafd"><code>bc7df6b</code></a>
Fix tests with Twisted 24.7.0</li>
<li><a
href="d5ba4a1695"><code>d5ba4a1</code></a>
httputil: Fix quadratic performance of cookie parsing</li>
<li><a
href="2a0e1d13b5"><code>2a0e1d1</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3388">#3388</a>
from bdarnell/release-641</li>
<li><a
href="b7af4e8f5e"><code>b7af4e8</code></a>
Release notes and version bump for version 6.4.1</li>
<li><a
href="d65f6e71a7"><code>d65f6e7</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3387">#3387</a>
from bdarnell/chunked-parsing</li>
<li><a
href="8d721a877d"><code>8d721a8</code></a>
httputil: Only strip tabs and spaces from header values</li>
<li><a
href="7786f09f84"><code>7786f09</code></a>
Merge pull request <a
href="https://redirect.github.com/tornadoweb/tornado/issues/3386">#3386</a>
from bdarnell/curl-crlf</li>
<li><a
href="fb119c767e"><code>fb119c7</code></a>
http1connection: Stricter handling of transfer-encoding</li>
<li><a
href="b0ffc58e02"><code>b0ffc58</code></a>
curl_httpclient,http1connection: Prohibit CR and LF in headers</li>
<li>Additional commits viewable in <a
href="https://github.com/tornadoweb/tornado/compare/v6.4.0...v6.4.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=tornado&package-manager=pip&previous-version=6.4&new-version=6.4.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/citusdata/citus/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: ibrahim halatci <ihalatci@gmail.com>
2025-05-26 10:59:59 +03:00
ibrahim halatci 282523549e
bumbed codeql version to v3 (#7999)
DESCRIPTION: bumbed codeql version to v3
2025-05-23 14:13:33 +03:00
Naisila Puka c98341e4ed
Bump PG versions to 17.5, 16.9, 15.13 (#7986)
Nontrivial bump because of the following PG15.3 commit
317aba70e
https://github.com/postgres/postgres/commit/317aba70e

Previously, when views were converted to RTE_SUBQUERY the relid
would be cleared in PG15. In this patch of PG15, relid is retained.
Therefore, we add a check with the "relkind and rtekind" to
identify the converted views in 15.13

Sister PR https://github.com/citusdata/the-process/pull/164
Using dev image sha because I encountered the libpq
symlink issue again with "-v219b87c"
2025-05-22 14:08:03 +02:00
Onur Tirtir 8d2fbca8ef
Fix unsafe memory access in citus_unmark_object_distributed() (#7985)
_Since we've never released a Citus release that contains the commit
that introduced this bug (see #7461), we don't need to have a
DESCRIPTION line that shows up in release changelog._

From 8 valgrind test targets run for release-13.1 with PG 17.5, we got
1344 stack traces and except one of them, they were all about below
unsafe memory access because this is a very hot code-path that we
execute via our drop trigger.

On main, even `make -C src/test/regress/ check-base-vg` dumps this stack
trace with PG 16/17 to src/test/regress/citus_valgrind_test_log.txt when
executing "multi_cluster_management", and this is not the case with this
PR anymore.

```c
==27337== VALGRINDERROR-BEGIN
==27337== Conditional jump or move depends on uninitialised value(s)
==27337==    at 0x7E26B68: citus_unmark_object_distributed (home/onurctirtir/citus/src/backend/distributed/metadata/distobject.c:113)
==27337==    by 0x7E26CC7: master_unmark_object_distributed (home/onurctirtir/citus/src/backend/distributed/metadata/distobject.c:153)
==27337==    by 0x4BD852: ExecInterpExpr (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execExprInterp.c:758)
==27337==    by 0x4BFD00: ExecInterpExprStillValid (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execExprInterp.c:1870)
==27337==    by 0x51D82C: ExecEvalExprSwitchContext (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/../../../src/include/executor/executor.h:355)
==27337==    by 0x51D8A4: ExecProject (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/../../../src/include/executor/executor.h:389)
==27337==    by 0x51DADB: ExecResult (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/nodeResult.c:136)
==27337==    by 0x4D72ED: ExecProcNodeFirst (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execProcnode.c:464)
==27337==    by 0x4CA394: ExecProcNode (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/../../../src/include/executor/executor.h:273)
==27337==    by 0x4CD34C: ExecutePlan (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execMain.c:1670)
==27337==    by 0x4CAA7C: standard_ExecutorRun (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execMain.c:365)
==27337==    by 0x7E1E475: CitusExecutorRun (home/onurctirtir/citus/src/backend/distributed/executor/multi_executor.c:238)
==27337==  Uninitialised value was created by a heap allocation
==27337==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==27337==    by 0x9AB1F7: AllocSetContextCreateInternal (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/utils/mmgr/aset.c:438)
==27337==    by 0x4E0D56: CreateExprContextInternal (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execUtils.c:261)
==27337==    by 0x4E0E3E: CreateExprContext (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execUtils.c:311)
==27337==    by 0x4E10D9: ExecAssignExprContext (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execUtils.c:490)
==27337==    by 0x51EE09: ExecInitSeqScan (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/nodeSeqscan.c:147)
==27337==    by 0x4D6CE1: ExecInitNode (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execProcnode.c:210)
==27337==    by 0x5243C7: ExecInitSubqueryScan (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/nodeSubqueryscan.c:126)
==27337==    by 0x4D6DD9: ExecInitNode (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execProcnode.c:250)
==27337==    by 0x4F05B2: ExecInitAppend (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/nodeAppend.c:223)
==27337==    by 0x4D6C46: ExecInitNode (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/execProcnode.c:182)
==27337==    by 0x52003D: ExecInitSetOp (home/onurctirtir/.pgenv/src/postgresql-16.2/src/backend/executor/nodeSetOp.c:530)
==27337== 
==27337== VALGRINDERROR-END
```
2025-05-20 15:22:35 +03:00
Alper Kocatas 088ba75057
Add citus_nodes view (#7968)
DESCRIPTION: Adds `citus_nodes` view that displays the node name, port,
role, and "active" for nodes in the cluster.

This PR adds `citus_nodes` view to the `pg_catalog` schema. The
`citus_nodes` view is created in the `citus` schema and is used to
display the node name, port, role, and active status of each node in the
`pg_dist_node` table.

The view is granted `SELECT` permission to the `PUBLIC` role and is set
to the `pg_catalog` schema.

Test cases was added to `multi_cluster_management` tests. 

structs.py was modified to add white spaces as `citus_indent` required.

---------

Co-authored-by: Alper Kocatas <alperkocatas@microsoft.com>
2025-05-14 15:05:12 +03:00
438 changed files with 42127 additions and 4627 deletions

View File

@ -73,7 +73,7 @@ USER citus
# build postgres versions separately for effective parrallelism and caching of already built versions when changing only certain versions
FROM base AS pg15
RUN MAKEFLAGS="-j $(nproc)" pgenv build 15.12
RUN MAKEFLAGS="-j $(nproc)" pgenv build 15.14
RUN rm .pgenv/src/*.tar*
RUN make -C .pgenv/src/postgresql-*/ clean
RUN make -C .pgenv/src/postgresql-*/src/include install
@ -85,7 +85,7 @@ RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
RUN rm .pgenv-staging/config/default.conf
FROM base AS pg16
RUN MAKEFLAGS="-j $(nproc)" pgenv build 16.8
RUN MAKEFLAGS="-j $(nproc)" pgenv build 16.10
RUN rm .pgenv/src/*.tar*
RUN make -C .pgenv/src/postgresql-*/ clean
RUN make -C .pgenv/src/postgresql-*/src/include install
@ -97,7 +97,7 @@ RUN cp -r .pgenv/src .pgenv/pgsql-* .pgenv/config .pgenv-staging/
RUN rm .pgenv-staging/config/default.conf
FROM base AS pg17
RUN MAKEFLAGS="-j $(nproc)" pgenv build 17.4
RUN MAKEFLAGS="-j $(nproc)" pgenv build 17.6
RUN rm .pgenv/src/*.tar*
RUN make -C .pgenv/src/postgresql-*/ clean
RUN make -C .pgenv/src/postgresql-*/src/include install
@ -216,7 +216,7 @@ COPY --chown=citus:citus .psqlrc .
RUN sudo chown --from=root:root citus:citus -R ~
# sets default pg version
RUN pgenv switch 17.4
RUN pgenv switch 17.6
# make connecting to the coordinator easy
ENV PGPORT=9700

View File

@ -2,6 +2,8 @@
"image": "ghcr.io/citusdata/citus-devcontainer:main",
"runArgs": [
"--cap-add=SYS_PTRACE",
"--cap-add=SYS_NICE", // allow NUMA page inquiry
"--security-opt=seccomp=unconfined", // unblocks move_pages() in the container
"--ulimit=core=-1",
],
"forwardPorts": [

View File

@ -1,4 +1,4 @@
black==23.11.0
black==24.3.0
click==8.1.7
isort==5.12.0
mypy-extensions==1.0.0

View File

@ -16,7 +16,7 @@ pytest-timeout = "*"
pytest-xdist = "*"
pytest-repeat = "*"
pyyaml = "*"
werkzeug = "==2.3.7"
werkzeug = "==3.0.6"
[dev-packages]
black = "*"

View File

@ -1,7 +1,7 @@
{
"_meta": {
"hash": {
"sha256": "f8db86383082539f626f1402e720f5f2e3f9718b44a8f26110cf9f52e7ca46bc"
"sha256": "bdfddfee81a47cfb42e76936d229e94f5d3cee75f612b7beb2d3008b06d6427b"
},
"pipfile-spec": 6,
"requires": {
@ -119,69 +119,85 @@
},
"certifi": {
"hashes": [
"sha256:0569859f95fc761b18b45ef421b1290a0f65f147e92a1e5eb3e635f9a5e4e66f",
"sha256:dc383c07b76109f368f6106eee2b593b04a011ea4d55f652c6ca24a754d1cdd1"
"sha256:5a1e7645bc0ec61a09e26c36f6106dd4cf40c6db3a1fb6352b0244e7fb057c7b",
"sha256:c198e21b1289c2ab85ee4e67bb4b4ef3ead0892059901a8d5b622f24a1101e90"
],
"index": "pypi",
"markers": "python_version >= '3.6'",
"version": "==2024.2.2"
"version": "==2024.7.4"
},
"cffi": {
"hashes": [
"sha256:0c9ef6ff37e974b73c25eecc13952c55bceed9112be2d9d938ded8e856138bcc",
"sha256:131fd094d1065b19540c3d72594260f118b231090295d8c34e19a7bbcf2e860a",
"sha256:1b8ebc27c014c59692bb2664c7d13ce7a6e9a629be20e54e7271fa696ff2b417",
"sha256:2c56b361916f390cd758a57f2e16233eb4f64bcbeee88a4881ea90fca14dc6ab",
"sha256:2d92b25dbf6cae33f65005baf472d2c245c050b1ce709cc4588cdcdd5495b520",
"sha256:31d13b0f99e0836b7ff893d37af07366ebc90b678b6664c955b54561fc36ef36",
"sha256:32c68ef735dbe5857c810328cb2481e24722a59a2003018885514d4c09af9743",
"sha256:3686dffb02459559c74dd3d81748269ffb0eb027c39a6fc99502de37d501faa8",
"sha256:582215a0e9adbe0e379761260553ba11c58943e4bbe9c36430c4ca6ac74b15ed",
"sha256:5b50bf3f55561dac5438f8e70bfcdfd74543fd60df5fa5f62d94e5867deca684",
"sha256:5bf44d66cdf9e893637896c7faa22298baebcd18d1ddb6d2626a6e39793a1d56",
"sha256:6602bc8dc6f3a9e02b6c22c4fc1e47aa50f8f8e6d3f78a5e16ac33ef5fefa324",
"sha256:673739cb539f8cdaa07d92d02efa93c9ccf87e345b9a0b556e3ecc666718468d",
"sha256:68678abf380b42ce21a5f2abde8efee05c114c2fdb2e9eef2efdb0257fba1235",
"sha256:68e7c44931cc171c54ccb702482e9fc723192e88d25a0e133edd7aff8fcd1f6e",
"sha256:6b3d6606d369fc1da4fd8c357d026317fbb9c9b75d36dc16e90e84c26854b088",
"sha256:748dcd1e3d3d7cd5443ef03ce8685043294ad6bd7c02a38d1bd367cfd968e000",
"sha256:7651c50c8c5ef7bdb41108b7b8c5a83013bfaa8a935590c5d74627c047a583c7",
"sha256:7b78010e7b97fef4bee1e896df8a4bbb6712b7f05b7ef630f9d1da00f6444d2e",
"sha256:7e61e3e4fa664a8588aa25c883eab612a188c725755afff6289454d6362b9673",
"sha256:80876338e19c951fdfed6198e70bc88f1c9758b94578d5a7c4c91a87af3cf31c",
"sha256:8895613bcc094d4a1b2dbe179d88d7fb4a15cee43c052e8885783fac397d91fe",
"sha256:88e2b3c14bdb32e440be531ade29d3c50a1a59cd4e51b1dd8b0865c54ea5d2e2",
"sha256:8f8e709127c6c77446a8c0a8c8bf3c8ee706a06cd44b1e827c3e6a2ee6b8c098",
"sha256:9cb4a35b3642fc5c005a6755a5d17c6c8b6bcb6981baf81cea8bfbc8903e8ba8",
"sha256:9f90389693731ff1f659e55c7d1640e2ec43ff725cc61b04b2f9c6d8d017df6a",
"sha256:a09582f178759ee8128d9270cd1344154fd473bb77d94ce0aeb2a93ebf0feaf0",
"sha256:a6a14b17d7e17fa0d207ac08642c8820f84f25ce17a442fd15e27ea18d67c59b",
"sha256:a72e8961a86d19bdb45851d8f1f08b041ea37d2bd8d4fd19903bc3083d80c896",
"sha256:abd808f9c129ba2beda4cfc53bde801e5bcf9d6e0f22f095e45327c038bfe68e",
"sha256:ac0f5edd2360eea2f1daa9e26a41db02dd4b0451b48f7c318e217ee092a213e9",
"sha256:b29ebffcf550f9da55bec9e02ad430c992a87e5f512cd63388abb76f1036d8d2",
"sha256:b2ca4e77f9f47c55c194982e10f058db063937845bb2b7a86c84a6cfe0aefa8b",
"sha256:b7be2d771cdba2942e13215c4e340bfd76398e9227ad10402a8767ab1865d2e6",
"sha256:b84834d0cf97e7d27dd5b7f3aca7b6e9263c56308ab9dc8aae9784abb774d404",
"sha256:b86851a328eedc692acf81fb05444bdf1891747c25af7529e39ddafaf68a4f3f",
"sha256:bcb3ef43e58665bbda2fb198698fcae6776483e0c4a631aa5647806c25e02cc0",
"sha256:c0f31130ebc2d37cdd8e44605fb5fa7ad59049298b3f745c74fa74c62fbfcfc4",
"sha256:c6a164aa47843fb1b01e941d385aab7215563bb8816d80ff3a363a9f8448a8dc",
"sha256:d8a9d3ebe49f084ad71f9269834ceccbf398253c9fac910c4fd7053ff1386936",
"sha256:db8e577c19c0fda0beb7e0d4e09e0ba74b1e4c092e0e40bfa12fe05b6f6d75ba",
"sha256:dc9b18bf40cc75f66f40a7379f6a9513244fe33c0e8aa72e2d56b0196a7ef872",
"sha256:e09f3ff613345df5e8c3667da1d918f9149bd623cd9070c983c013792a9a62eb",
"sha256:e4108df7fe9b707191e55f33efbcb2d81928e10cea45527879a4749cbe472614",
"sha256:e6024675e67af929088fda399b2094574609396b1decb609c55fa58b028a32a1",
"sha256:e70f54f1796669ef691ca07d046cd81a29cb4deb1e5f942003f401c0c4a2695d",
"sha256:e715596e683d2ce000574bae5d07bd522c781a822866c20495e52520564f0969",
"sha256:e760191dd42581e023a68b758769e2da259b5d52e3103c6060ddc02c9edb8d7b",
"sha256:ed86a35631f7bfbb28e108dd96773b9d5a6ce4811cf6ea468bb6a359b256b1e4",
"sha256:ee07e47c12890ef248766a6e55bd38ebfb2bb8edd4142d56db91b21ea68b7627",
"sha256:fa3a0128b152627161ce47201262d3140edb5a5c3da88d73a1b790a959126956",
"sha256:fcc8eb6d5902bb1cf6dc4f187ee3ea80a1eba0a89aba40a5cb20a5087d961357"
"sha256:045d61c734659cc045141be4bae381a41d89b741f795af1dd018bfb532fd0df8",
"sha256:0984a4925a435b1da406122d4d7968dd861c1385afe3b45ba82b750f229811e2",
"sha256:0e2b1fac190ae3ebfe37b979cc1ce69c81f4e4fe5746bb401dca63a9062cdaf1",
"sha256:0f048dcf80db46f0098ccac01132761580d28e28bc0f78ae0d58048063317e15",
"sha256:1257bdabf294dceb59f5e70c64a3e2f462c30c7ad68092d01bbbfb1c16b1ba36",
"sha256:1c39c6016c32bc48dd54561950ebd6836e1670f2ae46128f67cf49e789c52824",
"sha256:1d599671f396c4723d016dbddb72fe8e0397082b0a77a4fab8028923bec050e8",
"sha256:28b16024becceed8c6dfbc75629e27788d8a3f9030691a1dbf9821a128b22c36",
"sha256:2bb1a08b8008b281856e5971307cc386a8e9c5b625ac297e853d36da6efe9c17",
"sha256:30c5e0cb5ae493c04c8b42916e52ca38079f1b235c2f8ae5f4527b963c401caf",
"sha256:31000ec67d4221a71bd3f67df918b1f88f676f1c3b535a7eb473255fdc0b83fc",
"sha256:386c8bf53c502fff58903061338ce4f4950cbdcb23e2902d86c0f722b786bbe3",
"sha256:3edc8d958eb099c634dace3c7e16560ae474aa3803a5df240542b305d14e14ed",
"sha256:45398b671ac6d70e67da8e4224a065cec6a93541bb7aebe1b198a61b58c7b702",
"sha256:46bf43160c1a35f7ec506d254e5c890f3c03648a4dbac12d624e4490a7046cd1",
"sha256:4ceb10419a9adf4460ea14cfd6bc43d08701f0835e979bf821052f1805850fe8",
"sha256:51392eae71afec0d0c8fb1a53b204dbb3bcabcb3c9b807eedf3e1e6ccf2de903",
"sha256:5da5719280082ac6bd9aa7becb3938dc9f9cbd57fac7d2871717b1feb0902ab6",
"sha256:610faea79c43e44c71e1ec53a554553fa22321b65fae24889706c0a84d4ad86d",
"sha256:636062ea65bd0195bc012fea9321aca499c0504409f413dc88af450b57ffd03b",
"sha256:6883e737d7d9e4899a8a695e00ec36bd4e5e4f18fabe0aca0efe0a4b44cdb13e",
"sha256:6b8b4a92e1c65048ff98cfe1f735ef8f1ceb72e3d5f0c25fdb12087a23da22be",
"sha256:6f17be4345073b0a7b8ea599688f692ac3ef23ce28e5df79c04de519dbc4912c",
"sha256:706510fe141c86a69c8ddc029c7910003a17353970cff3b904ff0686a5927683",
"sha256:72e72408cad3d5419375fc87d289076ee319835bdfa2caad331e377589aebba9",
"sha256:733e99bc2df47476e3848417c5a4540522f234dfd4ef3ab7fafdf555b082ec0c",
"sha256:7596d6620d3fa590f677e9ee430df2958d2d6d6de2feeae5b20e82c00b76fbf8",
"sha256:78122be759c3f8a014ce010908ae03364d00a1f81ab5c7f4a7a5120607ea56e1",
"sha256:805b4371bf7197c329fcb3ead37e710d1bca9da5d583f5073b799d5c5bd1eee4",
"sha256:85a950a4ac9c359340d5963966e3e0a94a676bd6245a4b55bc43949eee26a655",
"sha256:8f2cdc858323644ab277e9bb925ad72ae0e67f69e804f4898c070998d50b1a67",
"sha256:9755e4345d1ec879e3849e62222a18c7174d65a6a92d5b346b1863912168b595",
"sha256:98e3969bcff97cae1b2def8ba499ea3d6f31ddfdb7635374834cf89a1a08ecf0",
"sha256:a08d7e755f8ed21095a310a693525137cfe756ce62d066e53f502a83dc550f65",
"sha256:a1ed2dd2972641495a3ec98445e09766f077aee98a1c896dcb4ad0d303628e41",
"sha256:a24ed04c8ffd54b0729c07cee15a81d964e6fee0e3d4d342a27b020d22959dc6",
"sha256:a45e3c6913c5b87b3ff120dcdc03f6131fa0065027d0ed7ee6190736a74cd401",
"sha256:a9b15d491f3ad5d692e11f6b71f7857e7835eb677955c00cc0aefcd0669adaf6",
"sha256:ad9413ccdeda48c5afdae7e4fa2192157e991ff761e7ab8fdd8926f40b160cc3",
"sha256:b2ab587605f4ba0bf81dc0cb08a41bd1c0a5906bd59243d56bad7668a6fc6c16",
"sha256:b62ce867176a75d03a665bad002af8e6d54644fad99a3c70905c543130e39d93",
"sha256:c03e868a0b3bc35839ba98e74211ed2b05d2119be4e8a0f224fba9384f1fe02e",
"sha256:c59d6e989d07460165cc5ad3c61f9fd8f1b4796eacbd81cee78957842b834af4",
"sha256:c7eac2ef9b63c79431bc4b25f1cd649d7f061a28808cbc6c47b534bd789ef964",
"sha256:c9c3d058ebabb74db66e431095118094d06abf53284d9c81f27300d0e0d8bc7c",
"sha256:ca74b8dbe6e8e8263c0ffd60277de77dcee6c837a3d0881d8c1ead7268c9e576",
"sha256:caaf0640ef5f5517f49bc275eca1406b0ffa6aa184892812030f04c2abf589a0",
"sha256:cdf5ce3acdfd1661132f2a9c19cac174758dc2352bfe37d98aa7512c6b7178b3",
"sha256:d016c76bdd850f3c626af19b0542c9677ba156e4ee4fccfdd7848803533ef662",
"sha256:d01b12eeeb4427d3110de311e1774046ad344f5b1a7403101878976ecd7a10f3",
"sha256:d63afe322132c194cf832bfec0dc69a99fb9bb6bbd550f161a49e9e855cc78ff",
"sha256:da95af8214998d77a98cc14e3a3bd00aa191526343078b530ceb0bd710fb48a5",
"sha256:dd398dbc6773384a17fe0d3e7eeb8d1a21c2200473ee6806bb5e6a8e62bb73dd",
"sha256:de2ea4b5833625383e464549fec1bc395c1bdeeb5f25c4a3a82b5a8c756ec22f",
"sha256:de55b766c7aa2e2a3092c51e0483d700341182f08e67c63630d5b6f200bb28e5",
"sha256:df8b1c11f177bc2313ec4b2d46baec87a5f3e71fc8b45dab2ee7cae86d9aba14",
"sha256:e03eab0a8677fa80d646b5ddece1cbeaf556c313dcfac435ba11f107ba117b5d",
"sha256:e221cf152cff04059d011ee126477f0d9588303eb57e88923578ace7baad17f9",
"sha256:e31ae45bc2e29f6b2abd0de1cc3b9d5205aa847cafaecb8af1476a609a2f6eb7",
"sha256:edae79245293e15384b51f88b00613ba9f7198016a5948b5dddf4917d4d26382",
"sha256:f1e22e8c4419538cb197e4dd60acc919d7696e5ef98ee4da4e01d3f8cfa4cc5a",
"sha256:f3a2b4222ce6b60e2e8b337bb9596923045681d71e5a082783484d845390938e",
"sha256:f6a16c31041f09ead72d69f583767292f750d24913dadacf5756b966aacb3f1a",
"sha256:f75c7ab1f9e4aca5414ed4d8e5c0e303a34f4421f8a0d47a4d019ceff0ab6af4",
"sha256:f79fc4fc25f1c8698ff97788206bb3c2598949bfe0fef03d299eb1b5356ada99",
"sha256:f7f5baafcc48261359e14bcd6d9bff6d4b28d9103847c9e136694cb0501aef87",
"sha256:fc48c783f9c87e60831201f2cce7f3b2e4846bf4d8728eabe54d60700b318a0b"
],
"markers": "platform_python_implementation != 'PyPy'",
"version": "==1.16.0"
"markers": "python_version >= '3.8'",
"version": "==1.17.1"
},
"click": {
"hashes": [
@ -202,42 +218,41 @@
},
"cryptography": {
"hashes": [
"sha256:04859aa7f12c2b5f7e22d25198ddd537391f1695df7057c8700f71f26f47a129",
"sha256:069d2ce9be5526a44093a0991c450fe9906cdf069e0e7cd67d9dee49a62b9ebe",
"sha256:0d3ec384058b642f7fb7e7bff9664030011ed1af8f852540c76a1317a9dd0d20",
"sha256:0fab2a5c479b360e5e0ea9f654bcebb535e3aa1e493a715b13244f4e07ea8eec",
"sha256:0fea01527d4fb22ffe38cd98951c9044400f6eff4788cf52ae116e27d30a1ba3",
"sha256:1b797099d221df7cce5ff2a1d272761d1554ddf9a987d3e11f6459b38cd300fd",
"sha256:1e935c2900fb53d31f491c0de04f41110351377be19d83d908c1fd502ae8daa5",
"sha256:20100c22b298c9eaebe4f0b9032ea97186ac2555f426c3e70670f2517989543b",
"sha256:20180da1b508f4aefc101cebc14c57043a02b355d1a652b6e8e537967f1e1b46",
"sha256:25b09b73db78facdfd7dd0fa77a3f19e94896197c86e9f6dc16bce7b37a96504",
"sha256:2619487f37da18d6826e27854a7f9d4d013c51eafb066c80d09c63cf24505306",
"sha256:2eb6368d5327d6455f20327fb6159b97538820355ec00f8cc9464d617caecead",
"sha256:35772a6cffd1f59b85cb670f12faba05513446f80352fe811689b4e439b5d89e",
"sha256:39d5c93e95bcbc4c06313fc6a500cee414ee39b616b55320c1904760ad686938",
"sha256:3d96ea47ce6d0055d5b97e761d37b4e84195485cb5a38401be341fabf23bc32a",
"sha256:4dcab7c25e48fc09a73c3e463d09ac902a932a0f8d0c568238b3696d06bf377b",
"sha256:5fbf0f3f0fac7c089308bd771d2c6c7b7d53ae909dce1db52d8e921f6c19bb3a",
"sha256:6c25e1e9c2ce682d01fc5e2dde6598f7313027343bd14f4049b82ad0402e52cd",
"sha256:762f3771ae40e111d78d77cbe9c1035e886ac04a234d3ee0856bf4ecb3749d54",
"sha256:90147dad8c22d64b2ff7331f8d4cddfdc3ee93e4879796f837bdbb2a0b141e0c",
"sha256:935cca25d35dda9e7bd46a24831dfd255307c55a07ff38fd1a92119cffc34857",
"sha256:93fbee08c48e63d5d1b39ab56fd3fdd02e6c2431c3da0f4edaf54954744c718f",
"sha256:9541c69c62d7446539f2c1c06d7046aef822940d248fa4b8962ff0302862cc1f",
"sha256:c23f03cfd7d9826cdcbad7850de67e18b4654179e01fe9bc623d37c2638eb4ef",
"sha256:c3d1f5a1d403a8e640fa0887e9f7087331abb3f33b0f2207d2cc7f213e4a864c",
"sha256:d1998e545081da0ab276bcb4b33cce85f775adb86a516e8f55b3dac87f469548",
"sha256:d5cf11bc7f0b71fb71af26af396c83dfd3f6eed56d4b6ef95d57867bf1e4ba65",
"sha256:db0480ffbfb1193ac4e1e88239f31314fe4c6cdcf9c0b8712b55414afbf80db4",
"sha256:de4ae486041878dc46e571a4c70ba337ed5233a1344c14a0790c4c4be4bbb8b4",
"sha256:de5086cd475d67113ccb6f9fae6d8fe3ac54a4f9238fd08bfdb07b03d791ff0a",
"sha256:df34312149b495d9d03492ce97471234fd9037aa5ba217c2a6ea890e9166f151",
"sha256:ead69ba488f806fe1b1b4050febafdbf206b81fa476126f3e16110c818bac396"
"sha256:00918d859aa4e57db8299607086f793fa7813ae2ff5a4637e318a25ef82730f7",
"sha256:1e8d181e90a777b63f3f0caa836844a1182f1f265687fac2115fcf245f5fbec3",
"sha256:1f9a92144fa0c877117e9748c74501bea842f93d21ee00b0cf922846d9d0b183",
"sha256:21377472ca4ada2906bc313168c9dc7b1d7ca417b63c1c3011d0c74b7de9ae69",
"sha256:24979e9f2040c953a94bf3c6782e67795a4c260734e5264dceea65c8f4bae64a",
"sha256:2a46a89ad3e6176223b632056f321bc7de36b9f9b93b2cc1cccf935a3849dc62",
"sha256:322eb03ecc62784536bc173f1483e76747aafeb69c8728df48537eb431cd1911",
"sha256:436df4f203482f41aad60ed1813811ac4ab102765ecae7a2bbb1dbb66dcff5a7",
"sha256:4f422e8c6a28cf8b7f883eb790695d6d45b0c385a2583073f3cec434cc705e1a",
"sha256:53f23339864b617a3dfc2b0ac8d5c432625c80014c25caac9082314e9de56f41",
"sha256:5fed5cd6102bb4eb843e3315d2bf25fede494509bddadb81e03a859c1bc17b83",
"sha256:610a83540765a8d8ce0f351ce42e26e53e1f774a6efb71eb1b41eb01d01c3d12",
"sha256:6c8acf6f3d1f47acb2248ec3ea261171a671f3d9428e34ad0357148d492c7864",
"sha256:6f76fdd6fd048576a04c5210d53aa04ca34d2ed63336d4abd306d0cbe298fddf",
"sha256:72198e2b5925155497a5a3e8c216c7fb3e64c16ccee11f0e7da272fa93b35c4c",
"sha256:887143b9ff6bad2b7570da75a7fe8bbf5f65276365ac259a5d2d5147a73775f2",
"sha256:888fcc3fce0c888785a4876ca55f9f43787f4c5c1cc1e2e0da71ad481ff82c5b",
"sha256:8e6a85a93d0642bd774460a86513c5d9d80b5c002ca9693e63f6e540f1815ed0",
"sha256:94f99f2b943b354a5b6307d7e8d19f5c423a794462bde2bf310c770ba052b1c4",
"sha256:9b336599e2cb77b1008cb2ac264b290803ec5e8e89d618a5e978ff5eb6f715d9",
"sha256:a2d8a7045e1ab9b9f803f0d9531ead85f90c5f2859e653b61497228b18452008",
"sha256:b8272f257cf1cbd3f2e120f14c68bff2b6bdfcc157fafdee84a1b795efd72862",
"sha256:bf688f615c29bfe9dfc44312ca470989279f0e94bb9f631f85e3459af8efc009",
"sha256:d9c5b9f698a83c8bd71e0f4d3f9f839ef244798e5ffe96febfa9714717db7af7",
"sha256:dd7c7e2d71d908dc0f8d2027e1604102140d84b155e658c20e8ad1304317691f",
"sha256:df978682c1504fc93b3209de21aeabf2375cb1571d4e61907b3e7a2540e83026",
"sha256:e403f7f766ded778ecdb790da786b418a9f2394f36e8cc8b796cc056ab05f44f",
"sha256:eb3889330f2a4a148abead555399ec9a32b13b7c8ba969b72d8e500eb7ef84cd",
"sha256:f4daefc971c2d1f82f03097dc6f216744a6cd2ac0f04c68fb935ea2ba2a0d420",
"sha256:f51f5705ab27898afda1aaa430f34ad90dc117421057782022edf0600bec5f14",
"sha256:fd0ee90072861e276b0ff08bd627abec29e32a53b2be44e41dbcdf87cbee2b00"
],
"index": "pypi",
"markers": "python_version >= '3.7'",
"version": "==42.0.3"
"markers": "python_version >= '3.7' and python_full_version not in '3.9.0, 3.9.1'",
"version": "==44.0.1"
},
"docopt": {
"hashes": [
@ -329,11 +344,12 @@
},
"jinja2": {
"hashes": [
"sha256:7d6d50dd97d52cbc355597bd845fabfbac3f551e1f99619e39a35ce8c370b5fa",
"sha256:ac8bd6544d4bb2c9792bf3a159e80bba8fda7f07e81bc3aed565432d5925ba90"
"sha256:0137fb05990d35f1275a587e9aee6d56da821fc83491a0fb838183be43f66d6d",
"sha256:85ece4451f492d0c13c5dd7c13a64681a86afae63a5f347908daf103ce6d2f67"
],
"index": "pypi",
"markers": "python_version >= '3.7'",
"version": "==3.1.3"
"version": "==3.1.6"
},
"kaitaistruct": {
"hashes": [
@ -353,69 +369,70 @@
},
"markupsafe": {
"hashes": [
"sha256:00e046b6dd71aa03a41079792f8473dc494d564611a8f89bbbd7cb93295ebdcf",
"sha256:075202fa5b72c86ad32dc7d0b56024ebdbcf2048c0ba09f1cde31bfdd57bcfff",
"sha256:0e397ac966fdf721b2c528cf028494e86172b4feba51d65f81ffd65c63798f3f",
"sha256:17b950fccb810b3293638215058e432159d2b71005c74371d784862b7e4683f3",
"sha256:1f3fbcb7ef1f16e48246f704ab79d79da8a46891e2da03f8783a5b6fa41a9532",
"sha256:2174c595a0d73a3080ca3257b40096db99799265e1c27cc5a610743acd86d62f",
"sha256:2b7c57a4dfc4f16f7142221afe5ba4e093e09e728ca65c51f5620c9aaeb9a617",
"sha256:2d2d793e36e230fd32babe143b04cec8a8b3eb8a3122d2aceb4a371e6b09b8df",
"sha256:30b600cf0a7ac9234b2638fbc0fb6158ba5bdcdf46aeb631ead21248b9affbc4",
"sha256:397081c1a0bfb5124355710fe79478cdbeb39626492b15d399526ae53422b906",
"sha256:3a57fdd7ce31c7ff06cdfbf31dafa96cc533c21e443d57f5b1ecc6cdc668ec7f",
"sha256:3c6b973f22eb18a789b1460b4b91bf04ae3f0c4234a0a6aa6b0a92f6f7b951d4",
"sha256:3e53af139f8579a6d5f7b76549125f0d94d7e630761a2111bc431fd820e163b8",
"sha256:4096e9de5c6fdf43fb4f04c26fb114f61ef0bf2e5604b6ee3019d51b69e8c371",
"sha256:4275d846e41ecefa46e2015117a9f491e57a71ddd59bbead77e904dc02b1bed2",
"sha256:4c31f53cdae6ecfa91a77820e8b151dba54ab528ba65dfd235c80b086d68a465",
"sha256:4f11aa001c540f62c6166c7726f71f7573b52c68c31f014c25cc7901deea0b52",
"sha256:5049256f536511ee3f7e1b3f87d1d1209d327e818e6ae1365e8653d7e3abb6a6",
"sha256:58c98fee265677f63a4385256a6d7683ab1832f3ddd1e66fe948d5880c21a169",
"sha256:598e3276b64aff0e7b3451b72e94fa3c238d452e7ddcd893c3ab324717456bad",
"sha256:5b7b716f97b52c5a14bffdf688f971b2d5ef4029127f1ad7a513973cfd818df2",
"sha256:5dedb4db619ba5a2787a94d877bc8ffc0566f92a01c0ef214865e54ecc9ee5e0",
"sha256:619bc166c4f2de5caa5a633b8b7326fbe98e0ccbfacabd87268a2b15ff73a029",
"sha256:629ddd2ca402ae6dbedfceeba9c46d5f7b2a61d9749597d4307f943ef198fc1f",
"sha256:656f7526c69fac7f600bd1f400991cc282b417d17539a1b228617081106feb4a",
"sha256:6ec585f69cec0aa07d945b20805be741395e28ac1627333b1c5b0105962ffced",
"sha256:72b6be590cc35924b02c78ef34b467da4ba07e4e0f0454a2c5907f473fc50ce5",
"sha256:7502934a33b54030eaf1194c21c692a534196063db72176b0c4028e140f8f32c",
"sha256:7a68b554d356a91cce1236aa7682dc01df0edba8d043fd1ce607c49dd3c1edcf",
"sha256:7b2e5a267c855eea6b4283940daa6e88a285f5f2a67f2220203786dfa59b37e9",
"sha256:823b65d8706e32ad2df51ed89496147a42a2a6e01c13cfb6ffb8b1e92bc910bb",
"sha256:8590b4ae07a35970728874632fed7bd57b26b0102df2d2b233b6d9d82f6c62ad",
"sha256:8dd717634f5a044f860435c1d8c16a270ddf0ef8588d4887037c5028b859b0c3",
"sha256:8dec4936e9c3100156f8a2dc89c4b88d5c435175ff03413b443469c7c8c5f4d1",
"sha256:97cafb1f3cbcd3fd2b6fbfb99ae11cdb14deea0736fc2b0952ee177f2b813a46",
"sha256:a17a92de5231666cfbe003f0e4b9b3a7ae3afb1ec2845aadc2bacc93ff85febc",
"sha256:a549b9c31bec33820e885335b451286e2969a2d9e24879f83fe904a5ce59d70a",
"sha256:ac07bad82163452a6884fe8fa0963fb98c2346ba78d779ec06bd7a6262132aee",
"sha256:ae2ad8ae6ebee9d2d94b17fb62763125f3f374c25618198f40cbb8b525411900",
"sha256:b91c037585eba9095565a3556f611e3cbfaa42ca1e865f7b8015fe5c7336d5a5",
"sha256:bc1667f8b83f48511b94671e0e441401371dfd0f0a795c7daa4a3cd1dde55bea",
"sha256:bec0a414d016ac1a18862a519e54b2fd0fc8bbfd6890376898a6c0891dd82e9f",
"sha256:bf50cd79a75d181c9181df03572cdce0fbb75cc353bc350712073108cba98de5",
"sha256:bff1b4290a66b490a2f4719358c0cdcd9bafb6b8f061e45c7a2460866bf50c2e",
"sha256:c061bb86a71b42465156a3ee7bd58c8c2ceacdbeb95d05a99893e08b8467359a",
"sha256:c8b29db45f8fe46ad280a7294f5c3ec36dbac9491f2d1c17345be8e69cc5928f",
"sha256:ce409136744f6521e39fd8e2a24c53fa18ad67aa5bc7c2cf83645cce5b5c4e50",
"sha256:d050b3361367a06d752db6ead6e7edeb0009be66bc3bae0ee9d97fb326badc2a",
"sha256:d283d37a890ba4c1ae73ffadf8046435c76e7bc2247bbb63c00bd1a709c6544b",
"sha256:d9fad5155d72433c921b782e58892377c44bd6252b5af2f67f16b194987338a4",
"sha256:daa4ee5a243f0f20d528d939d06670a298dd39b1ad5f8a72a4275124a7819eff",
"sha256:db0b55e0f3cc0be60c1f19efdde9a637c32740486004f20d1cff53c3c0ece4d2",
"sha256:e61659ba32cf2cf1481e575d0462554625196a1f2fc06a1c777d3f48e8865d46",
"sha256:ea3d8a3d18833cf4304cd2fc9cbb1efe188ca9b5efef2bdac7adc20594a0e46b",
"sha256:ec6a563cff360b50eed26f13adc43e61bc0c04d94b8be985e6fb24b81f6dcfdf",
"sha256:f5dfb42c4604dddc8e4305050aa6deb084540643ed5804d7455b5df8fe16f5e5",
"sha256:fa173ec60341d6bb97a89f5ea19c85c5643c1e7dedebc22f5181eb73573142c5",
"sha256:fa9db3f79de01457b03d4f01b34cf91bc0048eb2c3846ff26f66687c2f6d16ab",
"sha256:fce659a462a1be54d2ffcacea5e3ba2d74daa74f30f5f143fe0c58636e355fdd",
"sha256:ffee1f21e5ef0d712f9033568f8344d5da8cc2869dbd08d87c84656e6a2d2f68"
"sha256:0bff5e0ae4ef2e1ae4fdf2dfd5b76c75e5c2fa4132d05fc1b0dabcd20c7e28c4",
"sha256:0f4ca02bea9a23221c0182836703cbf8930c5e9454bacce27e767509fa286a30",
"sha256:1225beacc926f536dc82e45f8a4d68502949dc67eea90eab715dea3a21c1b5f0",
"sha256:131a3c7689c85f5ad20f9f6fb1b866f402c445b220c19fe4308c0b147ccd2ad9",
"sha256:15ab75ef81add55874e7ab7055e9c397312385bd9ced94920f2802310c930396",
"sha256:1a9d3f5f0901fdec14d8d2f66ef7d035f2157240a433441719ac9a3fba440b13",
"sha256:1c99d261bd2d5f6b59325c92c73df481e05e57f19837bdca8413b9eac4bd8028",
"sha256:1e084f686b92e5b83186b07e8a17fc09e38fff551f3602b249881fec658d3eca",
"sha256:2181e67807fc2fa785d0592dc2d6206c019b9502410671cc905d132a92866557",
"sha256:2cb8438c3cbb25e220c2ab33bb226559e7afb3baec11c4f218ffa7308603c832",
"sha256:3169b1eefae027567d1ce6ee7cae382c57fe26e82775f460f0b2778beaad66c0",
"sha256:3809ede931876f5b2ec92eef964286840ed3540dadf803dd570c3b7e13141a3b",
"sha256:38a9ef736c01fccdd6600705b09dc574584b89bea478200c5fbf112a6b0d5579",
"sha256:3d79d162e7be8f996986c064d1c7c817f6df3a77fe3d6859f6f9e7be4b8c213a",
"sha256:444dcda765c8a838eaae23112db52f1efaf750daddb2d9ca300bcae1039adc5c",
"sha256:48032821bbdf20f5799ff537c7ac3d1fba0ba032cfc06194faffa8cda8b560ff",
"sha256:4aa4e5faecf353ed117801a068ebab7b7e09ffb6e1d5e412dc852e0da018126c",
"sha256:52305740fe773d09cffb16f8ed0427942901f00adedac82ec8b67752f58a1b22",
"sha256:569511d3b58c8791ab4c2e1285575265991e6d8f8700c7be0e88f86cb0672094",
"sha256:57cb5a3cf367aeb1d316576250f65edec5bb3be939e9247ae594b4bcbc317dfb",
"sha256:5b02fb34468b6aaa40dfc198d813a641e3a63b98c2b05a16b9f80b7ec314185e",
"sha256:6381026f158fdb7c72a168278597a5e3a5222e83ea18f543112b2662a9b699c5",
"sha256:6af100e168aa82a50e186c82875a5893c5597a0c1ccdb0d8b40240b1f28b969a",
"sha256:6c89876f41da747c8d3677a2b540fb32ef5715f97b66eeb0c6b66f5e3ef6f59d",
"sha256:6e296a513ca3d94054c2c881cc913116e90fd030ad1c656b3869762b754f5f8a",
"sha256:70a87b411535ccad5ef2f1df5136506a10775d267e197e4cf531ced10537bd6b",
"sha256:7e94c425039cde14257288fd61dcfb01963e658efbc0ff54f5306b06054700f8",
"sha256:846ade7b71e3536c4e56b386c2a47adf5741d2d8b94ec9dc3e92e5e1ee1e2225",
"sha256:88416bd1e65dcea10bc7569faacb2c20ce071dd1f87539ca2ab364bf6231393c",
"sha256:88b49a3b9ff31e19998750c38e030fc7bb937398b1f78cfa599aaef92d693144",
"sha256:8c4e8c3ce11e1f92f6536ff07154f9d49677ebaaafc32db9db4620bc11ed480f",
"sha256:8e06879fc22a25ca47312fbe7c8264eb0b662f6db27cb2d3bbbc74b1df4b9b87",
"sha256:9025b4018f3a1314059769c7bf15441064b2207cb3f065e6ea1e7359cb46db9d",
"sha256:93335ca3812df2f366e80509ae119189886b0f3c2b81325d39efdb84a1e2ae93",
"sha256:9778bd8ab0a994ebf6f84c2b949e65736d5575320a17ae8984a77fab08db94cf",
"sha256:9e2d922824181480953426608b81967de705c3cef4d1af983af849d7bd619158",
"sha256:a123e330ef0853c6e822384873bef7507557d8e4a082961e1defa947aa59ba84",
"sha256:a904af0a6162c73e3edcb969eeeb53a63ceeb5d8cf642fade7d39e7963a22ddb",
"sha256:ad10d3ded218f1039f11a75f8091880239651b52e9bb592ca27de44eed242a48",
"sha256:b424c77b206d63d500bcb69fa55ed8d0e6a3774056bdc4839fc9298a7edca171",
"sha256:b5a6b3ada725cea8a5e634536b1b01c30bcdcd7f9c6fff4151548d5bf6b3a36c",
"sha256:ba8062ed2cf21c07a9e295d5b8a2a5ce678b913b45fdf68c32d95d6c1291e0b6",
"sha256:ba9527cdd4c926ed0760bc301f6728ef34d841f405abf9d4f959c478421e4efd",
"sha256:bbcb445fa71794da8f178f0f6d66789a28d7319071af7a496d4d507ed566270d",
"sha256:bcf3e58998965654fdaff38e58584d8937aa3096ab5354d493c77d1fdd66d7a1",
"sha256:c0ef13eaeee5b615fb07c9a7dadb38eac06a0608b41570d8ade51c56539e509d",
"sha256:cabc348d87e913db6ab4aa100f01b08f481097838bdddf7c7a84b7575b7309ca",
"sha256:cdb82a876c47801bb54a690c5ae105a46b392ac6099881cdfb9f6e95e4014c6a",
"sha256:cfad01eed2c2e0c01fd0ecd2ef42c492f7f93902e39a42fc9ee1692961443a29",
"sha256:d16a81a06776313e817c951135cf7340a3e91e8c1ff2fac444cfd75fffa04afe",
"sha256:d8213e09c917a951de9d09ecee036d5c7d36cb6cb7dbaece4c71a60d79fb9798",
"sha256:e07c3764494e3776c602c1e78e298937c3315ccc9043ead7e685b7f2b8d47b3c",
"sha256:e17c96c14e19278594aa4841ec148115f9c7615a47382ecb6b82bd8fea3ab0c8",
"sha256:e444a31f8db13eb18ada366ab3cf45fd4b31e4db1236a4448f68778c1d1a5a2f",
"sha256:e6a2a455bd412959b57a172ce6328d2dd1f01cb2135efda2e4576e8a23fa3b0f",
"sha256:eaa0a10b7f72326f1372a713e73c3f739b524b3af41feb43e4921cb529f5929a",
"sha256:eb7972a85c54febfb25b5c4b4f3af4dcc731994c7da0d8a0b4a6eb0640e1d178",
"sha256:ee55d3edf80167e48ea11a923c7386f4669df67d7994554387f84e7d8b0a2bf0",
"sha256:f3818cb119498c0678015754eba762e0d61e5b52d34c8b13d770f0719f7b1d79",
"sha256:f8b3d067f2e40fe93e1ccdd6b2e1d16c43140e76f02fb1319a05cf2b79d99430",
"sha256:fcabf5ff6eea076f859677f5f0b6b5c1a51e70a376b0579e0eadef8db48c6b50"
],
"markers": "python_version >= '3.7'",
"version": "==2.1.5"
"markers": "python_version >= '3.9'",
"version": "==3.0.2"
},
"mitmproxy": {
"editable": true,
@ -561,10 +578,11 @@
},
"pycparser": {
"hashes": [
"sha256:8ee45429555515e1f6b185e78100aea234072576aa43ab53aefcae078162fca9",
"sha256:e644fdec12f7872f86c58ff790da456218b10f863970249516d60a5eaca77206"
"sha256:491c8be9c040f5390f5bf44a5b07752bd07f56edf992381b05c701439eec10f6",
"sha256:c3702b6d3dd8c7abc1afa565d7e63d53a1d0bd86cdc24edd75470f4de499cfcc"
],
"version": "==2.21"
"markers": "python_version >= '3.8'",
"version": "==2.22"
},
"pyopenssl": {
"hashes": [
@ -772,20 +790,22 @@
},
"tornado": {
"hashes": [
"sha256:02ccefc7d8211e5a7f9e8bc3f9e5b0ad6262ba2fbb683a6443ecc804e5224ce0",
"sha256:10aeaa8006333433da48dec9fe417877f8bcc21f48dda8d661ae79da357b2a63",
"sha256:27787de946a9cffd63ce5814c33f734c627a87072ec7eed71f7fc4417bb16263",
"sha256:6f8a6c77900f5ae93d8b4ae1196472d0ccc2775cc1dfdc9e7727889145c45052",
"sha256:71ddfc23a0e03ef2df1c1397d859868d158c8276a0603b96cf86892bff58149f",
"sha256:72291fa6e6bc84e626589f1c29d90a5a6d593ef5ae68052ee2ef000dfd273dee",
"sha256:88b84956273fbd73420e6d4b8d5ccbe913c65d31351b4c004ae362eba06e1f78",
"sha256:e43bc2e5370a6a8e413e1e1cd0c91bedc5bd62a74a532371042a18ef19e10579",
"sha256:f0251554cdd50b4b44362f73ad5ba7126fc5b2c2895cc62b14a1c2d7ea32f212",
"sha256:f7894c581ecdcf91666a0912f18ce5e757213999e183ebfc2c3fdbf4d5bd764e",
"sha256:fd03192e287fbd0899dd8f81c6fb9cbbc69194d2074b38f384cb6fa72b80e9c2"
"sha256:007f036f7b661e899bd9ef3fa5f87eb2cb4d1b2e7d67368e778e140a2f101a7a",
"sha256:03576ab51e9b1677e4cdaae620d6700d9823568b7939277e4690fe4085886c55",
"sha256:119c03f440a832128820e87add8a175d211b7f36e7ee161c631780877c28f4fb",
"sha256:231f2193bb4c28db2bdee9e57bc6ca0cd491f345cd307c57d79613b058e807e0",
"sha256:542e380658dcec911215c4820654662810c06ad872eefe10def6a5e9b20e9633",
"sha256:7c625b9d03f1fb4d64149c47d0135227f0434ebb803e2008040eb92906b0105a",
"sha256:9a0d8d2309faf015903080fb5bdd969ecf9aa5ff893290845cf3fd5b2dd101bc",
"sha256:9ac1cbe1db860b3cbb251e795c701c41d343f06a96049d6274e7c77559117e41",
"sha256:ab75fe43d0e1b3a5e3ceddb2a611cb40090dd116a84fc216a07a298d9e000471",
"sha256:c70c0a26d5b2d85440e4debd14a8d0b463a0cf35d92d3af05f5f1ffa8675c826",
"sha256:f81067dad2e4443b015368b24e802d0083fecada4f0a4572fdb72fc06e54a9a6",
"sha256:fd20c816e31be1bbff1f7681f970bbbd0bb241c364220140228ba24242bcdc59"
],
"markers": "python_version >= '3.8'",
"version": "==6.4"
"index": "pypi",
"markers": "python_version >= '3.9'",
"version": "==6.5"
},
"typing-extensions": {
"hashes": [
@ -803,12 +823,12 @@
},
"werkzeug": {
"hashes": [
"sha256:2b8c0e447b4b9dbcc85dd97b6eeb4dcbaf6c8b6c3be0bd654e25553e0a2157d8",
"sha256:effc12dba7f3bd72e605ce49807bbe692bd729c3bb122a3b91747a6ae77df528"
"sha256:1bc0c2310d2fbb07b1dd1105eba2f7af72f322e1e455f2f93c993bee8c8a5f17",
"sha256:a8dd59d4de28ca70471a34cba79bed5f7ef2e036a76b3ab0835474246eb41f8d"
],
"index": "pypi",
"markers": "python_version >= '3.8'",
"version": "==2.3.7"
"version": "==3.0.6"
},
"wsproto": {
"hashes": [
@ -884,40 +904,40 @@
},
"black": {
"hashes": [
"sha256:057c3dc602eaa6fdc451069bd027a1b2635028b575a6c3acfd63193ced20d9c8",
"sha256:08654d0797e65f2423f850fc8e16a0ce50925f9337fb4a4a176a7aa4026e63f8",
"sha256:163baf4ef40e6897a2a9b83890e59141cc8c2a98f2dda5080dc15c00ee1e62cd",
"sha256:1e08fb9a15c914b81dd734ddd7fb10513016e5ce7e6704bdd5e1251ceee51ac9",
"sha256:4dd76e9468d5536abd40ffbc7a247f83b2324f0c050556d9c371c2b9a9a95e31",
"sha256:4f9de21bafcba9683853f6c96c2d515e364aee631b178eaa5145fc1c61a3cc92",
"sha256:61a0391772490ddfb8a693c067df1ef5227257e72b0e4108482b8d41b5aee13f",
"sha256:6981eae48b3b33399c8757036c7f5d48a535b962a7c2310d19361edeef64ce29",
"sha256:7e53a8c630f71db01b28cd9602a1ada68c937cbf2c333e6ed041390d6968faf4",
"sha256:810d445ae6069ce64030c78ff6127cd9cd178a9ac3361435708b907d8a04c693",
"sha256:93601c2deb321b4bad8f95df408e3fb3943d85012dddb6121336b8e24a0d1218",
"sha256:992e451b04667116680cb88f63449267c13e1ad134f30087dec8527242e9862a",
"sha256:9db528bccb9e8e20c08e716b3b09c6bdd64da0dd129b11e160bf082d4642ac23",
"sha256:a0057f800de6acc4407fe75bb147b0c2b5cbb7c3ed110d3e5999cd01184d53b0",
"sha256:ba15742a13de85e9b8f3239c8f807723991fbfae24bad92d34a2b12e81904982",
"sha256:bce4f25c27c3435e4dace4815bcb2008b87e167e3bf4ee47ccdc5ce906eb4894",
"sha256:ca610d29415ee1a30a3f30fab7a8f4144e9d34c89a235d81292a1edb2b55f540",
"sha256:d533d5e3259720fdbc1b37444491b024003e012c5173f7d06825a77508085430",
"sha256:d84f29eb3ee44859052073b7636533ec995bd0f64e2fb43aeceefc70090e752b",
"sha256:e37c99f89929af50ffaf912454b3e3b47fd64109659026b678c091a4cd450fb2",
"sha256:e8a6ae970537e67830776488bca52000eaa37fa63b9988e8c487458d9cd5ace6",
"sha256:faf2ee02e6612577ba0181f4347bcbcf591eb122f7841ae5ba233d12c39dcb4d"
"sha256:2818cf72dfd5d289e48f37ccfa08b460bf469e67fb7c4abb07edc2e9f16fb63f",
"sha256:41622020d7120e01d377f74249e677039d20e6344ff5851de8a10f11f513bf93",
"sha256:4acf672def7eb1725f41f38bf6bf425c8237248bb0804faa3965c036f7672d11",
"sha256:4be5bb28e090456adfc1255e03967fb67ca846a03be7aadf6249096100ee32d0",
"sha256:4f1373a7808a8f135b774039f61d59e4be7eb56b2513d3d2f02a8b9365b8a8a9",
"sha256:56f52cfbd3dabe2798d76dbdd299faa046a901041faf2cf33288bc4e6dae57b5",
"sha256:65b76c275e4c1c5ce6e9870911384bff5ca31ab63d19c76811cb1fb162678213",
"sha256:65c02e4ea2ae09d16314d30912a58ada9a5c4fdfedf9512d23326128ac08ac3d",
"sha256:6905238a754ceb7788a73f02b45637d820b2f5478b20fec82ea865e4f5d4d9f7",
"sha256:79dcf34b33e38ed1b17434693763301d7ccbd1c5860674a8f871bd15139e7837",
"sha256:7bb041dca0d784697af4646d3b62ba4a6b028276ae878e53f6b4f74ddd6db99f",
"sha256:7d5e026f8da0322b5662fa7a8e752b3fa2dac1c1cbc213c3d7ff9bdd0ab12395",
"sha256:9f50ea1132e2189d8dff0115ab75b65590a3e97de1e143795adb4ce317934995",
"sha256:a0c9c4a0771afc6919578cec71ce82a3e31e054904e7197deacbc9382671c41f",
"sha256:aadf7a02d947936ee418777e0247ea114f78aff0d0959461057cae8a04f20597",
"sha256:b5991d523eee14756f3c8d5df5231550ae8993e2286b8014e2fdea7156ed0959",
"sha256:bf21b7b230718a5f08bd32d5e4f1db7fc8788345c8aea1d155fc17852b3410f5",
"sha256:c45f8dff244b3c431b36e3224b6be4a127c6aca780853574c00faf99258041eb",
"sha256:c7ed6668cbbfcd231fa0dc1b137d3e40c04c7f786e626b405c62bcd5db5857e4",
"sha256:d7de8d330763c66663661a1ffd432274a2f92f07feeddd89ffd085b5744f85e7",
"sha256:e19cb1c6365fd6dc38a6eae2dcb691d7d83935c10215aef8e6c38edee3f77abd",
"sha256:e2af80566f43c85f5797365077fb64a393861a3730bd110971ab7a0c94e873e7"
],
"index": "pypi",
"markers": "python_version >= '3.8'",
"version": "==24.2.0"
"version": "==24.3.0"
},
"click": {
"hashes": [
"sha256:6a7a62563bbfabfda3a38f3023a1db4a35978c0abd76f6c9605ecd6554d6d9b1",
"sha256:8458d7b1287c5fb128c90e23381cf99dcde74beaf6c7ff6384ce84d6fe090adb"
"sha256:63c132bbbed01578a06712a2d1f497bb62d9c1c0d329b7903a866228027263b2",
"sha256:ed53c9d8990d83c2a27deae68e4ee337473f6330c040a31d4225c9574d16096a"
],
"markers": "python_version >= '3.6'",
"version": "==8.0.4"
"markers": "python_version >= '3.7'",
"version": "==8.1.8"
},
"flake8": {
"hashes": [
@ -956,19 +976,19 @@
},
"mypy-extensions": {
"hashes": [
"sha256:4392f6c0eb8a5668a69e23d168ffa70f0be9ccfd32b5cc2d26a34ae5b844552d",
"sha256:75dbf8955dc00442a438fc4d0666508a9a97b6bd41aa2f0ffe9d2f2725af0782"
"sha256:1be4cccdb0f2482337c4743e60421de3a356cd97508abadd57d47403e94f5505",
"sha256:52e68efc3284861e772bbcd66823fde5ae21fd2fdb51c62a211403730b916558"
],
"markers": "python_version >= '3.5'",
"version": "==1.0.0"
"markers": "python_version >= '3.8'",
"version": "==1.1.0"
},
"packaging": {
"hashes": [
"sha256:048fb0e9405036518eaaf48a55953c750c11e1a1b68e0dd1a9d62ed0c092cfc5",
"sha256:8c491190033a9af7e1d931d0b5dacc2ef47509b34dd0de67ed209b5203fc88c7"
"sha256:29572ef2b1f17581046b3a2227d5c611fb25ec70ca1ba8554b24b0e69331a484",
"sha256:d443872c98d677bf60f6a1f2f8c1cb748e8fe762d2bf9d3148b5599295b0fc4f"
],
"markers": "python_version >= '3.7'",
"version": "==23.2"
"markers": "python_version >= '3.8'",
"version": "==25.0"
},
"pathspec": {
"hashes": [
@ -980,11 +1000,11 @@
},
"platformdirs": {
"hashes": [
"sha256:0614df2a2f37e1a662acbd8e2b25b92ccf8632929bc6d43467e17fe89c75e068",
"sha256:ef0cc731df711022c174543cb70a9b5bd22e5a9337c8624ef2c2ceb8ddad8768"
"sha256:3d512d96e16bcb959a814c9f348431070822a6496326a4be0911c40b5a74c2bc",
"sha256:ff7059bb7eb1179e2685604f4aaf157cfd9535242bd23742eadc3c13542139b4"
],
"markers": "python_version >= '3.8'",
"version": "==4.2.0"
"markers": "python_version >= '3.9'",
"version": "==4.3.8"
},
"pycodestyle": {
"hashes": [
@ -1004,19 +1024,49 @@
},
"tomli": {
"hashes": [
"sha256:939de3e7a6161af0c887ef91b7d41a53e7c5a1ca976325f429cb46ea9bc30ecc",
"sha256:de526c12914f0c550d15924c62d72abc48d6fe7364aa87328337a31007fe8a4f"
"sha256:023aa114dd824ade0100497eb2318602af309e5a55595f76b626d6d9f3b7b0a6",
"sha256:02abe224de6ae62c19f090f68da4e27b10af2b93213d36cf44e6e1c5abd19fdd",
"sha256:286f0ca2ffeeb5b9bd4fcc8d6c330534323ec51b2f52da063b11c502da16f30c",
"sha256:2d0f2fdd22b02c6d81637a3c95f8cd77f995846af7414c5c4b8d0545afa1bc4b",
"sha256:33580bccab0338d00994d7f16f4c4ec25b776af3ffaac1ed74e0b3fc95e885a8",
"sha256:400e720fe168c0f8521520190686ef8ef033fb19fc493da09779e592861b78c6",
"sha256:40741994320b232529c802f8bc86da4e1aa9f413db394617b9a256ae0f9a7f77",
"sha256:465af0e0875402f1d226519c9904f37254b3045fc5084697cefb9bdde1ff99ff",
"sha256:4a8f6e44de52d5e6c657c9fe83b562f5f4256d8ebbfe4ff922c495620a7f6cea",
"sha256:4e340144ad7ae1533cb897d406382b4b6fede8890a03738ff1683af800d54192",
"sha256:678e4fa69e4575eb77d103de3df8a895e1591b48e740211bd1067378c69e8249",
"sha256:6972ca9c9cc9f0acaa56a8ca1ff51e7af152a9f87fb64623e31d5c83700080ee",
"sha256:7fc04e92e1d624a4a63c76474610238576942d6b8950a2d7f908a340494e67e4",
"sha256:889f80ef92701b9dbb224e49ec87c645ce5df3fa2cc548664eb8a25e03127a98",
"sha256:8d57ca8095a641b8237d5b079147646153d22552f1c637fd3ba7f4b0b29167a8",
"sha256:8dd28b3e155b80f4d54beb40a441d366adcfe740969820caf156c019fb5c7ec4",
"sha256:9316dc65bed1684c9a98ee68759ceaed29d229e985297003e494aa825ebb0281",
"sha256:a198f10c4d1b1375d7687bc25294306e551bf1abfa4eace6650070a5c1ae2744",
"sha256:a38aa0308e754b0e3c67e344754dff64999ff9b513e691d0e786265c93583c69",
"sha256:a92ef1a44547e894e2a17d24e7557a5e85a9e1d0048b0b5e7541f76c5032cb13",
"sha256:ac065718db92ca818f8d6141b5f66369833d4a80a9d74435a268c52bdfa73140",
"sha256:b82ebccc8c8a36f2094e969560a1b836758481f3dc360ce9a3277c65f374285e",
"sha256:c954d2250168d28797dd4e3ac5cf812a406cd5a92674ee4c8f123c889786aa8e",
"sha256:cb55c73c5f4408779d0cf3eef9f762b9c9f147a77de7b258bef0a5628adc85cc",
"sha256:cd45e1dc79c835ce60f7404ec8119f2eb06d38b1deba146f07ced3bbc44505ff",
"sha256:d3f5614314d758649ab2ab3a62d4f2004c825922f9e370b29416484086b264ec",
"sha256:d920f33822747519673ee656a4b6ac33e382eca9d331c87770faa3eef562aeb2",
"sha256:db2b95f9de79181805df90bedc5a5ab4c165e6ec3fe99f970d0e302f384ad222",
"sha256:e59e304978767a54663af13c07b3d1af22ddee3bb2fb0618ca1593e4f593a106",
"sha256:e85e99945e688e32d5a35c1ff38ed0b3f41f43fad8df0bdf79f72b2ba7bc5272",
"sha256:ece47d672db52ac607a3d9599a9d48dcb2f2f735c6c2d1f34130085bb12b112a",
"sha256:f4039b9cbc3048b2416cc57ab3bda989a6fcf9b36cf8937f01a6e731b64f80d7"
],
"markers": "python_version < '3.11'",
"version": "==2.0.1"
"markers": "python_version >= '3.8'",
"version": "==2.2.1"
},
"typing-extensions": {
"hashes": [
"sha256:23478f88c37f27d76ac8aee6c905017a143b0b1b886c3c9f66bc2fd94f9f5783",
"sha256:af72aea155e91adfc61c3ae9e0e342dbc0cba726d6cba4b6c72c1f34e47291cd"
"sha256:38b39f4aeeab64884ce9f74c94263ef78f3c22467c8724005483154c26648d36",
"sha256:d1e1e3b58374dc93031d6eda2420a48ea44a36c2b4766a4fdeb3710755731d76"
],
"markers": "python_version >= '3.8'",
"version": "==4.9.0"
"markers": "python_version >= '3.9'",
"version": "==4.14.1"
}
}
}

1
.gitattributes vendored
View File

@ -28,6 +28,7 @@ src/backend/distributed/utils/citus_outfuncs.c -citus-style
src/backend/distributed/deparser/ruleutils_15.c -citus-style
src/backend/distributed/deparser/ruleutils_16.c -citus-style
src/backend/distributed/deparser/ruleutils_17.c -citus-style
src/backend/distributed/deparser/ruleutils_18.c -citus-style
src/backend/distributed/commands/index_pg_source.c -citus-style
src/include/distributed/citus_nodes.h -citus-style

View File

@ -13,15 +13,3 @@ runs:
token: ${{ inputs.codecov_token }}
verbose: true
gcov: true
- name: Create codeclimate coverage
run: |-
lcov --directory . --capture --output-file lcov.info
lcov --remove lcov.info -o lcov.info '/usr/*'
sed "s=^SF:$PWD/=SF:=g" -i lcov.info # relative pats are required by codeclimate
mkdir -p /tmp/codeclimate
cc-test-reporter format-coverage -t lcov -o /tmp/codeclimate/${{ inputs.flags }}.json lcov.info
shell: bash
- uses: actions/upload-artifact@v4.6.0
with:
path: "/tmp/codeclimate/*.json"
name: codeclimate-${{ inputs.flags }}

View File

@ -31,12 +31,12 @@ jobs:
pgupgrade_image_name: "ghcr.io/citusdata/pgupgradetester"
style_checker_image_name: "ghcr.io/citusdata/stylechecker"
style_checker_tools_version: "0.8.18"
sql_snapshot_pg_version: "17.4"
image_suffix: "-veab367a"
pg15_version: '{ "major": "15", "full": "15.12" }'
pg16_version: '{ "major": "16", "full": "16.8" }'
pg17_version: '{ "major": "17", "full": "17.4" }'
upgrade_pg_versions: "15.12-16.8-17.4"
sql_snapshot_pg_version: "17.6"
image_suffix: "-va20872f"
pg15_version: '{ "major": "15", "full": "15.14" }'
pg16_version: '{ "major": "16", "full": "16.10" }'
pg17_version: '{ "major": "17", "full": "17.6" }'
upgrade_pg_versions: "15.14-16.10-17.6"
steps:
# Since GHA jobs need at least one step we use a noop step here.
- name: Set up parameters
@ -153,6 +153,7 @@ jobs:
- check-isolation
- check-operations
- check-follower-cluster
- check-add-backup-node
- check-columnar
- check-columnar-isolation
- check-enterprise
@ -224,10 +225,16 @@ jobs:
runs-on: ubuntu-latest
container:
image: "${{ matrix.image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root --dns=8.8.8.8
options: >-
--user root
--dns=8.8.8.8
--cap-add=SYS_NICE
--security-opt seccomp=unconfined
# Due to Github creates a default network for each job, we need to use
# --dns= to have similar DNS settings as our other CI systems or local
# machines. Otherwise, we may see different results.
# and grant caps so PG18's NUMA introspection (pg_shmem_allocations_numa -> move_pages)
# doesn't fail with EPERM in CI.
needs:
- params
- build
@ -331,7 +338,15 @@ jobs:
make -C src/test/regress \
check-pg-upgrade \
old-bindir=/usr/lib/postgresql/${{ env.old_pg_major }}/bin \
new-bindir=/usr/lib/postgresql/${{ env.new_pg_major }}/bin
new-bindir=/usr/lib/postgresql/${{ env.new_pg_major }}/bin \
test-with-columnar=false
gosu circleci \
make -C src/test/regress \
check-pg-upgrade \
old-bindir=/usr/lib/postgresql/${{ env.old_pg_major }}/bin \
new-bindir=/usr/lib/postgresql/${{ env.new_pg_major }}/bin \
test-with-columnar=true
- name: Copy pg_upgrade logs for newData dir
run: |-
mkdir -p /tmp/pg_upgrade_newData_logs
@ -349,14 +364,20 @@ jobs:
flags: ${{ env.old_pg_major }}_${{ env.new_pg_major }}_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
test-citus-upgrade:
name: PG${{ fromJson(needs.params.outputs.pg15_version).major }} - check-citus-upgrade
name: PG${{ fromJson(matrix.pg_version).major }} - check-citus-upgrade
runs-on: ubuntu-latest
container:
image: "${{ needs.params.outputs.citusupgrade_image_name }}:${{ fromJson(needs.params.outputs.pg15_version).full }}${{ needs.params.outputs.image_suffix }}"
image: "${{ needs.params.outputs.citusupgrade_image_name }}:${{ fromJson(matrix.pg_version).full }}${{ needs.params.outputs.image_suffix }}"
options: --user root
needs:
- params
- build
strategy:
fail-fast: false
matrix:
pg_version:
- ${{ needs.params.outputs.pg15_version }}
- ${{ needs.params.outputs.pg16_version }}
steps:
- uses: actions/checkout@v4
- uses: "./.github/actions/setup_extension"
@ -365,7 +386,7 @@ jobs:
- name: Install and test citus upgrade
run: |-
# run make check-citus-upgrade for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
# the image has ${CITUS_VERSIONS} set with all versions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
@ -376,7 +397,7 @@ jobs:
citus-post-tar=${GITHUB_WORKSPACE}/install-$PG_MAJOR.tar; \
done;
# run make check-citus-upgrade-mixed for all citus versions
# the image has ${CITUS_VERSIONS} set with all verions it contains the binaries of
# the image has ${CITUS_VERSIONS} set with all versions it contains the binaries of
for citus_version in ${CITUS_VERSIONS}; do \
gosu circleci \
make -C src/test/regress \
@ -395,29 +416,6 @@ jobs:
with:
flags: ${{ env.PG_MAJOR }}_citus_upgrade
codecov_token: ${{ secrets.CODECOV_TOKEN }}
upload-coverage:
if: always()
env:
CC_TEST_REPORTER_ID: ${{ secrets.CC_TEST_REPORTER_ID }}
runs-on: ubuntu-latest
container:
image: ${{ needs.params.outputs.test_image_name }}:${{ fromJson(needs.params.outputs.pg17_version).full }}${{ needs.params.outputs.image_suffix }}
needs:
- params
- test-citus
- test-arbitrary-configs
- test-citus-upgrade
- test-pg-upgrade
steps:
- uses: actions/download-artifact@v4.1.8
with:
pattern: codeclimate*
path: codeclimate
merge-multiple: true
- name: Upload coverage results to Code Climate
run: |-
cc-test-reporter sum-coverage codeclimate/*.json -o total.json
cc-test-reporter upload-coverage -i total.json
ch_benchmark:
name: CH Benchmark
if: startsWith(github.ref, 'refs/heads/ch_benchmark/')
@ -485,10 +483,14 @@ jobs:
tests=${detected_changes}
# split the tests to be skipped --today we only skip upgrade tests
# and snapshot based node addition tests.
# snapshot based node addition tests are not flaky, as they promote
# the streaming replica (clone) to a PostgreSQL primary node that is one way
# operation
skipped_tests=""
not_skipped_tests=""
for test in $tests; do
if [[ $test =~ ^src/test/regress/sql/upgrade_ ]]; then
if [[ $test =~ ^src/test/regress/sql/upgrade_ ]] || [[ $test =~ ^src/test/regress/sql/multi_add_node_from_backup ]]; then
skipped_tests="$skipped_tests $test"
else
not_skipped_tests="$not_skipped_tests $test"

View File

@ -24,7 +24,7 @@ jobs:
uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
@ -60,8 +60,7 @@ jobs:
libzstd-dev \
libzstd1 \
lintian \
postgresql-server-dev-15 \
postgresql-server-dev-all \
postgresql-server-dev-17 \
python3-pip \
python3-setuptools \
wget \
@ -76,4 +75,4 @@ jobs:
sudo make install-all
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
uses: github/codeql-action/analyze@v3

View File

@ -1,3 +1,241 @@
### citus v13.1.1 (Oct 1st, 2025) ###
* Adds support for latest PG minors: 14.19, 15.14, 16.10 (#8142)
* Fixes an assertion failure when an expression in the query references
a CTE (#8106)
* Fixes a bug that causes an unexpected error when executing
repartitioned MERGE (#8201)
* Fixes a bug that causes allowing UPDATE / MERGE queries that may
change the distribution column value (#8214)
* Updates dynamic_library_path automatically when CDC is enabled (#8025)
### citus v13.0.5 (Oct 1st, 2025) ###
* Adds support for latest PG minors: 14.19, 15.14, 16.10 (#7986, #8142)
* Fixes a bug that causes an unexpected error when executing
repartitioned MERGE (#8201)
* Fixes a bug that causes allowing UPDATE / MERGE queries that may
change the distribution column value (#8214)
* Fixes a bug in redundant WHERE clause detection (#8162)
* Updates dynamic_library_path automatically when CDC is enabled (#8025)
### citus v12.1.10 (Oct 1, 2025) ###
* Adds support for latest PG minors: 14.19, 15.14, 16.10 (#7986, #8142)
* Fixes a bug that causes allowing UPDATE / MERGE queries that may
change the distribution column value (#8214)
* Fixes an assertion failure that happens when querying a view that is
defined on distributed tables (#8136)
### citus v12.1.9 (Sep 3, 2025) ###
* Adds a GUC for queries with outer joins and pseudoconstant quals (#8163)
* Updates dynamic_library_path automatically when CDC is enabled (#7715)
### citus v13.2.0 (August 18, 2025) ###
* Adds `citus_add_clone_node()`, `citus_add_clone_node_with_nodeid()`,
`citus_remove_clone_node()` and `citus_remove_clone_node_with_nodeid()`
UDFs to support snapshot-based node splits. This feature allows promoting
a streaming replica (clone) to a primary node and rebalancing shards
between the original and newly promoted node without requiring a full data
copy. This greatly reduces rebalance times for scale-out operations when
the new node already has the data via streaming replication (#8122)
* Improves performance of shard rebalancer by parallelizing moves and removing
bottlenecks that blocked concurrent logical-replication transfers. This
reduces rebalance windows especially for clusters with large reference
tables and allows multiple shard transfers to run in parallel (#7983)
* Adds citus.enable_recurring_outer_join_pushdown GUC (enabled by default)
to allow pushing down LEFT/RIGHT outer joins having a reference table in
the outer side and a distributed table on the inner side (e.g.,
\<reference table\> LEFT JOIN \<distributed table\>) (#7973)
* Adds citus.enable_local_fast_path_query_optimization (enabled by default)
GUC to avoid unnecessary query deparsing to improve performance of
fast-path queries targeting local shards (#8035)
* Adds `citus_stats()` UDF that can be used to retrieve distributed `pg_stats`
for the provided Citus table. (#8026)
* Avoids automatically creating citus_columnar when there are no relations
using it (#8081)
* Makes sure to check if the distribution key is in the target list before
pushing down a query with a union and an outer join (#8092)
* Fixes a bug in EXPLAIN ANALYZE to prevent unintended (duplicate) execution
of the (sub)plans during the explain phase (#8017)
* Fixes potential memory corruptions that could happen when accessing
various catalog tables after a Citus downgrade is followed by a Citus
upgrade (#7950, #8120, #8124, #8121, #8114, #8146)
* Fixes UPDATE statements with indirection and array/jsonb subscripting with
more than one field (#7675)
* Fixes an assertion failure that happens when an expression in the query
references a CTE (#8106)
* Fixes an assertion failure that happens when querying a view that is
defined on distributed tables (#8136)
### citus v13.1.0 (May 30th, 2025) ###
* Adds `citus_stat_counters` view that can be used to query
stat counters that Citus collects while the feature is enabled, which is
controlled by citus.enable_stat_counters. `citus_stat_counters()` can be
used to query the stat counters for the provided database oid and
`citus_stat_counters_reset()` can be used to reset them for the provided
database oid or for the current database if nothing or 0 is provided (#7917)
* Adds `citus_nodes` view that displays the node name, port role, and "active"
for nodes in the cluster (#7968)
* Adds `citus_is_primary_node()` UDF to determine if the current node is a
primary node in the cluster (#7720)
* Adds support for propagating `GRANT/REVOKE` rights on table columns (#7918)
* Adds support for propagating `REASSIGN OWNED BY` commands (#7319)
* Adds support for propagating `CREATE`/`DROP` database from all nodes (#7240,
#7253, #7359)
* Propagates `SECURITY LABEL ON ROLE` statement from any node (#7508)
* Adds support for issuing role management commands from worker nodes (#7278)
* Adds support for propagating `ALTER USER RENAME` commands (#7204)
* Adds support for propagating `ALTER DATABASE <db_name> SET ..` commands
(#7181)
* Adds support for propagating `SECURITY LABEL` on tables and columns (#7956)
* Adds support for propagating `COMMENT ON <database>/<role>` commands (#7388)
* Moves some of the internal citus functions from `pg_catalog` to
`citus_internal` schema (#7473, #7470, #7466, 7456, 7450)
* Adjusts `max_prepared_transactions` only when it's set to default on PG >= 16
(#7712)
* Adds skip_qualify_public param to shard_name() UDF to allow qualifying for
"public" schema when needed (#8014)
* Allows `citus_*_size` on indexes on a distributed tables (#7271)
* Allows `GRANT ADMIN` to now also be `INHERIT` or `SET` in support of PG16
* Makes sure `worker_copy_table_to_node` errors out with Citus tables (#7662)
* Adds information to explain output when using
`citus.explain_distributed_queries=false` (#7412)
* Logs username in the failed connection message (#7432)
* Makes sure to avoid incorrectly pushing-down the outer joins between
distributed tables and recurring relations (like reference tables, local
tables and `VALUES(..)` etc.) prior to PG 17 (#7937)
* Prevents incorrectly pushing `nextval()` call down to workers to avoid using
incorrect sequence value for some types of `INSERT .. SELECT`s (#7976)
* Makes sure to prevent `INSERT INTO ... SELECT` queries involving subfield or
sublink, to avoid crashes (#7912)
* Makes sure to take improvement_threshold into the account
in `citus_add_rebalance_strategy()` (#7247)
* Makes sure to disallow creating a replicated distributed
table concurrently (#7219)
* Fixes a bug that causes omitting `CASCADE` clause for the commands sent to
workers for `REVOKE` commands on tables (#7958)
* Fixes an issue detected using address sanitizer (#7948, #7949)
* Fixes a bug in deparsing of shard query in case of "output-table column" name
conflict (#7932)
* Fixes a crash in columnar custom scan that happens when a columnar table is
used in a join (#7703)
* Fixes `MERGE` command when insert value does not have source distributed
column (#7627)
* Fixes performance issue when using `\d tablename` on a server with many
tables (#7577)
* Fixes performance issue in `GetForeignKeyOids` on systems with many
constraints (#7580)
* Fixes performance issue when distributing a table that depends on an
extension (#7574)
* Fixes performance issue when creating distributed tables if many already
exist (#7575)
* Fixes a crash caused by some form of `ALTER TABLE ADD COLUMN` statements. When
adding multiple columns, if one of the `ADD COLUMN` statements contains a
`FOREIGN` constraint ommitting the referenced
columns in the statement, a `SEGFAULT` occurs (#7522)
* Fixes assertion failure in maintenance daemon during Citus upgrades (#7537)
* Fixes segmentation fault when using `CASE WHEN` in `DO` block functions
(#7554)
* Fixes undefined behavior in `master_disable_node` due to argument mismatch
(#7492)
* Fixes incorrect propagating of `GRANTED BY` and `CASCADE/RESTRICT` clauses
for `REVOKE` statements (#7451)
* Fixes the incorrect column count after `ALTER TABLE` (#7379)
* Fixes timeout when underlying socket is changed for an inter-node connection
(#7377)
* Fixes memory leaks (#7441, #7440)
* Fixes leaking of memory and memory contexts when tracking foreign keys between
Citus tables (#7236)
* Fixes a potential segfault for background rebalancer (#7694)
* Fixes potential `NULL` dereference in casual clocks (#7704)
### citus v13.0.4 (May 29th, 2025) ###
* Fixes an issue detected using address sanitizer (#7966)
* Error out for queries with outer joins and pseudoconstant quals in versions
prior to PG 17 (#7937)
### citus v12.1.8 (May 29, 2025) ###
* Fixes a crash in left outer joins that can happen when there is an an
aggregate on a column from the inner side of the join (#7904)
* Fixes an issue detected using address sanitizer (#7965)
* Fixes a crash when executing a prepared CALL, which is not pure SQL but
available with some drivers like npgsql and jpgdbc (#7288)
### citus v13.0.3 (March 20th, 2025) ###
* Fixes a version bump issue in 13.0.2
@ -100,9 +338,8 @@
* Allows overwriting host name for all inter-node connections by
supporting "host" parameter in citus.node_conninfo (#7541)
* Changes the order in which the locks are acquired for the target and
reference tables, when a modify request is initiated from a worker
node that is not the "FirstWorkerNode" (#7542)
* Avoids distributed deadlocks by changing the order in which the locks are
acquired for the target and reference tables (#7542)
* Fixes a performance issue when distributing a table that depends on an
extension (#7574)

View File

@ -0,0 +1,78 @@
Below table is created with Citus 12.1.7 on PG16
| Extension Name | Works as Expected | Notes |
|:-----------------------------|:--------------------|:--------|
| address_standardizer | Yes | |
| address_standardizer_data_us | Yes | |
| age | Partially | Works fine side by side, but graph data cannot be distributed. |
| amcheck | Yes | |
| anon | Partially | Cannot anonymize distributed tables. It is possible to anonymize local tables. |
| auto_explain | No | [Issue #6448](https://github.com/citusdata/citus/issues/6448) |
| azure | Yes | |
| azure_ai | Yes | |
| azure_storage | Yes | |
| bloom | Yes | |
| Btree_gin | Yes | |
| btree_gist | Yes | |
| citext | Yes | |
| Citus_columnar | Yes | |
| cube | Yes | |
| dblink | Yes | |
| dict_int | Yes | |
| dict_xsyn | Yes | |
| earthdistance | Yes | |
| fuzzystrmatch | Yes | |
| hll | Yes | |
| hstore | Yes | |
| hypopg | Partially | Hypopg can work on local tables and individual shards, however, when we create a hypothetical index on a distributed table, citus does not propagate the index creation command to worker nodes, and thus, hypothetical index is not used in explain statements. |
| intagg | Yes | |
| intarray | Yes | |
| isn | Yes | |
| lo | Partially | Extension relies on triggers, but Citus does not support triggers over distributed tables |
| login_hook | Yes | |
| ltree | Yes | |
| oracle_fdw | Yes | |
| orafce | Yes | |
| pageinspect | Yes | |
| pg_buffercache | Yes | |
| pg_cron | Yes | |
| pg_diskann | Yes | |
| pg_failover_slots | To be tested | |
| pg_freespacemap | Partially | Users can set citus.override_table_visibility='off'; to get accurate calculation of free space map. |
| pg_hint_plan | Partially | Works fine side by side, but hints are ignored for distributed queries |
| pg_partman | Yes | |
| pg_prewarm | Partially | In order to prewarm distributed tables, set " citus.override_table_visibility" to off, and run prewarm for each shard. This needs to be done at each node. |
| pg_repack | Partially | Extension relies on triggers, but Citus does not support triggers over distributed tables. It works fine on local tables. |
| pg_squeeze | Partially | It can work on local tables, but it is not aware of distributed tables. Users can set citus.override_table_visibility='off'; and then run pg_squeeze for each shard. This needs to be done at each node. |
| pg_stat_statements | Yes | |
| pg_trgm | Yes | |
| pg_visibility | Partially | In order to get visibility map of a distributed table, customers can run the functions for shard tables. |
| pgaadauth | Yes | |
| pgaudit | Yes | |
| pgcrypto | Yes | |
| pglogical | No | |
| pgrowlocks | Partially | It works only with individual shards, not with distributed table names. |
| pgstattuple | Yes | |
| plpgsql | Yes | |
| plv8 | Yes | |
| postgis | Yes | |
| postgis_raster | Yes | |
| postgis_sfcgal | Yes | |
| postgis_tiger_geocoder | No | |
| postgis_topology | No | |
| postgres_fdw | Yes | |
| postgres_protobuf | Yes | |
| semver | Yes | |
| session_variable | No | |
| sslinfo | Yes | |
| tablefunc | Yes | |
| tdigest | Yes | |
| tds_fdw | Yes | |
| timescaledb | No | [Known to be incompatible with Citus](https://www.citusdata.com/blog/2021/10/22/how-to-scale-postgres-for-time-series-data-with-citus/#:~:text=Postgres%E2%80%99%20built-in%20partitioning) |
| topn | Yes | |
| tsm_system_rows | Yes | |
| tsm_system_time | Yes | |
| unaccent | Yes | |
| uuid-ossp | Yes | |
| vector (aka pg_vector) | Yes | |
| wal2json | To be tested | |
| xml2 | To be tested | |

18
configure vendored
View File

@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.69 for Citus 13.1devel.
# Generated by GNU Autoconf 2.69 for Citus 14.0devel.
#
#
# Copyright (C) 1992-1996, 1998-2012 Free Software Foundation, Inc.
@ -579,8 +579,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='Citus'
PACKAGE_TARNAME='citus'
PACKAGE_VERSION='13.1devel'
PACKAGE_STRING='Citus 13.1devel'
PACKAGE_VERSION='14.0devel'
PACKAGE_STRING='Citus 14.0devel'
PACKAGE_BUGREPORT=''
PACKAGE_URL=''
@ -1262,7 +1262,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures Citus 13.1devel to adapt to many kinds of systems.
\`configure' configures Citus 14.0devel to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@ -1324,7 +1324,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of Citus 13.1devel:";;
short | recursive ) echo "Configuration of Citus 14.0devel:";;
esac
cat <<\_ACEOF
@ -1429,7 +1429,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
Citus configure 13.1devel
Citus configure 14.0devel
generated by GNU Autoconf 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
@ -1912,7 +1912,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by Citus $as_me 13.1devel, which was
It was created by Citus $as_me 14.0devel, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ $0 $@
@ -5393,7 +5393,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by Citus $as_me 13.1devel, which was
This file was extended by Citus $as_me 14.0devel, which was
generated by GNU Autoconf 2.69. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@ -5455,7 +5455,7 @@ _ACEOF
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`"
ac_cs_version="\\
Citus config.status 13.1devel
Citus config.status 14.0devel
configured by $0, generated by GNU Autoconf 2.69,
with options \\"\$ac_cs_config\\"

View File

@ -5,7 +5,7 @@
# everyone needing autoconf installed, the resulting files are checked
# into the SCM.
AC_INIT([Citus], [13.1devel])
AC_INIT([Citus], [14.0devel])
AC_COPYRIGHT([Copyright (c) Citus Data, Inc.])
# we'll need sed and awk for some of the version commands

View File

@ -1,6 +1,6 @@
# Columnar extension
comment = 'Citus Columnar extension'
default_version = '12.2-1'
default_version = '14.0-1'
module_pathname = '$libdir/citus_columnar'
relocatable = false
schema = pg_catalog

View File

@ -21,6 +21,13 @@
#include "catalog/pg_am.h"
#include "catalog/pg_statistic.h"
#include "commands/defrem.h"
#include "columnar/columnar_version_compat.h"
#if PG_VERSION_NUM >= PG_VERSION_18
#include "commands/explain_format.h"
#endif
#include "executor/executor.h" /* for ExecInitExprWithParams(), ExecEvalExpr() */
#include "nodes/execnodes.h" /* for ExprState, ExprContext, etc. */
#include "nodes/extensible.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
@ -1549,8 +1556,7 @@ ColumnarPerStripeScanCost(RelOptInfo *rel, Oid relationId, int numberOfColumnsRe
ereport(ERROR, (errmsg("could not open relation with OID %u", relationId)));
}
List *stripeList = StripesForRelfilelocator(RelationPhysicalIdentifier_compat(
relation));
List *stripeList = StripesForRelfilelocator(relation);
RelationClose(relation);
uint32 maxColumnCount = 0;
@ -1607,8 +1613,7 @@ ColumnarTableStripeCount(Oid relationId)
ereport(ERROR, (errmsg("could not open relation with OID %u", relationId)));
}
List *stripeList = StripesForRelfilelocator(RelationPhysicalIdentifier_compat(
relation));
List *stripeList = StripesForRelfilelocator(relation);
int stripeCount = list_length(stripeList);
RelationClose(relation);

View File

@ -106,7 +106,9 @@ static void GetHighestUsedAddressAndId(uint64 storageId,
uint64 *highestUsedAddress,
uint64 *highestUsedId);
static StripeMetadata * UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId,
bool *update, Datum *newValues);
uint64 fileOffset, uint64 dataLength,
uint64 rowCount, uint64 chunkCount);
static List * ReadDataFileStripeList(uint64 storageId, Snapshot snapshot);
static StripeMetadata * BuildStripeMetadata(Relation columnarStripes,
HeapTuple heapTuple);
@ -123,7 +125,7 @@ static Oid ColumnarChunkGroupRelationId(void);
static Oid ColumnarChunkIndexRelationId(void);
static Oid ColumnarChunkGroupIndexRelationId(void);
static Oid ColumnarNamespaceId(void);
static uint64 LookupStorageId(RelFileLocator relfilelocator);
static uint64 LookupStorageId(Oid relationId, RelFileLocator relfilelocator);
static uint64 GetHighestUsedRowNumber(uint64 storageId);
static void DeleteStorageFromColumnarMetadataTable(Oid metadataTableId,
AttrNumber storageIdAtrrNumber,
@ -183,6 +185,8 @@ typedef FormData_columnar_options *Form_columnar_options;
#define Anum_columnar_stripe_chunk_count 8
#define Anum_columnar_stripe_first_row_number 9
static int GetFirstRowNumberAttrIndexInColumnarStripe(TupleDesc tupleDesc);
/* constants for columnar.chunk_group */
#define Natts_columnar_chunkgroup 4
#define Anum_columnar_chunkgroup_storageid 1
@ -602,7 +606,7 @@ ReadColumnarOptions(Oid regclass, ColumnarOptions *options)
* of columnar.chunk.
*/
void
SaveStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
SaveStripeSkipList(Oid relid, RelFileLocator relfilelocator, uint64 stripe,
StripeSkipList *chunkList,
TupleDesc tupleDescriptor)
{
@ -610,11 +614,17 @@ SaveStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
uint32 chunkIndex = 0;
uint32 columnCount = chunkList->columnCount;
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relid, relfilelocator);
Oid columnarChunkOid = ColumnarChunkRelationId();
Relation columnarChunk = table_open(columnarChunkOid, RowExclusiveLock);
ModifyState *modifyState = StartModifyRelation(columnarChunk);
bool pushed_snapshot = false;
if (!ActiveSnapshotSet())
{
PushActiveSnapshot(GetTransactionSnapshot());
pushed_snapshot = true;
}
for (columnIndex = 0; columnIndex < columnCount; columnIndex++)
{
for (chunkIndex = 0; chunkIndex < chunkList->chunkCount; chunkIndex++)
@ -645,20 +655,25 @@ SaveStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
{
values[Anum_columnar_chunk_minimum_value - 1] =
PointerGetDatum(DatumToBytea(chunk->minimumValue,
&tupleDescriptor->attrs[columnIndex]));
TupleDescAttr(tupleDescriptor,
columnIndex)));
values[Anum_columnar_chunk_maximum_value - 1] =
PointerGetDatum(DatumToBytea(chunk->maximumValue,
&tupleDescriptor->attrs[columnIndex]));
TupleDescAttr(tupleDescriptor,
columnIndex)));
}
else
{
nulls[Anum_columnar_chunk_minimum_value - 1] = true;
nulls[Anum_columnar_chunk_maximum_value - 1] = true;
}
InsertTupleAndEnforceConstraints(modifyState, values, nulls);
}
}
if (pushed_snapshot)
{
PopActiveSnapshot();
}
FinishModifyRelation(modifyState);
table_close(columnarChunk, RowExclusiveLock);
@ -669,10 +684,10 @@ SaveStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
* SaveChunkGroups saves the metadata for given chunk groups in columnar.chunk_group.
*/
void
SaveChunkGroups(RelFileLocator relfilelocator, uint64 stripe,
SaveChunkGroups(Oid relid, RelFileLocator relfilelocator, uint64 stripe,
List *chunkGroupRowCounts)
{
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(relid, relfilelocator);
Oid columnarChunkGroupOid = ColumnarChunkGroupRelationId();
Relation columnarChunkGroup = table_open(columnarChunkGroupOid, RowExclusiveLock);
ModifyState *modifyState = StartModifyRelation(columnarChunkGroup);
@ -705,7 +720,7 @@ SaveChunkGroups(RelFileLocator relfilelocator, uint64 stripe,
* ReadStripeSkipList fetches chunk metadata for a given stripe.
*/
StripeSkipList *
ReadStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
ReadStripeSkipList(Relation rel, uint64 stripe,
TupleDesc tupleDescriptor,
uint32 chunkCount, Snapshot snapshot)
{
@ -714,7 +729,8 @@ ReadStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
uint32 columnCount = tupleDescriptor->natts;
ScanKeyData scanKey[2];
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(RelationPrecomputeOid(rel),
RelationPhysicalIdentifier_compat(rel));
Oid columnarChunkOid = ColumnarChunkRelationId();
Relation columnarChunk = table_open(columnarChunkOid, AccessShareLock);
@ -803,9 +819,9 @@ ReadStripeSkipList(RelFileLocator relfilelocator, uint64 stripe,
datumArray[Anum_columnar_chunk_maximum_value - 1]);
chunk->minimumValue =
ByteaToDatum(minValue, &tupleDescriptor->attrs[columnIndex]);
ByteaToDatum(minValue, TupleDescAttr(tupleDescriptor, columnIndex));
chunk->maximumValue =
ByteaToDatum(maxValue, &tupleDescriptor->attrs[columnIndex]);
ByteaToDatum(maxValue, TupleDescAttr(tupleDescriptor, columnIndex));
chunk->hasMinMax = true;
}
@ -942,10 +958,12 @@ StripeMetadataLookupRowNumber(Relation relation, uint64 rowNumber, Snapshot snap
strategyNumber = BTGreaterStrategyNumber;
procedure = F_INT8GT;
}
ScanKeyInit(&scanKey[1], Anum_columnar_stripe_first_row_number,
strategyNumber, procedure, Int64GetDatum(rowNumber));
Relation columnarStripes = table_open(ColumnarStripeRelationId(), AccessShareLock);
TupleDesc tupleDesc = RelationGetDescr(columnarStripes);
ScanKeyInit(&scanKey[1], GetFirstRowNumberAttrIndexInColumnarStripe(tupleDesc) + 1,
strategyNumber, procedure, Int64GetDatum(rowNumber));
Oid indexId = ColumnarStripeFirstRowNumberIndexRelationId();
bool indexOk = OidIsValid(indexId);
@ -1210,9 +1228,13 @@ static void
InsertEmptyStripeMetadataRow(uint64 storageId, uint64 stripeId, uint32 columnCount,
uint32 chunkGroupRowCount, uint64 firstRowNumber)
{
bool nulls[Natts_columnar_stripe] = { false };
Oid columnarStripesOid = ColumnarStripeRelationId();
Relation columnarStripes = table_open(columnarStripesOid, RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(columnarStripes);
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *nulls = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
Datum values[Natts_columnar_stripe] = { 0 };
values[Anum_columnar_stripe_storageid - 1] =
UInt64GetDatum(storageId);
values[Anum_columnar_stripe_stripe - 1] =
@ -1221,7 +1243,7 @@ InsertEmptyStripeMetadataRow(uint64 storageId, uint64 stripeId, uint32 columnCou
UInt32GetDatum(columnCount);
values[Anum_columnar_stripe_chunk_row_count - 1] =
UInt32GetDatum(chunkGroupRowCount);
values[Anum_columnar_stripe_first_row_number - 1] =
values[GetFirstRowNumberAttrIndexInColumnarStripe(tupleDescriptor)] =
UInt64GetDatum(firstRowNumber);
/* stripe has no rows yet, so initialize rest of the columns accordingly */
@ -1234,9 +1256,6 @@ InsertEmptyStripeMetadataRow(uint64 storageId, uint64 stripeId, uint32 columnCou
values[Anum_columnar_stripe_chunk_count - 1] =
UInt32GetDatum(0);
Oid columnarStripesOid = ColumnarStripeRelationId();
Relation columnarStripes = table_open(columnarStripesOid, RowExclusiveLock);
ModifyState *modifyState = StartModifyRelation(columnarStripes);
InsertTupleAndEnforceConstraints(modifyState, values, nulls);
@ -1244,6 +1263,9 @@ InsertEmptyStripeMetadataRow(uint64 storageId, uint64 stripeId, uint32 columnCou
FinishModifyRelation(modifyState);
table_close(columnarStripes, RowExclusiveLock);
pfree(values);
pfree(nulls);
}
@ -1252,11 +1274,26 @@ InsertEmptyStripeMetadataRow(uint64 storageId, uint64 stripeId, uint32 columnCou
* of the given relfilenode.
*/
List *
StripesForRelfilelocator(RelFileLocator relfilelocator)
StripesForRelfilelocator(Relation rel)
{
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(RelationPrecomputeOid(rel),
RelationPhysicalIdentifier_compat(rel));
return ReadDataFileStripeList(storageId, GetTransactionSnapshot());
/*
* PG18 requires snapshot to be active or registered before it's used
* Without this, we hit
* Assert(snapshot->regd_count > 0 || snapshot->active_count > 0);
* when reading columnar stripes.
* Relevant PG18 commit:
* 8076c00592e40e8dbd1fce7a98b20d4bf075e4ba
*/
Snapshot snapshot = RegisterSnapshot(GetTransactionSnapshot());
List *readDataFileStripeList = ReadDataFileStripeList(storageId, snapshot);
UnregisterSnapshot(snapshot);
return readDataFileStripeList;
}
@ -1269,9 +1306,10 @@ StripesForRelfilelocator(RelFileLocator relfilelocator)
* returns 0.
*/
uint64
GetHighestUsedAddress(RelFileLocator relfilelocator)
GetHighestUsedAddress(Relation rel)
{
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(RelationPrecomputeOid(rel),
RelationPhysicalIdentifier_compat(rel));
uint64 highestUsedAddress = 0;
uint64 highestUsedId = 0;
@ -1281,6 +1319,24 @@ GetHighestUsedAddress(RelFileLocator relfilelocator)
}
/*
* In case if relid hasn't been defined yet, we should use RelidByRelfilenumber
* to get correct relid value.
*
* Now it is basically used for temp rels, because since PG18(it was backpatched
* through PG13) RelidByRelfilenumber skip temp relations and we should use
* alternative ways to get relid value in case of temp objects.
*/
Oid
ColumnarRelationId(Oid relid, RelFileLocator relfilelocator)
{
return OidIsValid(relid) ? relid : RelidByRelfilenumber(RelationTablespace_compat(
relfilelocator),
RelationPhysicalIdentifierNumber_compat(
relfilelocator));
}
/*
* GetHighestUsedAddressAndId returns the highest used address and id for
* the given relfilenode across all active and inactive transactions.
@ -1354,19 +1410,8 @@ CompleteStripeReservation(Relation rel, uint64 stripeId, uint64 sizeBytes,
uint64 resLogicalStart = ColumnarStorageReserveData(rel, sizeBytes);
uint64 storageId = ColumnarStorageGetStorageId(rel, false);
bool update[Natts_columnar_stripe] = { false };
update[Anum_columnar_stripe_file_offset - 1] = true;
update[Anum_columnar_stripe_data_length - 1] = true;
update[Anum_columnar_stripe_row_count - 1] = true;
update[Anum_columnar_stripe_chunk_count - 1] = true;
Datum newValues[Natts_columnar_stripe] = { 0 };
newValues[Anum_columnar_stripe_file_offset - 1] = Int64GetDatum(resLogicalStart);
newValues[Anum_columnar_stripe_data_length - 1] = Int64GetDatum(sizeBytes);
newValues[Anum_columnar_stripe_row_count - 1] = UInt64GetDatum(rowCount);
newValues[Anum_columnar_stripe_chunk_count - 1] = Int32GetDatum(chunkCount);
return UpdateStripeMetadataRow(storageId, stripeId, update, newValues);
return UpdateStripeMetadataRow(storageId, stripeId, resLogicalStart,
sizeBytes, rowCount, chunkCount);
}
@ -1377,12 +1422,9 @@ CompleteStripeReservation(Relation rel, uint64 stripeId, uint64 sizeBytes,
* of stripe metadata should be updated according to modifications done.
*/
static StripeMetadata *
UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
Datum *newValues)
UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, uint64 fileOffset,
uint64 dataLength, uint64 rowCount, uint64 chunkCount)
{
SnapshotData dirtySnapshot;
InitDirtySnapshot(dirtySnapshot);
ScanKeyData scanKey[2];
ScanKeyInit(&scanKey[0], Anum_columnar_stripe_storageid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(storageId));
@ -1392,11 +1434,15 @@ UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
Oid columnarStripesOid = ColumnarStripeRelationId();
Relation columnarStripes = table_open(columnarStripesOid, AccessShareLock);
TupleDesc tupleDescriptor = RelationGetDescr(columnarStripes);
Oid indexId = ColumnarStripePKeyIndexRelationId();
bool indexOk = OidIsValid(indexId);
SysScanDesc scanDescriptor = systable_beginscan(columnarStripes, indexId, indexOk,
&dirtySnapshot, 2, scanKey);
void *state;
HeapTuple tuple;
systable_inplace_update_begin(columnarStripes, indexId, indexOk, NULL,
2, scanKey, &tuple, &state);
static bool loggedSlowMetadataAccessWarning = false;
if (!indexOk && !loggedSlowMetadataAccessWarning)
@ -1405,8 +1451,7 @@ UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
loggedSlowMetadataAccessWarning = true;
}
HeapTuple oldTuple = systable_getnext(scanDescriptor);
if (!HeapTupleIsValid(oldTuple))
if (!HeapTupleIsValid(tuple))
{
ereport(ERROR, (errmsg("attempted to modify an unexpected stripe, "
"columnar storage with id=" UINT64_FORMAT
@ -1415,34 +1460,44 @@ UpdateStripeMetadataRow(uint64 storageId, uint64 stripeId, bool *update,
}
/*
* heap_inplace_update already doesn't allow changing size of the original
* systable_inplace_update_finish already doesn't allow changing size of the original
* tuple, so we don't allow setting any Datum's to NULL values.
*/
bool newNulls[Natts_columnar_stripe] = { false };
TupleDesc tupleDescriptor = RelationGetDescr(columnarStripes);
HeapTuple modifiedTuple = heap_modify_tuple(oldTuple, tupleDescriptor,
newValues, newNulls, update);
heap_inplace_update(columnarStripes, modifiedTuple);
Datum *newValues = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *newNulls = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
bool *update = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
/*
* Existing tuple now contains modifications, because we used
* heap_inplace_update().
*/
HeapTuple newTuple = oldTuple;
update[Anum_columnar_stripe_file_offset - 1] = true;
update[Anum_columnar_stripe_data_length - 1] = true;
update[Anum_columnar_stripe_row_count - 1] = true;
update[Anum_columnar_stripe_chunk_count - 1] = true;
newValues[Anum_columnar_stripe_file_offset - 1] = Int64GetDatum(fileOffset);
newValues[Anum_columnar_stripe_data_length - 1] = Int64GetDatum(dataLength);
newValues[Anum_columnar_stripe_row_count - 1] = UInt64GetDatum(rowCount);
newValues[Anum_columnar_stripe_chunk_count - 1] = Int32GetDatum(chunkCount);
tuple = heap_modify_tuple(tuple,
tupleDescriptor,
newValues,
newNulls,
update);
systable_inplace_update_finish(state, tuple);
/*
* Must not pass modifiedTuple, because BuildStripeMetadata expects a real
* heap tuple with MVCC fields.
*/
StripeMetadata *modifiedStripeMetadata = BuildStripeMetadata(columnarStripes,
newTuple);
tuple);
CommandCounterIncrement();
systable_endscan(scanDescriptor);
heap_freetuple(tuple);
table_close(columnarStripes, AccessShareLock);
pfree(newValues);
pfree(newNulls);
pfree(update);
/* return StripeMetadata object built from modified tuple */
return modifiedStripeMetadata;
}
@ -1506,10 +1561,12 @@ BuildStripeMetadata(Relation columnarStripes, HeapTuple heapTuple)
{
Assert(RelationGetRelid(columnarStripes) == ColumnarStripeRelationId());
Datum datumArray[Natts_columnar_stripe];
bool isNullArray[Natts_columnar_stripe];
heap_deform_tuple(heapTuple, RelationGetDescr(columnarStripes),
datumArray, isNullArray);
TupleDesc tupleDescriptor = RelationGetDescr(columnarStripes);
Datum *datumArray = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(heapTuple, tupleDescriptor, datumArray, isNullArray);
StripeMetadata *stripeMetadata = palloc0(sizeof(StripeMetadata));
stripeMetadata->id = DatumGetInt64(datumArray[Anum_columnar_stripe_stripe - 1]);
@ -1526,7 +1583,10 @@ BuildStripeMetadata(Relation columnarStripes, HeapTuple heapTuple)
stripeMetadata->rowCount = DatumGetInt64(
datumArray[Anum_columnar_stripe_row_count - 1]);
stripeMetadata->firstRowNumber = DatumGetUInt64(
datumArray[Anum_columnar_stripe_first_row_number - 1]);
datumArray[GetFirstRowNumberAttrIndexInColumnarStripe(tupleDescriptor)]);
pfree(datumArray);
pfree(isNullArray);
/*
* If there is unflushed data in a parent transaction, then we would
@ -1552,7 +1612,7 @@ BuildStripeMetadata(Relation columnarStripes, HeapTuple heapTuple)
* metadata tables.
*/
void
DeleteMetadataRows(RelFileLocator relfilelocator)
DeleteMetadataRows(Relation rel)
{
/*
* During a restore for binary upgrade, metadata tables and indexes may or
@ -1563,7 +1623,8 @@ DeleteMetadataRows(RelFileLocator relfilelocator)
return;
}
uint64 storageId = LookupStorageId(relfilelocator);
uint64 storageId = LookupStorageId(RelationPrecomputeOid(rel),
RelationPhysicalIdentifier_compat(rel));
DeleteStorageFromColumnarMetadataTable(ColumnarStripeRelationId(),
Anum_columnar_stripe_storageid,
@ -1727,12 +1788,37 @@ create_estate_for_relation(Relation rel)
rte->relkind = rel->rd_rel->relkind;
rte->rellockmode = AccessShareLock;
/* Prepare permission info on PG 16+ */
#if PG_VERSION_NUM >= PG_VERSION_16
List *perminfos = NIL;
addRTEPermissionInfo(&perminfos, rte);
ExecInitRangeTable(estate, list_make1(rte), perminfos);
#endif
/* Initialize the range table, with the right signature for each PG version */
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG 18+ needs four arguments (unpruned_relids) */
ExecInitRangeTable(
estate,
list_make1(rte),
perminfos,
NULL /* unpruned_relids: not used by columnar */
);
#elif PG_VERSION_NUM >= PG_VERSION_16
/* PG 1617: three-arg signature (permInfos) */
ExecInitRangeTable(
estate,
list_make1(rte),
perminfos
);
#else
ExecInitRangeTable(estate, list_make1(rte));
/* PG 15: two-arg signature */
ExecInitRangeTable(
estate,
list_make1(rte)
);
#endif
estate->es_output_cid = GetCurrentCommandId(true);
@ -1937,13 +2023,11 @@ ColumnarNamespaceId(void)
* false if the relation doesn't have a meta page yet.
*/
static uint64
LookupStorageId(RelFileLocator relfilelocator)
LookupStorageId(Oid relid, RelFileLocator relfilelocator)
{
Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(relfilelocator),
RelationPhysicalIdentifierNumber_compat(
relfilelocator));
relid = ColumnarRelationId(relid, relfilelocator);
Relation relation = relation_open(relationId, AccessShareLock);
Relation relation = relation_open(relid, AccessShareLock);
uint64 storageId = ColumnarStorageGetStorageId(relation, false);
table_close(relation, AccessShareLock);
@ -2049,3 +2133,23 @@ GetHighestUsedRowNumber(uint64 storageId)
return highestRowNumber;
}
/*
* GetFirstRowNumberAttrIndexInColumnarStripe returns attrnum for first_row_number attr.
*
* first_row_number attr was added to table columnar.stripe using alter operation after
* the version where Citus started supporting downgrades, and it's only column that we've
* introduced to columnar.stripe since then.
*
* And in case of a downgrade + upgrade, tupleDesc->natts becomes greater than
* Natts_columnar_stripe and when this happens, then we know that attrnum first_row_number is
* not Anum_columnar_stripe_first_row_number anymore but tupleDesc->natts - 1.
*/
static int
GetFirstRowNumberAttrIndexInColumnarStripe(TupleDesc tupleDesc)
{
return tupleDesc->natts == Natts_columnar_stripe
? (Anum_columnar_stripe_first_row_number - 1)
: tupleDesc->natts - 1;
}

View File

@ -986,8 +986,7 @@ ColumnarTableRowCount(Relation relation)
{
ListCell *stripeMetadataCell = NULL;
uint64 totalRowCount = 0;
List *stripeList = StripesForRelfilelocator(RelationPhysicalIdentifier_compat(
relation));
List *stripeList = StripesForRelfilelocator(relation);
foreach(stripeMetadataCell, stripeList)
{
@ -1015,8 +1014,7 @@ LoadFilteredStripeBuffers(Relation relation, StripeMetadata *stripeMetadata,
bool *projectedColumnMask = ProjectedColumnMask(columnCount, projectedColumnList);
StripeSkipList *stripeSkipList = ReadStripeSkipList(RelationPhysicalIdentifier_compat(
relation),
StripeSkipList *stripeSkipList = ReadStripeSkipList(relation,
stripeMetadata->id,
tupleDescriptor,
stripeMetadata->chunkCount,

View File

@ -872,7 +872,7 @@ columnar_relation_set_new_filelocator(Relation rel,
RelationPhysicalIdentifier_compat(rel)),
GetCurrentSubTransactionId());
DeleteMetadataRows(RelationPhysicalIdentifier_compat(rel));
DeleteMetadataRows(rel);
}
*freezeXid = RecentXmin;
@ -897,7 +897,7 @@ columnar_relation_nontransactional_truncate(Relation rel)
NonTransactionDropWriteState(RelationPhysicalIdentifierNumber_compat(relfilelocator));
/* Delete old relfilenode metadata */
DeleteMetadataRows(relfilelocator);
DeleteMetadataRows(rel);
/*
* No need to set new relfilenode, since the table was created in this
@ -960,8 +960,7 @@ columnar_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
ColumnarOptions columnarOptions = { 0 };
ReadColumnarOptions(OldHeap->rd_id, &columnarOptions);
ColumnarWriteState *writeState = ColumnarBeginWrite(RelationPhysicalIdentifier_compat(
NewHeap),
ColumnarWriteState *writeState = ColumnarBeginWrite(NewHeap,
columnarOptions,
targetDesc);
@ -1012,7 +1011,7 @@ NeededColumnsList(TupleDesc tupdesc, Bitmapset *attr_needed)
for (int i = 0; i < tupdesc->natts; i++)
{
if (tupdesc->attrs[i].attisdropped)
if (TupleDescAttr(tupdesc, i)->attisdropped)
{
continue;
}
@ -1036,8 +1035,7 @@ NeededColumnsList(TupleDesc tupdesc, Bitmapset *attr_needed)
static uint64
ColumnarTableTupleCount(Relation relation)
{
List *stripeList = StripesForRelfilelocator(RelationPhysicalIdentifier_compat(
relation));
List *stripeList = StripesForRelfilelocator(relation);
uint64 tupleCount = 0;
ListCell *lc = NULL;
@ -1121,10 +1119,27 @@ columnar_vacuum_rel(Relation rel, VacuumParams *params,
bool frozenxid_updated;
bool minmulti_updated;
/* for PG 18+, vac_update_relstats gained a new “all_frozen” param */
#if PG_VERSION_NUM >= PG_VERSION_18
/* all frozen pages are always 0, because columnar stripes never store XIDs */
BlockNumber new_rel_allfrozen = 0;
vac_update_relstats(rel, new_rel_pages, new_live_tuples,
new_rel_allvisible, /* allvisible */
new_rel_allfrozen, /* all_frozen */
nindexes > 0,
newRelFrozenXid, newRelminMxid,
&frozenxid_updated, &minmulti_updated,
false);
#else
vac_update_relstats(rel, new_rel_pages, new_live_tuples,
new_rel_allvisible, nindexes > 0,
newRelFrozenXid, newRelminMxid,
&frozenxid_updated, &minmulti_updated, false);
&frozenxid_updated, &minmulti_updated,
false);
#endif
#else
TransactionId oldestXmin;
TransactionId freezeLimit;
@ -1187,10 +1202,19 @@ columnar_vacuum_rel(Relation rel, VacuumParams *params,
#endif
#endif
#if PG_VERSION_NUM >= PG_VERSION_18
pgstat_report_vacuum(RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(new_live_tuples, 0), /* live tuples */
0, /* dead tuples */
GetCurrentTimestamp()); /* start time */
#else
pgstat_report_vacuum(RelationGetRelid(rel),
rel->rd_rel->relisshared,
Max(new_live_tuples, 0),
0);
#endif
pgstat_progress_end_command();
}
@ -1202,7 +1226,6 @@ static void
LogRelationStats(Relation rel, int elevel)
{
ListCell *stripeMetadataCell = NULL;
RelFileLocator relfilelocator = RelationPhysicalIdentifier_compat(rel);
StringInfo infoBuf = makeStringInfo();
int compressionStats[COMPRESSION_COUNT] = { 0 };
@ -1213,19 +1236,23 @@ LogRelationStats(Relation rel, int elevel)
uint64 droppedChunksWithData = 0;
uint64 totalDecompressedLength = 0;
List *stripeList = StripesForRelfilelocator(relfilelocator);
List *stripeList = StripesForRelfilelocator(rel);
int stripeCount = list_length(stripeList);
foreach(stripeMetadataCell, stripeList)
{
StripeMetadata *stripe = lfirst(stripeMetadataCell);
StripeSkipList *skiplist = ReadStripeSkipList(relfilelocator, stripe->id,
Snapshot snapshot = RegisterSnapshot(GetTransactionSnapshot());
StripeSkipList *skiplist = ReadStripeSkipList(rel, stripe->id,
RelationGetDescr(rel),
stripe->chunkCount,
GetTransactionSnapshot());
snapshot);
UnregisterSnapshot(snapshot);
for (uint32 column = 0; column < skiplist->columnCount; column++)
{
bool attrDropped = tupdesc->attrs[column].attisdropped;
bool attrDropped = TupleDescAttr(tupdesc, column)->attisdropped;
for (uint32 chunk = 0; chunk < skiplist->chunkCount; chunk++)
{
ColumnChunkSkipNode *skipnode =
@ -1355,8 +1382,7 @@ TruncateColumnar(Relation rel, int elevel)
* new stripes be added beyond highestPhysicalAddress while
* we're truncating.
*/
uint64 newDataReservation = Max(GetHighestUsedAddress(
RelationPhysicalIdentifier_compat(rel)) + 1,
uint64 newDataReservation = Max(GetHighestUsedAddress(rel) + 1,
ColumnarFirstLogicalOffset);
BlockNumber old_rel_pages = smgrnblocks(RelationGetSmgr(rel), MAIN_FORKNUM);
@ -2124,7 +2150,7 @@ ColumnarTableDropHook(Oid relid)
Relation rel = table_open(relid, AccessExclusiveLock);
RelFileLocator relfilelocator = RelationPhysicalIdentifier_compat(rel);
DeleteMetadataRows(relfilelocator);
DeleteMetadataRows(rel);
DeleteColumnarTableOptions(rel->rd_id, true);
MarkRelfilenumberDropped(RelationPhysicalIdentifierNumber_compat(relfilelocator),
@ -2564,8 +2590,13 @@ static const TableAmRoutine columnar_am_methods = {
.relation_estimate_size = columnar_estimate_rel_size,
#if PG_VERSION_NUM < PG_VERSION_18
/* these two fields were removed in PG18 */
.scan_bitmap_next_block = NULL,
.scan_bitmap_next_tuple = NULL,
#endif
.scan_sample_next_block = columnar_scan_sample_next_block,
.scan_sample_next_tuple = columnar_scan_sample_next_tuple
};
@ -2603,7 +2634,7 @@ detoast_values(TupleDesc tupleDesc, Datum *orig_values, bool *isnull)
for (int i = 0; i < tupleDesc->natts; i++)
{
if (!isnull[i] && tupleDesc->attrs[i].attlen == -1 &&
if (!isnull[i] && TupleDescAttr(tupleDesc, i)->attlen == -1 &&
VARATT_IS_EXTENDED(values[i]))
{
/* make a copy */

View File

@ -48,6 +48,12 @@ struct ColumnarWriteState
FmgrInfo **comparisonFunctionArray;
RelFileLocator relfilelocator;
/*
* We can't rely on RelidByRelfilenumber for temp tables since
* PG18(it was backpatched through PG13).
*/
Oid temp_relid;
MemoryContext stripeWriteContext;
MemoryContext perTupleContext;
StripeBuffers *stripeBuffers;
@ -93,10 +99,12 @@ static StringInfo CopyStringInfo(StringInfo sourceString);
* data load operation.
*/
ColumnarWriteState *
ColumnarBeginWrite(RelFileLocator relfilelocator,
ColumnarBeginWrite(Relation rel,
ColumnarOptions options,
TupleDesc tupleDescriptor)
{
RelFileLocator relfilelocator = RelationPhysicalIdentifier_compat(rel);
/* get comparison function pointers for each of the columns */
uint32 columnCount = tupleDescriptor->natts;
FmgrInfo **comparisonFunctionArray = palloc0(columnCount * sizeof(FmgrInfo *));
@ -134,6 +142,7 @@ ColumnarBeginWrite(RelFileLocator relfilelocator,
ColumnarWriteState *writeState = palloc0(sizeof(ColumnarWriteState));
writeState->relfilelocator = relfilelocator;
writeState->temp_relid = RelationPrecomputeOid(rel);
writeState->options = options;
writeState->tupleDescriptor = CreateTupleDescCopy(tupleDescriptor);
writeState->comparisonFunctionArray = comparisonFunctionArray;
@ -183,10 +192,9 @@ ColumnarWriteRow(ColumnarWriteState *writeState, Datum *columnValues, bool *colu
writeState->stripeSkipList = stripeSkipList;
writeState->compressionBuffer = makeStringInfo();
Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(
writeState->relfilelocator),
RelationPhysicalIdentifierNumber_compat(
writeState->relfilelocator));
Oid relationId = ColumnarRelationId(writeState->temp_relid,
writeState->relfilelocator);
Relation relation = relation_open(relationId, NoLock);
writeState->emptyStripeReservation =
ReserveEmptyStripe(relation, columnCount, chunkRowCount,
@ -404,10 +412,9 @@ FlushStripe(ColumnarWriteState *writeState)
elog(DEBUG1, "Flushing Stripe of size %d", stripeBuffers->rowCount);
Oid relationId = RelidByRelfilenumber(RelationTablespace_compat(
writeState->relfilelocator),
RelationPhysicalIdentifierNumber_compat(
writeState->relfilelocator));
Oid relationId = ColumnarRelationId(writeState->temp_relid,
writeState->relfilelocator);
Relation relation = relation_open(relationId, NoLock);
/*
@ -499,10 +506,12 @@ FlushStripe(ColumnarWriteState *writeState)
}
}
SaveChunkGroups(writeState->relfilelocator,
SaveChunkGroups(writeState->temp_relid,
writeState->relfilelocator,
stripeMetadata->id,
writeState->chunkGroupRowCounts);
SaveStripeSkipList(writeState->relfilelocator,
SaveStripeSkipList(writeState->temp_relid,
writeState->relfilelocator,
stripeMetadata->id,
stripeSkipList, tupleDescriptor);

View File

@ -0,0 +1,3 @@
-- citus_columnar--12.2-1--13.2-1.sql
#include "udfs/columnar_finish_pg_upgrade/13.2-1.sql"

View File

@ -0,0 +1,2 @@
-- citus_columnar--13.2-1--14.0-1
-- bump version to 14.0-1

View File

@ -0,0 +1,3 @@
-- citus_columnar--13.2-1--12.2-1.sql
DROP FUNCTION IF EXISTS pg_catalog.columnar_finish_pg_upgrade();

View File

@ -0,0 +1,2 @@
-- citus_columnar--14.0-1--13.2-1
-- downgrade version to 13.2-1

View File

@ -0,0 +1,13 @@
CREATE OR REPLACE FUNCTION pg_catalog.columnar_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
BEGIN
-- set dependencies for columnar table access method
PERFORM columnar_internal.columnar_ensure_am_depends_catalog();
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.columnar_finish_pg_upgrade()
IS 'perform tasks to properly complete a Postgres upgrade for columnar extension';

View File

@ -0,0 +1,13 @@
CREATE OR REPLACE FUNCTION pg_catalog.columnar_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
BEGIN
-- set dependencies for columnar table access method
PERFORM columnar_internal.columnar_ensure_am_depends_catalog();
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.columnar_finish_pg_upgrade()
IS 'perform tasks to properly complete a Postgres upgrade for columnar extension';

View File

@ -191,8 +191,7 @@ columnar_init_write_state(Relation relation, TupleDesc tupdesc,
ReadColumnarOptions(tupSlotRelationId, &columnarOptions);
SubXidWriteState *stackEntry = palloc0(sizeof(SubXidWriteState));
stackEntry->writeState = ColumnarBeginWrite(RelationPhysicalIdentifier_compat(
relation),
stackEntry->writeState = ColumnarBeginWrite(relation,
columnarOptions,
tupdesc);
stackEntry->subXid = currentSubXid;

View File

@ -355,6 +355,15 @@ DEBUG: Total number of commands sent over the session 8: 1 to node localhost:97
(0 rows)
```
### Delaying the Fast Path Plan
As of Citus 13.2, if it can be determined at plan-time that a fast path query is against a local shard then a shortcut can be taken so that deparse and parse/plan of the shard query is avoided. Citus must be in MX mode and the shard must be local to the Citus node processing the query. If so, the OID of the distributed table is replaced by the OID of the shard in the parse tree. The parse tree is then given to the Postgres planner which returns a plan that is stored in the distributed plan's task. That plan can be repeatedly used by the local executor (described in the next section), avoiding the need to deparse and plan the shard query on each execution.
We call this delayed fast path planning because if a query is eligible for fast path planning then `FastPathPlanner()` is delayed if the following properties hold:
- The query is a SELECT or UPDATE on a distributed table (schema or column sharded) or Citus managed local table
- The query has no volatile functions
If so, then `FastPathRouterQuery()` sets a flag indicating that making the fast path plan should be delayed until after the worker job has been created. At that point the router planner uses `CheckAndBuildDelayedFastPathPlan()` to see if the task's shard placement is local (and not a dummy placement) and the metadata of the shard table and distributed table are consistent (no DDL in progress on the distributed table). If so the parse tree with OID of the distributed table replaced by the OID of the shard table is fed to `standard_planner()` and the resultant plan is saved in the task. Otherwise, if the worker job has been marked for deferred pruning or the shard is not local or the shard is local but it's not safe to swap OIDs, then `CheckAndBuildDelayedFastPathPlan()` calls `FastPathPlanner()` to ensure a complete plan context. Reference tables are not currently supported, but this may be relaxed for SELECT statements in the future. Delayed fast path planning can be disabled by turning off `citus.enable_local_fast_path_query_optimization` (it is on by default).
## Router Planner in Citus
@ -788,14 +797,13 @@ WHERE l.user_id = o.user_id AND o.primary_key = 55;
### Ref table LEFT JOIN distributed table JOINs via recursive planning
### Outer joins between reference and distributed tables
Very much like local-distributed table joins, Citus can't push down queries formatted as:
In general, when the outer side of an outer join is a recurring tuple (e.g., reference table, intermediate results, or set returning functions), it is not safe to push down the join.
```sql
"... ref_table LEFT JOIN distributed_table ..."
"... distributed_table RIGHT JOIN ref_table ..."
```
This is the case when the outer side is a recurring tuple (e.g., reference table, intermediate results, or set returning functions).
In these situations, Citus recursively plans the "distributed" part of the join. Even though it may seem excessive to recursively plan a distributed table, remember that Citus pushes down the filters and projections. Functions involved here include `RequiredAttrNumbersForRelation()` and `ReplaceRTERelationWithRteSubquery()`.
The core function handling this logic is `RecursivelyPlanRecurringTupleOuterJoinWalker()`. There are likely numerous optimizations possible (e.g., first pushing down an inner JOIN then an outer join), but these have not been implemented due to their complexity.
@ -819,6 +827,45 @@ DEBUG: Wrapping relation "orders_table" "o" to a subquery
DEBUG: generating subplan 45_1 for subquery SELECT order_id, status FROM public.orders_table o WHERE true
```
As of Citus 13.2, under certain conditions, Citus can push down these types of LEFT and RIGHT outer joins by injecting constraints—derived from the shard intervals of distributed tables—into shard queries for the reference table. The eligibility rules for pushdown are defined in `CanPushdownRecurringOuterJoin()`, while the logic for computing and injecting the constraints is implemented in `UpdateWhereClauseToPushdownRecurringOuterJoin()`.
#### Example Query
In the example below, Citus pushes down the query by injecting interval constraints on the reference table. The injected constraints are visible in the EXPLAIN output.
```sql
SELECT pc.category_name, count(pt.product_id)
FROM product_categories pc
LEFT JOIN products_table pt ON pc.category_id = pt.product_id
GROUP BY pc.category_name;
```
#### Debug Messages
```
DEBUG: Router planner cannot handle multi-shard select queries
DEBUG: a push down safe left join with recurring left side
```
#### Explain Output
```
HashAggregate
Group Key: remote_scan.category_name
-> Custom Scan (Citus Adaptive)
Task Count: 32
Tasks Shown: One of 32
-> Task
Node: host=localhost port=9701 dbname=ebru
-> HashAggregate
Group Key: pc.category_name
-> Hash Right Join
Hash Cond: (pt.product_id = pc.category_id)
-> Seq Scan on products_table_102072 pt
-> Hash
-> Seq Scan on product_categories_102106 pc
Filter: ((category_id IS NULL) OR ((btint4cmp('-2147483648'::integer, hashint8((category_id)::bigint)) < 0) AND (btint4cmp(hashint8((category_id::bigint), '-2013265921'::integer) <= 0)))
```
### Recursive Planning When FROM Clause has Reference Table (or Recurring Tuples)
This section discusses a specific scenario in Citus's recursive query planning: handling queries where the main query's `FROM` clause is recurring, but there are subqueries in the `SELECT` or `WHERE` clauses involving distributed tables.

View File

@ -346,12 +346,12 @@ CdcIsReferenceTableViaCatalog(Oid relationId)
return false;
}
Datum datumArray[Natts_pg_dist_partition];
bool isNullArray[Natts_pg_dist_partition];
Relation pgDistPartition = table_open(DistPartitionRelationId(), AccessShareLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *datumArray = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(partitionTuple, tupleDescriptor, datumArray, isNullArray);
if (isNullArray[Anum_pg_dist_partition_partmethod - 1] ||
@ -363,6 +363,8 @@ CdcIsReferenceTableViaCatalog(Oid relationId)
*/
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
return false;
}
@ -374,6 +376,8 @@ CdcIsReferenceTableViaCatalog(Oid relationId)
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
/*
* A table is a reference table when its partition method is 'none'

View File

@ -1,6 +1,6 @@
# Citus extension
comment = 'Citus distributed database'
default_version = '13.1-1'
default_version = '14.0-1'
module_pathname = '$libdir/citus'
relocatable = false
schema = pg_catalog

View File

@ -1927,14 +1927,10 @@ GetNonGeneratedStoredColumnNameList(Oid relationId)
for (int columnIndex = 0; columnIndex < tupleDescriptor->natts; columnIndex++)
{
Form_pg_attribute currentColumn = TupleDescAttr(tupleDescriptor, columnIndex);
if (currentColumn->attisdropped)
{
/* skip dropped columns */
continue;
}
if (currentColumn->attgenerated == ATTRIBUTE_GENERATED_STORED)
if (IsDroppedOrGenerated(currentColumn))
{
/* skip dropped or generated columns */
continue;
}

View File

@ -19,6 +19,7 @@
#include "nodes/parsenodes.h"
#include "tcop/utility.h"
#include "distributed/citus_depended_object.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
#include "distributed/deparser.h"
@ -63,6 +64,13 @@ PostprocessCreateDistributedObjectFromCatalogStmt(Node *stmt, const char *queryS
return NIL;
}
if (ops->qualify && DistOpsValidityState(stmt, ops) ==
ShouldQualifyAfterLocalCreation)
{
/* qualify the statement after local creation */
ops->qualify(stmt);
}
List *addresses = GetObjectAddressListFromParseTree(stmt, false, true);
/* the code-path only supports a single object */

View File

@ -175,8 +175,9 @@ static bool DistributionColumnUsesNumericColumnNegativeScale(TupleDesc relationD
static int numeric_typmod_scale(int32 typmod);
static bool is_valid_numeric_typmod(int32 typmod);
static bool DistributionColumnUsesGeneratedStoredColumn(TupleDesc relationDesc,
Var *distributionColumn);
static void DistributionColumnIsGeneratedCheck(TupleDesc relationDesc,
Var *distributionColumn,
const char *relationName);
static bool CanUseExclusiveConnections(Oid relationId, bool localTableEmpty);
static uint64 DoCopyFromLocalTableIntoShards(Relation distributedRelation,
DestReceiver *copyDest,
@ -2103,13 +2104,10 @@ EnsureRelationCanBeDistributed(Oid relationId, Var *distributionColumn,
/* verify target relation is not distributed by a generated stored column
*/
if (distributionMethod != DISTRIBUTE_BY_NONE &&
DistributionColumnUsesGeneratedStoredColumn(relationDesc, distributionColumn))
if (distributionMethod != DISTRIBUTE_BY_NONE)
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot distribute relation: %s", relationName),
errdetail("Distribution column must not use GENERATED ALWAYS "
"AS (...) STORED.")));
DistributionColumnIsGeneratedCheck(relationDesc, distributionColumn,
relationName);
}
/* verify target relation is not distributed by a column of type numeric with negative scale */
@ -2829,9 +2827,7 @@ TupleDescColumnNameList(TupleDesc tupleDescriptor)
Form_pg_attribute currentColumn = TupleDescAttr(tupleDescriptor, columnIndex);
char *columnName = NameStr(currentColumn->attname);
if (currentColumn->attisdropped ||
currentColumn->attgenerated == ATTRIBUTE_GENERATED_STORED
)
if (IsDroppedOrGenerated(currentColumn))
{
continue;
}
@ -2893,22 +2889,43 @@ DistributionColumnUsesNumericColumnNegativeScale(TupleDesc relationDesc,
/*
* DistributionColumnUsesGeneratedStoredColumn returns whether a given relation uses
* GENERATED ALWAYS AS (...) STORED on distribution column
* DistributionColumnIsGeneratedCheck throws an error if a given relation uses
* GENERATED ALWAYS AS (...) STORED | VIRTUAL on distribution column
*/
static bool
DistributionColumnUsesGeneratedStoredColumn(TupleDesc relationDesc,
Var *distributionColumn)
static void
DistributionColumnIsGeneratedCheck(TupleDesc relationDesc,
Var *distributionColumn,
const char *relationName)
{
Form_pg_attribute attributeForm = TupleDescAttr(relationDesc,
distributionColumn->varattno - 1);
if (attributeForm->attgenerated == ATTRIBUTE_GENERATED_STORED)
switch (attributeForm->attgenerated)
{
return true;
}
case ATTRIBUTE_GENERATED_STORED:
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot distribute relation: %s", relationName),
errdetail("Distribution column must not use GENERATED ALWAYS "
"AS (...) STORED.")));
break;
}
return false;
#if PG_VERSION_NUM >= PG_VERSION_18
case ATTRIBUTE_GENERATED_VIRTUAL:
{
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot distribute relation: %s", relationName),
errdetail("Distribution column must not use GENERATED ALWAYS "
"AS (...) VIRTUAL.")));
break;
}
#endif
default:
{
break;
}
}
}

View File

@ -25,6 +25,12 @@
#include "utils/lsyscache.h"
#include "utils/syscache.h"
#include "pg_version_constants.h"
#if PG_VERSION_NUM < PG_VERSION_17
#include "catalog/pg_am_d.h"
#endif
#include "citus_version.h"
#include "columnar/columnar.h"
@ -52,6 +58,10 @@ static void MarkExistingObjectDependenciesDistributedIfSupported(void);
static List * GetAllViews(void);
static bool ShouldPropagateExtensionCommand(Node *parseTree);
static bool IsAlterExtensionSetSchemaCitus(Node *parseTree);
static bool HasAnyRelationsUsingOldColumnar(void);
static Oid GetOldColumnarAMIdIfExists(void);
static bool AccessMethodDependsOnAnyExtensions(Oid accessMethodId);
static bool HasAnyRelationsUsingAccessMethod(Oid accessMethodId);
static Node * RecreateExtensionStmt(Oid extensionOid);
static List * GenerateGrantCommandsOnExtensionDependentFDWs(Oid extensionId);
@ -783,7 +793,8 @@ PreprocessCreateExtensionStmtForCitusColumnar(Node *parsetree)
/*citus version >= 11.1 requires install citus_columnar first*/
if (versionNumber >= 1110 && !CitusHasBeenLoaded())
{
if (get_extension_oid("citus_columnar", true) == InvalidOid)
if (get_extension_oid("citus_columnar", true) == InvalidOid &&
(versionNumber < 1320 || HasAnyRelationsUsingOldColumnar()))
{
CreateExtensionWithVersion("citus_columnar", NULL);
}
@ -894,9 +905,10 @@ PreprocessAlterExtensionCitusStmtForCitusColumnar(Node *parseTree)
double newVersionNumber = GetExtensionVersionNumber(pstrdup(newVersion));
/*alter extension citus update to version >= 11.1-1, and no citus_columnar installed */
if (newVersionNumber >= 1110 && citusColumnarOid == InvalidOid)
if (newVersionNumber >= 1110 && citusColumnarOid == InvalidOid &&
(newVersionNumber < 1320 || HasAnyRelationsUsingOldColumnar()))
{
/*it's upgrade citus to 11.1-1 or further version */
/*it's upgrade citus to 11.1-1 or further version and there are relations using old columnar */
CreateExtensionWithVersion("citus_columnar", CITUS_COLUMNAR_INTERNAL_VERSION);
}
else if (newVersionNumber < 1110 && citusColumnarOid != InvalidOid)
@ -911,7 +923,8 @@ PreprocessAlterExtensionCitusStmtForCitusColumnar(Node *parseTree)
int versionNumber = (int) (100 * strtod(CITUS_MAJORVERSION, NULL));
if (versionNumber >= 1110)
{
if (citusColumnarOid == InvalidOid)
if (citusColumnarOid == InvalidOid &&
(versionNumber < 1320 || HasAnyRelationsUsingOldColumnar()))
{
CreateExtensionWithVersion("citus_columnar",
CITUS_COLUMNAR_INTERNAL_VERSION);
@ -921,6 +934,117 @@ PreprocessAlterExtensionCitusStmtForCitusColumnar(Node *parseTree)
}
/*
* HasAnyRelationsUsingOldColumnar returns true if there are any relations
* using the old columnar access method.
*/
static bool
HasAnyRelationsUsingOldColumnar(void)
{
Oid oldColumnarAMId = GetOldColumnarAMIdIfExists();
return OidIsValid(oldColumnarAMId) &&
HasAnyRelationsUsingAccessMethod(oldColumnarAMId);
}
/*
* GetOldColumnarAMIdIfExists returns the oid of the old columnar access
* method, i.e., the columnar access method that we had as part of "citus"
* extension before we split it into "citus_columnar" at version 11.1, if
* it exists. Otherwise, it returns InvalidOid.
*
* We know that it's "old columnar" only if the access method doesn't depend
* on any extensions. This is because, in citus--11.0-4--11.1-1.sql, we
* detach the columnar objects (including the access method) from citus
* in preparation for splitting of the columnar into a separate extension.
*/
static Oid
GetOldColumnarAMIdIfExists(void)
{
Oid columnarAMId = get_am_oid("columnar", true);
if (OidIsValid(columnarAMId) && !AccessMethodDependsOnAnyExtensions(columnarAMId))
{
return columnarAMId;
}
return InvalidOid;
}
/*
* AccessMethodDependsOnAnyExtensions returns true if the access method
* with the given accessMethodId depends on any extensions.
*/
static bool
AccessMethodDependsOnAnyExtensions(Oid accessMethodId)
{
ScanKeyData key[3];
Relation pgDepend = table_open(DependRelationId, AccessShareLock);
ScanKeyInit(&key[0],
Anum_pg_depend_classid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(AccessMethodRelationId));
ScanKeyInit(&key[1],
Anum_pg_depend_objid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(accessMethodId));
ScanKeyInit(&key[2],
Anum_pg_depend_objsubid,
BTEqualStrategyNumber, F_INT4EQ,
Int32GetDatum(0));
SysScanDesc scan = systable_beginscan(pgDepend, DependDependerIndexId, true,
NULL, 3, key);
bool result = false;
HeapTuple heapTuple = NULL;
while (HeapTupleIsValid(heapTuple = systable_getnext(scan)))
{
Form_pg_depend dependForm = (Form_pg_depend) GETSTRUCT(heapTuple);
if (dependForm->refclassid == ExtensionRelationId)
{
result = true;
break;
}
}
systable_endscan(scan);
table_close(pgDepend, AccessShareLock);
return result;
}
/*
* HasAnyRelationsUsingAccessMethod returns true if there are any relations
* using the access method with the given accessMethodId.
*/
static bool
HasAnyRelationsUsingAccessMethod(Oid accessMethodId)
{
ScanKeyData key[1];
Relation pgClass = table_open(RelationRelationId, AccessShareLock);
ScanKeyInit(&key[0],
Anum_pg_class_relam,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(accessMethodId));
SysScanDesc scan = systable_beginscan(pgClass, InvalidOid, false, NULL, 1, key);
bool result = HeapTupleIsValid(systable_getnext(scan));
systable_endscan(scan);
table_close(pgClass, AccessShareLock);
return result;
}
/*
* PostprocessAlterExtensionCitusStmtForCitusColumnar process the case when upgrade citus
* to version that support citus_columnar, or downgrade citus to lower version that
@ -959,7 +1083,7 @@ PostprocessAlterExtensionCitusStmtForCitusColumnar(Node *parseTree)
{
/*alter extension citus update, need upgrade citus_columnar from Y to Z*/
int versionNumber = (int) (100 * strtod(CITUS_MAJORVERSION, NULL));
if (versionNumber >= 1110)
if (versionNumber >= 1110 && citusColumnarOid != InvalidOid)
{
char *curColumnarVersion = get_extension_version(citusColumnarOid);
if (strcmp(curColumnarVersion, CITUS_COLUMNAR_INTERNAL_VERSION) == 0)

View File

@ -769,13 +769,16 @@ UpdateFunctionDistributionInfo(const ObjectAddress *distAddress,
const bool indexOK = true;
ScanKeyData scanKey[3];
Datum values[Natts_pg_dist_object];
bool isnull[Natts_pg_dist_object];
bool replace[Natts_pg_dist_object];
Relation pgDistObjectRel = table_open(DistObjectRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistObjectRel);
Datum *values = palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = palloc0(tupleDescriptor->natts * sizeof(bool));
int forseDelegationIndex = GetForceDelegationAttrIndexInPgDistObject(tupleDescriptor);
/* scan pg_dist_object for classid = $1 AND objid = $2 AND objsubid = $3 via index */
ScanKeyInit(&scanKey[0], Anum_pg_dist_object_classid, BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(distAddress->classId));
@ -797,12 +800,7 @@ UpdateFunctionDistributionInfo(const ObjectAddress *distAddress,
distAddress->objectId, distAddress->objectSubId)));
}
memset(values, 0, sizeof(values));
memset(isnull, 0, sizeof(isnull));
memset(replace, 0, sizeof(replace));
replace[Anum_pg_dist_object_distribution_argument_index - 1] = true;
if (distribution_argument_index != NULL)
{
values[Anum_pg_dist_object_distribution_argument_index - 1] = Int32GetDatum(
@ -825,16 +823,15 @@ UpdateFunctionDistributionInfo(const ObjectAddress *distAddress,
isnull[Anum_pg_dist_object_colocationid - 1] = true;
}
replace[Anum_pg_dist_object_force_delegation - 1] = true;
replace[forseDelegationIndex] = true;
if (forceDelegation != NULL)
{
values[Anum_pg_dist_object_force_delegation - 1] = BoolGetDatum(
*forceDelegation);
isnull[Anum_pg_dist_object_force_delegation - 1] = false;
values[forseDelegationIndex] = BoolGetDatum(*forceDelegation);
isnull[forseDelegationIndex] = false;
}
else
{
isnull[Anum_pg_dist_object_force_delegation - 1] = true;
isnull[forseDelegationIndex] = true;
}
heapTuple = heap_modify_tuple(heapTuple, tupleDescriptor, values, isnull, replace);
@ -849,6 +846,10 @@ UpdateFunctionDistributionInfo(const ObjectAddress *distAddress,
table_close(pgDistObjectRel, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
if (EnableMetadataSync)
{
List *objectAddressList = list_make1((ObjectAddress *) distAddress);

View File

@ -854,8 +854,11 @@ PostprocessIndexStmt(Node *node, const char *queryString)
table_close(relation, NoLock);
index_close(indexRelation, NoLock);
PushActiveSnapshot(GetTransactionSnapshot());
/* mark index as invalid, in-place (cannot be rolled back) */
index_set_state_flags(indexRelationId, INDEX_DROP_CLEAR_VALID);
PopActiveSnapshot();
/* re-open a transaction command from here on out */
CommitTransactionCommand();
@ -1370,8 +1373,11 @@ MarkIndexValid(IndexStmt *indexStmt)
schemaId);
Relation indexRelation = index_open(indexRelationId, RowExclusiveLock);
PushActiveSnapshot(GetTransactionSnapshot());
/* mark index as valid, in-place (cannot be rolled back) */
index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID);
PopActiveSnapshot();
table_close(relation, NoLock);
index_close(indexRelation, NoLock);

View File

@ -350,7 +350,6 @@ static void LogLocalCopyToRelationExecution(uint64 shardId);
static void LogLocalCopyToFileExecution(uint64 shardId);
static void ErrorIfMergeInCopy(CopyStmt *copyStatement);
/* exports for SQL callable functions */
PG_FUNCTION_INFO_V1(citus_text_send_as_jsonb);
@ -484,9 +483,7 @@ CopyToExistingShards(CopyStmt *copyStatement, QueryCompletion *completionTag)
Form_pg_attribute currentColumn = TupleDescAttr(tupleDescriptor, columnIndex);
char *columnName = NameStr(currentColumn->attname);
if (currentColumn->attisdropped ||
currentColumn->attgenerated == ATTRIBUTE_GENERATED_STORED
)
if (IsDroppedOrGenerated(currentColumn))
{
continue;
}
@ -804,9 +801,7 @@ CanUseBinaryCopyFormat(TupleDesc tupleDescription)
{
Form_pg_attribute currentColumn = TupleDescAttr(tupleDescription, columnIndex);
if (currentColumn->attisdropped ||
currentColumn->attgenerated == ATTRIBUTE_GENERATED_STORED
)
if (IsDroppedOrGenerated(currentColumn))
{
continue;
}
@ -1316,9 +1311,7 @@ TypeArrayFromTupleDescriptor(TupleDesc tupleDescriptor)
for (int columnIndex = 0; columnIndex < columnCount; columnIndex++)
{
Form_pg_attribute attr = TupleDescAttr(tupleDescriptor, columnIndex);
if (attr->attisdropped ||
attr->attgenerated == ATTRIBUTE_GENERATED_STORED
)
if (IsDroppedOrGenerated(attr))
{
typeArray[columnIndex] = InvalidOid;
}
@ -1486,9 +1479,7 @@ AppendCopyRowData(Datum *valueArray, bool *isNullArray, TupleDesc rowDescriptor,
value = CoerceColumnValue(value, &columnCoercionPaths[columnIndex]);
}
if (currentColumn->attisdropped ||
currentColumn->attgenerated == ATTRIBUTE_GENERATED_STORED
)
if (IsDroppedOrGenerated(currentColumn))
{
continue;
}
@ -1607,9 +1598,7 @@ AvailableColumnCount(TupleDesc tupleDescriptor)
{
Form_pg_attribute currentColumn = TupleDescAttr(tupleDescriptor, columnIndex);
if (!currentColumn->attisdropped &&
currentColumn->attgenerated != ATTRIBUTE_GENERATED_STORED
)
if (!IsDroppedOrGenerated(currentColumn))
{
columnCount++;
}
@ -3049,7 +3038,7 @@ CitusCopySelect(CopyStmt *copyStatement)
for (int i = 0; i < tupleDescriptor->natts; i++)
{
Form_pg_attribute attr = &tupleDescriptor->attrs[i];
Form_pg_attribute attr = TupleDescAttr(tupleDescriptor, i);
if (attr->attisdropped ||
attr->attgenerated
@ -3999,3 +3988,20 @@ UnclaimCopyConnections(List *connectionStateList)
UnclaimConnection(connectionState->connection);
}
}
/*
* IsDroppedOrGenerated - helper function for determining if an attribute is
* dropped or generated. Used by COPY and Citus DDL to skip such columns.
*/
inline bool
IsDroppedOrGenerated(Form_pg_attribute attr)
{
/*
* If the "is dropped" flag is true or the generated column flag
* is not the default nul character (in which case its value is 's'
* for ATTRIBUTE_GENERATED_STORED or possibly 'v' with PG18+ for
* ATTRIBUTE_GENERATED_VIRTUAL) then return true.
*/
return attr->attisdropped || (attr->attgenerated != '\0');
}

View File

@ -196,6 +196,27 @@ BuildCreatePublicationStmt(Oid publicationId)
-1);
createPubStmt->options = lappend(createPubStmt->options, pubViaRootOption);
/* WITH (publish_generated_columns = ...) option (PG18+) */
#if PG_VERSION_NUM >= PG_VERSION_18
if (publicationForm->pubgencols == 's') /* stored */
{
DefElem *pubGenColsOption =
makeDefElem("publish_generated_columns",
(Node *) makeString("stored"),
-1);
createPubStmt->options =
lappend(createPubStmt->options, pubGenColsOption);
}
else if (publicationForm->pubgencols != 'n') /* 'n' = none (default) */
{
ereport(ERROR,
(errmsg("unexpected pubgencols value '%c' for publication %u",
publicationForm->pubgencols, publicationId)));
}
#endif
/* WITH (publish = 'insert, update, delete, truncate') option */
List *publishList = NIL;

View File

@ -177,8 +177,7 @@ ExtractDefaultColumnsAndOwnedSequences(Oid relationId, List **columnNameList,
{
Form_pg_attribute attributeForm = TupleDescAttr(tupleDescriptor, attributeIndex);
if (attributeForm->attisdropped ||
attributeForm->attgenerated == ATTRIBUTE_GENERATED_STORED)
if (IsDroppedOrGenerated(attributeForm))
{
/* skip dropped columns and columns with GENERATED AS ALWAYS expressions */
continue;

View File

@ -69,7 +69,15 @@ PreprocessCreateStatisticsStmt(Node *node, const char *queryString,
{
CreateStatsStmt *stmt = castNode(CreateStatsStmt, node);
RangeVar *relation = (RangeVar *) linitial(stmt->relations);
Node *relationNode = (Node *) linitial(stmt->relations);
if (!IsA(relationNode, RangeVar))
{
return NIL;
}
RangeVar *relation = (RangeVar *) relationNode;
Oid relationId = RangeVarGetRelid(relation, ShareUpdateExclusiveLock, false);
if (!IsCitusTable(relationId) || !ShouldPropagate())

View File

@ -48,21 +48,27 @@ typedef struct CitusVacuumParams
#endif
} CitusVacuumParams;
/*
* Information we track per VACUUM/ANALYZE target relation.
*/
typedef struct CitusVacuumRelation
{
VacuumRelation *vacuumRelation;
Oid relationId;
} CitusVacuumRelation;
/* Local functions forward declarations for processing distributed table commands */
static bool IsDistributedVacuumStmt(List *vacuumRelationIdList);
static bool IsDistributedVacuumStmt(List *vacuumRelationList);
static List * VacuumTaskList(Oid relationId, CitusVacuumParams vacuumParams,
List *vacuumColumnList);
static char * DeparseVacuumStmtPrefix(CitusVacuumParams vacuumParams);
static char * DeparseVacuumColumnNames(List *columnNameList);
static List * VacuumColumnList(VacuumStmt *vacuumStmt, int relationIndex);
static List * ExtractVacuumTargetRels(VacuumStmt *vacuumStmt);
static void ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt, List *relationIdList,
static void ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt, List *relationList,
CitusVacuumParams vacuumParams);
static void ExecuteUnqualifiedVacuumTasks(VacuumStmt *vacuumStmt,
CitusVacuumParams vacuumParams);
static CitusVacuumParams VacuumStmtParams(VacuumStmt *vacstmt);
static List * VacuumRelationIdList(VacuumStmt *vacuumStmt, CitusVacuumParams
vacuumParams);
static List * VacuumRelationList(VacuumStmt *vacuumStmt, CitusVacuumParams vacuumParams);
/*
* PostprocessVacuumStmt processes vacuum statements that may need propagation to
@ -97,7 +103,7 @@ PostprocessVacuumStmt(Node *node, const char *vacuumCommand)
* when no table is specified propagate the command as it is;
* otherwise, only propagate when there is at least 1 citus table
*/
List *relationIdList = VacuumRelationIdList(vacuumStmt, vacuumParams);
List *vacuumRelationList = VacuumRelationList(vacuumStmt, vacuumParams);
if (list_length(vacuumStmt->rels) == 0)
{
@ -105,11 +111,11 @@ PostprocessVacuumStmt(Node *node, const char *vacuumCommand)
ExecuteUnqualifiedVacuumTasks(vacuumStmt, vacuumParams);
}
else if (IsDistributedVacuumStmt(relationIdList))
else if (IsDistributedVacuumStmt(vacuumRelationList))
{
/* there is at least 1 citus table specified */
ExecuteVacuumOnDistributedTables(vacuumStmt, relationIdList,
ExecuteVacuumOnDistributedTables(vacuumStmt, vacuumRelationList,
vacuumParams);
}
@ -120,39 +126,58 @@ PostprocessVacuumStmt(Node *node, const char *vacuumCommand)
/*
* VacuumRelationIdList returns the oid of the relations in the given vacuum statement.
* VacuumRelationList returns the list of relations in the given vacuum statement,
* along with their resolved Oids (if they can be locked).
*/
static List *
VacuumRelationIdList(VacuumStmt *vacuumStmt, CitusVacuumParams vacuumParams)
VacuumRelationList(VacuumStmt *vacuumStmt, CitusVacuumParams vacuumParams)
{
LOCKMODE lockMode = (vacuumParams.options & VACOPT_FULL) ? AccessExclusiveLock :
ShareUpdateExclusiveLock;
bool skipLocked = (vacuumParams.options & VACOPT_SKIP_LOCKED);
List *vacuumRelationList = ExtractVacuumTargetRels(vacuumStmt);
List *relationList = NIL;
List *relationIdList = NIL;
RangeVar *vacuumRelation = NULL;
foreach_declared_ptr(vacuumRelation, vacuumRelationList)
VacuumRelation *vacuumRelation = NULL;
foreach_declared_ptr(vacuumRelation, vacuumStmt->rels)
{
Oid relationId = InvalidOid;
/*
* If skip_locked option is enabled, we are skipping that relation
* if the lock for it is currently not available; else, we get the lock.
* if the lock for it is currently not available; otherwise, we get the lock.
*/
Oid relationId = RangeVarGetRelidExtended(vacuumRelation,
if (vacuumRelation->relation)
{
relationId = RangeVarGetRelidExtended(vacuumRelation->relation,
lockMode,
skipLocked ? RVR_SKIP_LOCKED : 0, NULL,
NULL);
}
else if (OidIsValid(vacuumRelation->oid))
{
/* fall back to the Oid directly when provided */
if (!skipLocked || ConditionalLockRelationOid(vacuumRelation->oid, lockMode))
{
if (!skipLocked)
{
LockRelationOid(vacuumRelation->oid, lockMode);
}
relationId = vacuumRelation->oid;
}
}
if (OidIsValid(relationId))
{
relationIdList = lappend_oid(relationIdList, relationId);
CitusVacuumRelation *relation = palloc(sizeof(CitusVacuumRelation));
relation->vacuumRelation = vacuumRelation;
relation->relationId = relationId;
relationList = lappend(relationList, relation);
}
}
return relationIdList;
return relationList;
}
@ -161,12 +186,13 @@ VacuumRelationIdList(VacuumStmt *vacuumStmt, CitusVacuumParams vacuumParams)
* otherwise, it returns false.
*/
static bool
IsDistributedVacuumStmt(List *vacuumRelationIdList)
IsDistributedVacuumStmt(List *vacuumRelationList)
{
Oid relationId = InvalidOid;
foreach_declared_oid(relationId, vacuumRelationIdList)
CitusVacuumRelation *vacuumRelation = NULL;
foreach_declared_ptr(vacuumRelation, vacuumRelationList)
{
if (OidIsValid(relationId) && IsCitusTable(relationId))
if (OidIsValid(vacuumRelation->relationId) &&
IsCitusTable(vacuumRelation->relationId))
{
return true;
}
@ -181,24 +207,31 @@ IsDistributedVacuumStmt(List *vacuumRelationIdList)
* if they are citus tables.
*/
static void
ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt, List *relationIdList,
ExecuteVacuumOnDistributedTables(VacuumStmt *vacuumStmt, List *relationList,
CitusVacuumParams vacuumParams)
{
int relationIndex = 0;
Oid relationId = InvalidOid;
foreach_declared_oid(relationId, relationIdList)
CitusVacuumRelation *vacuumRelationEntry = NULL;
foreach_declared_ptr(vacuumRelationEntry, relationList)
{
Oid relationId = vacuumRelationEntry->relationId;
VacuumRelation *vacuumRelation = vacuumRelationEntry->vacuumRelation;
RangeVar *relation = vacuumRelation->relation;
if (relation != NULL && !relation->inh)
{
/* ONLY specified, so don't recurse to shard placements */
continue;
}
if (IsCitusTable(relationId))
{
List *vacuumColumnList = VacuumColumnList(vacuumStmt, relationIndex);
List *vacuumColumnList = vacuumRelation->va_cols;
List *taskList = VacuumTaskList(relationId, vacuumParams, vacuumColumnList);
/* local execution is not implemented for VACUUM commands */
bool localExecutionSupported = false;
ExecuteUtilityTaskList(taskList, localExecutionSupported);
}
relationIndex++;
}
}
@ -484,39 +517,6 @@ DeparseVacuumColumnNames(List *columnNameList)
}
/*
* VacuumColumnList returns list of columns from relation
* in the vacuum statement at specified relationIndex.
*/
static List *
VacuumColumnList(VacuumStmt *vacuumStmt, int relationIndex)
{
VacuumRelation *vacuumRelation = (VacuumRelation *) list_nth(vacuumStmt->rels,
relationIndex);
return vacuumRelation->va_cols;
}
/*
* ExtractVacuumTargetRels returns list of target
* relations from vacuum statement.
*/
static List *
ExtractVacuumTargetRels(VacuumStmt *vacuumStmt)
{
List *vacuumList = NIL;
VacuumRelation *vacuumRelation = NULL;
foreach_declared_ptr(vacuumRelation, vacuumStmt->rels)
{
vacuumList = lappend(vacuumList, vacuumRelation->relation);
}
return vacuumList;
}
/*
* VacuumStmtParams returns a CitusVacuumParams based on the supplied VacuumStmt.
*/

View File

@ -14,6 +14,7 @@
#include "miscadmin.h"
#include "pgstat.h"
#include "catalog/pg_collation.h"
#include "lib/stringinfo.h"
#include "storage/latch.h"
#include "utils/builtins.h"
@ -371,8 +372,9 @@ CommandMatchesLogGrepPattern(const char *command)
if (GrepRemoteCommands && strnlen(GrepRemoteCommands, NAMEDATALEN) > 0)
{
Datum boolDatum =
DirectFunctionCall2(textlike, CStringGetTextDatum(command),
CStringGetTextDatum(GrepRemoteCommands));
DirectFunctionCall2Coll(textlike, DEFAULT_COLLATION_OID,
CStringGetTextDatum(command),
CStringGetTextDatum(GrepRemoteCommands));
return DatumGetBool(boolDatum);
}

View File

@ -11,6 +11,7 @@
#include "postgres.h"
#include "utils/elog.h"
#include "utils/memutils.h" /* for TopTransactionContext */
#include "distributed/connection_management.h"
#include "distributed/error_codes.h"

View File

@ -82,6 +82,7 @@ static void AppendStorageParametersToString(StringInfo stringBuffer,
List *optionList);
static const char * convert_aclright_to_string(int aclright);
static void simple_quote_literal(StringInfo buf, const char *val);
static SubscriptingRef * TargetEntryExprFindSubsRef(Expr *expr);
static void AddVacuumParams(ReindexStmt *reindexStmt, StringInfo buffer);
static void process_acl_items(Acl *acl, const char *relationName,
const char *attributeName, List **defs);
@ -470,6 +471,13 @@ pg_get_tableschemadef_string(Oid tableRelationId, IncludeSequenceDefaults
appendStringInfo(&buffer, " GENERATED ALWAYS AS (%s) STORED",
defaultString);
}
#if PG_VERSION_NUM >= PG_VERSION_18
else if (attributeForm->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)
{
appendStringInfo(&buffer, " GENERATED ALWAYS AS (%s) VIRTUAL",
defaultString);
}
#endif
else
{
Oid seqOid = GetSequenceOid(tableRelationId, defaultValue->adnum);
@ -546,6 +554,13 @@ pg_get_tableschemadef_string(Oid tableRelationId, IncludeSequenceDefaults
appendStringInfoString(&buffer, "(");
appendStringInfoString(&buffer, checkString);
appendStringInfoString(&buffer, ")");
#if PG_VERSION_NUM >= PG_VERSION_18
if (!checkConstraint->ccenforced)
{
appendStringInfoString(&buffer, " NOT ENFORCED");
}
#endif
}
/* close create table's outer parentheses */
@ -1715,3 +1730,317 @@ RoleSpecString(RoleSpec *spec, bool withQuoteIdentifier)
}
}
}
/*
* Recursively search an expression for a Param and return its paramid
* Intended for indirection management: UPDATE SET () = (SELECT )
* Does not cover all options but those supported by Citus.
*/
static int
GetParamId(Node *expr)
{
int paramid = 0;
if (expr == NULL)
{
return paramid;
}
/* If it's a Param, return its attnum */
if (IsA(expr, Param))
{
Param *param = (Param *) expr;
paramid = param->paramid;
}
/* If it's a FuncExpr, search in arguments */
else if (IsA(expr, FuncExpr))
{
FuncExpr *func = (FuncExpr *) expr;
ListCell *lc;
foreach(lc, func->args)
{
paramid = GetParamId((Node *) lfirst(lc));
if (paramid != 0)
{
break; /* Stop at the first valid paramid */
}
}
}
return paramid;
}
/*
* list_sort comparator to sort target list by paramid (in MULTIEXPR)
* Intended for indirection management: UPDATE SET () = (SELECT )
*/
static int
target_list_cmp(const ListCell *a, const ListCell *b)
{
TargetEntry *tleA = lfirst(a);
TargetEntry *tleB = lfirst(b);
/*
* Deal with resjunk entries; sublinks are marked resjunk and
* are placed at the end of the target list so this logic
* ensures they stay grouped at the end of the target list:
*/
if (tleA->resjunk || tleB->resjunk)
{
return tleA->resjunk - tleB->resjunk;
}
int la = GetParamId((Node *) tleA->expr);
int lb = GetParamId((Node *) tleB->expr);
/*
* Should be looking at legitimate param ids
*/
Assert(la > 0);
Assert(lb > 0);
/*
* Return -1, 0 or 1 depending on if la is less than,
* equal to or greater than lb
*/
return (la > lb) - (la < lb);
}
/*
* Used by get_update_query_targetlist_def() (in ruleutils) to reorder the target
* list on the left side of the update:
* SET () = (SELECT )
* Reordering the SELECT side only does not work, consider a case like:
* SET (col_1, col3) = (SELECT 1, 3), (col_2) = (SELECT 2)
* Without ensure_update_targetlist_in_param_order(), this will lead to an incorrect
* deparsed query:
* SET (col_1, col2) = (SELECT 1, 3), (col_3) = (SELECT 2)
*/
void
ensure_update_targetlist_in_param_order(List *targetList)
{
bool need_to_sort_target_list = false;
int previous_paramid = 0;
ListCell *l;
foreach(l, targetList)
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
if (!tle->resjunk)
{
int paramid = GetParamId((Node *) tle->expr);
if (paramid < previous_paramid)
{
need_to_sort_target_list = true;
break;
}
previous_paramid = paramid;
}
}
if (need_to_sort_target_list)
{
list_sort(targetList, target_list_cmp);
}
}
/*
* isSubsRef checks if a given node is a SubscriptingRef or can be
* reached through an implicit coercion.
*/
static
bool
isSubsRef(Node *node)
{
if (node == NULL)
{
return false;
}
if (IsA(node, CoerceToDomain))
{
CoerceToDomain *coerceToDomain = (CoerceToDomain *) node;
if (coerceToDomain->coercionformat != COERCE_IMPLICIT_CAST)
{
/* not an implicit coercion, cannot reach to a SubscriptingRef */
return false;
}
node = (Node *) coerceToDomain->arg;
}
return (IsA(node, SubscriptingRef));
}
/*
* checkTlistForSubsRef - checks if any target entry in the list contains a
* SubscriptingRef or can be reached through an implicit coercion. Used by
* ExpandMergedSubscriptingRefEntries() to identify if any target entries
* need to be expanded - if not the original target list is preserved.
*/
static
bool
checkTlistForSubsRef(List *targetEntryList)
{
ListCell *tgtCell = NULL;
foreach(tgtCell, targetEntryList)
{
TargetEntry *targetEntry = (TargetEntry *) lfirst(tgtCell);
Expr *expr = targetEntry->expr;
if (isSubsRef((Node *) expr))
{
return true;
}
}
return false;
}
/*
* ExpandMergedSubscriptingRefEntries takes a list of target entries and expands
* each one that references a SubscriptingRef node that indicates multiple (field)
* updates on the same attribute, which is applicable for array/json types atm.
*/
List *
ExpandMergedSubscriptingRefEntries(List *targetEntryList)
{
List *newTargetEntryList = NIL;
ListCell *tgtCell = NULL;
if (!checkTlistForSubsRef(targetEntryList))
{
/* No subscripting refs found, return original list */
return targetEntryList;
}
foreach(tgtCell, targetEntryList)
{
TargetEntry *targetEntry = (TargetEntry *) lfirst(tgtCell);
List *expandedTargetEntries = NIL;
Expr *expr = targetEntry->expr;
while (expr)
{
SubscriptingRef *subsRef = TargetEntryExprFindSubsRef(expr);
if (!subsRef)
{
break;
}
/*
* Remove refexpr from the SubscriptingRef that we are about to
* wrap in a new TargetEntry and save it for the next one.
*/
Expr *refexpr = subsRef->refexpr;
subsRef->refexpr = NULL;
/*
* Wrap the Expr that holds SubscriptingRef (directly or indirectly)
* in a new TargetEntry; note that it doesn't have a refexpr anymore.
*/
TargetEntry *newTargetEntry = copyObject(targetEntry);
newTargetEntry->expr = expr;
expandedTargetEntries = lappend(expandedTargetEntries, newTargetEntry);
/* now inspect the refexpr that SubscriptingRef at hand were holding */
expr = refexpr;
}
if (expandedTargetEntries == NIL)
{
/* return original entry since it doesn't hold a SubscriptingRef node */
newTargetEntryList = lappend(newTargetEntryList, targetEntry);
}
else
{
/*
* Need to concat expanded target list entries in reverse order
* to preserve ordering of the original target entry list.
*/
List *reversedTgtEntries = NIL;
ListCell *revCell = NULL;
foreach(revCell, expandedTargetEntries)
{
TargetEntry *tgtEntry = (TargetEntry *) lfirst(revCell);
reversedTgtEntries = lcons(tgtEntry, reversedTgtEntries);
}
newTargetEntryList = list_concat(newTargetEntryList, reversedTgtEntries);
}
}
return newTargetEntryList;
}
/*
* TargetEntryExprFindSubsRef searches given Expr --assuming that it is part
* of a target list entry-- to see if it directly (i.e.: itself) or indirectly
* (e.g.: behind some level of coercions) holds a SubscriptingRef node.
*
* Returns the original SubscriptingRef node on success or NULL otherwise.
*
* Note that it wouldn't add much value to use expression_tree_walker here
* since we are only interested in a subset of the fields of a few certain
* node types.
*/
static SubscriptingRef *
TargetEntryExprFindSubsRef(Expr *expr)
{
Node *node = (Node *) expr;
while (node)
{
if (IsA(node, FieldStore))
{
/*
* ModifyPartialQuerySupported doesn't allow INSERT/UPDATE via
* FieldStore. If we decide supporting such commands, then we
* should take the first element of "newvals" list into account
* here. This is because, to support such commands, we will need
* to expand merged FieldStore into separate target entries too.
*
* For this reason, this block is not reachable atm and need to
* uncomment the following if we decide supporting such commands.
*
* """
* FieldStore *fieldStore = (FieldStore *) node;
* node = (Node *) linitial(fieldStore->newvals);
* """
*/
ereport(ERROR, (errmsg("unexpectedly got FieldStore object when "
"generating shard query")));
}
else if (IsA(node, CoerceToDomain))
{
CoerceToDomain *coerceToDomain = (CoerceToDomain *) node;
if (coerceToDomain->coercionformat != COERCE_IMPLICIT_CAST)
{
/* not an implicit coercion, cannot reach to a SubscriptingRef */
break;
}
node = (Node *) coerceToDomain->arg;
}
else if (IsA(node, SubscriptingRef))
{
return (SubscriptingRef *) node;
}
else
{
/* got a node that we are not interested in */
break;
}
}
return NULL;
}

View File

@ -649,13 +649,18 @@ AppendAlterTableCmdAddColumn(StringInfo buf, AlterTableCmd *alterTableCmd,
}
else if (constraint->contype == CONSTR_GENERATED)
{
char attgenerated = 's';
appendStringInfo(buf, " GENERATED %s AS (%s) STORED",
char attgenerated = ATTRIBUTE_GENERATED_STORED;
#if PG_VERSION_NUM >= PG_VERSION_18
attgenerated = constraint->generated_kind;
#endif
appendStringInfo(buf, " GENERATED %s AS (%s) %s",
GeneratedWhenStr(constraint->generated_when),
DeparseRawExprForColumnDefault(relationId, typeOid, typmod,
columnDefinition->colname,
attgenerated,
constraint->raw_expr));
constraint->raw_expr),
(attgenerated == ATTRIBUTE_GENERATED_STORED ? "STORED" :
"VIRTUAL"));
}
else if (constraint->contype == CONSTR_CHECK ||
constraint->contype == CONSTR_PRIMARY ||

View File

@ -34,7 +34,14 @@ QualifyCreateStatisticsStmt(Node *node)
{
CreateStatsStmt *stmt = castNode(CreateStatsStmt, node);
RangeVar *relation = (RangeVar *) linitial(stmt->relations);
Node *relationNode = (Node *) linitial(stmt->relations);
if (!IsA(relationNode, RangeVar))
{
return;
}
RangeVar *relation = (RangeVar *) relationNode;
if (relation->schemaname == NULL)
{

View File

@ -1568,7 +1568,6 @@ set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
if (colinfo->is_new_col[col_index])
i++;
}
Assert(i == colinfo->num_cols);
Assert(j == nnewcolumns);
#endif
@ -3509,6 +3508,8 @@ get_update_query_targetlist_def(Query *query, List *targetList,
SubLink *cur_ma_sublink;
List *ma_sublinks;
targetList = ExpandMergedSubscriptingRefEntries(targetList);
/*
* Prepare to deal with MULTIEXPR assignments: collect the source SubLinks
* into a list. We expect them to appear, in ID order, in resjunk tlist
@ -3532,6 +3533,8 @@ get_update_query_targetlist_def(Query *query, List *targetList,
}
}
}
ensure_update_targetlist_in_param_order(targetList);
}
next_ma_cell = list_head(ma_sublinks);
cur_ma_sublink = NULL;

View File

@ -1585,7 +1585,6 @@ set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
if (colinfo->is_new_col[col_index])
i++;
}
Assert(i == colinfo->num_cols);
Assert(j == nnewcolumns);
#endif
@ -3525,6 +3524,8 @@ get_update_query_targetlist_def(Query *query, List *targetList,
SubLink *cur_ma_sublink;
List *ma_sublinks;
targetList = ExpandMergedSubscriptingRefEntries(targetList);
/*
* Prepare to deal with MULTIEXPR assignments: collect the source SubLinks
* into a list. We expect them to appear, in ID order, in resjunk tlist
@ -3548,6 +3549,8 @@ get_update_query_targetlist_def(Query *query, List *targetList,
}
}
}
ensure_update_targetlist_in_param_order(targetList);
}
next_ma_cell = list_head(ma_sublinks);
cur_ma_sublink = NULL;

View File

@ -1599,7 +1599,6 @@ set_join_column_names(deparse_namespace *dpns, RangeTblEntry *rte,
if (colinfo->is_new_col[col_index])
i++;
}
Assert(i == colinfo->num_cols);
Assert(j == nnewcolumns);
#endif
@ -3542,6 +3541,8 @@ get_update_query_targetlist_def(Query *query, List *targetList,
SubLink *cur_ma_sublink;
List *ma_sublinks;
targetList = ExpandMergedSubscriptingRefEntries(targetList);
/*
* Prepare to deal with MULTIEXPR assignments: collect the source SubLinks
* into a list. We expect them to appear, in ID order, in resjunk tlist
@ -3565,6 +3566,8 @@ get_update_query_targetlist_def(Query *query, List *targetList,
}
}
}
ensure_update_targetlist_in_param_order(targetList);
}
next_ma_cell = list_head(ma_sublinks);
cur_ma_sublink = NULL;

File diff suppressed because it is too large Load Diff

View File

@ -760,7 +760,7 @@ AdaptiveExecutorPreExecutorRun(CitusScanState *scanState)
*/
LockPartitionsForDistributedPlan(distributedPlan);
ExecuteSubPlans(distributedPlan);
ExecuteSubPlans(distributedPlan, RequestedForExplainAnalyze(scanState));
scanState->finishedPreScan = true;
}
@ -3804,7 +3804,7 @@ PopAssignedPlacementExecution(WorkerSession *session)
/*
* PopAssignedPlacementExecution finds an executable task from the queue of assigned tasks.
* PopUnAssignedPlacementExecution finds an executable task from the queue of unassigned tasks.
*/
static TaskPlacementExecution *
PopUnassignedPlacementExecution(WorkerPool *workerPool)

View File

@ -682,11 +682,13 @@ RegenerateTaskForFasthPathQuery(Job *workerJob)
}
bool isLocalTableModification = false;
bool delayedFastPath = false;
GenerateSingleShardRouterTaskList(workerJob,
relationShardList,
placementList,
shardId,
isLocalTableModification);
isLocalTableModification,
delayedFastPath);
}

View File

@ -42,6 +42,7 @@
#include "distributed/merge_planner.h"
#include "distributed/metadata_cache.h"
#include "distributed/multi_executor.h"
#include "distributed/multi_explain.h"
#include "distributed/multi_partitioning_utils.h"
#include "distributed/multi_physical_planner.h"
#include "distributed/multi_router_planner.h"
@ -121,7 +122,7 @@ NonPushableInsertSelectExecScan(CustomScanState *node)
bool binaryFormat =
CanUseBinaryCopyFormatForTargetList(selectQuery->targetList);
ExecuteSubPlans(distSelectPlan);
ExecuteSubPlans(distSelectPlan, RequestedForExplainAnalyze(scanState));
/*
* We have a separate directory for each transaction, so choosing

View File

@ -313,6 +313,7 @@ ExecuteLocalTaskListExtended(List *taskList,
{
int taskNumParams = numParams;
Oid *taskParameterTypes = parameterTypes;
int taskType = GetTaskQueryType(task);
if (task->parametersInQueryStringResolved)
{
@ -330,7 +331,7 @@ ExecuteLocalTaskListExtended(List *taskList,
* for concatenated strings, we set queryStringList so that we can access
* each query string.
*/
if (GetTaskQueryType(task) == TASK_QUERY_TEXT_LIST)
if (taskType == TASK_QUERY_TEXT_LIST)
{
List *queryStringList = task->taskQuery.data.queryStringList;
totalRowsProcessed +=
@ -342,22 +343,31 @@ ExecuteLocalTaskListExtended(List *taskList,
continue;
}
Query *shardQuery = ParseQueryString(TaskQueryString(task),
taskParameterTypes,
taskNumParams);
if (taskType != TASK_QUERY_LOCAL_PLAN)
{
Query *shardQuery = ParseQueryString(TaskQueryString(task),
taskParameterTypes,
taskNumParams);
int cursorOptions = CURSOR_OPT_PARALLEL_OK;
int cursorOptions = CURSOR_OPT_PARALLEL_OK;
/*
* Altough the shardQuery is local to this node, we prefer planner()
* over standard_planner(). The primary reason for that is Citus itself
* is not very tolarent standard_planner() calls that doesn't go through
* distributed_planner() because of the way that restriction hooks are
* implemented. So, let planner to call distributed_planner() which
* eventually calls standard_planner().
*/
localPlan = planner(shardQuery, NULL, cursorOptions, paramListInfo);
/*
* Altough the shardQuery is local to this node, we prefer planner()
* over standard_planner(). The primary reason for that is Citus itself
* is not very tolarent standard_planner() calls that doesn't go through
* distributed_planner() because of the way that restriction hooks are
* implemented. So, let planner to call distributed_planner() which
* eventually calls standard_planner().
*/
localPlan = planner(shardQuery, NULL, cursorOptions, paramListInfo);
}
else
{
ereport(DEBUG2, (errmsg(
"Local executor: Using task's cached local plan for task %u",
task->taskId)));
localPlan = TaskQueryLocalPlan(task);
}
}
char *shardQueryString = NULL;
@ -754,14 +764,29 @@ ExecuteTaskPlan(PlannedStmt *taskPlan, char *queryString,
localPlacementIndex) :
CreateDestReceiver(DestNone);
/* Create a QueryDesc for the query */
QueryDesc *queryDesc = CreateQueryDesc(taskPlan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
destReceiver, paramListInfo,
queryEnv, 0);
QueryDesc *queryDesc = CreateQueryDesc(
taskPlan, /* PlannedStmt *plannedstmt */
queryString, /* const char *sourceText */
GetActiveSnapshot(), /* Snapshot snapshot */
InvalidSnapshot, /* Snapshot crosscheck_snapshot */
destReceiver, /* DestReceiver *dest */
paramListInfo, /* ParamListInfo params */
queryEnv, /* QueryEnvironment *queryEnv */
0 /* int instrument_options */
);
ExecutorStart(queryDesc, eflags);
/* run the plan: count = 0 (all rows) */
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG 18+ dropped the “execute_once” boolean */
ExecutorRun(queryDesc, scanDirection, 0L);
#else
/* PG 17 and prevs still expect the 4th once argument */
ExecutorRun(queryDesc, scanDirection, 0L, true);
#endif
/*
* We'll set the executorState->es_processed later, for now only remember

View File

@ -23,6 +23,7 @@
#include "distributed/merge_executor.h"
#include "distributed/merge_planner.h"
#include "distributed/multi_executor.h"
#include "distributed/multi_explain.h"
#include "distributed/multi_partitioning_utils.h"
#include "distributed/multi_router_planner.h"
#include "distributed/repartition_executor.h"
@ -132,7 +133,7 @@ ExecuteSourceAtWorkerAndRepartition(CitusScanState *scanState)
ereport(DEBUG1, (errmsg("Executing subplans of the source query and "
"storing the results at the respective node(s)")));
ExecuteSubPlans(distSourcePlan);
ExecuteSubPlans(distSourcePlan, RequestedForExplainAnalyze(scanState));
/*
* We have a separate directory for each transaction, so choosing

View File

@ -235,7 +235,20 @@ CitusExecutorRun(QueryDesc *queryDesc,
/* postgres will switch here again and will restore back on its own */
MemoryContextSwitchTo(oldcontext);
standard_ExecutorRun(queryDesc, direction, count, execute_once);
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG18+ drops the “execute_once” argument */
standard_ExecutorRun(queryDesc,
direction,
count);
#else
/* PG17-: original four-arg signature */
standard_ExecutorRun(queryDesc,
direction,
count,
execute_once);
#endif
}
if (totalTime)
@ -675,7 +688,7 @@ ExecuteQueryIntoDestReceiver(Query *query, ParamListInfo params, DestReceiver *d
* ExecutePlanIntoDestReceiver executes a query plan and sends results to the given
* DestReceiver.
*/
void
uint64
ExecutePlanIntoDestReceiver(PlannedStmt *queryPlan, ParamListInfo params,
DestReceiver *dest)
{
@ -688,16 +701,44 @@ ExecutePlanIntoDestReceiver(PlannedStmt *queryPlan, ParamListInfo params,
/* don't display the portal in pg_cursors, it is for internal use only */
portal->visible = false;
PortalDefineQuery(portal,
NULL,
"",
CMDTAG_SELECT,
list_make1(queryPlan),
NULL);
PortalDefineQuery(
portal,
NULL, /* no prepared statement name */
"", /* query text */
CMDTAG_SELECT, /* command tag */
list_make1(queryPlan),/* list of PlannedStmt* */
NULL /* no CachedPlan */
);
PortalStart(portal, params, eflags, GetActiveSnapshot());
PortalRun(portal, count, false, true, dest, dest, NULL);
QueryCompletion qc = { 0 };
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG 18+: six-arg signature (drop the run_once bool) */
PortalRun(portal,
count, /* how many rows to fetch */
false, /* isTopLevel */
dest, /* DestReceiver *dest */
dest, /* DestReceiver *altdest */
&qc); /* QueryCompletion *qc */
#else
/* PG 17-: original seven-arg signature */
PortalRun(portal,
count, /* how many rows to fetch */
false, /* isTopLevel */
true, /* run_once */
dest, /* DestReceiver *dest */
dest, /* DestReceiver *altdest */
&qc); /* QueryCompletion *qc */
#endif
PortalDrop(portal, false);
return qc.nprocessed;
}

View File

@ -242,7 +242,27 @@ worker_partition_query_result(PG_FUNCTION_ARGS)
allowNullPartitionColumnValues);
/* execute the query */
PortalRun(portal, FETCH_ALL, false, true, dest, dest, NULL);
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG18+: drop the “run_once” bool */
PortalRun(portal,
FETCH_ALL, /* count */
false, /* isTopLevel */
dest, /* dest receiver */
dest, /* alternative dest */
NULL); /* QueryCompletion *qc */
#else
/* PG1517: original sevenarg signature */
PortalRun(portal,
FETCH_ALL, /* count */
false, /* isTopLevel */
true, /* run_once */
dest, /* dest receiver */
dest, /* alternative dest */
NULL); /* QueryCompletion *qc */
#endif
/* construct the output result */
TupleDesc returnTupleDesc = NULL;
@ -295,8 +315,15 @@ StartPortalForQueryExecution(const char *queryString)
/* don't display the portal in pg_cursors, it is for internal use only */
portal->visible = false;
PortalDefineQuery(portal, NULL, queryString, CMDTAG_SELECT,
list_make1(queryPlan), NULL);
PortalDefineQuery(
portal,
NULL,
queryString,
CMDTAG_SELECT,
list_make1(queryPlan),
NULL /* no CachedPlan */
);
int eflags = 0;
PortalStart(portal, NULL, eflags, GetActiveSnapshot());

View File

@ -30,13 +30,22 @@ int MaxIntermediateResult = 1048576; /* maximum size in KB the intermediate resu
/* when this is true, we enforce intermediate result size limit in all executors */
int SubPlanLevel = 0;
/*
* SubPlanExplainAnalyzeContext is both a memory context for storing
* subplans EXPLAIN ANALYZE output and a flag indicating that execution
* is running under EXPLAIN ANALYZE for subplans.
*/
MemoryContext SubPlanExplainAnalyzeContext = NULL;
SubPlanExplainOutputData *SubPlanExplainOutput;
extern uint8 TotalExplainOutputCapacity;
extern uint8 NumTasksOutput;
/*
* ExecuteSubPlans executes a list of subplans from a distributed plan
* by sequentially executing each plan from the top.
*/
void
ExecuteSubPlans(DistributedPlan *distributedPlan)
ExecuteSubPlans(DistributedPlan *distributedPlan, bool explainAnalyzeEnabled)
{
uint64 planId = distributedPlan->planId;
List *subPlanList = distributedPlan->subPlanList;
@ -47,6 +56,19 @@ ExecuteSubPlans(DistributedPlan *distributedPlan)
return;
}
/*
* If the root DistributedPlan has EXPLAIN ANALYZE enabled,
* its subplans should also have EXPLAIN ANALYZE enabled.
*/
if (explainAnalyzeEnabled)
{
SubPlanExplainAnalyzeContext = GetMemoryChunkContext(distributedPlan);
}
else
{
SubPlanExplainAnalyzeContext = NULL;
}
HTAB *intermediateResultsHash = MakeIntermediateResultHTAB();
RecordSubplanExecutionsOnNodes(intermediateResultsHash, distributedPlan);
@ -79,7 +101,23 @@ ExecuteSubPlans(DistributedPlan *distributedPlan)
TimestampTz startTimestamp = GetCurrentTimestamp();
ExecutePlanIntoDestReceiver(plannedStmt, params, copyDest);
uint64 nprocessed;
PG_TRY();
{
nprocessed =
ExecutePlanIntoDestReceiver(plannedStmt, params, copyDest);
}
PG_CATCH();
{
SubPlanExplainAnalyzeContext = NULL;
SubPlanExplainOutput = NULL;
TotalExplainOutputCapacity = 0;
NumTasksOutput = 0;
PG_RE_THROW();
}
PG_END_TRY();
/*
* EXPLAIN ANALYZE instrumentations. Calculating these are very light-weight,
@ -94,10 +132,24 @@ ExecuteSubPlans(DistributedPlan *distributedPlan)
subPlan->durationMillisecs += durationMicrosecs * MICRO_TO_MILLI_SECOND;
subPlan->bytesSentPerWorker = RemoteFileDestReceiverBytesSent(copyDest);
subPlan->ntuples = nprocessed;
subPlan->remoteWorkerCount = list_length(remoteWorkerNodeList);
subPlan->writeLocalFile = entry->writeLocalFile;
SubPlanLevel--;
/*
* Save the EXPLAIN ANALYZE output(s) for later extraction in ExplainSubPlans().
* Because the SubPlan context isnt available during distributed execution,
* pass the pointer as a global variable in SubPlanExplainOutput.
*/
subPlan->totalExplainOutput = SubPlanExplainOutput;
subPlan->numTasksOutput = NumTasksOutput;
SubPlanExplainOutput = NULL;
TotalExplainOutputCapacity = 0;
NumTasksOutput = 0;
FreeExecutorState(estate);
}
SubPlanExplainAnalyzeContext = NULL;
}

View File

@ -1249,8 +1249,9 @@ IsObjectAddressOwnedByCitus(const ObjectAddress *objectAddress)
return false;
}
bool ownedByCitus = extObjectAddress.objectId == citusId;
bool ownedByCitusColumnar = extObjectAddress.objectId == citusColumnarId;
bool ownedByCitus = OidIsValid(citusId) && extObjectAddress.objectId == citusId;
bool ownedByCitusColumnar = OidIsValid(citusColumnarId) &&
extObjectAddress.objectId == citusColumnarId;
return ownedByCitus || ownedByCitusColumnar;
}

View File

@ -109,13 +109,20 @@ citus_unmark_object_distributed(PG_FUNCTION_ARGS)
Oid classid = PG_GETARG_OID(0);
Oid objid = PG_GETARG_OID(1);
int32 objsubid = PG_GETARG_INT32(2);
/*
* SQL function master_unmark_object_distributed doesn't expect the
* 4th argument but SQL function citus_unmark_object_distributed does
* so as checkobjectexistence argument. For this reason, we try to
* get the 4th argument only if this C function is called with 4
* arguments.
*/
bool checkObjectExistence = true;
if (!PG_ARGISNULL(3))
if (PG_NARGS() == 4)
{
checkObjectExistence = PG_GETARG_BOOL(3);
}
ObjectAddress address = { 0 };
ObjectAddressSubSet(address, classid, objid, objsubid);
@ -673,11 +680,9 @@ UpdateDistributedObjectColocationId(uint32 oldColocationId,
HeapTuple heapTuple;
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
Datum values[Natts_pg_dist_object];
bool isnull[Natts_pg_dist_object];
bool replace[Natts_pg_dist_object];
memset(replace, 0, sizeof(replace));
Datum *values = palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = palloc0(tupleDescriptor->natts * sizeof(bool));
replace[Anum_pg_dist_object_colocationid - 1] = true;
@ -691,6 +696,10 @@ UpdateDistributedObjectColocationId(uint32 oldColocationId,
CatalogTupleUpdate(pgDistObjectRel, &heapTuple->t_self, heapTuple);
CitusInvalidateRelcacheByRelid(DistObjectRelationId());
pfree(values);
pfree(isnull);
pfree(replace);
}
systable_endscan(scanDescriptor);
@ -776,3 +785,23 @@ DistributedSequenceList(void)
relation_close(pgDistObjectRel, AccessShareLock);
return distributedSequenceList;
}
/*
* GetForceDelegationAttrIndexInPgDistObject returns attrnum for force_delegation attr.
*
* force_delegation attr was added to table pg_dist_object using alter operation after
* the version where Citus started supporting downgrades, and it's only column that we've
* introduced to pg_dist_object since then.
*
* And in case of a downgrade + upgrade, tupleDesc->natts becomes greater than
* Natts_pg_dist_object and when this happens, then we know that attrnum force_delegation is
* not Anum_pg_dist_object_force_delegation anymore but tupleDesc->natts - 1.
*/
int
GetForceDelegationAttrIndexInPgDistObject(TupleDesc tupleDesc)
{
return tupleDesc->natts == Natts_pg_dist_object
? (Anum_pg_dist_object_force_delegation - 1)
: tupleDesc->natts - 1;
}

View File

@ -221,6 +221,7 @@ typedef struct MetadataCacheData
Oid textCopyFormatId;
Oid primaryNodeRoleId;
Oid secondaryNodeRoleId;
Oid unavailableNodeRoleId;
Oid pgTableIsVisibleFuncId;
Oid citusTableIsVisibleFuncId;
Oid distAuthinfoRelationId;
@ -320,9 +321,10 @@ static void CachedRelationNamespaceLookup(const char *relationName, Oid relnames
static void CachedRelationNamespaceLookupExtended(const char *relationName,
Oid renamespace, Oid *cachedOid,
bool missing_ok);
static ShardPlacement * ResolveGroupShardPlacement(
GroupShardPlacement *groupShardPlacement, CitusTableCacheEntry *tableEntry,
int shardIndex);
static ShardPlacement * ResolveGroupShardPlacement(GroupShardPlacement *
groupShardPlacement,
CitusTableCacheEntry *tableEntry,
int shardIndex);
static Oid LookupEnumValueId(Oid typeId, char *valueName);
static void InvalidateCitusTableCacheEntrySlot(CitusTableCacheEntrySlot *cacheSlot);
static void InvalidateDistTableCache(void);
@ -521,6 +523,32 @@ IsCitusTableTypeCacheEntry(CitusTableCacheEntry *tableEntry, CitusTableType tabl
}
/*
* IsFirstShard returns true if the given shardId is the first shard.
*/
bool
IsFirstShard(CitusTableCacheEntry *tableEntry, uint64 shardId)
{
if (tableEntry == NULL || tableEntry->sortedShardIntervalArray == NULL)
{
return false;
}
if (tableEntry->sortedShardIntervalArray[0]->shardId == INVALID_SHARD_ID)
{
return false;
}
if (shardId == tableEntry->sortedShardIntervalArray[0]->shardId)
{
return true;
}
else
{
return false;
}
}
/*
* HasDistributionKey returns true if given Citus table has a distribution key.
*/
@ -729,12 +757,13 @@ PartitionMethodViaCatalog(Oid relationId)
return DISTRIBUTE_BY_INVALID;
}
Datum datumArray[Natts_pg_dist_partition];
bool isNullArray[Natts_pg_dist_partition];
Relation pgDistPartition = table_open(DistPartitionRelationId(), AccessShareLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *datumArray = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(partitionTuple, tupleDescriptor, datumArray, isNullArray);
if (isNullArray[Anum_pg_dist_partition_partmethod - 1])
@ -742,6 +771,8 @@ PartitionMethodViaCatalog(Oid relationId)
/* partition method cannot be NULL, still let's make sure */
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
return DISTRIBUTE_BY_INVALID;
}
@ -750,6 +781,8 @@ PartitionMethodViaCatalog(Oid relationId)
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
return partitionMethodChar;
}
@ -768,12 +801,12 @@ PartitionColumnViaCatalog(Oid relationId)
return NULL;
}
Datum datumArray[Natts_pg_dist_partition];
bool isNullArray[Natts_pg_dist_partition];
Relation pgDistPartition = table_open(DistPartitionRelationId(), AccessShareLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *datumArray = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(partitionTuple, tupleDescriptor, datumArray, isNullArray);
if (isNullArray[Anum_pg_dist_partition_partkey - 1])
@ -781,6 +814,8 @@ PartitionColumnViaCatalog(Oid relationId)
/* partition key cannot be NULL, still let's make sure */
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
return NULL;
}
@ -795,6 +830,8 @@ PartitionColumnViaCatalog(Oid relationId)
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
return partitionColumn;
}
@ -813,12 +850,13 @@ ColocationIdViaCatalog(Oid relationId)
return INVALID_COLOCATION_ID;
}
Datum datumArray[Natts_pg_dist_partition];
bool isNullArray[Natts_pg_dist_partition];
Relation pgDistPartition = table_open(DistPartitionRelationId(), AccessShareLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *datumArray = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(partitionTuple, tupleDescriptor, datumArray, isNullArray);
if (isNullArray[Anum_pg_dist_partition_colocationid - 1])
@ -826,6 +864,8 @@ ColocationIdViaCatalog(Oid relationId)
/* colocation id cannot be NULL, still let's make sure */
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
return INVALID_COLOCATION_ID;
}
@ -834,6 +874,8 @@ ColocationIdViaCatalog(Oid relationId)
heap_freetuple(partitionTuple);
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
return colocationId;
}
@ -1690,8 +1732,11 @@ LookupDistObjectCacheEntry(Oid classid, Oid objid, int32 objsubid)
if (HeapTupleIsValid(pgDistObjectTup))
{
Datum datumArray[Natts_pg_dist_object];
bool isNullArray[Natts_pg_dist_object];
Datum *datumArray = palloc(pgDistObjectTupleDesc->natts * sizeof(Datum));
bool *isNullArray = palloc(pgDistObjectTupleDesc->natts * sizeof(bool));
int forseDelegationIndex =
GetForceDelegationAttrIndexInPgDistObject(pgDistObjectTupleDesc);
heap_deform_tuple(pgDistObjectTup, pgDistObjectTupleDesc, datumArray,
isNullArray);
@ -1706,7 +1751,10 @@ LookupDistObjectCacheEntry(Oid classid, Oid objid, int32 objsubid)
DatumGetInt32(datumArray[Anum_pg_dist_object_colocationid - 1]);
cacheEntry->forceDelegation =
DatumGetBool(datumArray[Anum_pg_dist_object_force_delegation - 1]);
DatumGetBool(datumArray[forseDelegationIndex]);
pfree(datumArray);
pfree(isNullArray);
}
else
{
@ -1741,10 +1789,11 @@ BuildCitusTableCacheEntry(Oid relationId)
}
MemoryContext oldContext = NULL;
Datum datumArray[Natts_pg_dist_partition];
bool isNullArray[Natts_pg_dist_partition];
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *datumArray = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(distPartitionTuple, tupleDescriptor, datumArray, isNullArray);
CitusTableCacheEntry *cacheEntry =
@ -1797,7 +1846,7 @@ BuildCitusTableCacheEntry(Oid relationId)
cacheEntry->replicationModel = DatumGetChar(replicationModelDatum);
}
if (isNullArray[Anum_pg_dist_partition_autoconverted - 1])
if (isNullArray[GetAutoConvertedAttrIndexInPgDistPartition(tupleDescriptor)])
{
/*
* We don't expect this to happen, but set it to false (the default value)
@ -1808,7 +1857,7 @@ BuildCitusTableCacheEntry(Oid relationId)
else
{
cacheEntry->autoConverted = DatumGetBool(
datumArray[Anum_pg_dist_partition_autoconverted - 1]);
datumArray[GetAutoConvertedAttrIndexInPgDistPartition(tupleDescriptor)]);
}
heap_freetuple(distPartitionTuple);
@ -1852,6 +1901,9 @@ BuildCitusTableCacheEntry(Oid relationId)
table_close(pgDistPartition, NoLock);
pfree(datumArray);
pfree(isNullArray);
cacheEntry->isValid = true;
return cacheEntry;
@ -3550,6 +3602,20 @@ SecondaryNodeRoleId(void)
}
/* return the Oid of the 'unavailable' nodeRole enum value */
Oid
UnavailableNodeRoleId(void)
{
if (!MetadataCache.unavailableNodeRoleId)
{
MetadataCache.unavailableNodeRoleId = LookupStringEnumValueId("noderole",
"unavailable");
}
return MetadataCache.unavailableNodeRoleId;
}
Oid
CitusJobStatusScheduledId(void)
{
@ -4367,6 +4433,8 @@ InitializeWorkerNodeCache(void)
workerNode->isActive = currentNode->isActive;
workerNode->nodeRole = currentNode->nodeRole;
workerNode->shouldHaveShards = currentNode->shouldHaveShards;
workerNode->nodeprimarynodeid = currentNode->nodeprimarynodeid;
workerNode->nodeisclone = currentNode->nodeisclone;
strlcpy(workerNode->nodeCluster, currentNode->nodeCluster, NAMEDATALEN);
newWorkerNodeArray[workerNodeIndex++] = workerNode;
@ -5011,10 +5079,13 @@ CitusTableTypeIdList(CitusTableType citusTableType)
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
HeapTuple heapTuple = systable_getnext(scanDescriptor);
Datum *datumArray = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
while (HeapTupleIsValid(heapTuple))
{
bool isNullArray[Natts_pg_dist_partition];
Datum datumArray[Natts_pg_dist_partition];
memset(datumArray, 0, tupleDescriptor->natts * sizeof(Datum));
memset(isNullArray, 0, tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(heapTuple, tupleDescriptor, datumArray, isNullArray);
Datum partMethodDatum = datumArray[Anum_pg_dist_partition_partmethod - 1];
@ -5038,6 +5109,9 @@ CitusTableTypeIdList(CitusTableType citusTableType)
heapTuple = systable_getnext(scanDescriptor);
}
pfree(datumArray);
pfree(isNullArray);
systable_endscan(scanDescriptor);
table_close(pgDistPartition, AccessShareLock);

View File

@ -58,6 +58,7 @@
#include "distributed/argutils.h"
#include "distributed/backend_data.h"
#include "distributed/background_worker_utils.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/colocation_utils.h"
#include "distributed/commands.h"
@ -573,13 +574,17 @@ FetchRelationIdFromPgPartitionHeapTuple(HeapTuple heapTuple, TupleDesc tupleDesc
{
Assert(heapTuple->t_tableOid == DistPartitionRelationId());
bool isNullArray[Natts_pg_dist_partition];
Datum datumArray[Natts_pg_dist_partition];
Datum *datumArray = (Datum *) palloc(tupleDesc->natts * sizeof(Datum));
bool *isNullArray = (bool *) palloc(tupleDesc->natts * sizeof(bool));
heap_deform_tuple(heapTuple, tupleDesc, datumArray, isNullArray);
Datum relationIdDatum = datumArray[Anum_pg_dist_partition_logicalrelid - 1];
Oid relationId = DatumGetObjectId(relationIdDatum);
pfree(datumArray);
pfree(isNullArray);
return relationId;
}
@ -814,7 +819,7 @@ NodeListInsertCommand(List *workerNodeList)
appendStringInfo(nodeListInsertCommand,
"INSERT INTO pg_dist_node (nodeid, groupid, nodename, nodeport, "
"noderack, hasmetadata, metadatasynced, isactive, noderole, "
"nodecluster, shouldhaveshards) VALUES ");
"nodecluster, shouldhaveshards, nodeisclone, nodeprimarynodeid) VALUES ");
/* iterate over the worker nodes, add the values */
WorkerNode *workerNode = NULL;
@ -824,13 +829,14 @@ NodeListInsertCommand(List *workerNodeList)
char *metadataSyncedString = workerNode->metadataSynced ? "TRUE" : "FALSE";
char *isActiveString = workerNode->isActive ? "TRUE" : "FALSE";
char *shouldHaveShards = workerNode->shouldHaveShards ? "TRUE" : "FALSE";
char *nodeiscloneString = workerNode->nodeisclone ? "TRUE" : "FALSE";
Datum nodeRoleOidDatum = ObjectIdGetDatum(workerNode->nodeRole);
Datum nodeRoleStringDatum = DirectFunctionCall1(enum_out, nodeRoleOidDatum);
char *nodeRoleString = DatumGetCString(nodeRoleStringDatum);
appendStringInfo(nodeListInsertCommand,
"(%d, %d, %s, %d, %s, %s, %s, %s, '%s'::noderole, %s, %s)",
"(%d, %d, %s, %d, %s, %s, %s, %s, '%s'::noderole, %s, %s, %s, %d)",
workerNode->nodeId,
workerNode->groupId,
quote_literal_cstr(workerNode->workerName),
@ -841,7 +847,9 @@ NodeListInsertCommand(List *workerNodeList)
isActiveString,
nodeRoleString,
quote_literal_cstr(workerNode->nodeCluster),
shouldHaveShards);
shouldHaveShards,
nodeiscloneString,
workerNode->nodeprimarynodeid);
processedWorkerNodeCount++;
if (processedWorkerNodeCount != workerCount)
@ -875,9 +883,11 @@ NodeListIdempotentInsertCommand(List *workerNodeList)
"hasmetadata = EXCLUDED.hasmetadata, "
"isactive = EXCLUDED.isactive, "
"noderole = EXCLUDED.noderole, "
"nodecluster = EXCLUDED.nodecluster ,"
"nodecluster = EXCLUDED.nodecluster, "
"metadatasynced = EXCLUDED.metadatasynced, "
"shouldhaveshards = EXCLUDED.shouldhaveshards";
"shouldhaveshards = EXCLUDED.shouldhaveshards, "
"nodeisclone = EXCLUDED.nodeisclone, "
"nodeprimarynodeid = EXCLUDED.nodeprimarynodeid";
appendStringInfoString(nodeInsertIdempotentCommand, onConflictStr);
return nodeInsertIdempotentCommand->data;
}
@ -3152,37 +3162,26 @@ MetadataSyncSigAlrmHandler(SIGNAL_ARGS)
BackgroundWorkerHandle *
SpawnSyncNodeMetadataToNodes(Oid database, Oid extensionOwner)
{
BackgroundWorker worker;
BackgroundWorkerHandle *handle = NULL;
char workerName[BGW_MAXLEN];
/* Configure a worker. */
memset(&worker, 0, sizeof(worker));
SafeSnprintf(worker.bgw_name, BGW_MAXLEN,
SafeSnprintf(workerName, BGW_MAXLEN,
"Citus Metadata Sync: %u/%u",
database, extensionOwner);
worker.bgw_flags =
BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
/* don't restart, we manage restarts from maintenance daemon */
worker.bgw_restart_time = BGW_NEVER_RESTART;
strcpy_s(worker.bgw_library_name, sizeof(worker.bgw_library_name), "citus");
strcpy_s(worker.bgw_function_name, sizeof(worker.bgw_library_name),
"SyncNodeMetadataToNodesMain");
worker.bgw_main_arg = ObjectIdGetDatum(MyDatabaseId);
memcpy_s(worker.bgw_extra, sizeof(worker.bgw_extra), &extensionOwner,
sizeof(Oid));
worker.bgw_notify_pid = MyProcPid;
if (!RegisterDynamicBackgroundWorker(&worker, &handle))
{
return NULL;
}
pid_t pid;
WaitForBackgroundWorkerStartup(handle, &pid);
return handle;
CitusBackgroundWorkerConfig config = {
.workerName = workerName,
.functionName = "SyncNodeMetadataToNodesMain",
.mainArg = ObjectIdGetDatum(MyDatabaseId),
.extensionOwner = extensionOwner,
.needsNotification = true,
.waitForStartup = false,
.restartTime = CITUS_BGW_NEVER_RESTART,
.startTime = CITUS_BGW_DEFAULT_START_TIME,
.workerType = NULL, /* use default */
.extraData = NULL,
.extraDataSize = 0
};
return RegisterCitusBackgroundWorker(&config);
}
@ -5241,7 +5240,7 @@ SendDistObjectCommands(MetadataSyncContext *context)
bool forceDelegationIsNull = false;
Datum forceDelegationDatum =
heap_getattr(nextTuple,
Anum_pg_dist_object_force_delegation,
GetForceDelegationAttrIndexInPgDistObject(tupleDesc) + 1,
tupleDesc,
&forceDelegationIsNull);
bool forceDelegation = DatumGetBool(forceDelegationDatum);

View File

@ -812,6 +812,7 @@ GenerateSizeQueryOnMultiplePlacements(List *shardIntervalList,
{
partitionedShardNames = lappend(partitionedShardNames, quotedShardName);
}
/* for non-partitioned tables, we will use Postgres' size functions */
else
{
@ -1919,23 +1920,22 @@ InsertIntoPgDistPartition(Oid relationId, char distributionMethod,
{
char *distributionColumnString = NULL;
Datum newValues[Natts_pg_dist_partition];
bool newNulls[Natts_pg_dist_partition];
/* open system catalog and insert new tuple */
Relation pgDistPartition = table_open(DistPartitionRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *newValues = (Datum *) palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *newNulls = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
/* form new tuple for pg_dist_partition */
memset(newValues, 0, sizeof(newValues));
memset(newNulls, false, sizeof(newNulls));
newValues[Anum_pg_dist_partition_logicalrelid - 1] =
ObjectIdGetDatum(relationId);
newValues[Anum_pg_dist_partition_partmethod - 1] =
CharGetDatum(distributionMethod);
newValues[Anum_pg_dist_partition_colocationid - 1] = UInt32GetDatum(colocationId);
newValues[Anum_pg_dist_partition_repmodel - 1] = CharGetDatum(replicationModel);
newValues[Anum_pg_dist_partition_autoconverted - 1] = BoolGetDatum(autoConverted);
newValues[GetAutoConvertedAttrIndexInPgDistPartition(tupleDescriptor)] =
BoolGetDatum(autoConverted);
/* set partkey column to NULL for reference tables */
if (distributionMethod != DISTRIBUTE_BY_NONE)
@ -1951,7 +1951,7 @@ InsertIntoPgDistPartition(Oid relationId, char distributionMethod,
newNulls[Anum_pg_dist_partition_partkey - 1] = true;
}
HeapTuple newTuple = heap_form_tuple(RelationGetDescr(pgDistPartition), newValues,
HeapTuple newTuple = heap_form_tuple(tupleDescriptor, newValues,
newNulls);
/* finally insert tuple, build index entries & register cache invalidation */
@ -1963,6 +1963,9 @@ InsertIntoPgDistPartition(Oid relationId, char distributionMethod,
CommandCounterIncrement();
table_close(pgDistPartition, NoLock);
pfree(newValues);
pfree(newNulls);
}
@ -2154,13 +2157,13 @@ UpdatePlacementGroupId(uint64 placementId, int groupId)
ScanKeyData scanKey[1];
int scanKeyCount = 1;
bool indexOK = true;
Datum values[Natts_pg_dist_placement];
bool isnull[Natts_pg_dist_placement];
bool replace[Natts_pg_dist_placement];
bool colIsNull = false;
Relation pgDistPlacement = table_open(DistPlacementRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPlacement);
Datum *values = (Datum *) palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
ScanKeyInit(&scanKey[0], Anum_pg_dist_placement_placementid,
BTEqualStrategyNumber, F_INT8EQ, Int64GetDatum(placementId));
@ -2177,8 +2180,6 @@ UpdatePlacementGroupId(uint64 placementId, int groupId)
placementId)));
}
memset(replace, 0, sizeof(replace));
values[Anum_pg_dist_placement_groupid - 1] = Int32GetDatum(groupId);
isnull[Anum_pg_dist_placement_groupid - 1] = false;
replace[Anum_pg_dist_placement_groupid - 1] = true;
@ -2197,6 +2198,10 @@ UpdatePlacementGroupId(uint64 placementId, int groupId)
systable_endscan(scanDescriptor);
table_close(pgDistPlacement, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
}
@ -2210,12 +2215,13 @@ UpdatePgDistPartitionAutoConverted(Oid citusTableId, bool autoConverted)
ScanKeyData scanKey[1];
int scanKeyCount = 1;
bool indexOK = true;
Datum values[Natts_pg_dist_partition];
bool isnull[Natts_pg_dist_partition];
bool replace[Natts_pg_dist_partition];
Relation pgDistPartition = table_open(DistPartitionRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *values = (Datum *) palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
ScanKeyInit(&scanKey[0], Anum_pg_dist_partition_logicalrelid,
BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(citusTableId));
@ -2231,11 +2237,10 @@ UpdatePgDistPartitionAutoConverted(Oid citusTableId, bool autoConverted)
citusTableId)));
}
memset(replace, 0, sizeof(replace));
values[Anum_pg_dist_partition_autoconverted - 1] = BoolGetDatum(autoConverted);
isnull[Anum_pg_dist_partition_autoconverted - 1] = false;
replace[Anum_pg_dist_partition_autoconverted - 1] = true;
int autoconvertedindex = GetAutoConvertedAttrIndexInPgDistPartition(tupleDescriptor);
values[autoconvertedindex] = BoolGetDatum(autoConverted);
isnull[autoconvertedindex] = false;
replace[autoconvertedindex] = true;
heapTuple = heap_modify_tuple(heapTuple, tupleDescriptor, values, isnull, replace);
@ -2247,6 +2252,10 @@ UpdatePgDistPartitionAutoConverted(Oid citusTableId, bool autoConverted)
systable_endscan(scanDescriptor);
table_close(pgDistPartition, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
}
@ -2286,12 +2295,13 @@ UpdateDistributionColumn(Oid relationId, char distributionMethod, Var *distribut
ScanKeyData scanKey[1];
int scanKeyCount = 1;
bool indexOK = true;
Datum values[Natts_pg_dist_partition];
bool isnull[Natts_pg_dist_partition];
bool replace[Natts_pg_dist_partition];
Relation pgDistPartition = table_open(DistPartitionRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *values = (Datum *) palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
ScanKeyInit(&scanKey[0], Anum_pg_dist_partition_logicalrelid,
BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(relationId));
@ -2307,8 +2317,6 @@ UpdateDistributionColumn(Oid relationId, char distributionMethod, Var *distribut
relationId)));
}
memset(replace, 0, sizeof(replace));
replace[Anum_pg_dist_partition_partmethod - 1] = true;
values[Anum_pg_dist_partition_partmethod - 1] = CharGetDatum(distributionMethod);
isnull[Anum_pg_dist_partition_partmethod - 1] = false;
@ -2317,9 +2325,10 @@ UpdateDistributionColumn(Oid relationId, char distributionMethod, Var *distribut
values[Anum_pg_dist_partition_colocationid - 1] = UInt32GetDatum(colocationId);
isnull[Anum_pg_dist_partition_colocationid - 1] = false;
replace[Anum_pg_dist_partition_autoconverted - 1] = true;
values[Anum_pg_dist_partition_autoconverted - 1] = BoolGetDatum(false);
isnull[Anum_pg_dist_partition_autoconverted - 1] = false;
int autoconvertedindex = GetAutoConvertedAttrIndexInPgDistPartition(tupleDescriptor);
replace[autoconvertedindex] = true;
values[autoconvertedindex] = BoolGetDatum(false);
isnull[autoconvertedindex] = false;
char *distributionColumnString = nodeToString((Node *) distributionColumn);
@ -2337,6 +2346,10 @@ UpdateDistributionColumn(Oid relationId, char distributionMethod, Var *distribut
systable_endscan(scanDescriptor);
table_close(pgDistPartition, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
}
@ -2380,12 +2393,13 @@ UpdateNoneDistTableMetadata(Oid relationId, char replicationModel, uint32 coloca
ScanKeyData scanKey[1];
int scanKeyCount = 1;
bool indexOK = true;
Datum values[Natts_pg_dist_partition];
bool isnull[Natts_pg_dist_partition];
bool replace[Natts_pg_dist_partition];
Relation pgDistPartition = table_open(DistPartitionRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistPartition);
Datum *values = (Datum *) palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
ScanKeyInit(&scanKey[0], Anum_pg_dist_partition_logicalrelid,
BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(relationId));
@ -2401,8 +2415,6 @@ UpdateNoneDistTableMetadata(Oid relationId, char replicationModel, uint32 coloca
relationId)));
}
memset(replace, 0, sizeof(replace));
values[Anum_pg_dist_partition_colocationid - 1] = UInt32GetDatum(colocationId);
isnull[Anum_pg_dist_partition_colocationid - 1] = false;
replace[Anum_pg_dist_partition_colocationid - 1] = true;
@ -2411,9 +2423,10 @@ UpdateNoneDistTableMetadata(Oid relationId, char replicationModel, uint32 coloca
isnull[Anum_pg_dist_partition_repmodel - 1] = false;
replace[Anum_pg_dist_partition_repmodel - 1] = true;
values[Anum_pg_dist_partition_autoconverted - 1] = BoolGetDatum(autoConverted);
isnull[Anum_pg_dist_partition_autoconverted - 1] = false;
replace[Anum_pg_dist_partition_autoconverted - 1] = true;
int autoconvertedindex = GetAutoConvertedAttrIndexInPgDistPartition(tupleDescriptor);
values[autoconvertedindex] = BoolGetDatum(autoConverted);
isnull[autoconvertedindex] = false;
replace[autoconvertedindex] = true;
heapTuple = heap_modify_tuple(heapTuple, tupleDescriptor, values, isnull, replace);
@ -2424,6 +2437,10 @@ UpdateNoneDistTableMetadata(Oid relationId, char replicationModel, uint32 coloca
systable_endscan(scanDescriptor);
table_close(pgDistPartition, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
}
@ -3113,10 +3130,10 @@ ScheduleBackgroundTask(int64 jobId, Oid owner, char *command, int dependingTaskC
/* 2. insert new task */
{
Datum values[Natts_pg_dist_background_task] = { 0 };
bool nulls[Natts_pg_dist_background_task] = { 0 };
TupleDesc tupleDescriptor = RelationGetDescr(pgDistBackgroundTask);
memset(nulls, true, sizeof(nulls));
Datum *values = (Datum *) palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *nulls = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
int64 taskId = GetNextBackgroundTaskTaskId();
@ -3147,15 +3164,17 @@ ScheduleBackgroundTask(int64 jobId, Oid owner, char *command, int dependingTaskC
values[Anum_pg_dist_background_task_message - 1] = CStringGetTextDatum("");
nulls[Anum_pg_dist_background_task_message - 1] = false;
values[Anum_pg_dist_background_task_nodes_involved - 1] =
IntArrayToDatum(nodesInvolvedCount, nodesInvolved);
nulls[Anum_pg_dist_background_task_nodes_involved - 1] = (nodesInvolvedCount ==
0);
int nodesInvolvedIndex =
GetNodesInvolvedAttrIndexInPgDistBackgroundTask(tupleDescriptor);
values[nodesInvolvedIndex] = IntArrayToDatum(nodesInvolvedCount, nodesInvolved);
nulls[nodesInvolvedIndex] = (nodesInvolvedCount == 0);
HeapTuple newTuple = heap_form_tuple(RelationGetDescr(pgDistBackgroundTask),
values, nulls);
HeapTuple newTuple = heap_form_tuple(tupleDescriptor, values, nulls);
CatalogTupleInsert(pgDistBackgroundTask, newTuple);
pfree(values);
pfree(nulls);
task = palloc0(sizeof(BackgroundTask));
task->taskid = taskId;
task->status = BACKGROUND_TASK_STATUS_RUNNABLE;
@ -3268,11 +3287,12 @@ ResetRunningBackgroundTasks(void)
List *taskIdsToWait = NIL;
while (HeapTupleIsValid(taskTuple = systable_getnext(scanDescriptor)))
{
Datum values[Natts_pg_dist_background_task] = { 0 };
bool isnull[Natts_pg_dist_background_task] = { 0 };
bool replace[Natts_pg_dist_background_task] = { 0 };
TupleDesc tupleDescriptor = RelationGetDescr(pgDistBackgroundTasks);
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(taskTuple, tupleDescriptor, values, isnull);
values[Anum_pg_dist_background_task_status - 1] =
@ -3341,6 +3361,10 @@ ResetRunningBackgroundTasks(void)
replace);
CatalogTupleUpdate(pgDistBackgroundTasks, &taskTuple->t_self, taskTuple);
pfree(values);
pfree(isnull);
pfree(replace);
}
if (list_length(taskIdsToWait) > 0)
@ -3424,8 +3448,9 @@ DeformBackgroundJobHeapTuple(TupleDesc tupleDescriptor, HeapTuple jobTuple)
static BackgroundTask *
DeformBackgroundTaskHeapTuple(TupleDesc tupleDescriptor, HeapTuple taskTuple)
{
Datum values[Natts_pg_dist_background_task] = { 0 };
bool nulls[Natts_pg_dist_background_task] = { 0 };
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *nulls = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(taskTuple, tupleDescriptor, values, nulls);
BackgroundTask *task = palloc0(sizeof(BackgroundTask));
@ -3463,13 +3488,18 @@ DeformBackgroundTaskHeapTuple(TupleDesc tupleDescriptor, HeapTuple taskTuple)
TextDatumGetCString(values[Anum_pg_dist_background_task_message - 1]);
}
if (!nulls[Anum_pg_dist_background_task_nodes_involved - 1])
int nodesInvolvedIndex =
GetNodesInvolvedAttrIndexInPgDistBackgroundTask(tupleDescriptor);
if (!nulls[nodesInvolvedIndex])
{
ArrayType *nodesInvolvedArrayObject =
DatumGetArrayTypeP(values[Anum_pg_dist_background_task_nodes_involved - 1]);
DatumGetArrayTypeP(values[nodesInvolvedIndex]);
task->nodesInvolved = IntegerArrayTypeToList(nodesInvolvedArrayObject);
}
pfree(values);
pfree(nulls);
return task;
}
@ -3734,8 +3764,8 @@ JobTasksStatusCount(int64 jobId)
HeapTuple heapTuple = NULL;
while (HeapTupleIsValid(heapTuple = systable_getnext(scanDescriptor)))
{
Datum values[Natts_pg_dist_background_task] = { 0 };
bool isnull[Natts_pg_dist_background_task] = { 0 };
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(heapTuple, tupleDescriptor, values, isnull);
@ -3743,6 +3773,9 @@ JobTasksStatusCount(int64 jobId)
1]);
BackgroundTaskStatus status = BackgroundTaskStatusByOid(statusOid);
pfree(values);
pfree(isnull);
switch (status)
{
case BACKGROUND_TASK_STATUS_BLOCKED:
@ -3995,9 +4028,9 @@ UpdateBackgroundJob(int64 jobId)
UINT64_FORMAT, jobId)));
}
Datum values[Natts_pg_dist_background_task] = { 0 };
bool isnull[Natts_pg_dist_background_task] = { 0 };
bool replace[Natts_pg_dist_background_task] = { 0 };
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(heapTuple, tupleDescriptor, values, isnull);
@ -4041,6 +4074,10 @@ UpdateBackgroundJob(int64 jobId)
systable_endscan(scanDescriptor);
table_close(pgDistBackgroundJobs, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
}
@ -4076,9 +4113,9 @@ UpdateBackgroundTask(BackgroundTask *task)
task->jobid, task->taskid)));
}
Datum values[Natts_pg_dist_background_task] = { 0 };
bool isnull[Natts_pg_dist_background_task] = { 0 };
bool replace[Natts_pg_dist_background_task] = { 0 };
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(heapTuple, tupleDescriptor, values, isnull);
@ -4147,6 +4184,10 @@ UpdateBackgroundTask(BackgroundTask *task)
systable_endscan(scanDescriptor);
table_close(pgDistBackgroundTasks, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
}
@ -4234,9 +4275,10 @@ CancelTasksForJob(int64 jobid)
HeapTuple taskTuple = NULL;
while (HeapTupleIsValid(taskTuple = systable_getnext(scanDescriptor)))
{
Datum values[Natts_pg_dist_background_task] = { 0 };
bool nulls[Natts_pg_dist_background_task] = { 0 };
bool replace[Natts_pg_dist_background_task] = { 0 };
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *nulls = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
heap_deform_tuple(taskTuple, tupleDescriptor, values, nulls);
Oid statusOid =
@ -4285,6 +4327,10 @@ CancelTasksForJob(int64 jobid)
taskTuple = heap_modify_tuple(taskTuple, tupleDescriptor, values, nulls,
replace);
CatalogTupleUpdate(pgDistBackgroundTasks, &taskTuple->t_self, taskTuple);
pfree(values);
pfree(nulls);
pfree(replace);
}
systable_endscan(scanDescriptor);
@ -4341,9 +4387,9 @@ UnscheduleDependentTasks(BackgroundTask *task)
"task_id: " UINT64_FORMAT, cTaskId)));
}
Datum values[Natts_pg_dist_background_task] = { 0 };
bool isnull[Natts_pg_dist_background_task] = { 0 };
bool replace[Natts_pg_dist_background_task] = { 0 };
Datum *values = (Datum *) palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = (bool *) palloc(tupleDescriptor->natts * sizeof(bool));
bool *replace = (bool *) palloc0(tupleDescriptor->natts * sizeof(bool));
values[Anum_pg_dist_background_task_status - 1] =
ObjectIdGetDatum(CitusTaskStatusUnscheduledId());
@ -4355,6 +4401,10 @@ UnscheduleDependentTasks(BackgroundTask *task)
CatalogTupleUpdate(pgDistBackgroundTasks, &heapTuple->t_self, heapTuple);
systable_endscan(scanDescriptor);
pfree(values);
pfree(isnull);
pfree(replace);
}
}
@ -4420,3 +4470,43 @@ UnblockDependingBackgroundTasks(BackgroundTask *task)
table_close(pgDistBackgroundTasksDepend, NoLock);
}
/*
* GetAutoConvertedAttrIndexInPgDistPartition returns attrnum for autoconverted attr.
*
* autoconverted attr was added to table pg_dist_partition using alter operation after
* the version where Citus started supporting downgrades, and it's only column that we've
* introduced to pg_dist_partition since then.
*
* And in case of a downgrade + upgrade, tupleDesc->natts becomes greater than
* Natts_pg_dist_partition and when this happens, then we know that attrnum autoconverted is
* not Anum_pg_dist_partition_autoconverted anymore but tupleDesc->natts - 1.
*/
int
GetAutoConvertedAttrIndexInPgDistPartition(TupleDesc tupleDesc)
{
return tupleDesc->natts == Natts_pg_dist_partition
? (Anum_pg_dist_partition_autoconverted - 1)
: tupleDesc->natts - 1;
}
/*
* GetNodesInvolvedAttrIndexInPgDistBackgroundTask returns attrnum for nodes_involved attr.
*
* nodes_involved attr was added to table pg_dist_background_task using alter operation after
* the version where Citus started supporting downgrades, and it's only column that we've
* introduced to pg_dist_background_task since then.
*
* And in case of a downgrade + upgrade, tupleDesc->natts becomes greater than
* Natts_pg_dist_background_task and when this happens, then we know that attrnum nodes_involved is
* not Anum_pg_dist_background_task_nodes_involved anymore but tupleDesc->natts - 1.
*/
int
GetNodesInvolvedAttrIndexInPgDistBackgroundTask(TupleDesc tupleDesc)
{
return tupleDesc->natts == Natts_pg_dist_background_task
? (Anum_pg_dist_background_task_nodes_involved - 1)
: tupleDesc->natts - 1;
}

View File

@ -35,6 +35,7 @@
#include "distributed/citus_acquire_lock.h"
#include "distributed/citus_safe_lib.h"
#include "distributed/clonenode_utils.h"
#include "distributed/colocation_utils.h"
#include "distributed/commands.h"
#include "distributed/commands/utility_hook.h"
@ -84,6 +85,8 @@ typedef struct NodeMetadata
bool isActive;
Oid nodeRole;
bool shouldHaveShards;
uint32 nodeprimarynodeid;
bool nodeisclone;
char *nodeCluster;
} NodeMetadata;
@ -106,11 +109,14 @@ static void InsertNodeRow(int nodeid, char *nodename, int32 nodeport,
NodeMetadata *nodeMetadata);
static void DeleteNodeRow(char *nodename, int32 nodeport);
static void BlockDistributedQueriesOnMetadataNodes(void);
static WorkerNode * TupleToWorkerNode(TupleDesc tupleDescriptor, HeapTuple heapTuple);
static WorkerNode * TupleToWorkerNode(Relation pgDistNode, TupleDesc tupleDescriptor,
HeapTuple heapTuple);
static bool NodeIsLocal(WorkerNode *worker);
static void SetLockTimeoutLocally(int32 lock_cooldown);
static void UpdateNodeLocation(int32 nodeId, char *newNodeName, int32 newNodePort,
bool localOnly);
static int GetNodePrimaryNodeIdAttrIndexInPgDistNode(TupleDesc tupleDesc);
static int GetNodeIsCloneAttrIndexInPgDistNode(TupleDesc tupleDesc);
static bool UnsetMetadataSyncedForAllWorkers(void);
static char * GetMetadataSyncCommandToSetNodeColumn(WorkerNode *workerNode,
int columnIndex,
@ -120,11 +126,10 @@ static char * NodeMetadataSyncedUpdateCommand(uint32 nodeId, bool metadataSynced
static void ErrorIfCoordinatorMetadataSetFalse(WorkerNode *workerNode, Datum value,
char *field);
static WorkerNode * SetShouldHaveShards(WorkerNode *workerNode, bool shouldHaveShards);
static WorkerNode * FindNodeAnyClusterByNodeId(uint32 nodeId);
static void ErrorIfAnyNodeNotExist(List *nodeList);
static void UpdateLocalGroupIdsViaMetadataContext(MetadataSyncContext *context);
static void SendDeletionCommandsForReplicatedTablePlacements(
MetadataSyncContext *context);
static void SendDeletionCommandsForReplicatedTablePlacements(MetadataSyncContext *context)
;
static void SyncNodeMetadata(MetadataSyncContext *context);
static void SetNodeStateViaMetadataContext(MetadataSyncContext *context,
WorkerNode *workerNode,
@ -134,12 +139,15 @@ static void MarkNodesNotSyncedInLoopBackConnection(MetadataSyncContext *context,
static void EnsureParentSessionHasExclusiveLockOnPgDistNode(pid_t parentSessionPid);
static void SetNodeMetadata(MetadataSyncContext *context, bool localOnly);
static void EnsureTransactionalMetadataSyncMode(void);
static void LockShardsInWorkerPlacementList(WorkerNode *workerNode, LOCKMODE
lockMode);
static BackgroundWorkerHandle * CheckBackgroundWorkerToObtainLocks(int32 lock_cooldown);
static BackgroundWorkerHandle * LockPlacementsWithBackgroundWorkersInPrimaryNode(
WorkerNode *workerNode, bool force, int32 lock_cooldown);
static int32 CitusAddCloneNode(WorkerNode *primaryWorkerNode,
char *cloneHostname, int32 clonePort);
static void RemoveCloneNode(WorkerNode *cloneNode);
/* Function definitions go here */
/* declarations for dynamic loading */
@ -168,6 +176,10 @@ PG_FUNCTION_INFO_V1(citus_coordinator_nodeid);
PG_FUNCTION_INFO_V1(citus_is_coordinator);
PG_FUNCTION_INFO_V1(citus_internal_mark_node_not_synced);
PG_FUNCTION_INFO_V1(citus_is_primary_node);
PG_FUNCTION_INFO_V1(citus_add_clone_node);
PG_FUNCTION_INFO_V1(citus_add_clone_node_with_nodeid);
PG_FUNCTION_INFO_V1(citus_remove_clone_node);
PG_FUNCTION_INFO_V1(citus_remove_clone_node_with_nodeid);
/*
* DefaultNodeMetadata creates a NodeMetadata struct with the fields set to
@ -183,6 +195,8 @@ DefaultNodeMetadata()
nodeMetadata.nodeRack = WORKER_DEFAULT_RACK;
nodeMetadata.shouldHaveShards = true;
nodeMetadata.groupId = INVALID_GROUP_ID;
nodeMetadata.nodeisclone = false;
nodeMetadata.nodeprimarynodeid = 0; /* 0 typically means InvalidNodeId */
return nodeMetadata;
}
@ -1177,6 +1191,42 @@ ActivateNodeList(MetadataSyncContext *context)
}
/*
* ActivateCloneNodeAsPrimary sets the given worker node as primary and active
* in the pg_dist_node catalog and make the clone node as first class citizen.
*/
void
ActivateCloneNodeAsPrimary(WorkerNode *workerNode)
{
Relation pgDistNode = table_open(DistNodeRelationId(), AccessShareLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistNode);
TupleDesc copiedTupleDescriptor = CreateTupleDescCopy(tupleDescriptor);
table_close(pgDistNode, AccessShareLock);
/*
* Set the node as primary and active.
*/
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_noderole,
ObjectIdGetDatum(PrimaryNodeRoleId()));
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_isactive,
BoolGetDatum(true));
SetWorkerColumnLocalOnly(workerNode,
GetNodeIsCloneAttrIndexInPgDistNode(copiedTupleDescriptor) +
1,
BoolGetDatum(false));
SetWorkerColumnLocalOnly(workerNode,
GetNodePrimaryNodeIdAttrIndexInPgDistNode(
copiedTupleDescriptor) + 1,
Int32GetDatum(0));
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_hasmetadata,
BoolGetDatum(true));
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_metadatasynced,
BoolGetDatum(true));
SetWorkerColumnLocalOnly(workerNode, Anum_pg_dist_node_shouldhaveshards,
BoolGetDatum(true));
}
/*
* Acquires shard metadata locks on all shards residing in the given worker node
*
@ -1200,7 +1250,8 @@ BackgroundWorkerHandle *
CheckBackgroundWorkerToObtainLocks(int32 lock_cooldown)
{
BackgroundWorkerHandle *handle = StartLockAcquireHelperBackgroundWorker(MyProcPid,
lock_cooldown);
lock_cooldown)
;
if (!handle)
{
/*
@ -1422,6 +1473,305 @@ master_update_node(PG_FUNCTION_ARGS)
}
/*
* citus_add_clone_node adds a new node as a clone of an existing primary node.
*/
Datum
citus_add_clone_node(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
EnsureSuperUser();
EnsureCoordinator();
text *cloneHostnameText = PG_GETARG_TEXT_P(0);
int32 clonePort = PG_GETARG_INT32(1);
text *primaryHostnameText = PG_GETARG_TEXT_P(2);
int32 primaryPort = PG_GETARG_INT32(3);
char *cloneHostname = text_to_cstring(cloneHostnameText);
char *primaryHostname = text_to_cstring(primaryHostnameText);
WorkerNode *primaryWorker = FindWorkerNodeAnyCluster(primaryHostname, primaryPort);
if (primaryWorker == NULL)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("primary node %s:%d not found in pg_dist_node",
primaryHostname, primaryPort)));
}
int32 cloneNodeId = CitusAddCloneNode(primaryWorker, cloneHostname, clonePort);
PG_RETURN_INT32(cloneNodeId);
}
/*
* citus_add_clone_node_with_nodeid adds a new node as a clone of an existing primary node
* using the primary node's ID. It records the clone's hostname, port, and links it to the
* primary node's ID.
*
* This function is useful when you already know the primary node's ID and want to add a clone
* without needing to look it up by hostname and port.
*/
Datum
citus_add_clone_node_with_nodeid(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
EnsureSuperUser();
EnsureCoordinator();
text *cloneHostnameText = PG_GETARG_TEXT_P(0);
int32 clonePort = PG_GETARG_INT32(1);
int32 primaryNodeId = PG_GETARG_INT32(2);
char *cloneHostname = text_to_cstring(cloneHostnameText);
bool missingOk = false;
WorkerNode *primaryWorkerNode = FindNodeWithNodeId(primaryNodeId, missingOk);
if (primaryWorkerNode == NULL)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("primary node with ID %d does not exist", primaryNodeId)));
}
int32 cloneNodeId = CitusAddCloneNode(primaryWorkerNode, cloneHostname, clonePort);
PG_RETURN_INT32(cloneNodeId);
}
/*
* CitusAddCloneNode function adds a new node as a clone of an existing primary node.
* It records the clone's hostname, port, and links it to the primary node's ID.
* The clone is initially marked as inactive and not having shards.
*/
static int32
CitusAddCloneNode(WorkerNode *primaryWorkerNode,
char *cloneHostname, int32 clonePort)
{
Assert(primaryWorkerNode != NULL);
/* Future-proofing: Ideally, a primary node should not itself be a clone.
* This check might be more relevant once replica promotion logic exists.
* For now, pg_dist_node.nodeisclone defaults to false for existing nodes.
*/
if (primaryWorkerNode->nodeisclone)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"primary node %s:%d is itself a clone and cannot have clones",
primaryWorkerNode->workerName, primaryWorkerNode->
workerPort)));
}
if (!primaryWorkerNode->shouldHaveShards)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"primary node %s:%d does not have shards, node without shards cannot have clones",
primaryWorkerNode->workerName, primaryWorkerNode->
workerPort)));
}
WorkerNode *existingCloneNode = FindWorkerNodeAnyCluster(cloneHostname, clonePort);
if (existingCloneNode != NULL)
{
/*
* Idempotency check: If the node already exists, is it already correctly
* registered as a clone for THIS primary?
*/
if (existingCloneNode->nodeisclone &&
existingCloneNode->nodeprimarynodeid == primaryWorkerNode->nodeId)
{
ereport(NOTICE, (errmsg(
"node %s:%d is already registered as a clone for primary %s:%d (nodeid %d)",
cloneHostname, clonePort,
primaryWorkerNode->workerName, primaryWorkerNode->
workerPort, primaryWorkerNode->nodeId)));
PG_RETURN_INT32(existingCloneNode->nodeId);
}
else
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"a different node %s:%d (nodeid %d) already exists or is a clone for a different primary",
cloneHostname, clonePort, existingCloneNode->nodeId)));
}
}
EnsureValidStreamingReplica(primaryWorkerNode, cloneHostname, clonePort);
char *operation = "add";
EnsureValidCloneMode(primaryWorkerNode, cloneHostname, clonePort, operation);
NodeMetadata nodeMetadata = DefaultNodeMetadata();
nodeMetadata.nodeisclone = true;
nodeMetadata.nodeprimarynodeid = primaryWorkerNode->nodeId;
nodeMetadata.isActive = false; /* Replicas start as inactive */
nodeMetadata.shouldHaveShards = false; /* Replicas do not directly own primary shards */
nodeMetadata.groupId = INVALID_GROUP_ID; /* Replicas get a new group ID and do not belong to any existing group */
nodeMetadata.nodeRole = UnavailableNodeRoleId(); /* The node role is set to 'unavailable' */
nodeMetadata.nodeCluster = primaryWorkerNode->nodeCluster; /* Same cluster as primary */
/* Other fields like hasMetadata, metadataSynced will take defaults from DefaultNodeMetadata
* (typically true, true for hasMetadata and metadataSynced if it's a new node,
* or might need adjustment based on replica strategy)
* For now, let's assume DefaultNodeMetadata provides suitable defaults for these
* or they will be set by AddNodeMetadata/ActivateNodeList if needed.
* Specifically, hasMetadata is often true, and metadataSynced true after activation.
* Since this replica is inactive, metadata sync status might be less critical initially.
*/
bool nodeAlreadyExists = false;
bool localOnly = false; /* Propagate change to other workers with metadata */
/*
* AddNodeMetadata will take an ExclusiveLock on pg_dist_node.
* It also checks again if the node already exists after acquiring the lock.
*/
int cloneNodeId = AddNodeMetadata(cloneHostname, clonePort, &nodeMetadata,
&nodeAlreadyExists, localOnly);
if (nodeAlreadyExists)
{
/* This case should ideally be caught by the FindWorkerNodeAnyCluster check above,
* but AddNodeMetadata does its own check after locking.
* If it already exists and is correctly configured, we might have returned NOTICE above.
* If it exists but is NOT correctly configured as our replica, an ERROR would be more appropriate.
* AddNodeMetadata returns the existing node's ID if it finds one.
* We need to ensure it is the *correct* replica.
*/
WorkerNode *fetchedExistingNode = FindNodeAnyClusterByNodeId(cloneNodeId);
if (fetchedExistingNode != NULL && fetchedExistingNode->nodeisclone &&
fetchedExistingNode->nodeprimarynodeid == primaryWorkerNode->nodeId)
{
ereport(NOTICE, (errmsg(
"node %s:%d was already correctly registered as a clone for primary %s:%d (nodeid %d)",
cloneHostname, clonePort,
primaryWorkerNode->workerName, primaryWorkerNode->
workerPort, primaryWorkerNode->nodeId)));
/* Intentional fall-through to return cloneNodeId */
}
else
{
/* This state is less expected if our initial check passed or errored. */
ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg(
"node %s:%d already exists but is not correctly configured as a clone for primary %s:%d",
cloneHostname, clonePort, primaryWorkerNode->workerName,
primaryWorkerNode->workerPort)));
}
}
TransactionModifiedNodeMetadata = true;
/*
* Note: Clones added this way are inactive.
* A separate UDF citus_promote_clone_and_rebalance
* would be needed to activate them.
*/
return cloneNodeId;
}
/*
* citus_remove_clone_node removes an inactive streaming clone node from Citus metadata.
*/
Datum
citus_remove_clone_node(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
EnsureSuperUser();
EnsureCoordinator();
text *nodeNameText = PG_GETARG_TEXT_P(0);
int32 nodePort = PG_GETARG_INT32(1);
char *nodeName = text_to_cstring(nodeNameText);
WorkerNode *workerNode = FindWorkerNodeAnyCluster(nodeName, nodePort);
if (workerNode == NULL)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("node \"%s:%d\" does not exist", nodeName, nodePort)));
}
RemoveCloneNode(workerNode);
PG_RETURN_VOID();
}
/*
* citus_remove_clone_node_with_nodeid removes an inactive clone node from Citus metadata
* using the node's ID.
*/
Datum
citus_remove_clone_node_with_nodeid(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
EnsureSuperUser();
EnsureCoordinator();
uint32 replicaNodeId = PG_GETARG_INT32(0);
WorkerNode *replicaNode = FindNodeAnyClusterByNodeId(replicaNodeId);
if (replicaNode == NULL)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("Clone node with ID %d does not exist", replicaNodeId)));
}
RemoveCloneNode(replicaNode);
PG_RETURN_VOID();
}
static void
RemoveCloneNode(WorkerNode *cloneNode)
{
Assert(cloneNode != NULL);
if (!cloneNode->nodeisclone)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("Node %s:%d (ID %d) is not a clone node. "
"Use citus_remove_node() to remove primary or already promoted nodes.",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
}
if (cloneNode->isActive)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Clone node %s:%d (ID %d) is marked as active and cannot be removed with this function. "
"This might indicate a promoted clone. Consider using citus_remove_node() if you are sure, "
"or ensure it's properly deactivated if it's an unpromoted clone in an unexpected state.",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
}
/*
* All checks passed, proceed with removal.
* RemoveNodeFromCluster handles locking, catalog changes, connection closing, and metadata sync.
*/
ereport(NOTICE, (errmsg("Removing inactive clone node %s:%d (ID %d)",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
RemoveNodeFromCluster(cloneNode->workerName, cloneNode->workerPort);
/* RemoveNodeFromCluster might set this, but setting it here ensures it's marked for this UDF's transaction. */
TransactionModifiedNodeMetadata = true;
}
/*
* SetLockTimeoutLocally sets the lock_timeout to the given value.
* This setting is local.
@ -1440,14 +1790,14 @@ UpdateNodeLocation(int32 nodeId, char *newNodeName, int32 newNodePort, bool loca
{
const bool indexOK = true;
ScanKeyData scanKey[1];
Datum values[Natts_pg_dist_node];
bool isnull[Natts_pg_dist_node];
bool replace[Natts_pg_dist_node];
Relation pgDistNode = table_open(DistNodeRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistNode);
ScanKeyData scanKey[1];
Datum *values = palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = palloc0(tupleDescriptor->natts * sizeof(bool));
ScanKeyInit(&scanKey[0], Anum_pg_dist_node_nodeid,
BTEqualStrategyNumber, F_INT4EQ, Int32GetDatum(nodeId));
@ -1462,8 +1812,6 @@ UpdateNodeLocation(int32 nodeId, char *newNodeName, int32 newNodePort, bool loca
newNodeName, newNodePort)));
}
memset(replace, 0, sizeof(replace));
values[Anum_pg_dist_node_nodeport - 1] = Int32GetDatum(newNodePort);
isnull[Anum_pg_dist_node_nodeport - 1] = false;
replace[Anum_pg_dist_node_nodeport - 1] = true;
@ -1496,6 +1844,10 @@ UpdateNodeLocation(int32 nodeId, char *newNodeName, int32 newNodePort, bool loca
systable_endscan(scanDescriptor);
table_close(pgDistNode, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
}
@ -1766,11 +2118,10 @@ citus_internal_mark_node_not_synced(PG_FUNCTION_ARGS)
Relation pgDistNode = table_open(DistNodeRelationId(), AccessShareLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistNode);
Datum values[Natts_pg_dist_node];
bool isnull[Natts_pg_dist_node];
bool replace[Natts_pg_dist_node];
Datum *values = palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = palloc0(tupleDescriptor->natts * sizeof(bool));
memset(replace, 0, sizeof(replace));
values[Anum_pg_dist_node_metadatasynced - 1] = DatumGetBool(false);
isnull[Anum_pg_dist_node_metadatasynced - 1] = false;
replace[Anum_pg_dist_node_metadatasynced - 1] = true;
@ -1784,6 +2135,10 @@ citus_internal_mark_node_not_synced(PG_FUNCTION_ARGS)
table_close(pgDistNode, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
PG_RETURN_VOID();
}
@ -1859,7 +2214,7 @@ FindWorkerNodeAnyCluster(const char *nodeName, int32 nodePort)
HeapTuple heapTuple = GetNodeTuple(nodeName, nodePort);
if (heapTuple != NULL)
{
workerNode = TupleToWorkerNode(tupleDescriptor, heapTuple);
workerNode = TupleToWorkerNode(pgDistNode, tupleDescriptor, heapTuple);
}
table_close(pgDistNode, NoLock);
@ -1871,7 +2226,7 @@ FindWorkerNodeAnyCluster(const char *nodeName, int32 nodePort)
* FindNodeAnyClusterByNodeId searches pg_dist_node and returns the node with
* the nodeId. If the node can't be found returns NULL.
*/
static WorkerNode *
WorkerNode *
FindNodeAnyClusterByNodeId(uint32 nodeId)
{
bool includeNodesFromOtherClusters = true;
@ -1966,7 +2321,8 @@ ReadDistNode(bool includeNodesFromOtherClusters)
HeapTuple heapTuple = systable_getnext(scanDescriptor);
while (HeapTupleIsValid(heapTuple))
{
WorkerNode *workerNode = TupleToWorkerNode(tupleDescriptor, heapTuple);
WorkerNode *workerNode = TupleToWorkerNode(pgDistNode, tupleDescriptor, heapTuple)
;
if (includeNodesFromOtherClusters ||
strncmp(workerNode->nodeCluster, CurrentCluster, WORKER_LENGTH) == 0)
@ -2491,9 +2847,9 @@ SetWorkerColumnLocalOnly(WorkerNode *workerNode, int columnIndex, Datum value)
TupleDesc tupleDescriptor = RelationGetDescr(pgDistNode);
HeapTuple heapTuple = GetNodeTuple(workerNode->workerName, workerNode->workerPort);
Datum values[Natts_pg_dist_node];
bool isnull[Natts_pg_dist_node];
bool replace[Natts_pg_dist_node];
Datum *values = palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = palloc0(tupleDescriptor->natts * sizeof(bool));
bool *replace = palloc0(tupleDescriptor->natts * sizeof(bool));
if (heapTuple == NULL)
{
@ -2501,7 +2857,6 @@ SetWorkerColumnLocalOnly(WorkerNode *workerNode, int columnIndex, Datum value)
workerNode->workerName, workerNode->workerPort)));
}
memset(replace, 0, sizeof(replace));
values[columnIndex - 1] = value;
isnull[columnIndex - 1] = false;
replace[columnIndex - 1] = true;
@ -2513,10 +2868,14 @@ SetWorkerColumnLocalOnly(WorkerNode *workerNode, int columnIndex, Datum value)
CitusInvalidateRelcacheByRelid(DistNodeRelationId());
CommandCounterIncrement();
WorkerNode *newWorkerNode = TupleToWorkerNode(tupleDescriptor, heapTuple);
WorkerNode *newWorkerNode = TupleToWorkerNode(pgDistNode, tupleDescriptor, heapTuple);
table_close(pgDistNode, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
return newWorkerNode;
}
@ -2901,16 +3260,15 @@ InsertPlaceholderCoordinatorRecord(void)
static void
InsertNodeRow(int nodeid, char *nodeName, int32 nodePort, NodeMetadata *nodeMetadata)
{
Datum values[Natts_pg_dist_node];
bool isNulls[Natts_pg_dist_node];
Relation pgDistNode = table_open(DistNodeRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistNode);
Datum *values = palloc0(tupleDescriptor->natts * sizeof(Datum));
bool *isNulls = palloc0(tupleDescriptor->natts * sizeof(bool));
Datum nodeClusterStringDatum = CStringGetDatum(nodeMetadata->nodeCluster);
Datum nodeClusterNameDatum = DirectFunctionCall1(namein, nodeClusterStringDatum);
/* form new shard tuple */
memset(values, 0, sizeof(values));
memset(isNulls, false, sizeof(isNulls));
values[Anum_pg_dist_node_nodeid - 1] = UInt32GetDatum(nodeid);
values[Anum_pg_dist_node_groupid - 1] = Int32GetDatum(nodeMetadata->groupId);
values[Anum_pg_dist_node_nodename - 1] = CStringGetTextDatum(nodeName);
@ -2924,13 +3282,15 @@ InsertNodeRow(int nodeid, char *nodeName, int32 nodePort, NodeMetadata *nodeMeta
values[Anum_pg_dist_node_nodecluster - 1] = nodeClusterNameDatum;
values[Anum_pg_dist_node_shouldhaveshards - 1] = BoolGetDatum(
nodeMetadata->shouldHaveShards);
Relation pgDistNode = table_open(DistNodeRelationId(), RowExclusiveLock);
TupleDesc tupleDescriptor = RelationGetDescr(pgDistNode);
values[GetNodeIsCloneAttrIndexInPgDistNode(tupleDescriptor)] =
BoolGetDatum(nodeMetadata->nodeisclone);
values[GetNodePrimaryNodeIdAttrIndexInPgDistNode(tupleDescriptor)] =
Int32GetDatum(nodeMetadata->nodeprimarynodeid);
HeapTuple heapTuple = heap_form_tuple(tupleDescriptor, values, isNulls);
PushActiveSnapshot(GetTransactionSnapshot());
CatalogTupleInsert(pgDistNode, heapTuple);
PopActiveSnapshot();
CitusInvalidateRelcacheByRelid(DistNodeRelationId());
@ -2939,6 +3299,9 @@ InsertNodeRow(int nodeid, char *nodeName, int32 nodePort, NodeMetadata *nodeMeta
/* close relation */
table_close(pgDistNode, NoLock);
pfree(values);
pfree(isNulls);
}
@ -2965,8 +3328,18 @@ DeleteNodeRow(char *nodeName, int32 nodePort)
* https://github.com/citusdata/citus/pull/2855#discussion_r313628554
* https://github.com/citusdata/citus/issues/1890
*/
Relation replicaIndex = index_open(RelationGetPrimaryKeyIndex(pgDistNode),
AccessShareLock);
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG 18+ adds a bool “deferrable_ok” parameter */
Relation replicaIndex =
index_open(RelationGetPrimaryKeyIndex(pgDistNode, false),
AccessShareLock);
#else
Relation replicaIndex =
index_open(RelationGetPrimaryKeyIndex(pgDistNode),
AccessShareLock);
#endif
ScanKeyInit(&scanKey[0], Anum_pg_dist_node_nodename,
BTEqualStrategyNumber, F_TEXTEQ, CStringGetTextDatum(nodeName));
@ -3005,19 +3378,18 @@ DeleteNodeRow(char *nodeName, int32 nodePort)
* the caller already has locks on the tuple, and doesn't perform any locking.
*/
static WorkerNode *
TupleToWorkerNode(TupleDesc tupleDescriptor, HeapTuple heapTuple)
TupleToWorkerNode(Relation pgDistNode, TupleDesc tupleDescriptor, HeapTuple heapTuple)
{
Datum datumArray[Natts_pg_dist_node];
bool isNullArray[Natts_pg_dist_node];
Assert(!HeapTupleHasNulls(heapTuple));
/*
* This function can be called before "ALTER TABLE ... ADD COLUMN nodecluster ...",
* therefore heap_deform_tuple() won't set the isNullArray for this column. We
* initialize it true to be safe in that case.
/* we add remove columns from pg_dist_node during extension upgrade and
* and downgrads. Now the issue here is PostgreSQL never reuses the old
* attnum. Dropped columns leave holes (attributes with attisdropped = true),
* and a re-added column with the same name gets a new attnum at the end. So
* we cannot use the deined Natts_pg_dist_node to allocate memory and also
* we need to cater for the holes when fetching the column values
*/
memset(isNullArray, true, sizeof(isNullArray));
int nAtts = tupleDescriptor->natts;
Datum *datumArray = palloc0(sizeof(Datum) * nAtts);
bool *isNullArray = palloc0(sizeof(bool) * nAtts);
/*
* We use heap_deform_tuple() instead of heap_getattr() to expand tuple
@ -3044,10 +3416,11 @@ TupleToWorkerNode(TupleDesc tupleDescriptor, HeapTuple heapTuple)
1]);
/*
* nodecluster column can be missing. In the case of extension creation/upgrade,
* master_initialize_node_metadata function is called before the nodecluster
* column is added to pg_dist_node table.
* nodecluster, nodeisclone and nodeprimarynodeid columns can be missing. In case
* of extension creation/upgrade, master_initialize_node_metadata function is
* called before the nodecluster column is added to pg_dist_node table.
*/
if (!isNullArray[Anum_pg_dist_node_nodecluster - 1])
{
Name nodeClusterName =
@ -3056,10 +3429,68 @@ TupleToWorkerNode(TupleDesc tupleDescriptor, HeapTuple heapTuple)
strlcpy(workerNode->nodeCluster, nodeClusterString, NAMEDATALEN);
}
int nodeIsCloneIdx = GetNodeIsCloneAttrIndexInPgDistNode(tupleDescriptor);
int nodePrimaryNodeIdIdx = GetNodePrimaryNodeIdAttrIndexInPgDistNode(tupleDescriptor);
if (!isNullArray[nodeIsCloneIdx])
{
workerNode->nodeisclone = DatumGetBool(datumArray[nodeIsCloneIdx]);
}
if (!isNullArray[nodePrimaryNodeIdIdx])
{
workerNode->nodeprimarynodeid = DatumGetInt32(datumArray[nodePrimaryNodeIdIdx]);
}
pfree(datumArray);
pfree(isNullArray);
return workerNode;
}
/*
* GetNodePrimaryNodeIdAttrIndexInPgDistNode returns attrnum for nodeprimarynodeid attr.
*
* nodeprimarynodeid attr was added to table pg_dist_node using alter operation
* after the version where Citus started supporting downgrades, and it's one of
* the two columns that we've introduced to pg_dist_node since then.
*
* And in case of a downgrade + upgrade, tupleDesc->natts becomes greater than
* Natts_pg_dist_node and when this happens, then we know that attrnum
* nodeprimarynodeid is not Anum_pg_dist_node_nodeprimarynodeid anymore but
* tupleDesc->natts - 1.
*/
static int
GetNodePrimaryNodeIdAttrIndexInPgDistNode(TupleDesc tupleDesc)
{
return tupleDesc->natts == Natts_pg_dist_node
? (Anum_pg_dist_node_nodeprimarynodeid - 1)
: tupleDesc->natts - 1;
}
/*
* GetNodeIsCloneAttrIndexInPgDistNode returns attrnum for nodeisclone attr.
*
* Like, GetNodePrimaryNodeIdAttrIndexInPgDistNode(), performs a similar
* calculation for nodeisclone attribute because this is column added to
* pg_dist_node after we started supporting downgrades.
*
* Only difference with the mentioned function is that we know
* the attrnum for nodeisclone is not Anum_pg_dist_node_nodeisclone anymore
* but tupleDesc->natts - 2 because we added these columns consecutively
* and we first add nodeisclone attribute and then nodeprimarynodeid attribute.
*/
static int
GetNodeIsCloneAttrIndexInPgDistNode(TupleDesc tupleDesc)
{
return tupleDesc->natts == Natts_pg_dist_node
? (Anum_pg_dist_node_nodeisclone - 1)
: tupleDesc->natts - 2;
}
/*
* StringToDatum transforms a string representation into a Datum.
*/
@ -3136,15 +3567,15 @@ UnsetMetadataSyncedForAllWorkers(void)
updatedAtLeastOne = true;
}
Datum *values = palloc(tupleDescriptor->natts * sizeof(Datum));
bool *isnull = palloc(tupleDescriptor->natts * sizeof(bool));
bool *replace = palloc(tupleDescriptor->natts * sizeof(bool));
while (HeapTupleIsValid(heapTuple))
{
Datum values[Natts_pg_dist_node];
bool isnull[Natts_pg_dist_node];
bool replace[Natts_pg_dist_node];
memset(replace, false, sizeof(replace));
memset(isnull, false, sizeof(isnull));
memset(values, 0, sizeof(values));
memset(values, 0, tupleDescriptor->natts * sizeof(Datum));
memset(isnull, 0, tupleDescriptor->natts * sizeof(bool));
memset(replace, 0, tupleDescriptor->natts * sizeof(bool));
values[Anum_pg_dist_node_metadatasynced - 1] = BoolGetDatum(false);
replace[Anum_pg_dist_node_metadatasynced - 1] = true;
@ -3167,6 +3598,10 @@ UnsetMetadataSyncedForAllWorkers(void)
CatalogCloseIndexes(indstate);
table_close(relation, NoLock);
pfree(values);
pfree(isnull);
pfree(replace);
return updatedAtLeastOne;
}

View File

@ -0,0 +1,424 @@
#include "postgres.h"
#include "utils/fmgrprotos.h"
#include "utils/pg_lsn.h"
#include "distributed/argutils.h"
#include "distributed/clonenode_utils.h"
#include "distributed/listutils.h"
#include "distributed/metadata_cache.h"
#include "distributed/metadata_sync.h"
#include "distributed/remote_commands.h"
#include "distributed/shard_rebalancer.h"
static void BlockAllWritesToWorkerNode(WorkerNode *workerNode);
static bool GetNodeIsInRecoveryStatus(WorkerNode *workerNode);
static void PromoteCloneNode(WorkerNode *cloneWorkerNode);
static void EnsureSingleNodePromotion(WorkerNode *primaryNode);
PG_FUNCTION_INFO_V1(citus_promote_clone_and_rebalance);
/*
* citus_promote_clone_and_rebalance promotes an inactive clone node to become
* the new primary node, replacing its original primary node.
*
* This function performs the following steps:
* 1. Validates that the clone node exists and is properly configured
* 2. Ensures the clone is inactive and has a valid primary node reference
* 3. Blocks all writes to the primary node to prevent data divergence
* 4. Waits for the clone to catch up with the primary's WAL position
* 5. Promotes the clone node to become a standalone primary
* 6. Updates metadata to mark the clone as active and primary
* 7. Rebalances shards between the old primary and new primary
* 8. Returns information about the promotion and any shard movements
*
* Arguments:
* - clone_nodeid: The node ID of the clone to promote
* - catchUpTimeoutSeconds: Maximum time to wait for clone to catch up (default: 300)
*
* The function ensures data consistency by blocking writes during the promotion
* process and verifying replication lag before proceeding.
*/
Datum
citus_promote_clone_and_rebalance(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
/* Ensure superuser and coordinator */
EnsureSuperUser();
EnsureCoordinator();
/* Get clone_nodeid argument */
int32 cloneNodeIdArg = PG_GETARG_INT32(0);
/* Get catchUpTimeoutSeconds argument with default value of 300 */
int32 catchUpTimeoutSeconds = PG_ARGISNULL(2) ? 300 : PG_GETARG_INT32(2);
/* Lock pg_dist_node to prevent concurrent modifications during this operation */
LockRelationOid(DistNodeRelationId(), RowExclusiveLock);
WorkerNode *cloneNode = FindNodeAnyClusterByNodeId(cloneNodeIdArg);
if (cloneNode == NULL)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("Clone node with ID %d not found.", cloneNodeIdArg)));
}
if (!cloneNode->nodeisclone || cloneNode->nodeprimarynodeid == 0)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Node %s:%d (ID %d) is not a valid clone or its primary node ID is not set.",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
}
if (cloneNode->isActive)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Clone node %s:%d (ID %d) is already active and cannot be promoted.",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
}
WorkerNode *primaryNode = FindNodeAnyClusterByNodeId(cloneNode->nodeprimarynodeid);
if (primaryNode == NULL)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("Primary node with ID %d (for clone %s:%d) not found.",
cloneNode->nodeprimarynodeid, cloneNode->workerName,
cloneNode->workerPort)));
}
if (primaryNode->nodeisclone)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("Primary node %s:%d (ID %d) is itself a clone.",
primaryNode->workerName, primaryNode->workerPort,
primaryNode->nodeId)));
}
if (!primaryNode->isActive)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("Primary node %s:%d (ID %d) is not active.",
primaryNode->workerName, primaryNode->workerPort,
primaryNode->nodeId)));
}
/* Ensure the primary node is related to the clone node */
if (primaryNode->nodeId != cloneNode->nodeprimarynodeid)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Clone node %s:%d (ID %d) is not a clone of the primary node %s:%d (ID %d).",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId,
primaryNode->workerName, primaryNode->workerPort,
primaryNode->nodeId)));
}
EnsureSingleNodePromotion(primaryNode);
ereport(NOTICE, (errmsg(
"Starting promotion process for clone node %s:%d (ID %d), original primary %s:%d (ID %d)",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId,
primaryNode->workerName, primaryNode->workerPort, primaryNode
->nodeId)));
/* Step 0: Check if clone is replica of provided primary node and is not synchronous */
char *operation = "promote";
EnsureValidCloneMode(primaryNode, cloneNode->workerName, cloneNode->workerPort,
operation);
/* Step 1: Block Writes on Original Primary's Shards */
ereport(NOTICE, (errmsg(
"Blocking writes on shards of original primary node %s:%d (group %d)",
primaryNode->workerName, primaryNode->workerPort, primaryNode
->groupId)));
BlockAllWritesToWorkerNode(primaryNode);
/* Step 2: Wait for Clone to Catch Up */
ereport(NOTICE, (errmsg(
"Waiting for clone %s:%d to catch up with primary %s:%d (timeout: %d seconds)",
cloneNode->workerName, cloneNode->workerPort,
primaryNode->workerName, primaryNode->workerPort,
catchUpTimeoutSeconds)));
bool caughtUp = false;
const int sleepIntervalSeconds = 5;
int elapsedTimeSeconds = 0;
while (elapsedTimeSeconds < catchUpTimeoutSeconds)
{
uint64 repLag = GetReplicationLag(primaryNode, cloneNode);
if (repLag <= 0)
{
caughtUp = true;
break;
}
pg_usleep(sleepIntervalSeconds * 1000000L);
elapsedTimeSeconds += sleepIntervalSeconds;
}
if (!caughtUp)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Clone %s:%d failed to catch up with primary %s:%d within %d seconds.",
cloneNode->workerName, cloneNode->workerPort,
primaryNode->workerName, primaryNode->workerPort,
catchUpTimeoutSeconds)));
}
ereport(NOTICE, (errmsg("Clone %s:%d is now caught up with primary %s:%d.",
cloneNode->workerName, cloneNode->workerPort,
primaryNode->workerName, primaryNode->workerPort)));
/* Step 3: PostgreSQL Clone Promotion */
ereport(NOTICE, (errmsg("Attempting to promote clone %s:%d via pg_promote().",
cloneNode->workerName, cloneNode->workerPort)));
PromoteCloneNode(cloneNode);
/* Step 4: Update Clone Metadata in pg_dist_node on Coordinator */
ereport(NOTICE, (errmsg("Updating metadata for promoted clone %s:%d (ID %d)",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
ActivateCloneNodeAsPrimary(cloneNode);
/* We need to sync metadata changes to all nodes before rebalancing shards
* since the rebalancing algorithm depends on the latest metadata.
*/
SyncNodeMetadataToNodes();
/* Step 5: Split Shards Between Primary and Clone */
SplitShardsBetweenPrimaryAndClone(primaryNode, cloneNode, PG_GETARG_NAME_OR_NULL(1))
;
TransactionModifiedNodeMetadata = true; /* Inform Citus about metadata change */
TriggerNodeMetadataSyncOnCommit(); /* Ensure changes are propagated */
ereport(NOTICE, (errmsg(
"Clone node %s:%d (ID %d) metadata updated. It is now a primary",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
/* Step 6: Unblock Writes (should be handled by transaction commit) */
ereport(NOTICE, (errmsg(
"Clone node %s:%d (ID %d) successfully registered as a worker node",
cloneNode->workerName, cloneNode->workerPort, cloneNode->
nodeId)));
PG_RETURN_VOID();
}
/*
* PromoteCloneNode promotes a clone node to a primary node using PostgreSQL's
* pg_promote() function.
*
* This function performs the following steps:
* 1. Connects to the clone node
* 2. Executes pg_promote(wait := true) to promote the clone to primary
* 3. Reconnects to verify the promotion was successful
* 4. Checks if the node is still in recovery mode (which would indicate failure)
*
* The function throws an ERROR if:
* - Connection to the clone node fails
* - The pg_promote() command fails
* - The clone is still in recovery mode after promotion attempt
*
* On success, it logs a NOTICE message confirming the promotion.
*
* Note: This function assumes the clone has already been validated for promotion
* (e.g., replication lag is acceptable, clone is not synchronous, etc.)
*/
static void
PromoteCloneNode(WorkerNode *cloneWorkerNode)
{
/* Step 1: Connect to the clone node */
int connectionFlag = 0;
MultiConnection *cloneConnection = GetNodeConnection(connectionFlag,
cloneWorkerNode->workerName,
cloneWorkerNode->workerPort);
if (PQstatus(cloneConnection->pgConn) != CONNECTION_OK)
{
ReportConnectionError(cloneConnection, ERROR);
}
/* Step 2: Execute pg_promote() to promote the clone to primary */
const char *promoteQuery = "SELECT pg_promote(wait := true);";
int resultCode = SendRemoteCommand(cloneConnection, promoteQuery);
if (resultCode == 0)
{
ReportConnectionError(cloneConnection, ERROR);
}
ForgetResults(cloneConnection);
CloseConnection(cloneConnection);
/* Step 3: Reconnect and verify the promotion was successful */
if (GetNodeIsInRecoveryStatus(cloneWorkerNode))
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Failed to promote clone %s:%d (ID %d). It is still in recovery.",
cloneWorkerNode->workerName, cloneWorkerNode->workerPort,
cloneWorkerNode->nodeId)));
}
else
{
ereport(NOTICE, (errmsg(
"Clone node %s:%d (ID %d) has been successfully promoted.",
cloneWorkerNode->workerName, cloneWorkerNode->workerPort,
cloneWorkerNode->nodeId)));
}
}
static void
BlockAllWritesToWorkerNode(WorkerNode *workerNode)
{
ereport(NOTICE, (errmsg("Blocking all writes to worker node %s:%d (ID %d)",
workerNode->workerName, workerNode->workerPort, workerNode->
nodeId)));
LockShardsInWorkerPlacementList(workerNode, AccessExclusiveLock);
}
/*
* GetNodeIsInRecoveryStatus checks if a PostgreSQL node is currently in recovery mode.
*
* This function connects to the specified worker node and executes pg_is_in_recovery()
* to determine if the node is still acting as a replica (in recovery) or has been
* promoted to a primary (not in recovery).
*
* Arguments:
* - workerNode: The WorkerNode to check recovery status for
*
* Returns:
* - true if the node is in recovery mode (acting as a replica)
* - false if the node is not in recovery mode (acting as a primary)
*
* The function will ERROR if:
* - Cannot establish connection to the node
* - The remote query fails
* - The query result cannot be parsed
*
* This is used after promoting a clone node to verify that the
* promotion was successful and the node is no longer in recovery mode.
*/
static bool
GetNodeIsInRecoveryStatus(WorkerNode *workerNode)
{
int connectionFlag = 0;
MultiConnection *nodeConnection = GetNodeConnection(connectionFlag,
workerNode->workerName,
workerNode->workerPort);
if (PQstatus(nodeConnection->pgConn) != CONNECTION_OK)
{
ReportConnectionError(nodeConnection, ERROR);
}
const char *recoveryQuery = "SELECT pg_is_in_recovery();";
int resultCode = SendRemoteCommand(nodeConnection, recoveryQuery);
if (resultCode == 0)
{
ReportConnectionError(nodeConnection, ERROR);
}
PGresult *result = GetRemoteCommandResult(nodeConnection, true);
if (!IsResponseOK(result))
{
ReportResultError(nodeConnection, result, ERROR);
}
List *recoveryStatusList = ReadFirstColumnAsText(result);
if (list_length(recoveryStatusList) != 1)
{
PQclear(result);
ClearResults(nodeConnection, true);
CloseConnection(nodeConnection);
ereport(ERROR, (errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("cannot parse recovery status result from %s:%d",
workerNode->workerName,
workerNode->workerPort)));
}
StringInfo recoveryStatusInfo = (StringInfo) linitial(recoveryStatusList);
bool isInRecovery = (strcmp(recoveryStatusInfo->data, "t") == 0) || (strcmp(
recoveryStatusInfo
->data,
"true") == 0)
;
PQclear(result);
ForgetResults(nodeConnection);
CloseConnection(nodeConnection);
return isInRecovery;
}
/*
* EnsureSingleNodePromotion ensures that only one node promotion operation
* can proceed at a time by acquiring necessary locks and checking for
* conflicting operations.
*
* This function performs the following safety checks:
* 1. Verifies no rebalance operations are currently running, as they would
* conflict with the shard redistribution that occurs during promotion
* 2. Acquires exclusive placement colocation locks on all shards residing
* on the primary node's group to prevent concurrent shard operations
*
* The locks are acquired in shard ID order to prevent deadlocks when
* multiple operations attempt to lock the same set of shards.
*
* Arguments:
* - primaryNode: The primary node whose shards need to be locked
*
* Throws ERROR if:
* - A rebalance operation is already running
* - Unable to acquire necessary locks
*/
static void
EnsureSingleNodePromotion(WorkerNode *primaryNode)
{
/* Error out if some rebalancer is running */
int64 jobId = 0;
if (HasNonTerminalJobOfType("rebalance", &jobId))
{
ereport(ERROR, (
errmsg("A rebalance operation is already running as job %ld", jobId),
errdetail("A rebalance was already scheduled as background job"),
errhint("To monitor progress, run: SELECT * FROM "
"citus_rebalance_status();")));
}
List *placementList = AllShardPlacementsOnNodeGroup(primaryNode->groupId);
/* lock shards in order of shard id to prevent deadlock */
placementList = SortList(placementList, CompareShardPlacementsByShardId);
GroupShardPlacement *placement = NULL;
foreach_declared_ptr(placement, placementList)
{
int64 shardId = placement->shardId;
ShardInterval *shardInterval = LoadShardInterval(shardId);
Oid distributedTableId = shardInterval->relationId;
AcquirePlacementColocationLock(distributedTableId, ExclusiveLock, "promote clone")
;
}
}

View File

@ -746,7 +746,12 @@ GetRelationIdentityOrPK(Relation rel)
if (!OidIsValid(idxoid))
{
/* Determine the index OID of the primary key (PG18 adds a second parameter) */
#if PG_VERSION_NUM >= PG_VERSION_18
idxoid = RelationGetPrimaryKeyIndex(rel, false /* deferred_ok */);
#else
idxoid = RelationGetPrimaryKeyIndex(rel);
#endif
}
return idxoid;

View File

@ -81,8 +81,29 @@ typedef struct RebalanceOptions
Form_pg_dist_rebalance_strategy rebalanceStrategy;
const char *operationName;
WorkerNode *workerNode;
List *involvedWorkerNodeList;
} RebalanceOptions;
typedef struct SplitPrimaryCloneShards
{
/*
* primaryShardPlacementList contains the placements that
* should stay on primary worker node.
*/
List *primaryShardIdList;
/*
* cloneShardPlacementList contains the placements that should stay on
* clone worker node.
*/
List *cloneShardIdList;
} SplitPrimaryCloneShards;
static SplitPrimaryCloneShards * GetPrimaryCloneSplitRebalanceSteps(RebalanceOptions
*options,
WorkerNode
*cloneNode);
/*
* RebalanceState is used to keep the internal state of the rebalance
@ -222,6 +243,7 @@ typedef struct ShardMoveDependencies
{
HTAB *colocationDependencies;
HTAB *nodeDependencies;
bool parallelTransferColocatedShards;
} ShardMoveDependencies;
char *VariablesToBePassedToNewConnections = NULL;
@ -270,7 +292,9 @@ static ShardCost GetShardCost(uint64 shardId, void *context);
static List * NonColocatedDistRelationIdList(void);
static void RebalanceTableShards(RebalanceOptions *options, Oid shardReplicationModeOid);
static int64 RebalanceTableShardsBackground(RebalanceOptions *options, Oid
shardReplicationModeOid);
shardReplicationModeOid,
bool ParallelTransferReferenceTables,
bool ParallelTransferColocatedShards);
static void AcquireRebalanceColocationLock(Oid relationId, const char *operationName);
static void ExecutePlacementUpdates(List *placementUpdateList, Oid
shardReplicationModeOid, char *noticeOperation);
@ -296,9 +320,12 @@ static HTAB * BuildShardSizesHash(ProgressMonitorData *monitor, HTAB *shardStati
static void ErrorOnConcurrentRebalance(RebalanceOptions *);
static List * GetSetCommandListForNewConnections(void);
static int64 GetColocationId(PlacementUpdateEvent *move);
static ShardMoveDependencies InitializeShardMoveDependencies();
static int64 * GenerateTaskMoveDependencyList(PlacementUpdateEvent *move, int64
colocationId,
static ShardMoveDependencies InitializeShardMoveDependencies(bool
ParallelTransferColocatedShards);
static int64 * GenerateTaskMoveDependencyList(PlacementUpdateEvent *move,
int64 colocationId,
int64 *refTablesDepTaskIds,
int refTablesDepTaskIdsCount,
ShardMoveDependencies shardMoveDependencies,
int *nDepends);
static void UpdateShardMoveDependencies(PlacementUpdateEvent *move, uint64 colocationId,
@ -318,6 +345,7 @@ PG_FUNCTION_INFO_V1(pg_dist_rebalance_strategy_enterprise_check);
PG_FUNCTION_INFO_V1(citus_rebalance_start);
PG_FUNCTION_INFO_V1(citus_rebalance_stop);
PG_FUNCTION_INFO_V1(citus_rebalance_wait);
PG_FUNCTION_INFO_V1(get_snapshot_based_node_split_plan);
bool RunningUnderCitusTestSuite = false;
int MaxRebalancerLoggedIgnoredMoves = 5;
@ -517,8 +545,17 @@ GetRebalanceSteps(RebalanceOptions *options)
.context = &context,
};
if (options->involvedWorkerNodeList == NULL)
{
/*
* If the user did not specify a list of worker nodes, we use all the
* active worker nodes.
*/
options->involvedWorkerNodeList = SortedActiveWorkers();
}
/* sort the lists to make the function more deterministic */
List *activeWorkerList = SortedActiveWorkers();
List *activeWorkerList = options->involvedWorkerNodeList; /*SortedActiveWorkers(); */
int shardAllowedNodeCount = 0;
WorkerNode *workerNode = NULL;
foreach_declared_ptr(workerNode, activeWorkerList)
@ -981,6 +1018,7 @@ rebalance_table_shards(PG_FUNCTION_ARGS)
.excludedShardArray = PG_GETARG_ARRAYTYPE_P(3),
.drainOnly = PG_GETARG_BOOL(5),
.rebalanceStrategy = strategy,
.involvedWorkerNodeList = NULL,
.improvementThreshold = strategy->improvementThreshold,
};
Oid shardTransferModeOid = PG_GETARG_OID(4);
@ -1014,6 +1052,12 @@ citus_rebalance_start(PG_FUNCTION_ARGS)
PG_ENSURE_ARGNOTNULL(2, "shard_transfer_mode");
Oid shardTransferModeOid = PG_GETARG_OID(2);
PG_ENSURE_ARGNOTNULL(3, "parallel_transfer_reference_tables");
bool ParallelTransferReferenceTables = PG_GETARG_BOOL(3);
PG_ENSURE_ARGNOTNULL(4, "parallel_transfer_colocated_shards");
bool ParallelTransferColocatedShards = PG_GETARG_BOOL(4);
RebalanceOptions options = {
.relationIdList = relationIdList,
.threshold = strategy->defaultThreshold,
@ -1023,7 +1067,9 @@ citus_rebalance_start(PG_FUNCTION_ARGS)
.rebalanceStrategy = strategy,
.improvementThreshold = strategy->improvementThreshold,
};
int jobId = RebalanceTableShardsBackground(&options, shardTransferModeOid);
int jobId = RebalanceTableShardsBackground(&options, shardTransferModeOid,
ParallelTransferReferenceTables,
ParallelTransferColocatedShards);
if (jobId == 0)
{
@ -1988,17 +2034,20 @@ GetColocationId(PlacementUpdateEvent *move)
* given colocation group and the other one is for tracking source nodes of all moves.
*/
static ShardMoveDependencies
InitializeShardMoveDependencies()
InitializeShardMoveDependencies(bool ParallelTransferColocatedShards)
{
ShardMoveDependencies shardMoveDependencies;
shardMoveDependencies.colocationDependencies = CreateSimpleHashWithNameAndSize(int64,
ShardMoveDependencyInfo,
"colocationDependencyHashMap",
6);
shardMoveDependencies.nodeDependencies = CreateSimpleHashWithNameAndSize(int32,
ShardMoveSourceNodeHashEntry,
"nodeDependencyHashMap",
6);
shardMoveDependencies.parallelTransferColocatedShards =
ParallelTransferColocatedShards;
return shardMoveDependencies;
}
@ -2009,6 +2058,7 @@ InitializeShardMoveDependencies()
*/
static int64 *
GenerateTaskMoveDependencyList(PlacementUpdateEvent *move, int64 colocationId,
int64 *refTablesDepTaskIds, int refTablesDepTaskIdsCount,
ShardMoveDependencies shardMoveDependencies, int *nDepends)
{
HTAB *dependsList = CreateSimpleHashSetWithNameAndSize(int64,
@ -2016,13 +2066,17 @@ GenerateTaskMoveDependencyList(PlacementUpdateEvent *move, int64 colocationId,
bool found;
/* Check if there exists a move in the same colocation group scheduled earlier. */
ShardMoveDependencyInfo *shardMoveDependencyInfo = hash_search(
shardMoveDependencies.colocationDependencies, &colocationId, HASH_ENTER, &found);
if (found)
if (!shardMoveDependencies.parallelTransferColocatedShards)
{
hash_search(dependsList, &shardMoveDependencyInfo->taskId, HASH_ENTER, NULL);
/* Check if there exists a move in the same colocation group scheduled earlier. */
ShardMoveDependencyInfo *shardMoveDependencyInfo = hash_search(
shardMoveDependencies.colocationDependencies, &colocationId, HASH_ENTER, &
found);
if (found)
{
hash_search(dependsList, &shardMoveDependencyInfo->taskId, HASH_ENTER, NULL);
}
}
/*
@ -2045,6 +2099,23 @@ GenerateTaskMoveDependencyList(PlacementUpdateEvent *move, int64 colocationId,
}
}
*nDepends = hash_get_num_entries(dependsList);
if (*nDepends == 0)
{
/*
* shard copy can only start after finishing copy of reference table shards
* so each shard task will have a dependency on the task that indicates the
* copy complete of reference tables
*/
while (refTablesDepTaskIdsCount > 0)
{
int64 refTableTaskId = *refTablesDepTaskIds;
hash_search(dependsList, &refTableTaskId, HASH_ENTER, NULL);
refTablesDepTaskIds++;
refTablesDepTaskIdsCount--;
}
}
*nDepends = hash_get_num_entries(dependsList);
int64 *dependsArray = NULL;
@ -2076,9 +2147,13 @@ static void
UpdateShardMoveDependencies(PlacementUpdateEvent *move, uint64 colocationId, int64 taskId,
ShardMoveDependencies shardMoveDependencies)
{
ShardMoveDependencyInfo *shardMoveDependencyInfo = hash_search(
shardMoveDependencies.colocationDependencies, &colocationId, HASH_ENTER, NULL);
shardMoveDependencyInfo->taskId = taskId;
if (!shardMoveDependencies.parallelTransferColocatedShards)
{
ShardMoveDependencyInfo *shardMoveDependencyInfo = hash_search(
shardMoveDependencies.colocationDependencies, &colocationId,
HASH_ENTER, NULL);
shardMoveDependencyInfo->taskId = taskId;
}
bool found;
ShardMoveSourceNodeHashEntry *shardMoveSourceNodeHashEntry = hash_search(
@ -2103,7 +2178,9 @@ UpdateShardMoveDependencies(PlacementUpdateEvent *move, uint64 colocationId, int
* background job+task infrastructure.
*/
static int64
RebalanceTableShardsBackground(RebalanceOptions *options, Oid shardReplicationModeOid)
RebalanceTableShardsBackground(RebalanceOptions *options, Oid shardReplicationModeOid,
bool ParallelTransferReferenceTables,
bool ParallelTransferColocatedShards)
{
if (list_length(options->relationIdList) == 0)
{
@ -2174,7 +2251,8 @@ RebalanceTableShardsBackground(RebalanceOptions *options, Oid shardReplicationMo
initStringInfo(&buf);
List *referenceTableIdList = NIL;
int64 replicateRefTablesTaskId = 0;
int64 *refTablesDepTaskIds = NULL;
int refTablesDepTaskIdsCount = 0;
if (HasNodesWithMissingReferenceTables(&referenceTableIdList))
{
@ -2187,22 +2265,41 @@ RebalanceTableShardsBackground(RebalanceOptions *options, Oid shardReplicationMo
* Reference tables need to be copied to (newly-added) nodes, this needs to be the
* first task before we can move any other table.
*/
appendStringInfo(&buf,
"SELECT pg_catalog.replicate_reference_tables(%s)",
quote_literal_cstr(shardTranferModeLabel));
if (ParallelTransferReferenceTables)
{
refTablesDepTaskIds =
ScheduleTasksToParallelCopyReferenceTablesOnAllMissingNodes(
jobId, shardTransferMode, &refTablesDepTaskIdsCount);
ereport(DEBUG2,
(errmsg("%d dependent copy reference table tasks for job %ld",
refTablesDepTaskIdsCount, jobId),
errdetail("Rebalance scheduled as background job"),
errhint("To monitor progress, run: SELECT * FROM "
"citus_rebalance_status();")));
}
else
{
/* Move all reference tables as single task. Classical way */
appendStringInfo(&buf,
"SELECT pg_catalog.replicate_reference_tables(%s)",
quote_literal_cstr(shardTranferModeLabel));
int32 nodesInvolved[] = { 0 };
int32 nodesInvolved[] = { 0 };
/* replicate_reference_tables permissions require superuser */
Oid superUserId = CitusExtensionOwner();
BackgroundTask *task = ScheduleBackgroundTask(jobId, superUserId, buf.data, 0,
NULL, 0, nodesInvolved);
replicateRefTablesTaskId = task->taskid;
/* replicate_reference_tables permissions require superuser */
Oid superUserId = CitusExtensionOwner();
BackgroundTask *task = ScheduleBackgroundTask(jobId, superUserId, buf.data, 0,
NULL, 0, nodesInvolved);
refTablesDepTaskIds = palloc0(sizeof(int64));
refTablesDepTaskIds[0] = task->taskid;
refTablesDepTaskIdsCount = 1;
}
}
PlacementUpdateEvent *move = NULL;
ShardMoveDependencies shardMoveDependencies = InitializeShardMoveDependencies();
ShardMoveDependencies shardMoveDependencies =
InitializeShardMoveDependencies(ParallelTransferColocatedShards);
foreach_declared_ptr(move, placementUpdateList)
{
@ -2220,16 +2317,11 @@ RebalanceTableShardsBackground(RebalanceOptions *options, Oid shardReplicationMo
int nDepends = 0;
int64 *dependsArray = GenerateTaskMoveDependencyList(move, colocationId,
refTablesDepTaskIds,
refTablesDepTaskIdsCount,
shardMoveDependencies,
&nDepends);
if (nDepends == 0 && replicateRefTablesTaskId > 0)
{
nDepends = 1;
dependsArray = palloc(nDepends * sizeof(int64));
dependsArray[0] = replicateRefTablesTaskId;
}
int32 nodesInvolved[2] = { 0 };
nodesInvolved[0] = move->sourceNode->nodeId;
nodesInvolved[1] = move->targetNode->nodeId;
@ -3547,6 +3639,352 @@ EnsureShardCostUDF(Oid functionOid)
}
/*
* SplitShardsBetweenPrimaryAndClone splits the shards in shardPlacementList
* between the primary and clone nodes, adding them to the respective lists.
*/
void
SplitShardsBetweenPrimaryAndClone(WorkerNode *primaryNode,
WorkerNode *cloneNode,
Name strategyName)
{
CheckCitusVersion(ERROR);
List *relationIdList = NonColocatedDistRelationIdList();
Form_pg_dist_rebalance_strategy strategy = GetRebalanceStrategy(strategyName);/* We use default strategy for now */
RebalanceOptions options = {
.relationIdList = relationIdList,
.threshold = 0, /* Threshold is not strictly needed for two nodes */
.maxShardMoves = -1, /* No limit on moves between these two nodes */
.excludedShardArray = construct_empty_array(INT8OID),
.drainOnly = false, /* Not a drain operation */
.rebalanceStrategy = strategy,
.improvementThreshold = 0, /* Consider all beneficial moves */
.workerNode = primaryNode /* indicate Primary node as a source node */
};
SplitPrimaryCloneShards *splitShards = GetPrimaryCloneSplitRebalanceSteps(&options
,
cloneNode);
AdjustShardsForPrimaryCloneNodeSplit(primaryNode, cloneNode,
splitShards->primaryShardIdList, splitShards->
cloneShardIdList);
}
/*
* GetPrimaryCloneSplitRebalanceSteps returns a List of PlacementUpdateEvents that are needed to
* rebalance a list of tables.
*/
static SplitPrimaryCloneShards *
GetPrimaryCloneSplitRebalanceSteps(RebalanceOptions *options, WorkerNode *cloneNode)
{
WorkerNode *sourceNode = options->workerNode;
WorkerNode *targetNode = cloneNode;
/* Initialize rebalance plan functions and context */
EnsureShardCostUDF(options->rebalanceStrategy->shardCostFunction);
EnsureNodeCapacityUDF(options->rebalanceStrategy->nodeCapacityFunction);
EnsureShardAllowedOnNodeUDF(options->rebalanceStrategy->shardAllowedOnNodeFunction);
RebalanceContext context;
memset(&context, 0, sizeof(RebalanceContext));
fmgr_info(options->rebalanceStrategy->shardCostFunction, &context.shardCostUDF);
fmgr_info(options->rebalanceStrategy->nodeCapacityFunction, &context.nodeCapacityUDF);
fmgr_info(options->rebalanceStrategy->shardAllowedOnNodeFunction,
&context.shardAllowedOnNodeUDF);
RebalancePlanFunctions rebalancePlanFunctions = {
.shardAllowedOnNode = ShardAllowedOnNode,
.nodeCapacity = NodeCapacity,
.shardCost = GetShardCost,
.context = &context,
};
/*
* Collect all active shard placements on the source node for the given relations.
* Unlike the main rebalancer, we build a single list of all relevant source placements
* across all specified relations (or all relations if none specified).
*/
List *allSourcePlacements = NIL;
Oid relationIdItr = InvalidOid;
foreach_declared_oid(relationIdItr, options->relationIdList)
{
List *shardPlacementList = FullShardPlacementList(relationIdItr,
options->excludedShardArray);
List *activeShardPlacementsForRelation =
FilterShardPlacementList(shardPlacementList, IsActiveShardPlacement);
ShardPlacement *placement = NULL;
foreach_declared_ptr(placement, activeShardPlacementsForRelation)
{
if (placement->nodeId == sourceNode->nodeId)
{
/* Ensure we don't add duplicate shardId if it's somehow listed under multiple relations */
bool alreadyAdded = false;
ShardPlacement *existingPlacement = NULL;
foreach_declared_ptr(existingPlacement, allSourcePlacements)
{
if (existingPlacement->shardId == placement->shardId)
{
alreadyAdded = true;
break;
}
}
if (!alreadyAdded)
{
allSourcePlacements = lappend(allSourcePlacements, placement);
}
}
}
}
List *activeWorkerList = list_make2(options->workerNode, cloneNode);
SplitPrimaryCloneShards *splitShards = palloc0(sizeof(SplitPrimaryCloneShards));
splitShards->primaryShardIdList = NIL;
splitShards->cloneShardIdList = NIL;
if (list_length(allSourcePlacements) > 0)
{
/*
* Initialize RebalanceState considering only the source node's shards
* and the two active workers (source and target).
*/
RebalanceState *state = InitRebalanceState(activeWorkerList, allSourcePlacements,
&rebalancePlanFunctions);
NodeFillState *sourceFillState = NULL;
NodeFillState *targetFillState = NULL;
ListCell *fsc = NULL;
/* Identify the fill states for our specific source and target nodes */
foreach(fsc, state->fillStateListAsc) /* Could be fillStateListDesc too, order doesn't matter here */
{
NodeFillState *fs = (NodeFillState *) lfirst(fsc);
if (fs->node->nodeId == sourceNode->nodeId)
{
sourceFillState = fs;
}
else if (fs->node->nodeId == targetNode->nodeId)
{
targetFillState = fs;
}
}
if (sourceFillState != NULL && targetFillState != NULL)
{
/*
* The goal is to move roughly half the total cost from source to target.
* The target node is assumed to be empty or its existing load is not
* considered for this specific two-node balancing plan's shard distribution.
* We calculate costs based *only* on the shards currently on the source node.
*/
/*
* The core idea is to simulate the balancing process between these two nodes.
* We have all shards on sourceFillState. TargetFillState is initially empty (in terms of these specific shards).
* We want to move shards from source to target until their costs are as balanced as possible.
*/
float4 sourceCurrentCost = sourceFillState->totalCost;
float4 targetCurrentCost = 0; /* Representing cost on target from these source shards */
/* Sort shards on source node by cost (descending). This is a common heuristic. */
sourceFillState->shardCostListDesc = SortList(sourceFillState->
shardCostListDesc,
CompareShardCostDesc);
List *potentialMoves = NIL;
ListCell *lc_shardcost = NULL;
/*
* Iterate through each shard on the source node. For each shard, decide if moving it
* to the target node would improve the balance (or is necessary to reach balance).
* A simple greedy approach: move shard if target node's current cost is less than source's.
*/
foreach(lc_shardcost, sourceFillState->shardCostListDesc)
{
ShardCost *shardToConsider = (ShardCost *) lfirst(lc_shardcost);
/*
* If moving this shard makes the target less loaded than the source would become,
* or if target is simply less loaded currently, consider the move.
* More accurately, we move if target's cost + shard's cost < source's cost - shard's cost (approximately)
* or if target is significantly emptier.
* The condition (targetCurrentCost < sourceCurrentCost - shardToConsider->cost) is a greedy choice.
* A better check: would moving this shard reduce the difference in costs?
* Current difference: abs(sourceCurrentCost - targetCurrentCost)
* Difference after move: abs((sourceCurrentCost - shardToConsider->cost) - (targetCurrentCost + shardToConsider->cost))
* Move if new difference is smaller.
*/
float4 costOfShard = shardToConsider->cost;
float4 diffBefore = fabsf(sourceCurrentCost - targetCurrentCost);
float4 diffAfter = fabsf((sourceCurrentCost - costOfShard) - (
targetCurrentCost + costOfShard));
if (diffAfter < diffBefore)
{
PlacementUpdateEvent *update = palloc0(sizeof(PlacementUpdateEvent));
update->shardId = shardToConsider->shardId;
update->sourceNode = sourceNode;
update->targetNode = targetNode;
update->updateType = PLACEMENT_UPDATE_MOVE;
potentialMoves = lappend(potentialMoves, update);
splitShards->cloneShardIdList = lappend_int(splitShards->
cloneShardIdList,
shardToConsider->shardId
);
/* Update simulated costs for the next iteration */
sourceCurrentCost -= costOfShard;
targetCurrentCost += costOfShard;
}
else
{
splitShards->primaryShardIdList = lappend_int(splitShards->
primaryShardIdList,
shardToConsider->shardId
);
}
}
}
/* RebalanceState is in memory context, will be cleaned up */
}
return splitShards;
}
/*
* Snapshot-based node split plan outputs the shard placement plan
* for primary and replica based node split
*
* SQL signature:
* get_snapshot_based_node_split_plan(
* primary_node_name text,
* primary_node_port integer,
* replica_node_name text,
* replica_node_port integer,
* rebalance_strategy name DEFAULT NULL
*
*/
Datum
get_snapshot_based_node_split_plan(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
text *primaryNodeNameText = PG_GETARG_TEXT_P(0);
int32 primaryNodePort = PG_GETARG_INT32(1);
text *cloneNodeNameText = PG_GETARG_TEXT_P(2);
int32 cloneNodePort = PG_GETARG_INT32(3);
char *primaryNodeName = text_to_cstring(primaryNodeNameText);
char *cloneNodeName = text_to_cstring(cloneNodeNameText);
WorkerNode *primaryNode = FindWorkerNodeOrError(primaryNodeName, primaryNodePort);
WorkerNode *cloneNode = FindWorkerNodeOrError(cloneNodeName, cloneNodePort);
if (!cloneNode->nodeisclone || cloneNode->nodeprimarynodeid == 0)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Node %s:%d (ID %d) is not a valid clone or its primary node ID is not set.",
cloneNode->workerName, cloneNode->workerPort,
cloneNode->nodeId)));
}
if (primaryNode->nodeisclone)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("Primary node %s:%d (ID %d) is itself a replica.",
primaryNode->workerName, primaryNode->workerPort,
primaryNode->nodeId)));
}
/* Ensure the primary node is related to the replica node */
if (primaryNode->nodeId != cloneNode->nodeprimarynodeid)
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Clone node %s:%d (ID %d) is not a clone of the primary node %s:%d (ID %d).",
cloneNode->workerName, cloneNode->workerPort,
cloneNode->nodeId,
primaryNode->workerName, primaryNode->workerPort,
primaryNode->nodeId)));
}
List *relationIdList = NonColocatedDistRelationIdList();
Form_pg_dist_rebalance_strategy strategy = GetRebalanceStrategy(
PG_GETARG_NAME_OR_NULL(4));
RebalanceOptions options = {
.relationIdList = relationIdList,
.threshold = 0, /* Threshold is not strictly needed for two nodes */
.maxShardMoves = -1, /* No limit on moves between these two nodes */
.excludedShardArray = construct_empty_array(INT8OID),
.drainOnly = false, /* Not a drain operation */
.rebalanceStrategy = strategy,
.improvementThreshold = 0, /* Consider all beneficial moves */
.workerNode = primaryNode /* indicate Primary node as a source node */
};
SplitPrimaryCloneShards *splitShards = GetPrimaryCloneSplitRebalanceSteps(
&options,
cloneNode);
int shardId = 0;
TupleDesc tupdesc;
Tuplestorestate *tupstore = SetupTuplestore(fcinfo, &tupdesc);
Datum values[4];
bool nulls[4];
foreach_declared_int(shardId, splitShards->primaryShardIdList)
{
ShardInterval *shardInterval = LoadShardInterval(shardId);
List *colocatedShardList = ColocatedShardIntervalList(shardInterval);
ListCell *colocatedShardCell = NULL;
foreach(colocatedShardCell, colocatedShardList)
{
ShardInterval *colocatedShard = lfirst(colocatedShardCell);
int colocatedShardId = colocatedShard->shardId;
memset(values, 0, sizeof(values));
memset(nulls, 0, sizeof(nulls));
values[0] = ObjectIdGetDatum(RelationIdForShard(colocatedShardId));
values[1] = UInt64GetDatum(colocatedShardId);
values[2] = UInt64GetDatum(ShardLength(colocatedShardId));
values[3] = PointerGetDatum(cstring_to_text("Primary Node"));
tuplestore_putvalues(tupstore, tupdesc, values, nulls);
}
}
foreach_declared_int(shardId, splitShards->cloneShardIdList)
{
ShardInterval *shardInterval = LoadShardInterval(shardId);
List *colocatedShardList = ColocatedShardIntervalList(shardInterval);
ListCell *colocatedShardCell = NULL;
foreach(colocatedShardCell, colocatedShardList)
{
ShardInterval *colocatedShard = lfirst(colocatedShardCell);
int colocatedShardId = colocatedShard->shardId;
memset(values, 0, sizeof(values));
memset(nulls, 0, sizeof(nulls));
values[0] = ObjectIdGetDatum(RelationIdForShard(colocatedShardId));
values[1] = UInt64GetDatum(colocatedShardId);
values[2] = UInt64GetDatum(ShardLength(colocatedShardId));
values[3] = PointerGetDatum(cstring_to_text("Clone Node"));
tuplestore_putvalues(tupstore, tupdesc, values, nulls);
}
}
return (Datum) 0;
}
/*
* EnsureNodeCapacityUDF checks that the UDF matching the oid has the correct
* signature to be used as a NodeCapacity function. The expected signature is:

View File

@ -1546,12 +1546,15 @@ NonBlockingShardSplit(SplitOperation splitOperation,
* 9) Logically replicate all the changes and do most of the table DDL,
* like index and foreign key creation.
*/
bool skipInterShardRelationshipCreation = false;
CompleteNonBlockingShardTransfer(sourceColocatedShardIntervalList,
sourceConnection,
publicationInfoHash,
logicalRepTargetList,
groupedLogicalRepTargetsHash,
SHARD_SPLIT);
SHARD_SPLIT,
skipInterShardRelationshipCreation);
/*
* 10) Delete old shards metadata and mark the shards as to be deferred drop.

View File

@ -107,16 +107,18 @@ static void ErrorIfSameNode(char *sourceNodeName, int sourceNodePort,
static void CopyShardTables(List *shardIntervalList, char *sourceNodeName,
int32 sourceNodePort, char *targetNodeName,
int32 targetNodePort, bool useLogicalReplication,
const char *operationName);
const char *operationName, uint32 optionFlags);
static void CopyShardTablesViaLogicalReplication(List *shardIntervalList,
char *sourceNodeName,
int32 sourceNodePort,
char *targetNodeName,
int32 targetNodePort);
int32 targetNodePort,
uint32 optionFlags);
static void CopyShardTablesViaBlockWrites(List *shardIntervalList, char *sourceNodeName,
int32 sourceNodePort,
char *targetNodeName, int32 targetNodePort);
char *targetNodeName, int32 targetNodePort,
uint32 optionFlags);
static void EnsureShardCanBeCopied(int64 shardId, const char *sourceNodeName,
int32 sourceNodePort, const char *targetNodeName,
int32 targetNodePort);
@ -165,7 +167,8 @@ static List * PostLoadShardCreationCommandList(ShardInterval *shardInterval,
static ShardCommandList * CreateShardCommandList(ShardInterval *shardInterval,
List *ddlCommandList);
static char * CreateShardCopyCommand(ShardInterval *shard, WorkerNode *targetNode);
static void AcquireShardPlacementLock(uint64_t shardId, int lockMode, Oid relationId,
const char *operationName);
/* declarations for dynamic loading */
PG_FUNCTION_INFO_V1(citus_copy_shard_placement);
@ -174,7 +177,7 @@ PG_FUNCTION_INFO_V1(master_copy_shard_placement);
PG_FUNCTION_INFO_V1(citus_move_shard_placement);
PG_FUNCTION_INFO_V1(citus_move_shard_placement_with_nodeid);
PG_FUNCTION_INFO_V1(master_move_shard_placement);
PG_FUNCTION_INFO_V1(citus_internal_copy_single_shard_placement);
double DesiredPercentFreeAfterMove = 10;
bool CheckAvailableSpaceBeforeMove = true;
@ -203,7 +206,7 @@ citus_copy_shard_placement(PG_FUNCTION_ARGS)
TransferShards(shardId, sourceNodeName, sourceNodePort,
targetNodeName, targetNodePort,
shardReplicationMode, SHARD_TRANSFER_COPY);
shardReplicationMode, SHARD_TRANSFER_COPY, 0);
PG_RETURN_VOID();
}
@ -232,7 +235,7 @@ citus_copy_shard_placement_with_nodeid(PG_FUNCTION_ARGS)
TransferShards(shardId, sourceNode->workerName, sourceNode->workerPort,
targetNode->workerName, targetNode->workerPort,
shardReplicationMode, SHARD_TRANSFER_COPY);
shardReplicationMode, SHARD_TRANSFER_COPY, 0);
PG_RETURN_VOID();
}
@ -267,13 +270,69 @@ master_copy_shard_placement(PG_FUNCTION_ARGS)
TransferShards(shardId, sourceNodeName, sourceNodePort,
targetNodeName, targetNodePort,
shardReplicationMode, SHARD_TRANSFER_COPY);
shardReplicationMode, SHARD_TRANSFER_COPY, 0);
PG_RETURN_VOID();
}
/*
* citus_internal_copy_single_shard_placement is an internal function that
* copies a single shard placement from a source node to a target node.
* It has two main differences from citus_copy_shard_placement:
* 1. it copies only a single shard placement, not all colocated shards
* 2. It allows to defer the constraints creation and this same function
* can be used to create the constraints later.
*
* The primary use case for this function is to transfer the shards of
* reference tables. Since all reference tables are colocated together,
* and each reference table has only one shard, this function can be used
* to transfer the shards of reference tables in parallel.
* Furthermore, the reference tables could have relations with
* other reference tables, so we need to ensure that their constraints
* are also transferred after copying the shards to the target node.
* For this reason, we allow the caller to defer the constraints creation.
*
* This function is not supposed to be called by the user directly.
*/
Datum
citus_internal_copy_single_shard_placement(PG_FUNCTION_ARGS)
{
CheckCitusVersion(ERROR);
EnsureCoordinator();
int64 shardId = PG_GETARG_INT64(0);
uint32 sourceNodeId = PG_GETARG_INT32(1);
uint32 targetNodeId = PG_GETARG_INT32(2);
uint32 flags = PG_GETARG_INT32(3);
Oid shardReplicationModeOid = PG_GETARG_OID(4);
bool missingOk = false;
WorkerNode *sourceNode = FindNodeWithNodeId(sourceNodeId, missingOk);
WorkerNode *targetNode = FindNodeWithNodeId(targetNodeId, missingOk);
char shardReplicationMode = LookupShardTransferMode(shardReplicationModeOid);
/*
* This is an internal function that is used by the rebalancer.
* It is not supposed to be called by the user directly.
*/
if (!IsRebalancerInternalBackend())
{
ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("This is an internal Citus function that can only"
" be used by a rebalancer task")));
}
TransferShards(shardId, sourceNode->workerName, sourceNode->workerPort,
targetNode->workerName, targetNode->workerPort,
shardReplicationMode, SHARD_TRANSFER_COPY, flags);
PG_RETURN_VOID();
}
/*
* citus_move_shard_placement moves given shard (and its co-located shards) from one
* node to the other node. To accomplish this it entirely recreates the table structure
@ -315,7 +374,7 @@ citus_move_shard_placement(PG_FUNCTION_ARGS)
char shardReplicationMode = LookupShardTransferMode(shardReplicationModeOid);
TransferShards(shardId, sourceNodeName, sourceNodePort,
targetNodeName, targetNodePort,
shardReplicationMode, SHARD_TRANSFER_MOVE);
shardReplicationMode, SHARD_TRANSFER_MOVE, 0);
PG_RETURN_VOID();
}
@ -343,20 +402,77 @@ citus_move_shard_placement_with_nodeid(PG_FUNCTION_ARGS)
char shardReplicationMode = LookupShardTransferMode(shardReplicationModeOid);
TransferShards(shardId, sourceNode->workerName,
sourceNode->workerPort, targetNode->workerName,
targetNode->workerPort, shardReplicationMode, SHARD_TRANSFER_MOVE);
targetNode->workerPort, shardReplicationMode, SHARD_TRANSFER_MOVE, 0);
PG_RETURN_VOID();
}
/*
* TransferShards is the function for shard transfers.
* AcquireShardPlacementLock tries to acquire a lock on the shardid
* while moving/copying the shard placement. If this
* is it not possible it fails instantly because this means
* another move/copy on same shard is currently happening. */
static void
AcquireShardPlacementLock(uint64_t shardId, int lockMode, Oid relationId,
const char *operationName)
{
LOCKTAG tag;
const bool sessionLock = false;
const bool dontWait = true;
SET_LOCKTAG_SHARD_MOVE(tag, shardId);
LockAcquireResult lockAcquired = LockAcquire(&tag, lockMode, sessionLock, dontWait);
if (!lockAcquired)
{
ereport(ERROR, (errmsg("could not acquire the lock required to %s %s",
operationName,
generate_qualified_relation_name(relationId)),
errdetail("It means that either a concurrent shard move "
"or colocated distributed table creation is "
"happening."),
errhint("Make sure that the concurrent operation has "
"finished and re-run the command")));
}
}
/*
* TransferShards is responsible for handling shard transfers.
*
* The optionFlags parameter controls the transfer behavior:
*
* - By default, shard colocation groups are treated as a single unit. This works
* well for distributed tables, since they can contain multiple colocated shards
* on the same node, and shard transfers can still be parallelized at the group level.
*
* - Reference tables are different: every reference table belongs to the same
* colocation group but has only a single shard. To parallelize reference table
* transfers, we must bypass the colocation group. The
* SHARD_TRANSFER_SINGLE_SHARD_ONLY flag enables this behavior by transferring
* only the specific shardId passed into the function, ignoring colocated shards.
*
* - Reference tables may also define foreign key relationships with each other.
* Since we cannot create those relationships until all shards have been moved,
* the SHARD_TRANSFER_SKIP_CREATE_RELATIONSHIPS flag is used to defer their
* creation until shard transfer completes.
*
* - After shards are transferred, the SHARD_TRANSFER_CREATE_RELATIONSHIPS_ONLY
* flag is used to create the foreign key relationships for already-transferred
* reference tables.
*
* Currently, optionFlags are only used to customize reference table transfers.
* For distributed tables, optionFlags should always be set to 0.
* passing 0 as optionFlags means that the default behavior will be used for
* all aspects of the shard transfer. That is to consider all colocated shards
* as a single unit and return after creating the necessary relationships.
*/
void
TransferShards(int64 shardId, char *sourceNodeName,
int32 sourceNodePort, char *targetNodeName,
int32 targetNodePort, char shardReplicationMode,
ShardTransferType transferType)
ShardTransferType transferType, uint32 optionFlags)
{
/* strings to be used in log messages */
const char *operationName = ShardTransferTypeNames[transferType];
@ -385,20 +501,36 @@ TransferShards(int64 shardId, char *sourceNodeName,
ErrorIfTargetNodeIsNotSafeForTransfer(targetNodeName, targetNodePort, transferType);
AcquirePlacementColocationLock(distributedTableId, ExclusiveLock, operationName);
AcquirePlacementColocationLock(distributedTableId, RowExclusiveLock, operationName);
List *colocatedTableList = ColocatedTableList(distributedTableId);
List *colocatedShardList = ColocatedShardIntervalList(shardInterval);
List *colocatedTableList;
List *colocatedShardList;
/*
* If SHARD_TRANSFER_SINGLE_SHARD_ONLY is set, we only transfer a single shard
* specified by shardId. Otherwise, we transfer all colocated shards.
*/
bool isSingleShardOnly = optionFlags & SHARD_TRANSFER_SINGLE_SHARD_ONLY;
if (isSingleShardOnly)
{
colocatedTableList = list_make1_oid(distributedTableId);
colocatedShardList = list_make1(shardInterval);
}
else
{
colocatedTableList = ColocatedTableList(distributedTableId);
colocatedShardList = ColocatedShardIntervalList(shardInterval);
}
EnsureTableListOwner(colocatedTableList);
if (transferType == SHARD_TRANSFER_MOVE)
{
/*
* Block concurrent DDL / TRUNCATE commands on the relation. Similarly,
* block concurrent citus_move_shard_placement() on any shard of
* the same relation. This is OK for now since we're executing shard
* moves sequentially anyway.
* Block concurrent DDL / TRUNCATE commands on the relation. while,
* allow concurrent citus_move_shard_placement() on the shards of
* the same relation.
*/
LockColocatedRelationsForMove(colocatedTableList);
}
@ -412,14 +544,66 @@ TransferShards(int64 shardId, char *sourceNodeName,
/*
* We sort shardIntervalList so that lock operations will not cause any
* deadlocks.
* deadlocks. But we do not need to do that if the list contain only one
* shard.
*/
colocatedShardList = SortList(colocatedShardList, CompareShardIntervalsById);
if (!isSingleShardOnly)
{
colocatedShardList = SortList(colocatedShardList, CompareShardIntervalsById);
}
if (TransferAlreadyCompleted(colocatedShardList,
sourceNodeName, sourceNodePort,
targetNodeName, targetNodePort,
transferType))
/* We have pretty much covered the concurrent rebalance operations
* and we want to allow concurrent moves within the same colocation group.
* but at the same time we want to block the concurrent moves on the same shard
* placement. So we lock the shard moves before starting the transfer.
*/
foreach_declared_ptr(shardInterval, colocatedShardList)
{
int64 shardIdToLock = shardInterval->shardId;
AcquireShardPlacementLock(shardIdToLock, ExclusiveLock, distributedTableId,
operationName);
}
bool transferAlreadyCompleted = TransferAlreadyCompleted(colocatedShardList,
sourceNodeName,
sourceNodePort,
targetNodeName,
targetNodePort,
transferType);
/*
* If we just need to create the shard relationships,We don't need to do anything
* else other than calling CopyShardTables with SHARD_TRANSFER_CREATE_RELATIONSHIPS_ONLY
* flag.
*/
bool createRelationshipsOnly = optionFlags & SHARD_TRANSFER_CREATE_RELATIONSHIPS_ONLY;
if (createRelationshipsOnly)
{
if (!transferAlreadyCompleted)
{
/*
* if the transfer is not completed, and we are here just to create
* the relationships, we can return right away
*/
ereport(WARNING, (errmsg("shard is not present on node %s:%d",
targetNodeName, targetNodePort),
errdetail("%s may have not completed.",
operationNameCapitalized)));
return;
}
CopyShardTables(colocatedShardList, sourceNodeName, sourceNodePort, targetNodeName
,
targetNodePort, (shardReplicationMode ==
TRANSFER_MODE_FORCE_LOGICAL),
operationFunctionName, optionFlags);
/* We don't need to do anything else, just return */
return;
}
if (transferAlreadyCompleted)
{
/* if the transfer is already completed, we can return right away */
ereport(WARNING, (errmsg("shard is already present on node %s:%d",
@ -515,7 +699,8 @@ TransferShards(int64 shardId, char *sourceNodeName,
}
CopyShardTables(colocatedShardList, sourceNodeName, sourceNodePort, targetNodeName,
targetNodePort, useLogicalReplication, operationFunctionName);
targetNodePort, useLogicalReplication, operationFunctionName,
optionFlags);
if (transferType == SHARD_TRANSFER_MOVE)
{
@ -574,6 +759,205 @@ TransferShards(int64 shardId, char *sourceNodeName,
}
/*
* AdjustShardsForPrimaryCloneNodeSplit is called when a primary-clone node split
* occurs. It adjusts the shard placements between the primary and clone nodes based
* on the provided shard lists. Since the clone is an exact replica of the primary
* but the metadata is not aware of this replication, this function updates the
* metadata to reflect the new shard distribution.
*
* The function handles three types of shards:
*
* 1. Shards moving to clone node (cloneShardList):
* - Updates shard placement metadata to move placements from primary to clone
* - No data movement is needed since the clone already has the data
* - Adds cleanup records to remove the shard data from primary at transaction commit
*
* 2. Shards staying on primary node (primaryShardList):
* - Metadata already correctly reflects these shards on primary
* - Adds cleanup records to remove the shard data from clone node
*
* 3. Reference tables:
* - Inserts new placement records on the clone node
* - Data is already present on clone, so only metadata update is needed
*
* This function does not perform any actual data movement; it only updates the
* shard placement metadata and schedules cleanup operations for later execution.
*/
void
AdjustShardsForPrimaryCloneNodeSplit(WorkerNode *primaryNode,
WorkerNode *cloneNode,
List *primaryShardList,
List *cloneShardList)
{
/* Input validation */
if (primaryNode == NULL || cloneNode == NULL)
{
ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("primary or clone worker node is NULL")));
}
if (primaryNode->nodeId == cloneNode->nodeId)
{
ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("primary and clone nodes must be different")));
}
ereport(NOTICE, (errmsg(
"adjusting shard placements for primary %s:%d and clone %s:%d",
primaryNode->workerName, primaryNode->workerPort,
cloneNode->workerName, cloneNode->workerPort)));
RegisterOperationNeedingCleanup();
/*
* Process shards that will stay on the primary node.
* For these shards, we need to remove their data from the clone node
* since the metadata already correctly reflects them on primary.
*/
uint64 shardId = 0;
uint32 primaryGroupId = GroupForNode(primaryNode->workerName, primaryNode->workerPort)
;
uint32 cloneGroupId = GroupForNode(cloneNode->workerName, cloneNode->workerPort);
ereport(NOTICE, (errmsg("processing %d shards for primary node GroupID %d",
list_length(primaryShardList), primaryGroupId)));
/*
* For each shard staying on primary, insert cleanup records to remove
* the shard data from the clone node. The metadata already correctly
* reflects these shards on primary, so no metadata changes are needed.
*/
foreach_declared_int(shardId, primaryShardList)
{
ShardInterval *shardInterval = LoadShardInterval(shardId);
List *colocatedShardList = ColocatedShardIntervalList(shardInterval);
char *qualifiedShardName = ConstructQualifiedShardName(shardInterval);
ereport(LOG, (errmsg(
"inserting DELETE shard record for shard %s from clone node GroupID %d",
qualifiedShardName, cloneGroupId)));
InsertCleanupRecordsForShardPlacementsOnNode(colocatedShardList,
cloneGroupId);
}
/*
* Process shards that will move to the clone node.
* For these shards, we need to:
* 1. Update metadata to move placements from primary to clone
* 2. Remove the shard data from primary (via cleanup records)
* 3. No data movement needed since clone already has the data
*/
ereport(NOTICE, (errmsg("processing %d shards for clone node GroupID %d", list_length(
cloneShardList), cloneGroupId)));
foreach_declared_int(shardId, cloneShardList)
{
ShardInterval *shardInterval = LoadShardInterval(shardId);
List *colocatedShardList = ColocatedShardIntervalList(shardInterval);
/*
* Create new shard placement records on the clone node for all
* colocated shards. This moves the shard placements from primary
* to clone in the metadata.
*/
foreach_declared_ptr(shardInterval, colocatedShardList)
{
uint64 colocatedShardId = shardInterval->shardId;
uint64 placementId = GetNextPlacementId();
InsertShardPlacementRow(colocatedShardId, placementId,
ShardLength(colocatedShardId),
cloneGroupId);
}
/*
* Update the metadata on worker nodes to reflect the new shard
* placement distribution between primary and clone nodes.
*/
UpdateColocatedShardPlacementMetadataOnWorkers(shardId,
primaryNode->workerName,
primaryNode->workerPort,
cloneNode->workerName,
cloneNode->workerPort);
/*
* Remove the shard placement records from primary node metadata
* since these shards are now served from the clone node.
*/
DropShardPlacementsFromMetadata(colocatedShardList,
primaryNode->workerName, primaryNode->workerPort);
char *qualifiedShardName = ConstructQualifiedShardName(shardInterval);
ereport(LOG, (errmsg(
"inserting DELETE shard record for shard %s from primary node GroupID %d",
qualifiedShardName, primaryGroupId)));
/*
* Insert cleanup records to remove the shard data from primary node
* at transaction commit. This frees up space on the primary node
* since the data is now served from the clone node.
*/
InsertCleanupRecordsForShardPlacementsOnNode(colocatedShardList,
primaryGroupId);
}
/*
* Handle reference tables - these need to be available on both
* primary and clone nodes. Since the clone already has the data,
* we just need to insert placement records for the clone node.
*/
int colocationId = GetReferenceTableColocationId();
if (colocationId == INVALID_COLOCATION_ID)
{
/* we have no reference table yet. */
return;
}
ShardInterval *shardInterval = NULL;
List *referenceTableIdList = CitusTableTypeIdList(REFERENCE_TABLE);
Oid referenceTableId = linitial_oid(referenceTableIdList);
List *shardIntervalList = LoadShardIntervalList(referenceTableId);
foreach_declared_ptr(shardInterval, shardIntervalList)
{
List *colocatedShardList = ColocatedShardIntervalList(shardInterval);
ShardInterval *colocatedShardInterval = NULL;
/*
* For each reference table shard, create placement records on the
* clone node. The data is already present on the clone, so we only
* need to update the metadata to make the clone aware of these shards.
*/
foreach_declared_ptr(colocatedShardInterval, colocatedShardList)
{
uint64 colocatedShardId = colocatedShardInterval->shardId;
/*
* Insert shard placement record for the clone node and
* propagate the metadata change to worker nodes.
*/
uint64 placementId = GetNextPlacementId();
InsertShardPlacementRow(colocatedShardId, placementId,
ShardLength(colocatedShardId),
cloneGroupId);
char *placementCommand = PlacementUpsertCommand(colocatedShardId, placementId,
0, cloneGroupId);
SendCommandToWorkersWithMetadata(placementCommand);
}
}
ereport(NOTICE, (errmsg(
"shard placement adjustment complete for primary %s:%d and clone %s:%d",
primaryNode->workerName, primaryNode->workerPort,
cloneNode->workerName, cloneNode->workerPort)));
}
/*
* Insert deferred cleanup records.
* The shards will be dropped by background cleaner later.
@ -676,7 +1060,7 @@ IsShardListOnNode(List *colocatedShardList, char *targetNodeName, uint32 targetN
/*
* LockColocatedRelationsForMove takes a list of relations, locks all of them
* using ShareUpdateExclusiveLock
* using ShareLock
*/
static void
LockColocatedRelationsForMove(List *colocatedTableList)
@ -684,7 +1068,7 @@ LockColocatedRelationsForMove(List *colocatedTableList)
Oid colocatedTableId = InvalidOid;
foreach_declared_oid(colocatedTableId, colocatedTableList)
{
LockRelationOid(colocatedTableId, ShareUpdateExclusiveLock);
LockRelationOid(colocatedTableId, RowExclusiveLock);
}
}
@ -1333,7 +1717,7 @@ ErrorIfReplicatingDistributedTableWithFKeys(List *tableIdList)
static void
CopyShardTables(List *shardIntervalList, char *sourceNodeName, int32 sourceNodePort,
char *targetNodeName, int32 targetNodePort, bool useLogicalReplication,
const char *operationName)
const char *operationName, uint32 optionFlags)
{
if (list_length(shardIntervalList) < 1)
{
@ -1343,16 +1727,22 @@ CopyShardTables(List *shardIntervalList, char *sourceNodeName, int32 sourceNodeP
/* Start operation to prepare for generating cleanup records */
RegisterOperationNeedingCleanup();
if (useLogicalReplication)
bool createRelationshipsOnly = optionFlags & SHARD_TRANSFER_CREATE_RELATIONSHIPS_ONLY;
/*
* If we're just going to create relationships only always use
* CopyShardTablesViaBlockWrites.
*/
if (useLogicalReplication && !createRelationshipsOnly)
{
CopyShardTablesViaLogicalReplication(shardIntervalList, sourceNodeName,
sourceNodePort, targetNodeName,
targetNodePort);
targetNodePort, optionFlags);
}
else
{
CopyShardTablesViaBlockWrites(shardIntervalList, sourceNodeName, sourceNodePort,
targetNodeName, targetNodePort);
targetNodeName, targetNodePort, optionFlags);
}
/*
@ -1369,7 +1759,7 @@ CopyShardTables(List *shardIntervalList, char *sourceNodeName, int32 sourceNodeP
static void
CopyShardTablesViaLogicalReplication(List *shardIntervalList, char *sourceNodeName,
int32 sourceNodePort, char *targetNodeName,
int32 targetNodePort)
int32 targetNodePort, uint32 optionFlags)
{
MemoryContext localContext = AllocSetContextCreate(CurrentMemoryContext,
"CopyShardTablesViaLogicalReplication",
@ -1407,9 +1797,13 @@ CopyShardTablesViaLogicalReplication(List *shardIntervalList, char *sourceNodeNa
MemoryContextSwitchTo(oldContext);
bool skipRelationshipCreation = (optionFlags &
SHARD_TRANSFER_SKIP_CREATE_RELATIONSHIPS);
/* data copy is done seperately when logical replication is used */
LogicallyReplicateShards(shardIntervalList, sourceNodeName,
sourceNodePort, targetNodeName, targetNodePort);
sourceNodePort, targetNodeName, targetNodePort,
skipRelationshipCreation);
}
@ -1437,7 +1831,7 @@ CreateShardCommandList(ShardInterval *shardInterval, List *ddlCommandList)
static void
CopyShardTablesViaBlockWrites(List *shardIntervalList, char *sourceNodeName,
int32 sourceNodePort, char *targetNodeName,
int32 targetNodePort)
int32 targetNodePort, uint32 optionFlags)
{
MemoryContext localContext = AllocSetContextCreate(CurrentMemoryContext,
"CopyShardTablesViaBlockWrites",
@ -1446,127 +1840,150 @@ CopyShardTablesViaBlockWrites(List *shardIntervalList, char *sourceNodeName,
WorkerNode *sourceNode = FindWorkerNode(sourceNodeName, sourceNodePort);
WorkerNode *targetNode = FindWorkerNode(targetNodeName, targetNodePort);
/* iterate through the colocated shards and copy each */
ShardInterval *shardInterval = NULL;
foreach_declared_ptr(shardInterval, shardIntervalList)
{
/*
* For each shard we first create the shard table in a separate
* transaction and then we copy the data and create the indexes in a
* second separate transaction. The reason we don't do both in a single
* transaction is so we can see the size of the new shard growing
* during the copy when we run get_rebalance_progress in another
* session. If we wouldn't split these two phases up, then the table
* wouldn't be visible in the session that get_rebalance_progress uses.
* So get_rebalance_progress would always report its size as 0.
*/
List *ddlCommandList = RecreateShardDDLCommandList(shardInterval, sourceNodeName,
sourceNodePort);
char *tableOwner = TableOwner(shardInterval->relationId);
/* drop the shard we created on the target, in case of failure */
InsertCleanupRecordOutsideTransaction(CLEANUP_OBJECT_SHARD_PLACEMENT,
ConstructQualifiedShardName(shardInterval),
GroupForNode(targetNodeName,
targetNodePort),
CLEANUP_ON_FAILURE);
SendCommandListToWorkerOutsideTransaction(targetNodeName, targetNodePort,
tableOwner, ddlCommandList);
}
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_COPYING_DATA);
ConflictWithIsolationTestingBeforeCopy();
CopyShardsToNode(sourceNode, targetNode, shardIntervalList, NULL);
ConflictWithIsolationTestingAfterCopy();
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_CREATING_CONSTRAINTS);
foreach_declared_ptr(shardInterval, shardIntervalList)
{
List *ddlCommandList =
PostLoadShardCreationCommandList(shardInterval, sourceNodeName,
sourceNodePort);
char *tableOwner = TableOwner(shardInterval->relationId);
SendCommandListToWorkerOutsideTransaction(targetNodeName, targetNodePort,
tableOwner, ddlCommandList);
MemoryContextReset(localContext);
}
bool createRelationshipsOnly = optionFlags & SHARD_TRANSFER_CREATE_RELATIONSHIPS_ONLY;
/*
* Once all shards are copied, we can recreate relationships between shards.
* Create DDL commands to Attach child tables to their parents in a partitioning hierarchy.
* If were only asked to create the relationships, the shards are already
* present and populated on the node. Skip the tablesetup and dataloading
* steps and proceed straight to creating the relationships.
*/
List *shardIntervalWithDDCommandsList = NIL;
foreach_declared_ptr(shardInterval, shardIntervalList)
if (!createRelationshipsOnly)
{
if (PartitionTable(shardInterval->relationId))
/* iterate through the colocated shards and copy each */
foreach_declared_ptr(shardInterval, shardIntervalList)
{
char *attachPartitionCommand =
GenerateAttachShardPartitionCommand(shardInterval);
/*
* For each shard we first create the shard table in a separate
* transaction and then we copy the data and create the indexes in a
* second separate transaction. The reason we don't do both in a single
* transaction is so we can see the size of the new shard growing
* during the copy when we run get_rebalance_progress in another
* session. If we wouldn't split these two phases up, then the table
* wouldn't be visible in the session that get_rebalance_progress uses.
* So get_rebalance_progress would always report its size as 0.
*/
List *ddlCommandList = RecreateShardDDLCommandList(shardInterval,
sourceNodeName,
sourceNodePort);
char *tableOwner = TableOwner(shardInterval->relationId);
ShardCommandList *shardCommandList = CreateShardCommandList(
shardInterval,
list_make1(attachPartitionCommand));
shardIntervalWithDDCommandsList = lappend(shardIntervalWithDDCommandsList,
shardCommandList);
/* drop the shard we created on the target, in case of failure */
InsertCleanupRecordOutsideTransaction(CLEANUP_OBJECT_SHARD_PLACEMENT,
ConstructQualifiedShardName(
shardInterval),
GroupForNode(targetNodeName,
targetNodePort),
CLEANUP_ON_FAILURE);
SendCommandListToWorkerOutsideTransaction(targetNodeName, targetNodePort,
tableOwner, ddlCommandList);
}
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_COPYING_DATA);
ConflictWithIsolationTestingBeforeCopy();
CopyShardsToNode(sourceNode, targetNode, shardIntervalList, NULL);
ConflictWithIsolationTestingAfterCopy();
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_CREATING_CONSTRAINTS);
foreach_declared_ptr(shardInterval, shardIntervalList)
{
List *ddlCommandList =
PostLoadShardCreationCommandList(shardInterval, sourceNodeName,
sourceNodePort);
char *tableOwner = TableOwner(shardInterval->relationId);
SendCommandListToWorkerOutsideTransaction(targetNodeName, targetNodePort,
tableOwner, ddlCommandList);
MemoryContextReset(localContext);
}
}
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_CREATING_FOREIGN_KEYS);
/*
* Iterate through the colocated shards and create DDL commamnds
* to create the foreign constraints.
* Skip creating shard relationships if the caller has requested that they
* not be created.
*/
foreach_declared_ptr(shardInterval, shardIntervalList)
bool skipRelationshipCreation = (optionFlags &
SHARD_TRANSFER_SKIP_CREATE_RELATIONSHIPS);
if (!skipRelationshipCreation)
{
List *shardForeignConstraintCommandList = NIL;
List *referenceTableForeignConstraintList = NIL;
/*
* Once all shards are copied, we can recreate relationships between shards.
* Create DDL commands to Attach child tables to their parents in a partitioning hierarchy.
*/
List *shardIntervalWithDDCommandsList = NIL;
foreach_declared_ptr(shardInterval, shardIntervalList)
{
if (PartitionTable(shardInterval->relationId))
{
char *attachPartitionCommand =
GenerateAttachShardPartitionCommand(shardInterval);
CopyShardForeignConstraintCommandListGrouped(shardInterval,
&shardForeignConstraintCommandList,
&referenceTableForeignConstraintList);
ShardCommandList *shardCommandList = CreateShardCommandList(
shardInterval,
list_make1(attachPartitionCommand));
shardIntervalWithDDCommandsList = lappend(shardIntervalWithDDCommandsList,
shardCommandList);
}
}
ShardCommandList *shardCommandList = CreateShardCommandList(
shardInterval,
list_concat(shardForeignConstraintCommandList,
referenceTableForeignConstraintList));
shardIntervalWithDDCommandsList = lappend(shardIntervalWithDDCommandsList,
shardCommandList);
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_CREATING_FOREIGN_KEYS);
/*
* Iterate through the colocated shards and create DDL commamnds
* to create the foreign constraints.
*/
foreach_declared_ptr(shardInterval, shardIntervalList)
{
List *shardForeignConstraintCommandList = NIL;
List *referenceTableForeignConstraintList = NIL;
CopyShardForeignConstraintCommandListGrouped(shardInterval,
&
shardForeignConstraintCommandList,
&
referenceTableForeignConstraintList);
ShardCommandList *shardCommandList = CreateShardCommandList(
shardInterval,
list_concat(shardForeignConstraintCommandList,
referenceTableForeignConstraintList));
shardIntervalWithDDCommandsList = lappend(shardIntervalWithDDCommandsList,
shardCommandList);
}
/* Now execute the Partitioning & Foreign constraints creation commads. */
ShardCommandList *shardCommandList = NULL;
foreach_declared_ptr(shardCommandList, shardIntervalWithDDCommandsList)
{
char *tableOwner = TableOwner(shardCommandList->shardInterval->relationId);
SendCommandListToWorkerOutsideTransaction(targetNodeName, targetNodePort,
tableOwner,
shardCommandList->ddlCommandList);
}
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_COMPLETING);
}
/* Now execute the Partitioning & Foreign constraints creation commads. */
ShardCommandList *shardCommandList = NULL;
foreach_declared_ptr(shardCommandList, shardIntervalWithDDCommandsList)
{
char *tableOwner = TableOwner(shardCommandList->shardInterval->relationId);
SendCommandListToWorkerOutsideTransaction(targetNodeName, targetNodePort,
tableOwner,
shardCommandList->ddlCommandList);
}
UpdatePlacementUpdateStatusForShardIntervalList(
shardIntervalList,
sourceNodeName,
sourceNodePort,
PLACEMENT_UPDATE_STATUS_COMPLETING);
MemoryContextReset(localContext);
MemoryContextSwitchTo(oldContext);
}
@ -1647,7 +2064,8 @@ CopyShardsToNode(WorkerNode *sourceNode, WorkerNode *targetNode, List *shardInte
ExecuteTaskListOutsideTransaction(ROW_MODIFY_NONE, copyTaskList,
MaxAdaptiveExecutorPoolSize,
NULL /* jobIdList (ignored by API implementation) */);
NULL /* jobIdList (ignored by API implementation) */
);
}
@ -2050,6 +2468,7 @@ UpdateColocatedShardPlacementMetadataOnWorkers(int64 shardId,
"SELECT citus_internal.update_placement_metadata(%ld, %d, %d)",
colocatedShard->shardId,
sourceGroupId, targetGroupId);
SendCommandToWorkersWithMetadata(updateCommand->data);
}
}

View File

@ -13,6 +13,7 @@
#include "postgres.h"
#include "executor/executor.h" /* for CreateExecutorState(), FreeExecutorState(), CreateExprContext(), etc. */
#include "utils/builtins.h"
#include "utils/lsyscache.h"

View File

@ -16,6 +16,8 @@
#include "access/heapam.h"
#include "access/htup_details.h"
#include "catalog/pg_constraint.h"
#include "catalog/pg_namespace.h"
#include "catalog/pg_operator.h"
#include "lib/stringinfo.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
@ -38,6 +40,8 @@
#include "distributed/metadata_cache.h"
#include "distributed/multi_physical_planner.h"
#include "distributed/multi_router_planner.h"
#include "distributed/query_utils.h"
#include "distributed/recursive_planning.h"
#include "distributed/shard_utils.h"
#include "distributed/stats/stat_tenants.h"
#include "distributed/version_compat.h"
@ -204,6 +208,252 @@ UpdateTaskQueryString(Query *query, Task *task)
}
/*
* CreateQualsForShardInterval creates the necessary qual conditions over the
* given attnum and rtindex for the given shard interval.
*/
Node *
CreateQualsForShardInterval(RelationShard *relationShard, int attnum, int rtindex)
{
uint64 shardId = relationShard->shardId;
Oid relationId = relationShard->relationId;
CitusTableCacheEntry *cacheEntry = GetCitusTableCacheEntry(relationId);
Var *partitionColumnVar = cacheEntry->partitionColumn;
/*
* Add constraints for the relation identified by rtindex, specifically on its column at attnum.
* Create a Var node representing this column, which will be used to compare against the bounds
* from the partition column of shard interval.
*/
Var *outerTablePartitionColumnVar = makeVar(
rtindex, attnum, partitionColumnVar->vartype,
partitionColumnVar->vartypmod,
partitionColumnVar->varcollid,
0);
bool isFirstShard = IsFirstShard(cacheEntry, shardId);
/* load the interval for the shard and create constant nodes for the upper/lower bounds */
ShardInterval *shardInterval = LoadShardInterval(shardId);
Const *constNodeLowerBound = makeConst(INT4OID, -1, InvalidOid, sizeof(int32),
shardInterval->minValue, false, true);
Const *constNodeUpperBound = makeConst(INT4OID, -1, InvalidOid, sizeof(int32),
shardInterval->maxValue, false, true);
Const *constNodeZero = makeConst(INT4OID, -1, InvalidOid, sizeof(int32),
Int32GetDatum(0), false, true);
/* create a function expression node for the hash partition column */
FuncExpr *hashFunction = makeNode(FuncExpr);
hashFunction->funcid = cacheEntry->hashFunction->fn_oid;
hashFunction->args = list_make1(outerTablePartitionColumnVar);
hashFunction->funcresulttype = get_func_rettype(cacheEntry->hashFunction->fn_oid);
hashFunction->funcretset = false;
/* create a function expression for the lower bound of the shard interval */
Oid resultTypeOid = get_func_rettype(
cacheEntry->shardIntervalCompareFunction->fn_oid);
FuncExpr *lowerBoundFuncExpr = makeNode(FuncExpr);
lowerBoundFuncExpr->funcid = cacheEntry->shardIntervalCompareFunction->fn_oid;
lowerBoundFuncExpr->args = list_make2((Node *) constNodeLowerBound,
(Node *) hashFunction);
lowerBoundFuncExpr->funcresulttype = resultTypeOid;
lowerBoundFuncExpr->funcretset = false;
Oid lessThan = GetSysCacheOid(OPERNAMENSP, Anum_pg_operator_oid, CStringGetDatum("<"),
resultTypeOid, resultTypeOid, ObjectIdGetDatum(
PG_CATALOG_NAMESPACE));
/*
* Finally, check if the comparison result is less than 0, i.e.,
* shardInterval->minValue < hash(partitionColumn)
* See SearchCachedShardInterval for the behavior at the boundaries.
*/
Expr *lowerBoundExpr = make_opclause(lessThan, BOOLOID, false,
(Expr *) lowerBoundFuncExpr,
(Expr *) constNodeZero, InvalidOid, InvalidOid);
/* create a function expression for the upper bound of the shard interval */
FuncExpr *upperBoundFuncExpr = makeNode(FuncExpr);
upperBoundFuncExpr->funcid = cacheEntry->shardIntervalCompareFunction->fn_oid;
upperBoundFuncExpr->args = list_make2((Node *) hashFunction,
(Expr *) constNodeUpperBound);
upperBoundFuncExpr->funcresulttype = resultTypeOid;
upperBoundFuncExpr->funcretset = false;
Oid lessThanOrEqualTo = GetSysCacheOid(OPERNAMENSP, Anum_pg_operator_oid,
CStringGetDatum("<="),
resultTypeOid, resultTypeOid,
ObjectIdGetDatum(PG_CATALOG_NAMESPACE));
/*
* Finally, check if the comparison result is less than or equal to 0, i.e.,
* hash(partitionColumn) <= shardInterval->maxValue
* See SearchCachedShardInterval for the behavior at the boundaries.
*/
Expr *upperBoundExpr = make_opclause(lessThanOrEqualTo, BOOLOID, false,
(Expr *) upperBoundFuncExpr,
(Expr *) constNodeZero, InvalidOid, InvalidOid);
/* create a node for both upper and lower bound */
Node *shardIntervalBoundQuals = make_and_qual((Node *) lowerBoundExpr,
(Node *) upperBoundExpr);
/*
* Add a null test for the partition column for the first shard.
* This is because we need to include the null values in exactly one of the shard queries.
* The null test is added as an OR clause to the existing AND clause.
*/
if (isFirstShard)
{
/* null test for the first shard */
NullTest *nullTest = makeNode(NullTest);
nullTest->nulltesttype = IS_NULL; /* Check for IS NULL */
nullTest->arg = (Expr *) outerTablePartitionColumnVar; /* The variable to check */
nullTest->argisrow = false;
shardIntervalBoundQuals = (Node *) make_orclause(list_make2(nullTest,
shardIntervalBoundQuals));
}
return shardIntervalBoundQuals;
}
/*
* UpdateWhereClauseToPushdownRecurringOuterJoinWalker walks over the query tree and
* updates the WHERE clause for outer joins satisfying feasibility conditions.
*/
bool
UpdateWhereClauseToPushdownRecurringOuterJoinWalker(Node *node, List *relationShardList)
{
if (node == NULL)
{
return false;
}
if (IsA(node, Query))
{
UpdateWhereClauseToPushdownRecurringOuterJoin((Query *) node, relationShardList);
return query_tree_walker((Query *) node,
UpdateWhereClauseToPushdownRecurringOuterJoinWalker,
relationShardList, QTW_EXAMINE_RTES_BEFORE);
}
if (!IsA(node, RangeTblEntry))
{
return expression_tree_walker(node,
UpdateWhereClauseToPushdownRecurringOuterJoinWalker,
relationShardList);
}
return false;
}
/*
* UpdateWhereClauseToPushdownRecurringOuterJoin
*
* Inject shard interval predicates into the query WHERE clause for certain
* outer joins to make the join semantically correct when distributed.
*
* Why this is needed:
* When an inner side of an OUTER JOIN is a distributed table that has been
* routed to a single shard, we cannot simply replace the RTE with the shard
* name and rely on implicit pruning: the preserved (outer) side could still
* produce rows whose join keys would hash to other shards. To keep results
* consistent with the global execution semantics we restrict the preserved
* (outer) side to only those partition key values that would route to the
* chosen shard (plus NULLs, which are assigned to exactly one shard).
*
* What the function does:
* 1. Iterate over the top-level jointree->fromlist.
* 2. For each JoinExpr call CanPushdownRecurringOuterJoinExtended() which:
* - Verifies shape / join type is eligible.
* - Returns:
* outerRtIndex : RT index whose column we will constrain,
* outerRte / innerRte,
* attnum : attribute number (partition column) on outer side.
* This is compared to partition column of innerRte.
* 3. Find the RelationShard for the inner distributed table (innerRte->relid)
* in relationShardList; skip if absent (no fixed shard chosen).
* 4. Build the shard qualification with CreateQualsForShardInterval():
* (minValue < hash(partcol) AND hash(partcol) <= maxValue)
* and, for the first shard only, OR (partcol IS NULL).
* The Var refers to (outerRtIndex, attnum) so the restriction applies to
* the preserved outer input.
* 5. AND the new quals into jointree->quals (creating it if NULL).
*
* The function does not return anything, it modifies the query in place.
*/
void
UpdateWhereClauseToPushdownRecurringOuterJoin(Query *query, List *relationShardList)
{
if (query == NULL)
{
return;
}
FromExpr *fromExpr = query->jointree;
if (fromExpr == NULL || fromExpr->fromlist == NIL)
{
return;
}
ListCell *fromExprCell;
foreach(fromExprCell, fromExpr->fromlist)
{
Node *fromItem = (Node *) lfirst(fromExprCell);
if (!IsA(fromItem, JoinExpr))
{
continue;
}
JoinExpr *joinExpr = (JoinExpr *) fromItem;
/*
* We will check if we need to add constraints to the WHERE clause.
*/
RangeTblEntry *innerRte = NULL;
RangeTblEntry *outerRte = NULL;
int outerRtIndex = -1;
int attnum;
if (!CanPushdownRecurringOuterJoinExtended(joinExpr, query, &outerRtIndex,
&outerRte, &innerRte, &attnum))
{
continue;
}
if (attnum == InvalidAttrNumber)
{
continue;
}
ereport(DEBUG5, (errmsg(
"Distributed table from the inner part of the outer join: %s.",
innerRte->eref->aliasname)));
RelationShard *relationShard = FindRelationShard(innerRte->relid,
relationShardList);
if (relationShard == NULL || relationShard->shardId == INVALID_SHARD_ID)
{
continue;
}
Node *shardIntervalBoundQuals = CreateQualsForShardInterval(relationShard, attnum,
outerRtIndex);
if (fromExpr->quals == NULL)
{
fromExpr->quals = (Node *) shardIntervalBoundQuals;
}
else
{
fromExpr->quals = make_and_qual(fromExpr->quals, shardIntervalBoundQuals);
}
}
}
/*
* UpdateRelationToShardNames walks over the query tree and appends shard ids to
* relations. It uses unique identity value to establish connection between a
@ -439,6 +689,27 @@ SetTaskQueryStringList(Task *task, List *queryStringList)
}
void
SetTaskQueryPlan(Task *task, Query *query, PlannedStmt *localPlan)
{
Assert(localPlan != NULL);
task->taskQuery.queryType = TASK_QUERY_LOCAL_PLAN;
task->taskQuery.data.localCompiled = (LocalCompilation *) palloc0(
sizeof(LocalCompilation));
task->taskQuery.data.localCompiled->query = query;
task->taskQuery.data.localCompiled->plan = localPlan;
task->queryCount = 1;
}
PlannedStmt *
TaskQueryLocalPlan(Task *task)
{
Assert(task->taskQuery.queryType == TASK_QUERY_LOCAL_PLAN);
return task->taskQuery.data.localCompiled->plan;
}
/*
* DeparseTaskQuery is a general way of deparsing a query based on a task.
*/
@ -524,6 +795,26 @@ TaskQueryString(Task *task)
{
return task->taskQuery.data.queryStringLazy;
}
else if (taskQueryType == TASK_QUERY_LOCAL_PLAN)
{
Query *query = task->taskQuery.data.localCompiled->query;
Assert(query != NULL);
/*
* Use the query of the local compilation to generate the
* query string. For local compiled tasks, the query is retained
* for this purpose, which may be EXPLAIN ANALYZing the task, or
* command logging. Generating the query string on the fly is
* acceptable because the plan of the local compilation is used
* for query execution.
*/
MemoryContext previousContext = MemoryContextSwitchTo(GetMemoryChunkContext(
query));
UpdateRelationToShardNames((Node *) query, task->relationShardList);
MemoryContextSwitchTo(previousContext);
return AnnotateQuery(DeparseTaskQuery(task, query),
task->partitionKeyValue, task->colocationId);
}
Query *jobQueryReferenceForLazyDeparsing =
task->taskQuery.data.jobQueryReferenceForLazyDeparsing;

View File

@ -13,6 +13,7 @@
#include "postgres.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "access/htup_details.h"
#include "access/xact.h"
@ -75,17 +76,6 @@
#endif
/* RouterPlanType is used to determine the router plan to invoke */
typedef enum RouterPlanType
{
INSERT_SELECT_INTO_CITUS_TABLE,
INSERT_SELECT_INTO_LOCAL_TABLE,
DML_QUERY,
SELECT_QUERY,
MERGE_QUERY,
REPLAN_WITH_BOUND_PARAMETERS
} RouterPlanType;
static List *plannerRestrictionContextList = NIL;
int MultiTaskQueryLogLevel = CITUS_LOG_LEVEL_OFF; /* multi-task query log level */
static uint64 NextPlanId = 1;
@ -135,13 +125,15 @@ static void AdjustReadIntermediateResultsCostInternal(RelOptInfo *relOptInfo,
Const *resultFormatConst);
static List * OuterPlanParamsList(PlannerInfo *root);
static List * CopyPlanParamList(List *originalPlanParamList);
static PlannerRestrictionContext * CreateAndPushPlannerRestrictionContext(void);
static void CreateAndPushPlannerRestrictionContext(
DistributedPlanningContext *planContext,
FastPathRestrictionContext *
fastPathContext);
static PlannerRestrictionContext * CurrentPlannerRestrictionContext(void);
static void PopPlannerRestrictionContext(void);
static void ResetPlannerRestrictionContext(
PlannerRestrictionContext *plannerRestrictionContext);
static PlannedStmt * PlanFastPathDistributedStmt(DistributedPlanningContext *planContext,
Node *distributionKeyValue);
static PlannedStmt * PlanFastPathDistributedStmt(DistributedPlanningContext *planContext);
static PlannedStmt * PlanDistributedStmt(DistributedPlanningContext *planContext,
int rteIdCounter);
static RTEListProperties * GetRTEListProperties(List *rangeTableList);
@ -152,10 +144,12 @@ static RouterPlanType GetRouterPlanType(Query *query,
bool hasUnresolvedParams);
static void ConcatenateRTablesAndPerminfos(PlannedStmt *mainPlan,
PlannedStmt *concatPlan);
static bool CheckPostPlanDistribution(bool isDistributedQuery,
Query *origQuery,
List *rangeTableList,
Query *plannedQuery);
static bool CheckPostPlanDistribution(DistributedPlanningContext *planContext,
bool isDistributedQuery,
List *rangeTableList);
#if PG_VERSION_NUM >= PG_VERSION_18
static int DisableSelfJoinElimination(void);
#endif
/* Distributed planner hook */
PlannedStmt *
@ -166,7 +160,10 @@ distributed_planner(Query *parse,
{
bool needsDistributedPlanning = false;
bool fastPathRouterQuery = false;
Node *distributionKeyValue = NULL;
FastPathRestrictionContext fastPathContext = { 0 };
#if PG_VERSION_NUM >= PG_VERSION_18
int saveNestLevel = -1;
#endif
List *rangeTableList = ExtractRangeTableEntryList(parse);
@ -191,8 +188,7 @@ distributed_planner(Query *parse,
&maybeHasForeignDistributedTable);
if (needsDistributedPlanning)
{
fastPathRouterQuery = FastPathRouterQuery(parse, &distributionKeyValue);
fastPathRouterQuery = FastPathRouterQuery(parse, &fastPathContext);
if (maybeHasForeignDistributedTable)
{
WarnIfListHasForeignDistributedTable(rangeTableList);
@ -231,6 +227,10 @@ distributed_planner(Query *parse,
bool setPartitionedTablesInherited = false;
AdjustPartitioningForDistributedPlanning(rangeTableList,
setPartitionedTablesInherited);
#if PG_VERSION_NUM >= PG_VERSION_18
saveNestLevel = DisableSelfJoinElimination();
#endif
}
}
@ -247,8 +247,9 @@ distributed_planner(Query *parse,
*/
HideCitusDependentObjectsOnQueriesOfPgMetaTables((Node *) parse, NULL);
/* create a restriction context and put it at the end if context list */
planContext.plannerRestrictionContext = CreateAndPushPlannerRestrictionContext();
/* create a restriction context and put it at the end of our plan context's context list */
CreateAndPushPlannerRestrictionContext(&planContext,
&fastPathContext);
/*
* We keep track of how many times we've recursed into the planner, primarily
@ -264,7 +265,7 @@ distributed_planner(Query *parse,
{
if (fastPathRouterQuery)
{
result = PlanFastPathDistributedStmt(&planContext, distributionKeyValue);
result = PlanFastPathDistributedStmt(&planContext);
}
else
{
@ -276,10 +277,19 @@ distributed_planner(Query *parse,
planContext.plan = standard_planner(planContext.query, NULL,
planContext.cursorOptions,
planContext.boundParams);
needsDistributedPlanning = CheckPostPlanDistribution(needsDistributedPlanning,
planContext.originalQuery,
rangeTableList,
planContext.query);
#if PG_VERSION_NUM >= PG_VERSION_18
if (needsDistributedPlanning)
{
Assert(saveNestLevel > 0);
AtEOXact_GUC(true, saveNestLevel);
}
/* Pop the plan context from the current restriction context */
planContext.plannerRestrictionContext->planContext = NULL;
#endif
needsDistributedPlanning = CheckPostPlanDistribution(&planContext,
needsDistributedPlanning,
rangeTableList);
if (needsDistributedPlanning)
{
@ -649,30 +659,21 @@ IsMultiTaskPlan(DistributedPlan *distributedPlan)
* the FastPathPlanner.
*/
static PlannedStmt *
PlanFastPathDistributedStmt(DistributedPlanningContext *planContext,
Node *distributionKeyValue)
PlanFastPathDistributedStmt(DistributedPlanningContext *planContext)
{
FastPathRestrictionContext *fastPathContext =
planContext->plannerRestrictionContext->fastPathRestrictionContext;
Assert(fastPathContext != NULL);
Assert(fastPathContext->fastPathRouterQuery);
planContext->plannerRestrictionContext->fastPathRestrictionContext->
fastPathRouterQuery = true;
FastPathPreprocessParseTree(planContext->query);
if (distributionKeyValue == NULL)
if (!fastPathContext->delayFastPathPlanning)
{
/* nothing to record */
planContext->plan = FastPathPlanner(planContext->originalQuery,
planContext->query,
planContext->boundParams);
}
else if (IsA(distributionKeyValue, Const))
{
fastPathContext->distributionKeyValue = (Const *) distributionKeyValue;
}
else if (IsA(distributionKeyValue, Param))
{
fastPathContext->distributionKeyHasParam = true;
}
planContext->plan = FastPathPlanner(planContext->originalQuery, planContext->query,
planContext->boundParams);
return CreateDistributedPlannedStmt(planContext);
}
@ -803,6 +804,8 @@ CreateDistributedPlannedStmt(DistributedPlanningContext *planContext)
RaiseDeferredError(distributedPlan->planningError, ERROR);
}
CheckAndBuildDelayedFastPathPlan(planContext, distributedPlan);
/* remember the plan's identifier for identifying subplans */
distributedPlan->planId = planId;
@ -1104,7 +1107,8 @@ CreateDistributedPlan(uint64 planId, bool allowRecursivePlanning, Query *origina
* set_plan_references>add_rtes_to_flat_rtable>add_rte_to_flat_rtable.
*/
List *subPlanList = GenerateSubplansForSubqueriesAndCTEs(planId, originalQuery,
plannerRestrictionContext);
plannerRestrictionContext,
routerPlan);
/*
* If subqueries were recursively planned then we need to replan the query
@ -2034,6 +2038,32 @@ multi_relation_restriction_hook(PlannerInfo *root, RelOptInfo *relOptInfo,
lappend(relationRestrictionContext->relationRestrictionList, relationRestriction);
MemoryContextSwitchTo(oldMemoryContext);
#if PG_VERSION_NUM >= PG_VERSION_18
if (root->query_level == 1 && plannerRestrictionContext->planContext != NULL)
{
/* We're at the top query with a distributed context; see if Postgres
* has changed the query tree we passed to it in distributed_planner().
* This check was necessitated by PG commit 1e4351a, becuase in it the
* planner modfies a copy of the passed in query tree with the consequence
* that changes are not reflected back to the caller of standard_planner().
*/
Query *query = plannerRestrictionContext->planContext->query;
if (root->parse != query)
{
/*
* The Postgres planner has reconstructed the query tree, so the query
* tree our distributed context passed in (to standard_planner() is
* updated to track the new query tree.
*/
ereport(DEBUG4, (errmsg(
"Detected query reconstruction by Postgres planner, updating "
"planContext to track it")));
plannerRestrictionContext->planContext->query = root->parse;
}
}
#endif
}
@ -2407,13 +2437,17 @@ CopyPlanParamList(List *originalPlanParamList)
/*
* CreateAndPushPlannerRestrictionContext creates a new relation restriction context
* and a new join context, inserts it to the beginning of the
* plannerRestrictionContextList. Finally, the planner restriction context is
* inserted to the beginning of the plannerRestrictionContextList and it is returned.
* CreateAndPushPlannerRestrictionContext creates a new planner restriction
* context with an empty relation restriction context and an empty join and
* a copy of the given fast path restriction context (if present). Finally,
* the planner restriction context is inserted to the beginning of the
* global plannerRestrictionContextList and, in PG18+, given a reference to
* its distributed plan context.
*/
static PlannerRestrictionContext *
CreateAndPushPlannerRestrictionContext(void)
static void
CreateAndPushPlannerRestrictionContext(DistributedPlanningContext *planContext,
FastPathRestrictionContext *
fastPathRestrictionContext)
{
PlannerRestrictionContext *plannerRestrictionContext =
palloc0(sizeof(PlannerRestrictionContext));
@ -2427,6 +2461,21 @@ CreateAndPushPlannerRestrictionContext(void)
plannerRestrictionContext->fastPathRestrictionContext =
palloc0(sizeof(FastPathRestrictionContext));
if (fastPathRestrictionContext != NULL)
{
/* copy the given fast path restriction context */
FastPathRestrictionContext *plannersFastPathCtx =
plannerRestrictionContext->fastPathRestrictionContext;
plannersFastPathCtx->fastPathRouterQuery =
fastPathRestrictionContext->fastPathRouterQuery;
plannersFastPathCtx->distributionKeyValue =
fastPathRestrictionContext->distributionKeyValue;
plannersFastPathCtx->distributionKeyHasParam =
fastPathRestrictionContext->distributionKeyHasParam;
plannersFastPathCtx->delayFastPathPlanning =
fastPathRestrictionContext->delayFastPathPlanning;
}
plannerRestrictionContext->memoryContext = CurrentMemoryContext;
/* we'll apply logical AND as we add tables */
@ -2435,7 +2484,11 @@ CreateAndPushPlannerRestrictionContext(void)
plannerRestrictionContextList = lcons(plannerRestrictionContext,
plannerRestrictionContextList);
return plannerRestrictionContext;
planContext->plannerRestrictionContext = plannerRestrictionContext;
#if PG_VERSION_NUM >= PG_VERSION_18
plannerRestrictionContext->planContext = planContext;
#endif
}
@ -2496,6 +2549,18 @@ CurrentPlannerRestrictionContext(void)
static void
PopPlannerRestrictionContext(void)
{
#if PG_VERSION_NUM >= PG_VERSION_18
/*
* PG18+: Clear the restriction context's planContext pointer; this is done
* by distributed_planner() when popping the context, but in case of error
* during standard_planner() we want to clean up here also.
*/
PlannerRestrictionContext *plannerRestrictionContext =
(PlannerRestrictionContext *) linitial(plannerRestrictionContextList);
plannerRestrictionContext->planContext = NULL;
#endif
plannerRestrictionContextList = list_delete_first(plannerRestrictionContextList);
}
@ -2740,12 +2805,13 @@ WarnIfListHasForeignDistributedTable(List *rangeTableList)
static bool
CheckPostPlanDistribution(bool isDistributedQuery,
Query *origQuery, List *rangeTableList,
Query *plannedQuery)
CheckPostPlanDistribution(DistributedPlanningContext *planContext, bool
isDistributedQuery, List *rangeTableList)
{
if (isDistributedQuery)
{
Query *origQuery = planContext->originalQuery;
Query *plannedQuery = planContext->query;
Node *origQuals = origQuery->jointree->quals;
Node *plannedQuals = plannedQuery->jointree->quals;
@ -2764,6 +2830,23 @@ CheckPostPlanDistribution(bool isDistributedQuery,
*/
if (origQuals != NULL && plannedQuals == NULL)
{
bool planHasDistTable = ListContainsDistributedTableRTE(
planContext->plan->rtable, NULL);
/*
* If the Postgres plan has a distributed table, we know for sure that
* the query requires distributed planning.
*/
if (planHasDistTable)
{
return true;
}
/*
* Otherwise, if the query has less range table entries after Postgres,
* planning, we should re-evaluate the distribution of the query. Postgres
* may have optimized away all citus tables, per issues 7782, 7783.
*/
List *rtesPostPlan = ExtractRangeTableEntryList(plannedQuery);
if (list_length(rtesPostPlan) < list_length(rangeTableList))
{
@ -2775,3 +2858,27 @@ CheckPostPlanDistribution(bool isDistributedQuery,
return isDistributedQuery;
}
#if PG_VERSION_NUM >= PG_VERSION_18
/*
* DisableSelfJoinElimination is used to prevent self join elimination
* during distributed query planning to ensure shard queries are correctly
* generated. PG18's self join elimination (fc069a3a6) changes the Query
* in a way that can cause problems for queries with a mix of Citus and
* Postgres tables. Self join elimination is allowed on Postgres tables
* only so queries involving shards get the benefit of it.
*/
static int
DisableSelfJoinElimination(void)
{
int NestLevel = NewGUCNestLevel();
set_config_option("enable_self_join_elimination", "off",
(superuser() ? PGC_SUSET : PGC_USERSET), PGC_S_SESSION,
GUC_ACTION_LOCAL, true, 0, false);
return NestLevel;
}
#endif

View File

@ -43,8 +43,10 @@
#include "pg_version_constants.h"
#include "distributed/citus_clauses.h"
#include "distributed/distributed_planner.h"
#include "distributed/insert_select_planner.h"
#include "distributed/local_executor.h"
#include "distributed/metadata_cache.h"
#include "distributed/multi_physical_planner.h" /* only to use some utility functions */
#include "distributed/multi_router_planner.h"
@ -53,6 +55,7 @@
#include "distributed/shardinterval_utils.h"
bool EnableFastPathRouterPlanner = true;
bool EnableLocalFastPathQueryOptimization = true;
static bool ColumnAppearsMultipleTimes(Node *quals, Var *distributionKey);
static bool DistKeyInSimpleOpExpression(Expr *clause, Var *distColumn,
@ -61,6 +64,24 @@ static bool ConjunctionContainsColumnFilter(Node *node,
Var *column,
Node **distributionKeyValue);
/*
* FastPathPreprocessParseTree is used to apply transformations on the parse tree
* that are expected by the Postgres planner. This is called on both delayed FastPath
* and non-delayed FastPath queries.
*/
void
FastPathPreprocessParseTree(Query *parse)
{
/*
* Citus planner relies on some of the transformations on constant
* evaluation on the parse tree.
*/
parse->targetList =
(List *) eval_const_expressions(NULL, (Node *) parse->targetList);
parse->jointree->quals =
(Node *) eval_const_expressions(NULL, (Node *) parse->jointree->quals);
}
/*
* FastPathPlanner is intended to be used instead of standard_planner() for trivial
@ -73,15 +94,6 @@ static bool ConjunctionContainsColumnFilter(Node *node,
PlannedStmt *
FastPathPlanner(Query *originalQuery, Query *parse, ParamListInfo boundParams)
{
/*
* Citus planner relies on some of the transformations on constant
* evaluation on the parse tree.
*/
parse->targetList =
(List *) eval_const_expressions(NULL, (Node *) parse->targetList);
parse->jointree->quals =
(Node *) eval_const_expressions(NULL, (Node *) parse->jointree->quals);
PlannedStmt *result = GeneratePlaceHolderPlannedStmt(originalQuery);
return result;
@ -112,9 +124,9 @@ GeneratePlaceHolderPlannedStmt(Query *parse)
Plan *plan = &scanNode->plan;
#endif
Node *distKey PG_USED_FOR_ASSERTS_ONLY = NULL;
FastPathRestrictionContext fprCtxt PG_USED_FOR_ASSERTS_ONLY = { 0 };
Assert(FastPathRouterQuery(parse, &distKey));
Assert(FastPathRouterQuery(parse, &fprCtxt));
/* there is only a single relation rte */
#if PG_VERSION_NUM >= PG_VERSION_16
@ -150,27 +162,83 @@ GeneratePlaceHolderPlannedStmt(Query *parse)
}
/*
* InitializeFastPathContext - helper function to initialize a FastPath
* restriction context with the details that the FastPath code path needs.
*/
static void
InitializeFastPathContext(FastPathRestrictionContext *fastPathContext,
Node *distributionKeyValue,
bool canAvoidDeparse,
Query *query)
{
Assert(fastPathContext != NULL);
Assert(!fastPathContext->fastPathRouterQuery);
Assert(!fastPathContext->delayFastPathPlanning);
/*
* We're looking at a fast path query, so we can fill the
* fastPathContext with relevant details.
*/
fastPathContext->fastPathRouterQuery = true;
if (distributionKeyValue == NULL)
{
/* nothing to record */
}
else if (IsA(distributionKeyValue, Const))
{
fastPathContext->distributionKeyValue = (Const *) distributionKeyValue;
}
else if (IsA(distributionKeyValue, Param))
{
fastPathContext->distributionKeyHasParam = true;
}
/*
* If local execution and the fast path optimization to
* avoid deparse are enabled, and it is safe to do local
* execution..
*/
if (EnableLocalFastPathQueryOptimization &&
EnableLocalExecution &&
GetCurrentLocalExecutionStatus() != LOCAL_EXECUTION_DISABLED)
{
/*
* .. we can delay fast path planning until we know whether
* or not the shard is local. Make a final check for volatile
* functions in the query tree to determine if we should delay
* the fast path planning.
*/
fastPathContext->delayFastPathPlanning = canAvoidDeparse &&
!FindNodeMatchingCheckFunction(
(Node *) query,
CitusIsVolatileFunction);
}
}
/*
* FastPathRouterQuery gets a query and returns true if the query is eligible for
* being a fast path router query.
* being a fast path router query. It also fills the given fastPathContext with
* details about the query such as the distribution key value (if available),
* whether the distribution key is a parameter, and the range table entry for the
* table being queried.
* The requirements for the fast path query can be listed below:
*
* - SELECT/UPDATE/DELETE query without CTES, sublinks-subqueries, set operations
* - The query should touch only a single hash distributed or reference table
* - The distribution with equality operator should be in the WHERE clause
* and it should be ANDed with any other filters. Also, the distribution
* key should only exists once in the WHERE clause. So basically,
* key should only exist once in the WHERE clause. So basically,
* SELECT ... FROM dist_table WHERE dist_key = X
* If the filter is a const, distributionKeyValue is set
* - All INSERT statements (including multi-row INSERTs) as long as the commands
* don't have any sublinks/CTEs etc
* -
*/
bool
FastPathRouterQuery(Query *query, Node **distributionKeyValue)
FastPathRouterQuery(Query *query, FastPathRestrictionContext *fastPathContext)
{
FromExpr *joinTree = query->jointree;
Node *quals = NULL;
if (!EnableFastPathRouterPlanner)
{
return false;
@ -201,11 +269,20 @@ FastPathRouterQuery(Query *query, Node **distributionKeyValue)
else if (query->commandType == CMD_INSERT)
{
/* we don't need to do any further checks, all INSERTs are fast-path */
InitializeFastPathContext(fastPathContext, NULL, true, query);
return true;
}
/* make sure that the only range table in FROM clause */
if (list_length(query->rtable) != 1)
int numFromRels = list_length(query->rtable);
/* make sure that there is only one range table in FROM clause */
if ((numFromRels != 1)
#if PG_VERSION_NUM >= PG_VERSION_18
/* with a PG18+ twist for GROUP rte - if present make sure there's two range tables */
&& (!query->hasGroupRTE || numFromRels != 2)
#endif
)
{
return false;
}
@ -225,6 +302,10 @@ FastPathRouterQuery(Query *query, Node **distributionKeyValue)
return false;
}
bool isFastPath = false;
bool canAvoidDeparse = false;
Node *distributionKeyValue = NULL;
/*
* If the table doesn't have a distribution column, we don't need to
* check anything further.
@ -232,45 +313,62 @@ FastPathRouterQuery(Query *query, Node **distributionKeyValue)
Var *distributionKey = PartitionColumn(distributedTableId, 1);
if (!distributionKey)
{
return true;
/*
* Local execution may avoid a deparse on single shard distributed tables or
* citus local tables. We don't yet support reference tables in this code-path
* because modifications on reference tables are complicated to support here.
*/
canAvoidDeparse = IsCitusTableTypeCacheEntry(cacheEntry,
SINGLE_SHARD_DISTRIBUTED) ||
IsCitusTableTypeCacheEntry(cacheEntry, CITUS_LOCAL_TABLE);
isFastPath = true;
}
/* WHERE clause should not be empty for distributed tables */
if (joinTree == NULL ||
(IsCitusTableTypeCacheEntry(cacheEntry, DISTRIBUTED_TABLE) && joinTree->quals ==
NULL))
else
{
return false;
FromExpr *joinTree = query->jointree;
Node *quals = NULL;
canAvoidDeparse = IsCitusTableTypeCacheEntry(cacheEntry, DISTRIBUTED_TABLE);
if (joinTree == NULL ||
(joinTree->quals == NULL && canAvoidDeparse))
{
/* no quals, not a fast path query */
return false;
}
quals = joinTree->quals;
if (quals != NULL && IsA(quals, List))
{
quals = (Node *) make_ands_explicit((List *) quals);
}
/*
* Distribution column must be used in a simple equality match check and it must be
* place at top level conjunction operator. In simple words, we should have
* WHERE dist_key = VALUE [AND ....];
*
* We're also not allowing any other appearances of the distribution key in the quals.
*
* Overall the logic might sound fuzzy since it involves two individual checks:
* (a) Check for top level AND operator with one side being "dist_key = const"
* (b) Only allow single appearance of "dist_key" in the quals
*
* This is to simplify both of the individual checks and omit various edge cases
* that might arise with multiple distribution keys in the quals.
*/
isFastPath = (ConjunctionContainsColumnFilter(quals, distributionKey,
&distributionKeyValue) &&
!ColumnAppearsMultipleTimes(quals, distributionKey));
}
/* convert list of expressions into expression tree for further processing */
quals = joinTree->quals;
if (quals != NULL && IsA(quals, List))
if (isFastPath)
{
quals = (Node *) make_ands_explicit((List *) quals);
InitializeFastPathContext(fastPathContext, distributionKeyValue, canAvoidDeparse,
query);
}
/*
* Distribution column must be used in a simple equality match check and it must be
* place at top level conjunction operator. In simple words, we should have
* WHERE dist_key = VALUE [AND ....];
*
* We're also not allowing any other appearances of the distribution key in the quals.
*
* Overall the logic might sound fuzzy since it involves two individual checks:
* (a) Check for top level AND operator with one side being "dist_key = const"
* (b) Only allow single appearance of "dist_key" in the quals
*
* This is to simplify both of the individual checks and omit various edge cases
* that might arise with multiple distribution keys in the quals.
*/
if (ConjunctionContainsColumnFilter(quals, distributionKey, distributionKeyValue) &&
!ColumnAppearsMultipleTimes(quals, distributionKey))
{
return true;
}
return false;
return isFastPath;
}

View File

@ -428,11 +428,10 @@ CreateInsertSelectIntoLocalTablePlan(uint64 planId, Query *insertSelectQuery,
ParamListInfo boundParams, bool hasUnresolvedParams,
PlannerRestrictionContext *plannerRestrictionContext)
{
RangeTblEntry *selectRte = ExtractSelectRangeTableEntry(insertSelectQuery);
PrepareInsertSelectForCitusPlanner(insertSelectQuery);
/* get the SELECT query (may have changed after PrepareInsertSelectForCitusPlanner) */
RangeTblEntry *selectRte = ExtractSelectRangeTableEntry(insertSelectQuery);
Query *selectQuery = selectRte->subquery;
bool allowRecursivePlanning = true;
@ -513,6 +512,13 @@ PrepareInsertSelectForCitusPlanner(Query *insertSelectQuery)
bool isWrapped = false;
/*
* PG18 is stricter about GroupRTE/GroupVar. For INSERT SELECT with a GROUP BY,
* flatten the SELECTs targetList and havingQual so Vars point to base RTEs and
* avoid Unrecognized range table id.
*/
FlattenGroupExprs(selectRte->subquery);
if (selectRte->subquery->setOperations != NULL)
{
/*
@ -766,7 +772,8 @@ DistributedInsertSelectSupported(Query *queryTree, RangeTblEntry *insertRte,
{
/* first apply toplevel pushdown checks to SELECT query */
error =
DeferErrorIfUnsupportedSubqueryPushdown(subquery, plannerRestrictionContext);
DeferErrorIfUnsupportedSubqueryPushdown(subquery, plannerRestrictionContext,
true);
if (error)
{
return error;
@ -1430,11 +1437,6 @@ static DistributedPlan *
CreateNonPushableInsertSelectPlan(uint64 planId, Query *parse, ParamListInfo boundParams)
{
Query *insertSelectQuery = copyObject(parse);
RangeTblEntry *selectRte = ExtractSelectRangeTableEntry(insertSelectQuery);
RangeTblEntry *insertRte = ExtractResultRelationRTEOrError(insertSelectQuery);
Oid targetRelationId = insertRte->relid;
DistributedPlan *distributedPlan = CitusMakeNode(DistributedPlan);
distributedPlan->modLevel = RowModifyLevelForQuery(insertSelectQuery);
@ -1449,6 +1451,7 @@ CreateNonPushableInsertSelectPlan(uint64 planId, Query *parse, ParamListInfo bou
PrepareInsertSelectForCitusPlanner(insertSelectQuery);
/* get the SELECT query (may have changed after PrepareInsertSelectForCitusPlanner) */
RangeTblEntry *selectRte = ExtractSelectRangeTableEntry(insertSelectQuery);
Query *selectQuery = selectRte->subquery;
/*
@ -1471,6 +1474,9 @@ CreateNonPushableInsertSelectPlan(uint64 planId, Query *parse, ParamListInfo bou
PlannedStmt *selectPlan = pg_plan_query(selectQueryCopy, NULL, cursorOptions,
boundParams);
/* decide whether we can repartition the results */
RangeTblEntry *insertRte = ExtractResultRelationRTEOrError(insertSelectQuery);
Oid targetRelationId = insertRte->relid;
bool repartitioned = IsRedistributablePlan(selectPlan->planTree) &&
IsSupportedRedistributionTarget(targetRelationId);

View File

@ -41,6 +41,7 @@
static int SourceResultPartitionColumnIndex(Query *mergeQuery,
List *sourceTargetList,
CitusTableCacheEntry *targetRelation);
static int FindTargetListEntryWithVarExprAttno(List *targetList, AttrNumber varattno);
static Var * ValidateAndReturnVarIfSupported(Node *entryExpr);
static DeferredErrorMessage * DeferErrorIfTargetHasFalseClause(Oid targetRelationId,
PlannerRestrictionContext *
@ -422,10 +423,13 @@ ErrorIfMergeHasUnsupportedTables(Oid targetRelationId, List *rangeTableList)
case RTE_VALUES:
case RTE_JOIN:
case RTE_CTE:
{
/* Skip them as base table(s) will be checked */
continue;
}
#if PG_VERSION_NUM >= PG_VERSION_18
case RTE_GROUP:
#endif
{
/* Skip them as base table(s) will be checked */
continue;
}
/*
* RTE_NAMEDTUPLESTORE is typically used in ephmeral named relations,
@ -628,6 +632,22 @@ MergeQualAndTargetListFunctionsSupported(Oid resultRelationId, Query *query,
}
}
/*
* joinTree->quals, retrieved by GetMergeJoinTree() - either from
* mergeJoinCondition (PG >= 17) or jointree->quals (PG < 17),
* only contains the quals that present in "ON (..)" clause. Action
* quals that can be specified for each specific action, as in
* "WHEN <match condition> AND <action quals> THEN <action>"", are
* saved into "qual" field of the corresponding action's entry in
* mergeActionList, see
* https://github.com/postgres/postgres/blob/e6da68a6e1d60a037b63a9c9ed36e5ef0a996769/src/backend/parser/parse_merge.c#L285-L293.
*
* For this reason, even if TargetEntryChangesValue() could prove that
* an action's quals ensure that the action cannot change the distribution
* key, this is not the case as we don't provide action quals to
* TargetEntryChangesValue(), but just joinTree, which only contains
* the "ON (..)" clause quals.
*/
if (targetEntryDistributionColumn &&
TargetEntryChangesValue(targetEntry, distributionColumn, joinTree))
{
@ -1149,7 +1169,8 @@ DeferErrorIfRoutableMergeNotSupported(Query *query, List *rangeTableList,
{
deferredError =
DeferErrorIfUnsupportedSubqueryPushdown(query,
plannerRestrictionContext);
plannerRestrictionContext,
true);
if (deferredError)
{
ereport(DEBUG1, (errmsg("Sub-query is not pushable, try repartitioning")));
@ -1410,7 +1431,8 @@ SourceResultPartitionColumnIndex(Query *mergeQuery, List *sourceTargetList,
Assert(sourceRepartitionVar);
int sourceResultRepartitionColumnIndex =
DistributionColumnIndex(sourceTargetList, sourceRepartitionVar);
FindTargetListEntryWithVarExprAttno(sourceTargetList,
sourceRepartitionVar->varattno);
if (sourceResultRepartitionColumnIndex == -1)
{
@ -1561,6 +1583,33 @@ FetchAndValidateInsertVarIfExists(Oid targetRelationId, Query *query)
}
/*
* FindTargetListEntryWithVarExprAttno finds the index of the target
* entry whose expr is a Var that points to input varattno.
*
* If no such target entry is found, it returns -1.
*/
static int
FindTargetListEntryWithVarExprAttno(List *targetList, AttrNumber varattno)
{
int targetEntryIndex = 0;
TargetEntry *targetEntry = NULL;
foreach_declared_ptr(targetEntry, targetList)
{
if (IsA(targetEntry->expr, Var) &&
((Var *) targetEntry->expr)->varattno == varattno)
{
return targetEntryIndex;
}
targetEntryIndex++;
}
return -1;
}
/*
* IsLocalTableModification returns true if the table modified is a Postgres table.
* We do not support recursive planning for MERGE yet, so we could have a join

View File

@ -26,6 +26,7 @@
#include "commands/tablecmds.h"
#include "executor/tstoreReceiver.h"
#include "lib/stringinfo.h"
#include "nodes/nodeFuncs.h"
#include "nodes/plannodes.h"
#include "nodes/primnodes.h"
#include "nodes/print.h"
@ -44,6 +45,11 @@
#include "utils/snapmgr.h"
#include "pg_version_constants.h"
#if PG_VERSION_NUM >= PG_VERSION_18
#include "commands/explain_dr.h" /* CreateExplainSerializeDestReceiver() */
#include "commands/explain_format.h"
#endif
#include "distributed/citus_depended_object.h"
#include "distributed/citus_nodefuncs.h"
@ -68,6 +74,7 @@
#include "distributed/placement_connection.h"
#include "distributed/recursive_planning.h"
#include "distributed/remote_commands.h"
#include "distributed/subplan_execution.h"
#include "distributed/tuple_destination.h"
#include "distributed/tuplestore.h"
#include "distributed/version_compat.h"
@ -78,6 +85,7 @@
bool ExplainDistributedQueries = true;
bool ExplainAllTasks = false;
int ExplainAnalyzeSortMethod = EXPLAIN_ANALYZE_SORT_BY_TIME;
extern MemoryContext SubPlanExplainAnalyzeContext;
/*
* If enabled, EXPLAIN ANALYZE output & other statistics of last worker task
@ -85,6 +93,11 @@ int ExplainAnalyzeSortMethod = EXPLAIN_ANALYZE_SORT_BY_TIME;
*/
static char *SavedExplainPlan = NULL;
static double SavedExecutionDurationMillisec = 0.0;
static double SavedExplainPlanNtuples = 0;
static double SavedExplainPlanNloops = 0;
extern SubPlanExplainOutputData *SubPlanExplainOutput;
uint8 TotalExplainOutputCapacity = 0;
uint8 NumTasksOutput = 0;
/* struct to save explain flags */
typedef struct
@ -134,14 +147,7 @@ typedef struct ExplainAnalyzeDestination
TupleDesc lastSavedExplainAnalyzeTupDesc;
} ExplainAnalyzeDestination;
#if PG_VERSION_NUM >= PG_VERSION_17
/*
* Various places within need to convert bytes to kilobytes. Round these up
* to the next whole kilobyte.
* copied from explain.c
*/
#define BYTES_TO_KILOBYTES(b) (((b) + 1023) / 1024)
#if PG_VERSION_NUM >= PG_VERSION_17 && PG_VERSION_NUM < PG_VERSION_18
/* copied from explain.c */
/* Instrumentation data for SERIALIZE option */
@ -153,13 +159,7 @@ typedef struct SerializeMetrics
} SerializeMetrics;
/* copied from explain.c */
static bool peek_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_memory_counters(ExplainState *es,
const MemoryContextCounters *mem_counters);
static void ExplainIndentText(ExplainState *es);
static void ExplainPrintSerialize(ExplainState *es,
SerializeMetrics *metrics);
static SerializeMetrics GetSerializationMetrics(DestReceiver *dest);
/*
@ -187,6 +187,23 @@ typedef struct SerializeDestReceiver
} SerializeDestReceiver;
#endif
#if PG_VERSION_NUM >= PG_VERSION_17
/*
* Various places within need to convert bytes to kilobytes. Round these up
* to the next whole kilobyte.
* copied from explain.c
*/
#define BYTES_TO_KILOBYTES(b) (((b) + 1023) / 1024)
/* copied from explain.c */
static bool peek_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_memory_counters(ExplainState *es,
const MemoryContextCounters *mem_counters);
static void ExplainPrintSerialize(ExplainState *es,
SerializeMetrics *metrics);
#endif
/* Explain functions for distributed queries */
static void ExplainSubPlans(DistributedPlan *distributedPlan, ExplainState *es);
@ -210,7 +227,8 @@ static const char * ExplainFormatStr(ExplainFormat format);
#if PG_VERSION_NUM >= PG_VERSION_17
static const char * ExplainSerializeStr(ExplainSerializeOption serializeOption);
#endif
static void ExplainWorkerPlan(PlannedStmt *plannedStmt, DestReceiver *dest,
static void ExplainWorkerPlan(PlannedStmt *plannedStmt, DistributedSubPlan *subPlan,
DestReceiver *dest,
ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv,
@ -219,7 +237,9 @@ static void ExplainWorkerPlan(PlannedStmt *plannedStmt, DestReceiver *dest,
const BufferUsage *bufusage,
const MemoryContextCounters *mem_counters,
#endif
double *executionDurationMillisec);
double *executionDurationMillisec,
double *executionTuples,
double *executionLoops);
static ExplainFormat ExtractFieldExplainFormat(Datum jsonbDoc, const char *fieldName,
ExplainFormat defaultValue);
#if PG_VERSION_NUM >= PG_VERSION_17
@ -251,7 +271,8 @@ static double elapsed_time(instr_time *starttime);
static void ExplainPropertyBytes(const char *qlabel, int64 bytes, ExplainState *es);
static uint64 TaskReceivedTupleData(Task *task);
static bool ShowReceivedTupleData(CitusScanState *scanState, ExplainState *es);
static bool PlanStateAnalyzeWalker(PlanState *planState, void *ctx);
static void ExtractAnalyzeStats(DistributedSubPlan *subPlan, PlanState *planState);
/* exports for SQL callable functions */
PG_FUNCTION_INFO_V1(worker_last_saved_explain_analyze);
@ -427,6 +448,84 @@ NonPushableMergeCommandExplainScan(CustomScanState *node, List *ancestors,
}
/*
* ExtractAnalyzeStats parses the EXPLAIN ANALYZE output of the pre-executed
* subplans and injects the parsed statistics into queryDesc->planstate->instrument.
*/
static void
ExtractAnalyzeStats(DistributedSubPlan *subPlan, PlanState *planState)
{
if (!planState)
{
return;
}
Instrumentation *instr = planState->instrument;
if (!IsA(planState, CustomScanState))
{
instr->ntuples = subPlan->ntuples;
instr->nloops = 1; /* subplan nodes are executed only once */
return;
}
Assert(IsA(planState, CustomScanState));
if (subPlan->numTasksOutput <= 0)
{
return;
}
ListCell *lc;
int tasksOutput = 0;
double tasksNtuples = 0;
double tasksNloops = 0;
memset(instr, 0, sizeof(Instrumentation));
DistributedPlan *newdistributedPlan =
((CitusScanState *) planState)->distributedPlan;
/*
* Inject the earlier executed resultsextracted from the workers' EXPLAIN output
* into the newly created tasks.
*/
foreach(lc, newdistributedPlan->workerJob->taskList)
{
Task *task = (Task *) lfirst(lc);
uint32 taskId = task->taskId;
if (tasksOutput > subPlan->numTasksOutput)
{
break;
}
if (!subPlan->totalExplainOutput[taskId].explainOutput)
{
continue;
}
/*
* Now feed the earlier saved output, which will be used
* by RemoteExplain() when printing tasks
*/
MemoryContext taskContext = GetMemoryChunkContext(task);
task->totalReceivedTupleData =
subPlan->totalExplainOutput[taskId].totalReceivedTupleData;
task->fetchedExplainAnalyzeExecutionDuration =
subPlan->totalExplainOutput[taskId].executionDuration;
task->fetchedExplainAnalyzePlan =
MemoryContextStrdup(taskContext,
subPlan->totalExplainOutput[taskId].explainOutput);
tasksNtuples += subPlan->totalExplainOutput[taskId].executionNtuples;
tasksNloops = subPlan->totalExplainOutput[taskId].executionNloops;
subPlan->totalExplainOutput[taskId].explainOutput = NULL;
tasksOutput++;
}
instr->ntuples = tasksNtuples;
instr->nloops = tasksNloops;
}
/*
* ExplainSubPlans generates EXPLAIN output for subplans for CTEs
* and complex subqueries. Because the planning for these queries
@ -445,7 +544,6 @@ ExplainSubPlans(DistributedPlan *distributedPlan, ExplainState *es)
{
DistributedSubPlan *subPlan = (DistributedSubPlan *) lfirst(subPlanCell);
PlannedStmt *plan = subPlan->plan;
IntoClause *into = NULL;
ParamListInfo params = NULL;
/*
@ -529,6 +627,12 @@ ExplainSubPlans(DistributedPlan *distributedPlan, ExplainState *es)
ExplainOpenGroup("PlannedStmt", "PlannedStmt", false, es);
DestReceiver *dest = None_Receiver; /* No query execution */
double executionDurationMillisec = 0.0;
double executionTuples = 0;
double executionLoops = 0;
/* Capture memory stats on PG17+ */
#if PG_VERSION_NUM >= PG_VERSION_17
if (es->memory)
{
@ -536,12 +640,20 @@ ExplainSubPlans(DistributedPlan *distributedPlan, ExplainState *es)
MemoryContextMemConsumed(planner_ctx, &mem_counters);
}
ExplainOnePlan(plan, into, es, queryString, params, NULL, &planduration,
(es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL));
/* Execute EXPLAIN without ANALYZE */
ExplainWorkerPlan(plan, subPlan, dest, es, queryString, params, NULL,
&planduration,
(es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL),
&executionDurationMillisec,
&executionTuples,
&executionLoops);
#else
ExplainOnePlan(plan, into, es, queryString, params, NULL, &planduration,
(es->buffers ? &bufusage : NULL));
/* Execute EXPLAIN without ANALYZE */
ExplainWorkerPlan(plan, subPlan, dest, es, queryString, params, NULL,
&planduration, &executionDurationMillisec,
&executionTuples, &executionLoops);
#endif
ExplainCloseGroup("PlannedStmt", "PlannedStmt", false, es);
@ -1212,17 +1324,19 @@ worker_last_saved_explain_analyze(PG_FUNCTION_ARGS)
if (SavedExplainPlan != NULL)
{
int columnCount = tupleDescriptor->natts;
if (columnCount != 2)
if (columnCount != 4)
{
ereport(ERROR, (errmsg("expected 3 output columns in definition of "
ereport(ERROR, (errmsg("expected 4 output columns in definition of "
"worker_last_saved_explain_analyze, but got %d",
columnCount)));
}
bool columnNulls[2] = { false };
Datum columnValues[2] = {
bool columnNulls[4] = { false };
Datum columnValues[4] = {
CStringGetTextDatum(SavedExplainPlan),
Float8GetDatum(SavedExecutionDurationMillisec)
Float8GetDatum(SavedExecutionDurationMillisec),
Float8GetDatum(SavedExplainPlanNtuples),
Float8GetDatum(SavedExplainPlanNloops)
};
tuplestore_putvalues(tupleStore, tupleDescriptor, columnValues, columnNulls);
@ -1243,6 +1357,8 @@ worker_save_query_explain_analyze(PG_FUNCTION_ARGS)
text *queryText = PG_GETARG_TEXT_P(0);
char *queryString = text_to_cstring(queryText);
double executionDurationMillisec = 0.0;
double executionTuples = 0;
double executionLoops = 0;
Datum explainOptions = PG_GETARG_DATUM(1);
ExplainState *es = NewExplainState();
@ -1359,16 +1475,19 @@ worker_save_query_explain_analyze(PG_FUNCTION_ARGS)
}
/* do the actual EXPLAIN ANALYZE */
ExplainWorkerPlan(plan, tupleStoreDest, es, queryString, boundParams, NULL,
ExplainWorkerPlan(plan, NULL, tupleStoreDest, es, queryString, boundParams, NULL,
&planDuration,
(es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL),
&executionDurationMillisec);
&executionDurationMillisec,
&executionTuples,
&executionLoops);
#else
/* do the actual EXPLAIN ANALYZE */
ExplainWorkerPlan(plan, tupleStoreDest, es, queryString, boundParams, NULL,
&planDuration, &executionDurationMillisec);
ExplainWorkerPlan(plan, NULL, tupleStoreDest, es, queryString, boundParams, NULL,
&planDuration, &executionDurationMillisec,
&executionTuples, &executionLoops);
#endif
ExplainEndOutput(es);
@ -1379,6 +1498,8 @@ worker_save_query_explain_analyze(PG_FUNCTION_ARGS)
SavedExplainPlan = pstrdup(es->str->data);
SavedExecutionDurationMillisec = executionDurationMillisec;
SavedExplainPlanNtuples = executionTuples;
SavedExplainPlanNloops = executionLoops;
MemoryContextSwitchTo(oldContext);
@ -1558,22 +1679,40 @@ CitusExplainOneQuery(Query *query, int cursorOptions, IntoClause *into,
BufferUsageAccumDiff(&bufusage, &pgBufferUsage, &bufusage_start);
}
/* capture memory stats on PG17+ */
#if PG_VERSION_NUM >= PG_VERSION_17
if (es->memory)
{
MemoryContextSwitchTo(saved_ctx);
MemoryContextMemConsumed(planner_ctx, &mem_counters);
}
#endif
/* run it (if needed) and produce output */
ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL));
#if PG_VERSION_NUM >= PG_VERSION_17
/* PostgreSQL 17 signature (9 args: includes mem_counters) */
ExplainOnePlan(
plan,
into,
es,
queryString,
params,
queryEnv,
&planduration,
(es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL)
);
#else
/* run it (if needed) and produce output */
ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
ExplainOnePlan(
plan,
into,
es,
queryString,
params,
queryEnv,
&planduration,
(es->buffers ? &bufusage : NULL)
);
#endif
}
@ -1590,11 +1729,13 @@ CreateExplainAnlyzeDestination(Task *task, TupleDestination *taskDest)
tupleDestination->originalTask = task;
tupleDestination->originalTaskDestination = taskDest;
TupleDesc lastSavedExplainAnalyzeTupDesc = CreateTemplateTupleDesc(2);
TupleDesc lastSavedExplainAnalyzeTupDesc = CreateTemplateTupleDesc(4);
TupleDescInitEntry(lastSavedExplainAnalyzeTupDesc, 1, "explain analyze", TEXTOID, 0,
0);
TupleDescInitEntry(lastSavedExplainAnalyzeTupDesc, 2, "duration", FLOAT8OID, 0, 0);
TupleDescInitEntry(lastSavedExplainAnalyzeTupDesc, 3, "ntuples", FLOAT8OID, 0, 0);
TupleDescInitEntry(lastSavedExplainAnalyzeTupDesc, 4, "nloops", FLOAT8OID, 0, 0);
tupleDestination->lastSavedExplainAnalyzeTupDesc = lastSavedExplainAnalyzeTupDesc;
@ -1605,6 +1746,51 @@ CreateExplainAnlyzeDestination(Task *task, TupleDestination *taskDest)
}
/*
* EnsureExplainOutputCapacity is to ensure capacity for new entries. Input
* parameter requiredSize is minimum number of elements needed.
*/
static void
EnsureExplainOutputCapacity(int requiredSize)
{
if (requiredSize < TotalExplainOutputCapacity)
{
return;
}
int newCapacity =
(TotalExplainOutputCapacity == 0) ? 32 : TotalExplainOutputCapacity * 2;
while (newCapacity <= requiredSize)
{
newCapacity *= 2;
}
if (SubPlanExplainOutput == NULL)
{
SubPlanExplainOutput =
(SubPlanExplainOutputData *) MemoryContextAllocZero(
SubPlanExplainAnalyzeContext,
newCapacity *
sizeof(SubPlanExplainOutputData));
}
else
{
/* Use repalloc and manually zero the new memory */
int oldSize = TotalExplainOutputCapacity * sizeof(SubPlanExplainOutputData);
int newSize = newCapacity * sizeof(SubPlanExplainOutputData);
SubPlanExplainOutput =
(SubPlanExplainOutputData *) repalloc(SubPlanExplainOutput, newSize);
/* Zero out the newly allocated memory */
MemSet((char *) SubPlanExplainOutput + oldSize, 0, newSize - oldSize);
}
TotalExplainOutputCapacity = newCapacity;
}
/*
* ExplainAnalyzeDestPutTuple implements TupleDestination->putTuple
* for ExplainAnalyzeDestination.
@ -1614,6 +1800,8 @@ ExplainAnalyzeDestPutTuple(TupleDestination *self, Task *task,
int placementIndex, int queryNumber,
HeapTuple heapTuple, uint64 tupleLibpqSize)
{
uint32 taskId = task->taskId;
ExplainAnalyzeDestination *tupleDestination = (ExplainAnalyzeDestination *) self;
if (queryNumber == 0)
{
@ -1621,6 +1809,13 @@ ExplainAnalyzeDestPutTuple(TupleDestination *self, Task *task,
originalTupDest->putTuple(originalTupDest, task, placementIndex, 0, heapTuple,
tupleLibpqSize);
tupleDestination->originalTask->totalReceivedTupleData += tupleLibpqSize;
if (SubPlanExplainAnalyzeContext)
{
EnsureExplainOutputCapacity(taskId + 1);
SubPlanExplainOutput[taskId].totalReceivedTupleData =
tupleDestination->originalTask->totalReceivedTupleData;
}
}
else if (queryNumber == 1)
{
@ -1636,6 +1831,8 @@ ExplainAnalyzeDestPutTuple(TupleDestination *self, Task *task,
}
Datum executionDuration = heap_getattr(heapTuple, 2, tupDesc, &isNull);
Datum executionTuples = heap_getattr(heapTuple, 3, tupDesc, &isNull);
Datum executionLoops = heap_getattr(heapTuple, 4, tupDesc, &isNull);
if (isNull)
{
@ -1645,6 +1842,8 @@ ExplainAnalyzeDestPutTuple(TupleDestination *self, Task *task,
char *fetchedExplainAnalyzePlan = TextDatumGetCString(explainAnalyze);
double fetchedExplainAnalyzeExecutionDuration = DatumGetFloat8(executionDuration);
double fetchedExplainAnalyzeTuples = DatumGetFloat8(executionTuples);
double fetchedExplainAnalyzeLoops = DatumGetFloat8(executionLoops);
/*
* Allocate fetchedExplainAnalyzePlan in the same context as the Task, since we are
@ -1670,6 +1869,20 @@ ExplainAnalyzeDestPutTuple(TupleDestination *self, Task *task,
placementIndex;
tupleDestination->originalTask->fetchedExplainAnalyzeExecutionDuration =
fetchedExplainAnalyzeExecutionDuration;
/* We should build tupleDestination in subPlan similar to the above */
if (SubPlanExplainAnalyzeContext)
{
EnsureExplainOutputCapacity(taskId + 1);
SubPlanExplainOutput[taskId].explainOutput =
MemoryContextStrdup(SubPlanExplainAnalyzeContext,
fetchedExplainAnalyzePlan);
SubPlanExplainOutput[taskId].executionDuration =
fetchedExplainAnalyzeExecutionDuration;
SubPlanExplainOutput[taskId].executionNtuples = fetchedExplainAnalyzeTuples;
SubPlanExplainOutput[taskId].executionNloops = fetchedExplainAnalyzeLoops;
NumTasksOutput++;
}
}
else
{
@ -1732,7 +1945,14 @@ ExplainAnalyzeDestTupleDescForQuery(TupleDestination *self, int queryNumber)
bool
RequestedForExplainAnalyze(CitusScanState *node)
{
return (node->customScanState.ss.ps.state->es_instrument != 0);
/*
* When running a distributed planeither the root plan or a subplans
* distributed fragmentwe need to know if were under EXPLAIN ANALYZE.
* Subplans cant receive the EXPLAIN ANALYZE flag directly, so we use
* SubPlanExplainAnalyzeContext as a flag to indicate that context.
*/
return (node->customScanState.ss.ps.state->es_instrument != 0) ||
(SubPlanLevel > 0 && SubPlanExplainAnalyzeContext);
}
@ -1805,7 +2025,7 @@ WrapQueryForExplainAnalyze(const char *queryString, TupleDesc tupleDesc,
appendStringInfoString(columnDef, ", ");
}
Form_pg_attribute attr = &tupleDesc->attrs[columnIndex];
Form_pg_attribute attr = TupleDescAttr(tupleDesc, columnIndex);
char *attrType = format_type_extended(attr->atttypid, attr->atttypmod,
FORMAT_TYPE_TYPEMOD_GIVEN |
FORMAT_TYPE_FORCE_QUALIFY);
@ -1891,7 +2111,8 @@ FetchPlanQueryForExplainAnalyze(const char *queryString, ParamListInfo params)
}
appendStringInfoString(fetchQuery,
"SELECT explain_analyze_output, execution_duration "
"SELECT explain_analyze_output, execution_duration, "
"execution_ntuples, execution_nloops "
"FROM worker_last_saved_explain_analyze()");
return fetchQuery->data;
@ -2026,25 +2247,57 @@ ExplainOneQuery(Query *query, int cursorOptions,
BufferUsageAccumDiff(&bufusage, &pgBufferUsage, &bufusage_start);
}
/* 1) Capture memory counters on PG17+ only once: */
#if PG_VERSION_NUM >= PG_VERSION_17
if (es->memory)
{
MemoryContextSwitchTo(saved_ctx);
MemoryContextMemConsumed(planner_ctx, &mem_counters);
}
/* run it (if needed) and produce output */
ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters : NULL));
#endif
#if PG_VERSION_NUM >= PG_VERSION_17
ExplainOnePlan(
plan,
into,
es,
queryString,
params,
queryEnv,
&planduration,
(es->buffers ? &bufusage : NULL),
(es->memory ? &mem_counters: NULL)
);
#else
/* run it (if needed) and produce output */
ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL));
ExplainOnePlan(
plan,
into,
es,
queryString,
params,
queryEnv,
&planduration,
(es->buffers ? &bufusage : NULL)
);
#endif
}
}
/*
* PlanStateAnalyzeWalker Tree walker callback that visits each PlanState node in the
* plan tree and extracts analyze statistics from CustomScanState tasks using
* ExtractAnalyzeStats. Always returns false to recurse into all children.
*/
static bool
PlanStateAnalyzeWalker(PlanState *planState, void *ctx)
{
DistributedSubPlan *subplan = (DistributedSubPlan *) ctx;
ExtractAnalyzeStats(subplan, planState);
return false;
}
/*
* ExplainWorkerPlan produces explain output into es. If es->analyze, it also executes
* the given plannedStmt and sends the results to dest. It puts total time to execute in
@ -2059,20 +2312,25 @@ ExplainOneQuery(Query *query, int cursorOptions,
* destination.
*/
static void
ExplainWorkerPlan(PlannedStmt *plannedstmt, DestReceiver *dest, ExplainState *es,
ExplainWorkerPlan(PlannedStmt *plannedstmt, DistributedSubPlan *subPlan, DestReceiver *dest, ExplainState *es,
const char *queryString, ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
#if PG_VERSION_NUM >= PG_VERSION_17
const BufferUsage *bufusage,
const MemoryContextCounters *mem_counters,
#endif
double *executionDurationMillisec)
double *executionDurationMillisec,
double *executionTuples,
double *executionLoops)
{
QueryDesc *queryDesc;
instr_time starttime;
double totaltime = 0;
int eflags;
int instrument_option = 0;
/* Sub-plan already executed; skipping execution */
bool executeQuery = (es->analyze && !subPlan);
bool executeSubplan = (es->analyze && subPlan);
Assert(plannedstmt->commandType != CMD_UTILITY);
@ -2102,12 +2360,19 @@ ExplainWorkerPlan(PlannedStmt *plannedstmt, DestReceiver *dest, ExplainState *es
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc for the query */
queryDesc = CreateQueryDesc(plannedstmt, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
queryDesc = CreateQueryDesc(
plannedstmt, /* PlannedStmt *plannedstmt */
queryString, /* const char *sourceText */
GetActiveSnapshot(), /* Snapshot snapshot */
InvalidSnapshot, /* Snapshot crosscheck_snapshot */
dest, /* DestReceiver *dest */
params, /* ParamListInfo params */
queryEnv, /* QueryEnvironment *queryEnv */
instrument_option /* int instrument_options */
);
/* Select execution options */
if (es->analyze)
if (executeQuery)
eflags = 0; /* default run-to-completion flags */
else
eflags = EXEC_FLAG_EXPLAIN_ONLY;
@ -2116,12 +2381,19 @@ ExplainWorkerPlan(PlannedStmt *plannedstmt, DestReceiver *dest, ExplainState *es
ExecutorStart(queryDesc, eflags);
/* Execute the plan for statistics if asked for */
if (es->analyze)
if (executeQuery)
{
ScanDirection dir = ForwardScanDirection;
/* run the plan */
ExecutorRun(queryDesc, dir, 0L, true);
/* run the plan: count = 0 (all rows) */
#if PG_VERSION_NUM >= PG_VERSION_18
/* PG 18+ dropped the “execute_once” boolean */
ExecutorRun(queryDesc, dir, 0L);
#else
/* PG 17- still expect the 4th once argument */
ExecutorRun(queryDesc, dir, 0L, true);
#endif
/* run cleanup too */
ExecutorFinish(queryDesc);
@ -2132,6 +2404,12 @@ ExplainWorkerPlan(PlannedStmt *plannedstmt, DestReceiver *dest, ExplainState *es
ExplainOpenGroup("Query", NULL, true, es);
if (executeSubplan)
{
ExtractAnalyzeStats(subPlan, queryDesc->planstate);
planstate_tree_walker(queryDesc->planstate, PlanStateAnalyzeWalker, (void *) subPlan);
}
/* Create textual dump of plan tree */
ExplainPrintPlan(es, queryDesc);
@ -2204,6 +2482,13 @@ ExplainWorkerPlan(PlannedStmt *plannedstmt, DestReceiver *dest, ExplainState *es
*/
INSTR_TIME_SET_CURRENT(starttime);
if (executeQuery)
{
Instrumentation *instr = queryDesc->planstate->instrument;
*executionTuples = instr->ntuples;
*executionLoops = instr->nloops;
}
ExecutorEnd(queryDesc);
FreeQueryDesc(queryDesc);
@ -2211,7 +2496,7 @@ ExplainWorkerPlan(PlannedStmt *plannedstmt, DestReceiver *dest, ExplainState *es
PopActiveSnapshot();
/* We need a CCI just in case query expanded to multiple plans */
if (es->analyze)
if (executeQuery)
CommandCounterIncrement();
totaltime += elapsed_time(&starttime);
@ -2248,6 +2533,50 @@ elapsed_time(instr_time *starttime)
}
#if PG_VERSION_NUM >= PG_VERSION_17 && PG_VERSION_NUM < PG_VERSION_18
/*
* Indent a text-format line.
*
* We indent by two spaces per indentation level. However, when emitting
* data for a parallel worker there might already be data on the current line
* (cf. ExplainOpenWorker); in that case, don't indent any more.
*
* Copied from explain.c.
*/
static void
ExplainIndentText(ExplainState *es)
{
Assert(es->format == EXPLAIN_FORMAT_TEXT);
if (es->str->len == 0 || es->str->data[es->str->len - 1] == '\n')
appendStringInfoSpaces(es->str, es->indent * 2);
}
/*
* GetSerializationMetrics - collect metrics
*
* We have to be careful here since the receiver could be an IntoRel
* receiver if the subject statement is CREATE TABLE AS. In that
* case, return all-zeroes stats.
*
* Copied from explain.c.
*/
static SerializeMetrics
GetSerializationMetrics(DestReceiver *dest)
{
SerializeMetrics empty;
if (dest->mydest == DestExplainSerialize)
return ((SerializeDestReceiver *) dest)->metrics;
memset(&empty, 0, sizeof(SerializeMetrics));
INSTR_TIME_SET_ZERO(empty.timeSpent);
return empty;
}
#endif
#if PG_VERSION_NUM >= PG_VERSION_17
/*
* Return whether show_buffer_usage would have anything to print, if given
@ -2466,24 +2795,6 @@ show_buffer_usage(ExplainState *es, const BufferUsage *usage)
}
/*
* Indent a text-format line.
*
* We indent by two spaces per indentation level. However, when emitting
* data for a parallel worker there might already be data on the current line
* (cf. ExplainOpenWorker); in that case, don't indent any more.
*
* Copied from explain.c.
*/
static void
ExplainIndentText(ExplainState *es)
{
Assert(es->format == EXPLAIN_FORMAT_TEXT);
if (es->str->len == 0 || es->str->data[es->str->len - 1] == '\n')
appendStringInfoSpaces(es->str, es->indent * 2);
}
/*
* Show memory usage details.
*
@ -2560,7 +2871,7 @@ ExplainPrintSerialize(ExplainState *es, SerializeMetrics *metrics)
ExplainPropertyFloat("Time", "ms",
1000.0 * INSTR_TIME_GET_DOUBLE(metrics->timeSpent),
3, es);
ExplainPropertyUInteger("Output Volume", "kB",
ExplainPropertyInteger("Output Volume", "kB",
BYTES_TO_KILOBYTES(metrics->bytesSent), es);
ExplainPropertyText("Format", format, es);
if (es->buffers)
@ -2569,28 +2880,4 @@ ExplainPrintSerialize(ExplainState *es, SerializeMetrics *metrics)
ExplainCloseGroup("Serialization", "Serialization", true, es);
}
/*
* GetSerializationMetrics - collect metrics
*
* We have to be careful here since the receiver could be an IntoRel
* receiver if the subject statement is CREATE TABLE AS. In that
* case, return all-zeroes stats.
*
* Copied from explain.c.
*/
static SerializeMetrics
GetSerializationMetrics(DestReceiver *dest)
{
SerializeMetrics empty;
if (dest->mydest == DestExplainSerialize)
return ((SerializeDestReceiver *) dest)->metrics;
memset(&empty, 0, sizeof(SerializeMetrics));
INSTR_TIME_SET_ZERO(empty.timeSpent);
return empty;
}
#endif

View File

@ -4557,14 +4557,36 @@ FindReferencedTableColumn(Expr *columnExpression, List *parentQueryList, Query *
FindReferencedTableColumn(joinColumn, parentQueryList, query, column,
rteContainingReferencedColumn, skipOuterVars);
}
#if PG_VERSION_NUM >= PG_VERSION_18
else if (rangeTableEntry->rtekind == RTE_GROUP)
{
/*
* PG 18: synthetic GROUP RTE. Each groupexprs item corresponds to the
* columns produced by the grouping step, in the *same ordinal order* as
* the Vars that reference them.
*/
List *groupexprs = rangeTableEntry->groupexprs;
AttrNumber groupIndex = candidateColumn->varattno - 1;
/* this must always hold unless upstream Postgres mis-constructed the RTE_GROUP */
Assert(groupIndex >= 0 && groupIndex < list_length(groupexprs));
Expr *groupExpr = (Expr *) list_nth(groupexprs, groupIndex);
/* Recurse on the underlying expression (stay in the same query) */
FindReferencedTableColumn(groupExpr, parentQueryList, query,
column, rteContainingReferencedColumn,
skipOuterVars);
}
#endif /* PG_VERSION_NUM >= 180000 */
else if (rangeTableEntry->rtekind == RTE_CTE)
{
/*
* When outerVars are considered, we modify parentQueryList, so this
* logic might need to change when we support outervars in CTEs.
* Resolve through a CTE even when skipOuterVars == false.
* Maintain the invariant that each recursion level owns a private,
* correctly-bounded copy of parentQueryList.
*/
Assert(skipOuterVars);
int cteParentListIndex = list_length(parentQueryList) -
rangeTableEntry->ctelevelsup - 1;
Query *cteParentQuery = NULL;
@ -4595,14 +4617,34 @@ FindReferencedTableColumn(Expr *columnExpression, List *parentQueryList, Query *
if (cte != NULL)
{
Query *cteQuery = (Query *) cte->ctequery;
List *targetEntryList = cteQuery->targetList;
AttrNumber targetEntryIndex = candidateColumn->varattno - 1;
TargetEntry *targetEntry = list_nth(targetEntryList, targetEntryIndex);
parentQueryList = lappend(parentQueryList, query);
FindReferencedTableColumn(targetEntry->expr, parentQueryList,
cteQuery, column, rteContainingReferencedColumn,
skipOuterVars);
if (targetEntryIndex >= 0 &&
targetEntryIndex < list_length(cteQuery->targetList))
{
TargetEntry *targetEntry =
list_nth(cteQuery->targetList, targetEntryIndex);
/* Build a private, bounded parentQueryList before recursing into the CTE.
* Invariant: list is [top current], owned by this call (no aliasing).
* For RTE_CTE:
* owner_idx = list_length(parentQueryList) - rangeTableEntry->ctelevelsup - 1;
* newParent = lappend(list_truncate(list_copy(parentQueryList), owner_idx + 1), query);
* Example (Q0 owns CTE; were in Q2 via nested subquery):
* parent=[Q0,Q1,Q2], ctelevelsup=2 owner_idx=0 newParent=[Q0,Q2].
* Keeps outer-Var level math correct without mutating the callers list.
*/
List *newParent = list_copy(parentQueryList);
newParent = list_truncate(newParent, cteParentListIndex + 1);
newParent = lappend(newParent, query);
FindReferencedTableColumn(targetEntry->expr,
newParent,
cteQuery,
column,
rteContainingReferencedColumn,
skipOuterVars);
}
}
}
}

View File

@ -35,6 +35,10 @@
#include "utils/syscache.h"
#include "pg_version_constants.h"
#if PG_VERSION_NUM >= PG_VERSION_18
typedef OpIndexInterpretation OpBtreeInterpretation;
#endif
#include "distributed/citus_clauses.h"
#include "distributed/colocation_utils.h"
@ -293,8 +297,19 @@ TargetListOnPartitionColumn(Query *query, List *targetEntryList)
bool
FindNodeMatchingCheckFunctionInRangeTableList(List *rtable, CheckNodeFunc checker)
{
int rtWalkFlags = QTW_EXAMINE_RTES_BEFORE;
#if PG_VERSION_NUM >= PG_VERSION_18
/*
* PG18+: Do not descend into GROUP BY expressions subqueries, they
* have already been visited as recursive planning is depth-first.
*/
rtWalkFlags |= QTW_IGNORE_GROUPEXPRS;
#endif
return range_table_walker(rtable, FindNodeMatchingCheckFunction, checker,
QTW_EXAMINE_RTES_BEFORE);
rtWalkFlags);
}
@ -2293,7 +2308,12 @@ OperatorImplementsEquality(Oid opno)
{
OpBtreeInterpretation *btreeIntepretation = (OpBtreeInterpretation *)
lfirst(btreeInterpretationCell);
#if PG_VERSION_NUM >= PG_VERSION_18
if (btreeIntepretation->cmptype == BTEqualStrategyNumber)
#else
if (btreeIntepretation->strategy == BTEqualStrategyNumber)
#endif
{
equalityOperator = true;
break;

View File

@ -167,13 +167,16 @@ static uint32 HashPartitionCount(void);
/* Local functions forward declarations for task list creation and helper functions */
static Job * BuildJobTreeTaskList(Job *jobTree,
PlannerRestrictionContext *plannerRestrictionContext);
static bool IsInnerTableOfOuterJoin(RelationRestriction *relationRestriction);
static bool IsInnerTableOfOuterJoin(RelationRestriction *relationRestriction,
Bitmapset *distributedTables,
bool *outerPartHasDistributedTable);
static void ErrorIfUnsupportedShardDistribution(Query *query);
static Task * QueryPushdownTaskCreate(Query *originalQuery, int shardIndex,
RelationRestrictionContext *restrictionContext,
uint32 taskId,
TaskType taskType,
bool modifyRequiresCoordinatorEvaluation,
bool updateQualsForOuterJoin,
DeferredErrorMessage **planningError);
static List * SqlTaskList(Job *job);
static bool DependsOnHashPartitionJob(Job *job);
@ -1418,8 +1421,24 @@ ExtractColumns(RangeTblEntry *callingRTE, int rangeTableId,
int subLevelsUp = 0;
int location = -1;
bool includeDroppedColumns = false;
expandRTE(callingRTE, rangeTableId, subLevelsUp, location, includeDroppedColumns,
columnNames, columnVars);
#if PG_VERSION_NUM >= PG_VERSION_18
expandRTE(callingRTE,
rangeTableId,
subLevelsUp,
VAR_RETURNING_DEFAULT, /* new argument on PG18+ */
location,
includeDroppedColumns,
columnNames,
columnVars);
#else
expandRTE(callingRTE,
rangeTableId,
subLevelsUp,
location,
includeDroppedColumns,
columnNames,
columnVars);
#endif
}
@ -2183,6 +2202,7 @@ QueryPushdownSqlTaskList(Query *query, uint64 jobId,
int minShardOffset = INT_MAX;
int prevShardCount = 0;
Bitmapset *taskRequiredForShardIndex = NULL;
Bitmapset *distributedTableIndex = NULL;
/* error if shards are not co-partitioned */
ErrorIfUnsupportedShardDistribution(query);
@ -2199,8 +2219,12 @@ QueryPushdownSqlTaskList(Query *query, uint64 jobId,
RelationRestriction *relationRestriction = NULL;
List *prunedShardList = NULL;
forboth_ptr(prunedShardList, prunedRelationShardList,
relationRestriction, relationRestrictionContext->relationRestrictionList)
/* First loop, gather the indexes of distributed tables
* this is required to decide whether we can skip shards
* from inner tables of outer joins
*/
foreach_declared_ptr(relationRestriction,
relationRestrictionContext->relationRestrictionList)
{
Oid relationId = relationRestriction->relationId;
@ -2221,6 +2245,24 @@ QueryPushdownSqlTaskList(Query *query, uint64 jobId,
}
prevShardCount = cacheEntry->shardIntervalArrayLength;
distributedTableIndex = bms_add_member(distributedTableIndex,
relationRestriction->index);
}
/* In the second loop, populate taskRequiredForShardIndex */
bool updateQualsForOuterJoin = false;
bool outerPartHasDistributedTable = false;
forboth_ptr(prunedShardList, prunedRelationShardList,
relationRestriction, relationRestrictionContext->relationRestrictionList)
{
Oid relationId = relationRestriction->relationId;
CitusTableCacheEntry *cacheEntry = GetCitusTableCacheEntry(relationId);
if (!HasDistributionKeyCacheEntry(cacheEntry))
{
continue;
}
/*
* For left joins we don't care about the shards pruned for the right hand side.
* If the right hand side would prune to a smaller set we should still send it to
@ -2228,12 +2270,25 @@ QueryPushdownSqlTaskList(Query *query, uint64 jobId,
* the left hand side we don't have to send the query to any shard that is not
* matching anything on the left hand side.
*
* Instead we will simply skip any RelationRestriction if it is an OUTER join and
* the table is part of the non-outer side of the join.
* Instead we will simply skip any RelationRestriction if it is an OUTER join,
* the table is part of the non-outer side of the join and the outer side has a
* distributed table.
*/
if (IsInnerTableOfOuterJoin(relationRestriction))
if (IsInnerTableOfOuterJoin(relationRestriction, distributedTableIndex,
&outerPartHasDistributedTable))
{
continue;
if (outerPartHasDistributedTable)
{
/* we can skip the shards from this relation restriction */
continue;
}
else
{
/* The outer part does not include distributed tables, we can not skip shards.
* Also, we will possibly update the quals of the outer relation for recurring join push down, mark here.
*/
updateQualsForOuterJoin = true;
}
}
ShardInterval *shardInterval = NULL;
@ -2247,6 +2302,22 @@ QueryPushdownSqlTaskList(Query *query, uint64 jobId,
}
}
/*
* We might fail to find outer joins from the relationRestrictionContext
* when the original query has CTEs. In order to ensure that we always mark
* the outer joins correctly and compute additional quals when necessary,
* check the task query as well.
*/
if (!updateQualsForOuterJoin && FindNodeMatchingCheckFunction((Node *) query,
IsOuterJoinExpr))
{
/*
* We have an outer join, so assume "might" need to update quals.
* See the usage of this flag in QueryPushdownTaskCreate().
*/
updateQualsForOuterJoin = true;
}
/*
* We keep track of minShardOffset to skip over a potentially big amount of pruned
* shards. However, we need to start at minShardOffset - 1 to make sure we don't
@ -2266,6 +2337,7 @@ QueryPushdownSqlTaskList(Query *query, uint64 jobId,
taskIdIndex,
taskType,
modifyRequiresCoordinatorEvaluation,
updateQualsForOuterJoin,
planningError);
if (*planningError != NULL)
{
@ -2299,10 +2371,13 @@ QueryPushdownSqlTaskList(Query *query, uint64 jobId,
* a) in an outer join
* b) on the inner part of said join
*
* The function returns true only if both conditions above hold true
* The function also sets outerPartHasDistributedTable if the outer part
* of the corresponding join has a distributed table.
*/
static bool
IsInnerTableOfOuterJoin(RelationRestriction *relationRestriction)
IsInnerTableOfOuterJoin(RelationRestriction *relationRestriction,
Bitmapset *distributedTables,
bool *outerPartHasDistributedTable)
{
RestrictInfo *joinInfo = NULL;
foreach_declared_ptr(joinInfo, relationRestriction->relOptInfo->joininfo)
@ -2323,6 +2398,11 @@ IsInnerTableOfOuterJoin(RelationRestriction *relationRestriction)
if (!isInOuter)
{
/* this table is joined in the inner part of an outer join */
/* set if the outer part has a distributed relation */
*outerPartHasDistributedTable = bms_overlap(joinInfo->outer_relids,
distributedTables);
/* this is an inner table of an outer join */
return true;
}
}
@ -2421,11 +2501,16 @@ ErrorIfUnsupportedShardDistribution(Query *query)
currentRelationId);
if (!coPartitionedTables)
{
char *firstRelName = get_rel_name(firstTableRelationId);
char *currentRelName = get_rel_name(currentRelationId);
int compareResult = strcmp(firstRelName, currentRelName);
ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot push down this subquery"),
errdetail("%s and %s are not colocated",
get_rel_name(firstTableRelationId),
get_rel_name(currentRelationId))));
(compareResult > 0 ? currentRelName : firstRelName),
(compareResult > 0 ? firstRelName :
currentRelName))));
}
}
}
@ -2439,6 +2524,7 @@ static Task *
QueryPushdownTaskCreate(Query *originalQuery, int shardIndex,
RelationRestrictionContext *restrictionContext, uint32 taskId,
TaskType taskType, bool modifyRequiresCoordinatorEvaluation,
bool updateQualsForOuterJoin,
DeferredErrorMessage **planningError)
{
Query *taskQuery = copyObject(originalQuery);
@ -2543,6 +2629,21 @@ QueryPushdownTaskCreate(Query *originalQuery, int shardIndex,
(List *) taskQuery->jointree->quals);
}
if (updateQualsForOuterJoin)
{
/*
* QueryPushdownSqlTaskList() might set this when it detects an outer join,
* even if the outer join is not surely known to be happening between a
* recurring and a distributed rel. However, it's still safe to call
* UpdateWhereClauseToPushdownRecurringOuterJoinWalker() here as it only
* acts on the where clause if the join is happening between a
* recurring and a distributed rel.
*/
UpdateWhereClauseToPushdownRecurringOuterJoinWalker((Node *) taskQuery,
relationShardList);
}
Task *subqueryTask = CreateBasicTask(jobId, taskId, taskType, NULL);
if ((taskType == MODIFY_TASK && !modifyRequiresCoordinatorEvaluation) ||
@ -3077,16 +3178,25 @@ BuildBaseConstraint(Var *column)
/*
* MakeOpExpression builds an operator expression node. This operator expression
* implements the operator clause as defined by the variable and the strategy
* number.
* MakeOpExpressionExtended builds an operator expression node that's of
* the form "Var <op> Expr", where, Expr must either be a Const or a Var
* (*1).
*
* This operator expression implements the operator clause as defined by
* the variable and the strategy number.
*/
OpExpr *
MakeOpExpression(Var *variable, int16 strategyNumber)
MakeOpExpressionExtended(Var *leftVar, Expr *rightArg, int16 strategyNumber)
{
Oid typeId = variable->vartype;
Oid typeModId = variable->vartypmod;
Oid collationId = variable->varcollid;
/*
* Other types of expressions are probably also fine to be used, but
* none of the callers need support for them for now, so we haven't
* tested them (*1).
*/
Assert(IsA(rightArg, Const) || IsA(rightArg, Var));
Oid typeId = leftVar->vartype;
Oid collationId = leftVar->varcollid;
Oid accessMethodId = BTREE_AM_OID;
@ -3104,18 +3214,16 @@ MakeOpExpression(Var *variable, int16 strategyNumber)
*/
if (operatorClassInputType != typeId && typeType != TYPTYPE_PSEUDO)
{
variable = (Var *) makeRelabelType((Expr *) variable, operatorClassInputType,
-1, collationId, COERCE_IMPLICIT_CAST);
leftVar = (Var *) makeRelabelType((Expr *) leftVar, operatorClassInputType,
-1, collationId, COERCE_IMPLICIT_CAST);
}
Const *constantValue = makeNullConst(operatorClassInputType, typeModId, collationId);
/* Now make the expression with the given variable and a null constant */
OpExpr *expression = (OpExpr *) make_opclause(operatorId,
InvalidOid, /* no result type yet */
false, /* no return set */
(Expr *) variable,
(Expr *) constantValue,
(Expr *) leftVar,
rightArg,
InvalidOid, collationId);
/* Set implementing function id and result type */
@ -3126,6 +3234,31 @@ MakeOpExpression(Var *variable, int16 strategyNumber)
}
/*
* MakeOpExpression is a wrapper around MakeOpExpressionExtended
* that creates a null constant of the appropriate type for right
* hand side operator class input type. As a result, it builds an
* operator expression node that's of the form "Var <op> NULL".
*/
OpExpr *
MakeOpExpression(Var *leftVar, int16 strategyNumber)
{
Oid typeId = leftVar->vartype;
Oid typeModId = leftVar->vartypmod;
Oid collationId = leftVar->varcollid;
Oid accessMethodId = BTREE_AM_OID;
OperatorCacheEntry *operatorCacheEntry = LookupOperatorByType(typeId, accessMethodId,
strategyNumber);
Oid operatorClassInputType = operatorCacheEntry->operatorClassInputType;
Const *constantValue = makeNullConst(operatorClassInputType, typeModId, collationId);
return MakeOpExpressionExtended(leftVar, (Expr *) constantValue, strategyNumber);
}
/*
* LookupOperatorByType is a wrapper around GetOperatorByType(),
* operatorClassInputType() and get_typtype() functions that uses a cache to avoid

View File

@ -16,6 +16,8 @@
#include "postgres.h"
#include "access/stratnum.h"
#include "access/tupdesc.h"
#include "access/tupdesc_details.h"
#include "access/xact.h"
#include "catalog/pg_opfamily.h"
#include "catalog/pg_proc.h"
@ -34,6 +36,7 @@
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
#include "optimizer/planner.h"
#include "optimizer/restrictinfo.h"
#include "parser/parse_oper.h"
#include "parser/parsetree.h"
@ -81,6 +84,7 @@
#include "distributed/relay_utility.h"
#include "distributed/resource_lock.h"
#include "distributed/shard_pruning.h"
#include "distributed/shard_utils.h"
#include "distributed/shardinterval_utils.h"
/* intermediate value for INSERT processing */
@ -164,7 +168,7 @@ static List * SingleShardTaskList(Query *query, uint64 jobId,
List *relationShardList, List *placementList,
uint64 shardId, bool parametersInQueryResolved,
bool isLocalTableModification, Const *partitionKeyValue,
int colocationId);
int colocationId, bool delayedFastPath);
static bool RowLocksOnRelations(Node *node, List **rtiLockList);
static void ReorderTaskPlacementsByTaskAssignmentPolicy(Job *job,
TaskAssignmentPolicyType
@ -173,7 +177,7 @@ static void ReorderTaskPlacementsByTaskAssignmentPolicy(Job *job,
static bool ModifiesLocalTableWithRemoteCitusLocalTable(List *rangeTableList);
static DeferredErrorMessage * DeferErrorIfUnsupportedLocalTableJoin(List *rangeTableList);
static bool IsLocallyAccessibleCitusLocalTable(Oid relationId);
static bool ConvertToQueryOnShard(Query *query, Oid relationID, Oid shardRelationId);
/*
* CreateRouterPlan attempts to create a router executor plan for the given
@ -368,6 +372,25 @@ AddPartitionKeyNotNullFilterToSelect(Query *subqery)
/* we should have found target partition column */
Assert(targetPartitionColumnVar != NULL);
#if PG_VERSION_NUM >= PG_VERSION_18
if (subqery->hasGroupRTE)
{
/* if the partition column is a grouped column, we need to flatten it
* to ensure query deparsing works correctly. We choose to do this here
* instead of in ruletils.c because we want to keep the flattening logic
* close to the NOT NULL filter injection.
*/
RangeTblEntry *partitionRTE = rt_fetch(targetPartitionColumnVar->varno,
subqery->rtable);
if (partitionRTE->rtekind == RTE_GROUP)
{
targetPartitionColumnVar = (Var *) flatten_group_exprs(NULL, subqery,
(Node *)
targetPartitionColumnVar);
}
}
#endif
/* create expression for partition_column IS NOT NULL */
NullTest *nullTest = makeNode(NullTest);
nullTest->nulltesttype = IS_NOT_NULL;
@ -1317,7 +1340,8 @@ MultiShardUpdateDeleteSupported(Query *originalQuery,
{
errorMessage = DeferErrorIfUnsupportedSubqueryPushdown(
originalQuery,
plannerRestrictionContext);
plannerRestrictionContext,
true);
}
return errorMessage;
@ -1604,10 +1628,19 @@ MasterIrreducibleExpressionFunctionChecker(Oid func_id, void *context)
/*
* TargetEntryChangesValue determines whether the given target entry may
* change the value in a given column, given a join tree. The result is
* true unless the expression refers directly to the column, or the
* expression is a value that is implied by the qualifiers of the join
* tree, or the target entry sets a different column.
* change the value given a column and a join tree.
*
* The function assumes that the "targetEntry" references given "column"
* Var via its "resname" and is used as part of a modify query. This means
* that, for example, for an update query, the input "targetEntry" constructs
* the following assignment operation as part of the SET clause:
* "col_a = expr_a ", where, "col_a" refers to input "column" Var (via
* "resname") as per the assumption written above. And we want to understand
* if "expr_a" (which is pointed to by targetEntry->expr) refers directly to
* the "column" Var, or "expr_a" is a value that is implied to be equal
* to "column" Var by the qualifiers of the join tree. If so, we know that
* the value of "col_a" effectively cannot be changed by this assignment
* operation.
*/
bool
TargetEntryChangesValue(TargetEntry *targetEntry, Var *column, FromExpr *joinTree)
@ -1618,11 +1651,36 @@ TargetEntryChangesValue(TargetEntry *targetEntry, Var *column, FromExpr *joinTre
if (IsA(setExpr, Var))
{
Var *newValue = (Var *) setExpr;
if (newValue->varattno == column->varattno)
if (column->varno == newValue->varno &&
column->varattno == newValue->varattno)
{
/* target entry of the form SET col = table.col */
/*
* Target entry is of the form "SET col_a = foo.col_b",
* where foo also points to the same range table entry
* and col_a and col_b are the same. So, effectively
* they're literally referring to the same column.
*/
isColumnValueChanged = false;
}
else
{
List *restrictClauseList = WhereClauseList(joinTree);
OpExpr *equalityExpr = MakeOpExpressionExtended(column, (Expr *) newValue,
BTEqualStrategyNumber);
bool predicateIsImplied = predicate_implied_by(list_make1(equalityExpr),
restrictClauseList, false);
if (predicateIsImplied)
{
/*
* Target entry is of the form
* "SET col_a = foo.col_b WHERE col_a = foo.col_b (AND (...))",
* where foo points to a different relation or it points
* to the same relation but col_a is not the same column as col_b.
*/
isColumnValueChanged = false;
}
}
}
else if (IsA(setExpr, Const))
{
@ -1643,7 +1701,10 @@ TargetEntryChangesValue(TargetEntry *targetEntry, Var *column, FromExpr *joinTre
restrictClauseList, false);
if (predicateIsImplied)
{
/* target entry of the form SET col = <x> WHERE col = <x> AND ... */
/*
* Target entry is of the form
* "SET col_a = const_a WHERE col_a = const_a (AND (...))".
*/
isColumnValueChanged = false;
}
}
@ -1940,7 +2001,9 @@ RouterJob(Query *originalQuery, PlannerRestrictionContext *plannerRestrictionCon
{
GenerateSingleShardRouterTaskList(job, relationShardList,
placementList, shardId,
isLocalTableModification);
isLocalTableModification,
fastPathRestrictionContext->
delayFastPathPlanning);
}
job->requiresCoordinatorEvaluation = requiresCoordinatorEvaluation;
@ -1948,6 +2011,266 @@ RouterJob(Query *originalQuery, PlannerRestrictionContext *plannerRestrictionCon
}
/*
* CheckAttributesMatch checks if the attributes of the Citus table and the shard
* table match.
*
* It is used to ensure that the shard table has the same schema as the Citus
* table before replacing the Citus table OID with the shard table OID in the
* parse tree we (Citus planner) recieved from Postgres.
*/
static
bool
CheckAttributesMatch(Oid citusTableId, Oid shardTableId)
{
bool same_schema = false;
Relation citusRelation = RelationIdGetRelation(citusTableId);
Relation shardRelation = RelationIdGetRelation(shardTableId);
if (RelationIsValid(citusRelation) && RelationIsValid(shardRelation))
{
TupleDesc citusTupDesc = citusRelation->rd_att;
TupleDesc shardTupDesc = shardRelation->rd_att;
if (citusTupDesc->natts == shardTupDesc->natts)
{
/*
* Do an attribute-by-attribute comparison. This is borrowed from
* the Postgres function equalTupleDescs(), which we cannot use
* because the citus table and shard table have different composite
* types.
*/
same_schema = true;
for (int i = 0; i < citusTupDesc->natts && same_schema; i++)
{
Form_pg_attribute attr1 = TupleDescAttr(citusTupDesc, i);
Form_pg_attribute attr2 = TupleDescAttr(shardTupDesc, i);
if (strcmp(NameStr(attr1->attname), NameStr(attr2->attname)) != 0)
{
same_schema = false;
}
if (attr1->atttypid != attr2->atttypid)
{
same_schema = false;
}
if (attr1->atttypmod != attr2->atttypmod)
{
same_schema = false;
}
if (attr1->attcollation != attr2->attcollation)
{
same_schema = false;
}
/* Record types derived from tables could have dropped fields. */
if (attr1->attisdropped != attr2->attisdropped)
{
same_schema = false;
}
}
}
}
RelationClose(citusRelation);
RelationClose(shardRelation);
return same_schema;
}
/*
* CheckAndBuildDelayedFastPathPlan() - if the query being planned is a fast
* path query, not marked for deferred pruning and the placement for the task
* is not a dummy placement then if the placement is local to this node we can
* take a shortcut of replacing the OID of the citus table with the OID of the
* shard in the query tree and plan that directly, instead of deparsing the
* parse tree to a SQL query on the shard and parsing and planning that in
* the local executor. Instead, the local executor can use the plan created
* here.
*/
void
CheckAndBuildDelayedFastPathPlan(DistributedPlanningContext *planContext,
DistributedPlan *plan)
{
FastPathRestrictionContext *fastPathContext =
planContext->plannerRestrictionContext->fastPathRestrictionContext;
if (!fastPathContext->delayFastPathPlanning)
{
return;
}
Job *job = plan->workerJob;
Assert(job != NULL);
if (job->deferredPruning)
{
/* Execution time pruning => don't know which shard at this point */
planContext->plan = FastPathPlanner(planContext->originalQuery,
planContext->query,
planContext->boundParams);
return;
}
List *tasks = job->taskList;
Assert(list_length(tasks) == 1);
Task *task = (Task *) linitial(tasks);
List *placements = task->taskPlacementList;
int32 localGroupId = GetLocalGroupId();
ShardPlacement *primaryPlacement = (ShardPlacement *) linitial(placements);
bool isLocalExecution = !IsDummyPlacement(primaryPlacement) &&
(primaryPlacement->groupId == localGroupId);
bool canBuildLocalPlan = true;
if (isLocalExecution)
{
List *relationShards = task->relationShardList;
Assert(list_length(relationShards) == 1);
RelationShard *relationShard = (RelationShard *) linitial(relationShards);
Assert(relationShard->shardId == primaryPlacement->shardId);
/*
* Today FastPathRouterQuery() doesn't set delayFastPathPlanning to true for
* reference tables. We should be looking at 1 placement, or their replication
* factor.
*/
Assert(list_length(placements) == 1 || list_length(placements) ==
TableShardReplicationFactor(relationShard->relationId));
canBuildLocalPlan = ConvertToQueryOnShard(planContext->query,
relationShard->relationId,
relationShard->shardId);
if (canBuildLocalPlan)
{
/* Plan the query with the new shard relation id */
planContext->plan = standard_planner(planContext->query, NULL,
planContext->cursorOptions,
planContext->boundParams);
SetTaskQueryPlan(task, job->jobQuery, planContext->plan);
ereport(DEBUG2, (errmsg(
"Fast-path router query: created local execution plan "
"to avoid deparse and compile of shard query")));
return;
}
}
/*
* Either the shard is not local to this node, or it was not safe to replace
* the OIDs in the parse tree; in any case we fall back to generating the shard
* query and compiling that.
*/
Assert(!isLocalExecution || (isLocalExecution && !canBuildLocalPlan));
/* Fall back to fast path planner and generating SQL query on the shard */
planContext->plan = FastPathPlanner(planContext->originalQuery,
planContext->query,
planContext->boundParams);
UpdateRelationToShardNames((Node *) job->jobQuery, task->relationShardList);
SetTaskQueryIfShouldLazyDeparse(task, job->jobQuery);
}
/*
* ConvertToQueryOnShard() converts the given query on a citus table (identified by
* citusTableOid) to a query on a shard (identified by shardId).
*
* The function assumes that the query is a "fast path" query - it has only one
* RangeTblEntry and one RTEPermissionInfo.
*
* It acquires the same lock on the shard that was acquired on the citus table
* by the Postgres parser. It checks that the attribute numbers and metadata of
* the shard table and citus table are identical - otherwise it is not safe
* to proceed with this shortcut. Assuming the attributes do match, the actual
* conversion involves changing the target list entries that reference the
* citus table's oid to reference the shard's relation id instead. Finally,
* it changes the RangeTblEntry's relid to the shard's relation id and (PG16+)
* changes the RTEPermissionInfo's relid to the shard's relation id also.
* At this point the Query is ready for the postgres planner.
*/
static bool
ConvertToQueryOnShard(Query *query, Oid citusTableOid, Oid shardId)
{
Assert(list_length(query->rtable) == 1
#if PG_VERSION_NUM >= PG_VERSION_18
|| (list_length(query->rtable) == 2 && query->hasGroupRTE)
#endif
);
RangeTblEntry *citusTableRte = (RangeTblEntry *) linitial(query->rtable);
Assert(citusTableRte->relid == citusTableOid);
const char *citusTableName = get_rel_name(citusTableOid);
Assert(citusTableName != NULL);
/* construct shard relation name */
char *shardRelationName = pstrdup(citusTableName);
AppendShardIdToName(&shardRelationName, shardId);
/* construct the schema name */
char *schemaName = get_namespace_name(get_rel_namespace(citusTableOid));
/* now construct a range variable for the shard */
RangeVar shardRangeVar = {
.relname = shardRelationName,
.schemaname = schemaName,
.inh = citusTableRte->inh,
.relpersistence = RELPERSISTENCE_PERMANENT,
};
/* Must apply the same lock to the shard that was applied to the citus table */
Oid shardRelationId = RangeVarGetRelidExtended(&shardRangeVar,
citusTableRte->rellockmode,
0, NULL, NULL);
/* Verify that the attributes of citus table and shard table match */
if (!CheckAttributesMatch(citusTableOid, shardRelationId))
{
/* There is a difference between the attributes of the citus
* table and the shard table. This can happen if there is a DROP
* COLUMN on the citus table. In this case, we cannot
* convert the query to a shard query, so clean up and return.
*/
UnlockRelationOid(shardRelationId, citusTableRte->rellockmode);
ereport(DEBUG2, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg(
"Router planner fast path cannot modify parse tree for local execution: shard table \"%s.%s\" does not match the "
"distributed table \"%s.%s\"",
schemaName, shardRelationName, schemaName,
citusTableName)));
pfree(shardRelationName);
pfree(schemaName);
return false;
}
/* Change the target list entries that reference the original citus table's relation id */
ListCell *lc = NULL;
foreach(lc, query->targetList)
{
TargetEntry *targetEntry = (TargetEntry *) lfirst(lc);
if (targetEntry->resorigtbl == citusTableOid)
{
targetEntry->resorigtbl = shardRelationId;
}
}
/* Change the range table entry's oid to that of the shard's */
Assert(shardRelationId != InvalidOid);
citusTableRte->relid = shardRelationId;
#if PG_VERSION_NUM >= PG_VERSION_16
/* Change the range table permission oid to that of the shard's (PG16+) */
Assert(list_length(query->rteperminfos) == 1);
RTEPermissionInfo *rtePermInfo = (RTEPermissionInfo *) linitial(query->rteperminfos);
rtePermInfo->relid = shardRelationId;
#endif
return true;
}
/*
* GenerateSingleShardRouterTaskList is a wrapper around other corresponding task
* list generation functions specific to single shard selects and modifications.
@ -1957,7 +2280,7 @@ RouterJob(Query *originalQuery, PlannerRestrictionContext *plannerRestrictionCon
void
GenerateSingleShardRouterTaskList(Job *job, List *relationShardList,
List *placementList, uint64 shardId, bool
isLocalTableModification)
isLocalTableModification, bool delayedFastPath)
{
Query *originalQuery = job->jobQuery;
@ -1970,7 +2293,8 @@ GenerateSingleShardRouterTaskList(Job *job, List *relationShardList,
shardId,
job->parametersInJobQueryResolved,
isLocalTableModification,
job->partitionKeyValue, job->colocationId);
job->partitionKeyValue, job->colocationId,
delayedFastPath);
/*
* Queries to reference tables, or distributed tables with multiple replica's have
@ -2001,7 +2325,8 @@ GenerateSingleShardRouterTaskList(Job *job, List *relationShardList,
shardId,
job->parametersInJobQueryResolved,
isLocalTableModification,
job->partitionKeyValue, job->colocationId);
job->partitionKeyValue, job->colocationId,
delayedFastPath);
}
}
@ -2096,7 +2421,7 @@ SingleShardTaskList(Query *query, uint64 jobId, List *relationShardList,
List *placementList, uint64 shardId,
bool parametersInQueryResolved,
bool isLocalTableModification, Const *partitionKeyValue,
int colocationId)
int colocationId, bool delayedFastPath)
{
TaskType taskType = READ_TASK;
char replicationModel = 0;
@ -2168,7 +2493,10 @@ SingleShardTaskList(Query *query, uint64 jobId, List *relationShardList,
task->taskPlacementList = placementList;
task->partitionKeyValue = partitionKeyValue;
task->colocationId = colocationId;
SetTaskQueryIfShouldLazyDeparse(task, query);
if (!delayedFastPath)
{
SetTaskQueryIfShouldLazyDeparse(task, query);
}
task->anchorShardId = shardId;
task->jobId = jobId;
task->relationShardList = relationShardList;
@ -2245,6 +2573,18 @@ SelectsFromDistributedTable(List *rangeTableList, Query *query)
continue;
}
#if PG_VERSION_NUM >= 150013 && PG_VERSION_NUM < PG_VERSION_16
if (rangeTableEntry->rtekind == RTE_SUBQUERY && rangeTableEntry->relkind == 0)
{
/*
* In PG15.13 commit https://github.com/postgres/postgres/commit/317aba70e
* relid is retained when converting views to subqueries,
* so we need an extra check identifying those views
*/
continue;
}
#endif
if (rangeTableEntry->relkind == RELKIND_VIEW ||
rangeTableEntry->relkind == RELKIND_MATVIEW)
{
@ -2437,10 +2777,15 @@ PlanRouterQuery(Query *originalQuery,
/*
* If this is an UPDATE or DELETE query which requires coordinator evaluation,
* don't try update shard names, and postpone that to execution phase.
* don't try update shard names, and postpone that to execution phase. Also, if
* this is a delayed fast path query, we don't update the shard names
* either, as the shard names will be updated in the fast path query planner.
*/
bool isUpdateOrDelete = UpdateOrDeleteOrMergeQuery(originalQuery);
if (!(isUpdateOrDelete && RequiresCoordinatorEvaluation(originalQuery)))
bool delayedFastPath =
plannerRestrictionContext->fastPathRestrictionContext->delayFastPathPlanning;
if (!(isUpdateOrDelete && RequiresCoordinatorEvaluation(originalQuery)) &&
!delayedFastPath)
{
UpdateRelationToShardNames((Node *) originalQuery, *relationShardList);
}

View File

@ -88,7 +88,7 @@ static bool WindowPartitionOnDistributionColumn(Query *query);
static DeferredErrorMessage * DeferErrorIfFromClauseRecurs(Query *queryTree);
static RecurringTuplesType FromClauseRecurringTupleType(Query *queryTree);
static DeferredErrorMessage * DeferredErrorIfUnsupportedRecurringTuplesJoin(
PlannerRestrictionContext *plannerRestrictionContext);
PlannerRestrictionContext *plannerRestrictionContext, bool plannerPhase);
static DeferredErrorMessage * DeferErrorIfUnsupportedTableCombination(Query *queryTree);
static DeferredErrorMessage * DeferErrorIfSubqueryRequiresMerge(Query *subqueryTree, bool
lateral,
@ -109,6 +109,7 @@ static bool RelationInfoContainsOnlyRecurringTuples(PlannerInfo *plannerInfo,
static char * RecurringTypeDescription(RecurringTuplesType recurType);
static DeferredErrorMessage * DeferredErrorIfUnsupportedLateralSubquery(
PlannerInfo *plannerInfo, Relids recurringRelIds, Relids nonRecurringRelIds);
static bool ContainsLateralSubquery(PlannerInfo *plannerInfo);
static Var * PartitionColumnForPushedDownSubquery(Query *query);
static bool ContainsReferencesToRelids(Query *query, Relids relids, int *foundRelid);
static bool ContainsReferencesToRelidsWalker(Node *node,
@ -332,7 +333,9 @@ WhereOrHavingClauseContainsSubquery(Query *query)
bool
TargetListContainsSubquery(List *targetList)
{
return FindNodeMatchingCheckFunction((Node *) targetList, IsNodeSubquery);
bool hasSubquery = FindNodeMatchingCheckFunction((Node *) targetList, IsNodeSubquery);
return hasSubquery;
}
@ -535,9 +538,16 @@ SubqueryMultiNodeTree(Query *originalQuery, Query *queryTree,
RaiseDeferredError(unsupportedQueryError, ERROR);
}
/*
* We reach here at the third step of the planning, thus we already checked for pushed down
* feasibility of recurring outer joins, at this step the unsupported outer join check should
* only generate an error when there is a lateral subquery.
*/
DeferredErrorMessage *subqueryPushdownError = DeferErrorIfUnsupportedSubqueryPushdown(
originalQuery,
plannerRestrictionContext);
plannerRestrictionContext,
false);
if (subqueryPushdownError != NULL)
{
RaiseDeferredError(subqueryPushdownError, ERROR);
@ -560,7 +570,8 @@ SubqueryMultiNodeTree(Query *originalQuery, Query *queryTree,
DeferredErrorMessage *
DeferErrorIfUnsupportedSubqueryPushdown(Query *originalQuery,
PlannerRestrictionContext *
plannerRestrictionContext)
plannerRestrictionContext,
bool plannerPhase)
{
bool outerMostQueryHasLimit = false;
ListCell *subqueryCell = NULL;
@ -612,7 +623,8 @@ DeferErrorIfUnsupportedSubqueryPushdown(Query *originalQuery,
return error;
}
error = DeferredErrorIfUnsupportedRecurringTuplesJoin(plannerRestrictionContext);
error = DeferredErrorIfUnsupportedRecurringTuplesJoin(plannerRestrictionContext,
plannerPhase);
if (error)
{
return error;
@ -770,12 +782,17 @@ FromClauseRecurringTupleType(Query *queryTree)
* DeferredErrorIfUnsupportedRecurringTuplesJoin returns a DeferredError if
* there exists a join between a recurring rel (such as reference tables
* and intermediate_results) and a non-recurring rel (such as distributed tables
* and subqueries that we can push-down to worker nodes) that can return an
* incorrect result set due to recurring tuples coming from the recurring rel.
* and subqueries that we can push-down to worker nodes) when plannerPhase is
* true, so that we try to recursively plan these joins.
* During recursive planning phase, we either replace those with recursive plans
* or leave them if it is safe to push-down.
* During the logical planning phase (plannerPhase is false), we only check if
* such queries have lateral subqueries.
*/
static DeferredErrorMessage *
DeferredErrorIfUnsupportedRecurringTuplesJoin(
PlannerRestrictionContext *plannerRestrictionContext)
PlannerRestrictionContext *plannerRestrictionContext,
bool plannerPhase)
{
List *joinRestrictionList =
plannerRestrictionContext->joinRestrictionContext->joinRestrictionList;
@ -828,14 +845,29 @@ DeferredErrorIfUnsupportedRecurringTuplesJoin(
if (RelationInfoContainsOnlyRecurringTuples(plannerInfo, outerrelRelids))
{
if (plannerPhase)
{
/*
* We have not yet tried to recursively plan this join, we should
* defer an error.
*/
recurType = FetchFirstRecurType(plannerInfo, outerrelRelids);
break;
}
/*
* Inner side contains distributed rels but the outer side only
* contains recurring rels, must be an unsupported lateral outer
* contains recurring rels, might be an unsupported lateral outer
* join.
* Note that plannerInfo->hasLateralRTEs is not always set to
* true, so here we check rtes, see ContainsLateralSubquery for details.
*/
recurType = FetchFirstRecurType(plannerInfo, outerrelRelids);
break;
if (ContainsLateralSubquery(plannerInfo))
{
recurType = FetchFirstRecurType(plannerInfo, outerrelRelids);
break;
}
}
}
else if (joinType == JOIN_FULL)
@ -1063,6 +1095,28 @@ DeferErrorIfCannotPushdownSubquery(Query *subqueryTree, bool outerMostQueryHasLi
}
/*
* FlattenGroupExprs flattens the GROUP BY expressions in the query tree
* by replacing VAR nodes referencing the GROUP range table with the actual
* GROUP BY expression. This is used by Citus planning to ensure correctness
* when analysing and building the distributed plan.
*/
void
FlattenGroupExprs(Query *queryTree)
{
#if PG_VERSION_NUM >= PG_VERSION_18
if (queryTree->hasGroupRTE)
{
queryTree->targetList = (List *)
flatten_group_exprs(NULL, queryTree,
(Node *) queryTree->targetList);
queryTree->havingQual =
flatten_group_exprs(NULL, queryTree, queryTree->havingQual);
}
#endif
}
/*
* DeferErrorIfSubqueryRequiresMerge returns a deferred error if the subquery
* requires a merge step on the coordinator (e.g. limit, group by non-distribution
@ -1717,6 +1771,30 @@ DeferredErrorIfUnsupportedLateralSubquery(PlannerInfo *plannerInfo,
}
/*
* ContainsLateralSubquery checks if the given plannerInfo contains any
* lateral subqueries in its rtable. If it does, it returns true, otherwise false.
*/
static bool
ContainsLateralSubquery(PlannerInfo *plannerInfo)
{
ListCell *lc;
foreach(lc, plannerInfo->parse->rtable)
{
RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
/* We are only interested in subqueries that are lateral */
if (rte->lateral && rte->rtekind == RTE_SUBQUERY)
{
return true;
}
}
return false;
}
/*
* FetchFirstRecurType checks whether the relationInfo
* contains any recurring table expression, namely a reference table,
@ -1899,6 +1977,13 @@ static MultiNode *
SubqueryPushdownMultiNodeTree(Query *originalQuery)
{
Query *queryTree = copyObject(originalQuery);
/*
* PG18+ need to flatten GROUP BY expressions to ensure correct processing
* later on, such as identification of partition columns in GROUP BY.
*/
FlattenGroupExprs(queryTree);
List *targetEntryList = queryTree->targetList;
MultiCollect *subqueryCollectNode = CitusMakeNode(MultiCollect);
@ -1975,7 +2060,9 @@ SubqueryPushdownMultiNodeTree(Query *originalQuery)
pushedDownQuery->setOperations = copyObject(queryTree->setOperations);
pushedDownQuery->querySource = queryTree->querySource;
pushedDownQuery->hasSubLinks = queryTree->hasSubLinks;
#if PG_VERSION_NUM >= PG_VERSION_18
pushedDownQuery->hasGroupRTE = queryTree->hasGroupRTE;
#endif
MultiTable *subqueryNode = MultiSubqueryPushdownTable(pushedDownQuery);
SetChild((MultiUnaryNode *) subqueryCollectNode, (MultiNode *) subqueryNode);

View File

@ -49,6 +49,7 @@
#include "postgres.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "catalog/pg_class.h"
#include "catalog/pg_type.h"
@ -73,8 +74,10 @@
#include "distributed/citus_nodes.h"
#include "distributed/citus_ruleutils.h"
#include "distributed/combine_query_planner.h"
#include "distributed/commands/multi_copy.h"
#include "distributed/distributed_planner.h"
#include "distributed/distribution_column.h"
#include "distributed/errormessage.h"
#include "distributed/listutils.h"
#include "distributed/local_distributed_join_planner.h"
@ -87,11 +90,15 @@
#include "distributed/multi_server_executor.h"
#include "distributed/query_colocation_checker.h"
#include "distributed/query_pushdown_planning.h"
#include "distributed/query_utils.h"
#include "distributed/recursive_planning.h"
#include "distributed/relation_restriction_equivalence.h"
#include "distributed/shard_pruning.h"
#include "distributed/version_compat.h"
bool EnableRecurringOuterJoinPushdown = true;
bool EnableOuterJoinsWithPseudoconstantQualsPrePG17 = false;
/*
* RecursivePlanningContext is used to recursively plan subqueries
* and CTEs, pull results to the coordinator, and push it back into
@ -104,6 +111,8 @@ struct RecursivePlanningContextInternal
bool allDistributionKeysInQueryAreEqual; /* used for some optimizations */
List *subPlanList;
PlannerRestrictionContext *plannerRestrictionContext;
bool restrictionEquivalenceCheck;
bool forceRecursivelyPlanRecurringOuterJoins;
};
/* track depth of current recursive planner query */
@ -152,7 +161,8 @@ static void RecursivelyPlanNonColocatedSubqueriesInWhere(Query *query,
RecursivePlanningContext *
recursivePlanningContext);
static bool RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
RecursivePlanningContext *context);
RecursivePlanningContext *context,
bool chainedJoin);
static void RecursivelyPlanDistributedJoinNode(Node *node, Query *query,
RecursivePlanningContext *context);
static bool IsRTERefRecurring(RangeTblRef *rangeTableRef, Query *query);
@ -193,6 +203,9 @@ static Query * CreateOuterSubquery(RangeTblEntry *rangeTableEntry,
List *outerSubqueryTargetList);
static List * GenerateRequiredColNamesFromTargetList(List *targetList);
static char * GetRelationNameAndAliasName(RangeTblEntry *rangeTablentry);
static bool CanPushdownRecurringOuterJoinOnOuterRTE(RangeTblEntry *rte);
static bool CanPushdownRecurringOuterJoinOnInnerVar(Var *innervar, RangeTblEntry *rte);
static bool CanPushdownRecurringOuterJoin(JoinExpr *joinExpr, Query *query);
#if PG_VERSION_NUM < PG_VERSION_17
static bool hasPseudoconstantQuals(
RelationRestrictionContext *relationRestrictionContext);
@ -207,7 +220,8 @@ static bool hasPseudoconstantQuals(
*/
List *
GenerateSubplansForSubqueriesAndCTEs(uint64 planId, Query *originalQuery,
PlannerRestrictionContext *plannerRestrictionContext)
PlannerRestrictionContext *plannerRestrictionContext,
RouterPlanType routerPlan)
{
RecursivePlanningContext context;
@ -221,6 +235,17 @@ GenerateSubplansForSubqueriesAndCTEs(uint64 planId, Query *originalQuery,
context.planId = planId;
context.subPlanList = NIL;
context.plannerRestrictionContext = plannerRestrictionContext;
context.forceRecursivelyPlanRecurringOuterJoins = false;
/*
* Force recursive planning of recurring outer joins for these queries
* since the planning error from the previous step is generated prior to
* the actual planning attempt.
*/
if (routerPlan == DML_QUERY)
{
context.forceRecursivelyPlanRecurringOuterJoins = true;
}
/*
* Calculating the distribution key equality upfront is a trade-off for us.
@ -236,7 +261,6 @@ GenerateSubplansForSubqueriesAndCTEs(uint64 planId, Query *originalQuery,
*/
context.allDistributionKeysInQueryAreEqual =
AllDistributionKeysInQueryAreEqual(originalQuery, plannerRestrictionContext);
DeferredErrorMessage *error = RecursivelyPlanSubqueriesAndCTEs(originalQuery,
&context);
if (error != NULL)
@ -363,7 +387,7 @@ RecursivelyPlanSubqueriesAndCTEs(Query *query, RecursivePlanningContext *context
if (ShouldRecursivelyPlanOuterJoins(query, context))
{
RecursivelyPlanRecurringTupleOuterJoinWalker((Node *) query->jointree,
query, context);
query, context, false);
}
/*
@ -485,7 +509,7 @@ ShouldRecursivelyPlanOuterJoins(Query *query, RecursivePlanningContext *context)
bool hasOuterJoin =
context->plannerRestrictionContext->joinRestrictionContext->hasOuterJoin;
#if PG_VERSION_NUM < PG_VERSION_17
if (!hasOuterJoin)
if (!EnableOuterJoinsWithPseudoconstantQualsPrePG17 && !hasOuterJoin)
{
/*
* PG15 commit d1ef5631e620f9a5b6480a32bb70124c857af4f1
@ -691,7 +715,8 @@ RecursivelyPlanNonColocatedSubqueriesInWhere(Query *query,
static bool
RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
RecursivePlanningContext *
recursivePlanningContext)
recursivePlanningContext,
bool chainedJoin)
{
if (node == NULL)
{
@ -708,7 +733,8 @@ RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
Node *fromElement = (Node *) lfirst(fromExprCell);
RecursivelyPlanRecurringTupleOuterJoinWalker(fromElement, query,
recursivePlanningContext);
recursivePlanningContext,
false);
}
/*
@ -734,10 +760,12 @@ RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
*/
bool leftNodeRecurs =
RecursivelyPlanRecurringTupleOuterJoinWalker(leftNode, query,
recursivePlanningContext);
recursivePlanningContext,
true);
bool rightNodeRecurs =
RecursivelyPlanRecurringTupleOuterJoinWalker(rightNode, query,
recursivePlanningContext);
recursivePlanningContext,
true);
switch (joinExpr->jointype)
{
case JOIN_LEFT:
@ -745,11 +773,23 @@ RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
/* <recurring> left join <distributed> */
if (leftNodeRecurs && !rightNodeRecurs)
{
ereport(DEBUG1, (errmsg("recursively planning right side of "
"the left join since the outer side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(rightNode, query,
recursivePlanningContext);
if (recursivePlanningContext->forceRecursivelyPlanRecurringOuterJoins
||
chainedJoin || !CanPushdownRecurringOuterJoin(joinExpr,
query))
{
ereport(DEBUG1, (errmsg("recursively planning right side of "
"the left join since the outer side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(rightNode, query,
recursivePlanningContext);
}
else
{
ereport(DEBUG3, (errmsg(
"a push down safe left join with recurring left side")));
leftNodeRecurs = false; /* left node will be pushed down */
}
}
/*
@ -766,11 +806,23 @@ RecursivelyPlanRecurringTupleOuterJoinWalker(Node *node, Query *query,
/* <distributed> right join <recurring> */
if (!leftNodeRecurs && rightNodeRecurs)
{
ereport(DEBUG1, (errmsg("recursively planning left side of "
"the right join since the outer side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(leftNode, query,
recursivePlanningContext);
if (recursivePlanningContext->forceRecursivelyPlanRecurringOuterJoins
||
chainedJoin || !CanPushdownRecurringOuterJoin(joinExpr,
query))
{
ereport(DEBUG1, (errmsg("recursively planning left side of "
"the right join since the outer side "
"is a recurring rel")));
RecursivelyPlanDistributedJoinNode(leftNode, query,
recursivePlanningContext);
}
else
{
ereport(DEBUG3, (errmsg(
"a push down safe right join with recurring left side")));
rightNodeRecurs = false; /* right node will be pushed down */
}
}
/*
@ -1070,14 +1122,10 @@ ExtractSublinkWalker(Node *node, List **sublinkList)
static bool
ShouldRecursivelyPlanSublinks(Query *query)
{
if (FindNodeMatchingCheckFunctionInRangeTableList(query->rtable,
IsDistributedTableRTE))
{
/* there is a distributed table in the FROM clause */
return false;
}
return true;
bool hasDistributedTable = (FindNodeMatchingCheckFunctionInRangeTableList(
query->rtable,
IsDistributedTableRTE));
return !hasDistributedTable;
}
@ -2642,3 +2690,335 @@ hasPseudoconstantQuals(RelationRestrictionContext *relationRestrictionContext)
#endif
/*
* CanPushdownRecurringOuterJoinOnOuterRTE returns true if the given range table entry
* is safe for pushdown when it is the outer relation of a outer join when the
* inner relation is not recurring.
* Currently, we only allow reference tables.
*/
static bool
CanPushdownRecurringOuterJoinOnOuterRTE(RangeTblEntry *rte)
{
if (IsCitusTable(rte->relid) && IsCitusTableType(rte->relid, REFERENCE_TABLE))
{
return true;
}
else
{
ereport(DEBUG5, (errmsg("RTE type %d is not safe for pushdown",
rte->rtekind)));
return false;
}
}
/*
* ResolveBaseVarFromSubquery recursively resolves a Var from a subquery target list to
* the base Var and RTE
*/
bool
ResolveBaseVarFromSubquery(Var *var, Query *query,
Var **baseVar, RangeTblEntry **baseRte)
{
TargetEntry *tle = get_tle_by_resno(query->targetList, var->varattno);
if (!tle || !IsA(tle->expr, Var))
{
return false;
}
Var *tleVar = (Var *) tle->expr;
RangeTblEntry *rte = rt_fetch(tleVar->varno, query->rtable);
if (rte == NULL)
{
return false;
}
if (rte->rtekind == RTE_RELATION || rte->rtekind == RTE_FUNCTION)
{
*baseVar = tleVar;
*baseRte = rte;
return true;
}
else if (rte->rtekind == RTE_SUBQUERY)
{
/* Prevent overflow, and allow query cancellation */
check_stack_depth();
CHECK_FOR_INTERRUPTS();
return ResolveBaseVarFromSubquery(tleVar, rte->subquery, baseVar, baseRte);
}
return false;
}
/*
* CanPushdownRecurringOuterJoinOnInnerVar checks if the inner variable
* from a join qual for a join pushdown. It returns true if it is valid,
* it is the partition column and hash distributed, otherwise it returns false.
*/
static bool
CanPushdownRecurringOuterJoinOnInnerVar(Var *innerVar, RangeTblEntry *rte)
{
if (!innerVar || !rte)
{
return false;
}
if (innerVar->varattno == InvalidAttrNumber)
{
return false;
}
CitusTableCacheEntry *cacheEntry = GetCitusTableCacheEntry(rte->relid);
if (!cacheEntry || GetCitusTableType(cacheEntry) != HASH_DISTRIBUTED)
{
return false;
}
/* Check if the inner variable is part of the distribution column */
if (cacheEntry->partitionColumn && innerVar->varattno ==
cacheEntry->partitionColumn->varattno)
{
return true;
}
return false;
}
/*
* JoinTreeContainsLateral checks if the given node contains a lateral
* join. It returns true if it does, otherwise false.
*
* It recursively traverses the join tree and checks each RangeTblRef and JoinExpr
* for lateral joins.
*/
static bool
JoinTreeContainsLateral(Node *node, List *rtable)
{
if (node == NULL)
{
return false;
}
/* Prevent overflow, and allow query cancellation */
check_stack_depth();
CHECK_FOR_INTERRUPTS();
if (IsA(node, RangeTblRef))
{
RangeTblEntry *rte = rt_fetch(((RangeTblRef *) node)->rtindex, rtable);
if (rte == NULL)
{
return false;
}
if (rte->lateral)
{
return true;
}
if (rte->rtekind == RTE_SUBQUERY)
{
if (rte->subquery)
{
return JoinTreeContainsLateral((Node *) rte->subquery->jointree,
rte->subquery->rtable);
}
}
return false;
}
else if (IsA(node, JoinExpr))
{
JoinExpr *join = (JoinExpr *) node;
return JoinTreeContainsLateral(join->larg, rtable) ||
JoinTreeContainsLateral(join->rarg, rtable);
}
else if (IsA(node, FromExpr))
{
FromExpr *fromExpr = (FromExpr *) node;
ListCell *lc = NULL;
foreach(lc, fromExpr->fromlist)
{
if (JoinTreeContainsLateral((Node *) lfirst(lc), rtable))
{
return true;
}
}
}
return false;
}
/*
* CanPushdownRecurringOuterJoinExtended checks if the given join expression
* is an outer join between recurring rel -on outer part- and a distributed
* rel -on the inner side- and if it is feasible to push down the join. If feasible,
* it computes the outer relation's range table index, the outer relation's
* range table entry, the inner (distributed) relation's range table entry, and the
* attribute number of the partition column in the outer relation.
*/
bool
CanPushdownRecurringOuterJoinExtended(JoinExpr *joinExpr, Query *query,
int *outerRtIndex, RangeTblEntry **outerRte,
RangeTblEntry **distRte, int *attnum)
{
if (!EnableRecurringOuterJoinPushdown)
{
return false;
}
if (!IS_OUTER_JOIN(joinExpr->jointype))
{
return false;
}
if (joinExpr->jointype != JOIN_LEFT && joinExpr->jointype != JOIN_RIGHT)
{
return false;
}
/* Push down for chained joins is not supported in this path. */
if (IsA(joinExpr->rarg, JoinExpr) || IsA(joinExpr->larg, JoinExpr))
{
ereport(DEBUG5, (errmsg(
"One side is a join expression, pushdown is not supported in this path.")));
return false;
}
/* Push down for joins with fromExpr on one side is not supported in this path. */
if (!IsA(joinExpr->larg, RangeTblRef) || !IsA(joinExpr->rarg, RangeTblRef))
{
ereport(DEBUG5, (errmsg(
"One side is not a RangeTblRef, pushdown is not supported in this path.")));
return false;
}
if (joinExpr->jointype == JOIN_LEFT)
{
*outerRtIndex = (((RangeTblRef *) joinExpr->larg)->rtindex);
}
else /* JOIN_RIGHT */
{
*outerRtIndex = (((RangeTblRef *) joinExpr->rarg)->rtindex);
}
*outerRte = rt_fetch(*outerRtIndex, query->rtable);
if (!CanPushdownRecurringOuterJoinOnOuterRTE(*outerRte))
{
return false;
}
/* For now if we see any lateral join in the join tree, we return false.
* This check can be improved to support the cases where the lateral reference
* does not cause an error in the final planner checks.
*/
if (JoinTreeContainsLateral(joinExpr->rarg, query->rtable) || JoinTreeContainsLateral(
joinExpr->larg, query->rtable))
{
ereport(DEBUG5, (errmsg(
"Lateral join is not supported for pushdown in this path.")));
return false;
}
/* Check if the join is performed on the distribution column */
List *joinClauseList = make_ands_implicit((Expr *) joinExpr->quals);
if (joinClauseList == NIL)
{
return false;
}
Node *joinClause = NULL;
foreach_declared_ptr(joinClause, joinClauseList)
{
if (!NodeIsEqualsOpExpr(joinClause))
{
continue;
}
OpExpr *joinClauseExpr = castNode(OpExpr, joinClause);
Var *leftColumn = LeftColumnOrNULL(joinClauseExpr);
Var *rightColumn = RightColumnOrNULL(joinClauseExpr);
if (leftColumn == NULL || rightColumn == NULL)
{
continue;
}
RangeTblEntry *rte;
Var *innerVar;
if (leftColumn->varno == *outerRtIndex)
{
/* left column is the outer table of the comparison, get right */
rte = rt_fetch(rightColumn->varno, query->rtable);
innerVar = rightColumn;
/* additional constraints will be introduced on outer relation variable */
*attnum = leftColumn->varattno;
}
else if (rightColumn->varno == *outerRtIndex)
{
/* right column is the outer table of the comparison, get left*/
rte = rt_fetch(leftColumn->varno, query->rtable);
innerVar = leftColumn;
/* additional constraints will be introduced on outer relation variable */
*attnum = rightColumn->varattno;
}
else
{
continue;
}
/* the simple case, the inner table itself a Citus table */
if (rte && IsCitusTable(rte->relid))
{
if (CanPushdownRecurringOuterJoinOnInnerVar(innerVar, rte))
{
*distRte = rte;
return true;
}
}
/* the inner table is a subquery, extract the base relation referred in the qual */
else if (rte && rte->rtekind == RTE_SUBQUERY)
{
Var *baseVar = NULL;
RangeTblEntry *baseRte = NULL;
if (ResolveBaseVarFromSubquery(innerVar, rte->subquery, &baseVar, &baseRte))
{
if (baseRte && IsCitusTable(baseRte->relid))
{
if (CanPushdownRecurringOuterJoinOnInnerVar(baseVar, baseRte))
{
*distRte = baseRte;
return true;
}
}
}
}
}
return false;
}
/*
* CanPushdownRecurringOuterJoin initializes input variables to call
* CanPushdownRecurringOuterJoinExtended.
* See CanPushdownRecurringOuterJoinExtended for more details.
*/
bool
CanPushdownRecurringOuterJoin(JoinExpr *joinExpr, Query *query)
{
int outerRtIndex;
RangeTblEntry *outerRte = NULL;
RangeTblEntry *innerRte = NULL;
int attnum;
return CanPushdownRecurringOuterJoinExtended(joinExpr, query, &outerRtIndex,
&outerRte, &innerRte, &attnum);
}

View File

@ -171,6 +171,14 @@ static bool FindQueryContainingRTEIdentityInternal(Node *node,
static int ParentCountPriorToAppendRel(List *appendRelList, AppendRelInfo *appendRelInfo);
static bool PartitionColumnSelectedForOuterJoin(Query *query,
RelationRestrictionContext *
restrictionContext,
JoinRestrictionContext *
joinRestrictionContext);
static bool PartitionColumnIsInTargetList(Query *query, JoinRestriction *joinRestriction,
RelationRestrictionContext *restrictionContext);
/*
* AllDistributionKeysInQueryAreEqual returns true if either
@ -391,6 +399,80 @@ SafeToPushdownUnionSubquery(Query *originalQuery,
return false;
}
if (!PartitionColumnSelectedForOuterJoin(originalQuery,
restrictionContext,
joinRestrictionContext))
{
/* outer join does not select partition column of outer relation */
return false;
}
return true;
}
/*
* PartitionColumnSelectedForOuterJoin checks whether the partition column of
* the outer relation is selected in the target list of the query.
*
* If there is no outer join, it returns true.
*/
static bool
PartitionColumnSelectedForOuterJoin(Query *query,
RelationRestrictionContext *restrictionContext,
JoinRestrictionContext *joinRestrictionContext)
{
ListCell *joinRestrictionCell;
foreach(joinRestrictionCell, joinRestrictionContext->joinRestrictionList)
{
JoinRestriction *joinRestriction = (JoinRestriction *) lfirst(
joinRestrictionCell);
/* Restriction context includes alternative plans, sufficient to check for left joins.*/
if (joinRestriction->joinType != JOIN_LEFT)
{
continue;
}
if (!PartitionColumnIsInTargetList(query, joinRestriction, restrictionContext))
{
/* outer join does not select partition column of outer relation */
return false;
}
}
return true;
}
/*
* PartitionColumnIsInTargetList checks whether the partition column of
* the given relation is included in the target list of the query.
*/
static bool
PartitionColumnIsInTargetList(Query *query, JoinRestriction *joinRestriction,
RelationRestrictionContext *restrictionContext)
{
Relids relids = joinRestriction->outerrelRelids;
int relationId = -1;
Index partitionKeyIndex = InvalidAttrNumber;
while ((relationId = bms_next_member(relids, relationId)) >= 0)
{
RangeTblEntry *rte = joinRestriction->plannerInfo->simple_rte_array[relationId];
if (rte->rtekind != RTE_RELATION)
{
/* skip if it is not a relation */
continue;
}
int targetRTEIndex = GetRTEIdentity(rte);
PartitionKeyForRTEIdentityInQuery(query, targetRTEIndex,
&partitionKeyIndex);
if (partitionKeyIndex == 0)
{
/* partition key is not in the target list */
return false;
}
}
return true;
}
@ -889,6 +971,40 @@ GetVarFromAssignedParam(List *outerPlanParamsList, Param *plannerParam,
}
}
#if PG_VERSION_NUM >= PG_VERSION_18
/*
* In PG18+, the dereferenced PARAM node could be a GroupVar if the
* query has a GROUP BY. In that case, we need to make an extra
* hop to get the underlying Var from the grouping expressions.
*/
if (assignedVar != NULL)
{
Query *parse = (*rootContainingVar)->parse;
if (parse->hasGroupRTE)
{
RangeTblEntry *rte = rt_fetch(assignedVar->varno, parse->rtable);
if (rte->rtekind == RTE_GROUP)
{
Assert(assignedVar->varattno >= 1 &&
assignedVar->varattno <= list_length(rte->groupexprs));
Node *groupVar = list_nth(rte->groupexprs, assignedVar->varattno - 1);
if (IsA(groupVar, Var))
{
assignedVar = (Var *) groupVar;
}
else
{
/* todo: handle PlaceHolderVar case if needed */
ereport(DEBUG2, (errmsg(
"GroupVar maps to non-Var group expr; bailing out")));
assignedVar = NULL;
}
}
}
}
#endif
return assignedVar;
}
@ -2349,7 +2465,7 @@ FilterJoinRestrictionContext(JoinRestrictionContext *joinRestrictionContext, Rel
/*
* RangeTableArrayContainsAnyRTEIdentities returns true if any of the range table entries
* int rangeTableEntries array is an range table relation specified in queryRteIdentities.
* in rangeTableEntries array is a range table relation specified in queryRteIdentities.
*/
static bool
RangeTableArrayContainsAnyRTEIdentities(RangeTblEntry **rangeTableEntries, int
@ -2362,6 +2478,18 @@ RangeTableArrayContainsAnyRTEIdentities(RangeTblEntry **rangeTableEntries, int
List *rangeTableRelationList = NULL;
ListCell *rteRelationCell = NULL;
#if PG_VERSION_NUM >= PG_VERSION_18
/*
* In PG18+, planner array simple_rte_array may contain NULL entries
* for "dead relations". See PG commits 5f6f951 and e9a20e4 for details.
*/
if (rangeTableEntry == NULL)
{
continue;
}
#endif
/*
* Get list of all RTE_RELATIONs in the given range table entry
* (i.e.,rangeTableEntry could be a subquery where we're interested

View File

@ -85,6 +85,10 @@
#include "utils/ruleutils.h"
#include "pg_version_constants.h"
#if PG_VERSION_NUM >= PG_VERSION_18
typedef OpIndexInterpretation OpBtreeInterpretation;
#endif
#include "distributed/distributed_planner.h"
#include "distributed/listutils.h"
@ -1078,7 +1082,11 @@ IsValidPartitionKeyRestriction(OpExpr *opClause)
OpBtreeInterpretation *btreeInterpretation =
(OpBtreeInterpretation *) lfirst(btreeInterpretationCell);
#if PG_VERSION_NUM >= PG_VERSION_18
if (btreeInterpretation->cmptype == ROWCOMPARE_NE)
#else
if (btreeInterpretation->strategy == ROWCOMPARE_NE)
#endif
{
/* TODO: could add support for this, if we feel like it */
return false;
@ -1130,7 +1138,11 @@ AddPartitionKeyRestrictionToInstance(ClauseWalkerContext *context, OpExpr *opCla
OpBtreeInterpretation *btreeInterpretation =
(OpBtreeInterpretation *) lfirst(btreeInterpretationCell);
#if PG_VERSION_NUM >= PG_VERSION_18
switch (btreeInterpretation->cmptype)
#else
switch (btreeInterpretation->strategy)
#endif
{
case BTLessStrategyNumber:
{
@ -1299,7 +1311,11 @@ IsValidHashRestriction(OpExpr *opClause)
OpBtreeInterpretation *btreeInterpretation =
(OpBtreeInterpretation *) lfirst(btreeInterpretationCell);
#if PG_VERSION_NUM >= PG_VERSION_18
if (btreeInterpretation->cmptype == BTGreaterEqualStrategyNumber)
#else
if (btreeInterpretation->strategy == BTGreaterEqualStrategyNumber)
#endif
{
return true;
}

View File

@ -962,6 +962,7 @@ shard_name(PG_FUNCTION_ARGS)
Oid relationId = PG_GETARG_OID(0);
int64 shardId = PG_GETARG_INT64(1);
bool skipQualifyPublic = PG_GETARG_BOOL(2);
char *qualifiedName = NULL;
@ -991,7 +992,7 @@ shard_name(PG_FUNCTION_ARGS)
Oid schemaId = get_rel_namespace(relationId);
char *schemaName = get_namespace_name(schemaId);
if (strncmp(schemaName, "public", NAMEDATALEN) == 0)
if (skipQualifyPublic && strncmp(schemaName, "public", NAMEDATALEN) == 0)
{
qualifiedName = (char *) quote_identifier(relationName);
}

View File

@ -118,7 +118,8 @@ static List * GetReplicaIdentityCommandListForShard(Oid relationId, uint64 shard
static List * GetIndexCommandListForShardBackingReplicaIdentity(Oid relationId,
uint64 shardId);
static void CreatePostLogicalReplicationDataLoadObjects(List *logicalRepTargetList,
LogicalRepType type);
LogicalRepType type,
bool skipInterShardRelationships);
static void ExecuteCreateIndexCommands(List *logicalRepTargetList);
static void ExecuteCreateConstraintsBackedByIndexCommands(List *logicalRepTargetList);
static List * ConvertNonExistingPlacementDDLCommandsToTasks(List *shardCommandList,
@ -132,7 +133,6 @@ static XLogRecPtr GetRemoteLSN(MultiConnection *connection, char *command);
static void WaitForMiliseconds(long timeout);
static XLogRecPtr GetSubscriptionPosition(
GroupedLogicalRepTargets *groupedLogicalRepTargets);
static void AcquireLogicalReplicationLock(void);
static HTAB * CreateShardMovePublicationInfoHash(WorkerNode *targetNode,
List *shardIntervals);
@ -154,9 +154,9 @@ static void WaitForGroupedLogicalRepTargetsToCatchUp(XLogRecPtr sourcePosition,
*/
void
LogicallyReplicateShards(List *shardList, char *sourceNodeName, int sourceNodePort,
char *targetNodeName, int targetNodePort)
char *targetNodeName, int targetNodePort,
bool skipInterShardRelationshipCreation)
{
AcquireLogicalReplicationLock();
char *superUser = CitusExtensionOwnerName();
char *databaseName = get_database_name(MyDatabaseId);
int connectionFlags = FORCE_NEW_CONNECTION;
@ -258,7 +258,8 @@ LogicallyReplicateShards(List *shardList, char *sourceNodeName, int sourceNodePo
publicationInfoHash,
logicalRepTargetList,
groupedLogicalRepTargetsHash,
SHARD_MOVE);
SHARD_MOVE,
skipInterShardRelationshipCreation);
/*
* We use these connections exclusively for subscription management,
@ -317,7 +318,8 @@ CompleteNonBlockingShardTransfer(List *shardList,
HTAB *publicationInfoHash,
List *logicalRepTargetList,
HTAB *groupedLogicalRepTargetsHash,
LogicalRepType type)
LogicalRepType type,
bool skipInterShardRelationshipCreation)
{
/* Start applying the changes from the replication slots to catch up. */
EnableSubscriptions(logicalRepTargetList);
@ -345,7 +347,8 @@ CompleteNonBlockingShardTransfer(List *shardList,
* and partitioning hierarchy. Once they are done, wait until the replication
* catches up again. So we don't block writes too long.
*/
CreatePostLogicalReplicationDataLoadObjects(logicalRepTargetList, type);
CreatePostLogicalReplicationDataLoadObjects(logicalRepTargetList, type,
skipInterShardRelationshipCreation);
UpdatePlacementUpdateStatusForShardIntervalList(
shardList,
@ -372,7 +375,7 @@ CompleteNonBlockingShardTransfer(List *shardList,
WaitForAllSubscriptionsToCatchUp(sourceConnection, groupedLogicalRepTargetsHash);
if (type != SHARD_SPLIT)
if (type != SHARD_SPLIT && !skipInterShardRelationshipCreation)
{
UpdatePlacementUpdateStatusForShardIntervalList(
shardList,
@ -497,25 +500,6 @@ CreateShardMoveLogicalRepTargetList(HTAB *publicationInfoHash, List *shardList)
}
/*
* AcquireLogicalReplicationLock tries to acquire a lock for logical
* replication. We need this lock, because at the start of logical replication
* we clean up old subscriptions and publications. Because of this cleanup it's
* not safe to run multiple logical replication based shard moves at the same
* time. If multiple logical replication moves would run at the same time, the
* second move might clean up subscriptions and publications that are in use by
* another move.
*/
static void
AcquireLogicalReplicationLock(void)
{
LOCKTAG tag;
SET_LOCKTAG_LOGICAL_REPLICATION(tag);
LockAcquire(&tag, ExclusiveLock, false, false);
}
/*
* PrepareReplicationSubscriptionList returns list of shards to be logically
* replicated from given shard list. This is needed because Postgres does not
@ -675,7 +659,8 @@ GetReplicaIdentityCommandListForShard(Oid relationId, uint64 shardId)
*/
static void
CreatePostLogicalReplicationDataLoadObjects(List *logicalRepTargetList,
LogicalRepType type)
LogicalRepType type,
bool skipInterShardRelationships)
{
/*
* We create indexes in 4 steps.
@ -705,7 +690,7 @@ CreatePostLogicalReplicationDataLoadObjects(List *logicalRepTargetList,
/*
* Creating the partitioning hierarchy errors out in shard splits when
*/
if (type != SHARD_SPLIT)
if (type != SHARD_SPLIT && !skipInterShardRelationships)
{
/* create partitioning hierarchy, if any */
CreatePartitioningHierarchy(logicalRepTargetList);

View File

@ -212,9 +212,11 @@ static const char * MaxSharedPoolSizeGucShowHook(void);
static const char * LocalPoolSizeGucShowHook(void);
static bool StatisticsCollectionGucCheckHook(bool *newval, void **extra, GucSource
source);
static bool WarnIfLocalExecutionDisabled(bool *newval, void **extra, GucSource source);
static void CitusAuthHook(Port *port, int status);
static bool IsSuperuser(char *userName);
static void AdjustDynamicLibraryPathForCdcDecoders(void);
static void EnableChangeDataCaptureAssignHook(bool newval, void *extra);
static ClientAuthentication_hook_type original_client_auth_hook = NULL;
static emit_log_hook_type original_emit_log_hook = NULL;
@ -383,6 +385,32 @@ static const struct config_enum_entry metadata_sync_mode_options[] = {
/* *INDENT-ON* */
/*----------------------------------------------------------------------*
* On PG 18+ the hook signature changed; we wrap the old Citus handler
* in a fresh function that matches the new typedef exactly.
*----------------------------------------------------------------------*/
static void
citus_executor_run_adapter(QueryDesc *queryDesc,
ScanDirection direction,
uint64 count
#if PG_VERSION_NUM < PG_VERSION_18
, bool run_once
#endif
)
{
/* PG18+ has no run_once flag */
CitusExecutorRun(queryDesc,
direction,
count,
#if PG_VERSION_NUM >= PG_VERSION_18
true
#else
run_once
#endif
);
}
/* shared library initialization function */
void
_PG_init(void)
@ -458,7 +486,7 @@ _PG_init(void)
get_relation_info_hook = multi_get_relation_info_hook;
set_join_pathlist_hook = multi_join_restriction_hook;
ExecutorStart_hook = CitusExecutorStart;
ExecutorRun_hook = CitusExecutorRun;
ExecutorRun_hook = citus_executor_run_adapter;
ExplainOneQuery_hook = CitusExplainOneQuery;
prev_ExecutorEnd = ExecutorEnd_hook;
ExecutorEnd_hook = CitusAttributeToEnd;
@ -1272,7 +1300,7 @@ RegisterCitusConfigVariables(void)
false,
PGC_USERSET,
GUC_STANDARD,
NULL, NULL, NULL);
NULL, EnableChangeDataCaptureAssignHook, NULL);
DefineCustomBoolVariable(
"citus.enable_cluster_clock",
@ -1377,6 +1405,17 @@ RegisterCitusConfigVariables(void)
GUC_STANDARD,
NULL, NULL, NULL);
DefineCustomBoolVariable(
"citus.enable_local_fast_path_query_optimization",
gettext_noop("Enables the planner to avoid a query deparse and planning if "
"the shard is local to the current node."),
NULL,
&EnableLocalFastPathQueryOptimization,
true,
PGC_USERSET,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
WarnIfLocalExecutionDisabled, NULL, NULL);
DefineCustomBoolVariable(
"citus.enable_local_reference_table_foreign_keys",
gettext_noop("Enables foreign keys from/to local tables"),
@ -1441,6 +1480,38 @@ RegisterCitusConfigVariables(void)
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
NULL, NULL, NULL);
DefineCustomBoolVariable(
"citus.enable_outer_joins_with_pseudoconstant_quals_pre_pg17",
gettext_noop("Enables running distributed queries with outer joins "
"and pseudoconstant quals pre PG17."),
gettext_noop("Set to false by default. If set to true, enables "
"running distributed queries with outer joins and "
"pseudoconstant quals, at user's own risk, because "
"pre PG17, Citus doesn't have access to "
"set_join_pathlist_hook, which doesn't guarantee correct"
"query results. Note that in PG17+, this GUC has no effect"
"and the user can run such queries"),
&EnableOuterJoinsWithPseudoconstantQualsPrePG17,
false,
PGC_USERSET,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
NULL, NULL, NULL);
DefineCustomBoolVariable(
"citus.enable_recurring_outer_join_pushdown",
gettext_noop("Enables outer join pushdown for recurring relations."),
gettext_noop("When enabled, Citus will try to push down outer joins "
"between recurring and non-recurring relations to workers "
"whenever feasible by introducing correctness constraints "
"to the where clause of the query. Note that if this is "
"disabled, or push down is not feasible, the result will "
"be computed via recursive planning."),
&EnableRecurringOuterJoinPushdown,
true,
PGC_USERSET,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
NULL, NULL, NULL);
DefineCustomBoolVariable(
"citus.enable_repartition_joins",
gettext_noop("Allows Citus to repartition data between nodes."),
@ -1913,7 +1984,7 @@ RegisterCitusConfigVariables(void)
"because total background worker count is shared by all background workers. The value "
"represents the possible maximum number of task executors."),
&MaxBackgroundTaskExecutors,
4, 1, MAX_BG_TASK_EXECUTORS,
1, 1, MAX_BG_TASK_EXECUTORS,
PGC_SIGHUP,
GUC_STANDARD,
NULL, NULL, NULL);
@ -2439,8 +2510,8 @@ RegisterCitusConfigVariables(void)
NULL,
&SkipAdvisoryLockPermissionChecks,
false,
GUC_SUPERUSER_ONLY,
GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
PGC_SUSET,
GUC_SUPERUSER_ONLY | GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE,
NULL, NULL, NULL);
DefineCustomBoolVariable(
@ -2802,6 +2873,26 @@ WarnIfDeprecatedExecutorUsed(int *newval, void **extra, GucSource source)
}
/*
* WarnIfLocalExecutionDisabled is used to emit a warning message when
* enabling citus.enable_local_fast_path_query_optimization if
* citus.enable_local_execution was disabled.
*/
static bool
WarnIfLocalExecutionDisabled(bool *newval, void **extra, GucSource source)
{
if (*newval == true && EnableLocalExecution == false)
{
ereport(WARNING, (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg(
"citus.enable_local_execution must be set in order for "
"citus.enable_local_fast_path_query_optimization to be effective.")));
}
return true;
}
/*
* NoticeIfSubqueryPushdownEnabled prints a notice when a user sets
* citus.subquery_pushdown to ON. It doesn't print the notice if the
@ -3272,3 +3363,19 @@ CitusObjectAccessHook(ObjectAccessType access, Oid classId, Oid objectId, int su
SetCreateCitusTransactionLevel(GetCurrentTransactionNestLevel());
}
}
/*
* EnableChangeDataCaptureAssignHook is called whenever the
* citus.enable_change_data_capture setting is changed to dynamically
* adjust the dynamic_library_path based on the new value.
*/
static void
EnableChangeDataCaptureAssignHook(bool newval, void *extra)
{
if (newval)
{
/* CDC enabled: add citus_decoders to the path */
AdjustDynamicLibraryPathForCdcDecoders();
}
}

View File

@ -0,0 +1,7 @@
-- Add replica information columns to pg_dist_node
ALTER TABLE pg_catalog.pg_dist_node ADD COLUMN nodeisclone BOOLEAN NOT NULL DEFAULT FALSE;
ALTER TABLE pg_catalog.pg_dist_node ADD COLUMN nodeprimarynodeid INT4 NOT NULL DEFAULT 0;
-- Add a comment to the table and columns for clarity in \d output
COMMENT ON COLUMN pg_catalog.pg_dist_node.nodeisclone IS 'Indicates if this node is a replica of another node.';
COMMENT ON COLUMN pg_catalog.pg_dist_node.nodeprimarynodeid IS 'If nodeisclone is true, this stores the nodeid of its primary node.';

View File

@ -0,0 +1,3 @@
-- Remove clone information columns to pg_dist_node
ALTER TABLE pg_catalog.pg_dist_node DROP COLUMN IF EXISTS nodeisclone;
ALTER TABLE pg_catalog.pg_dist_node DROP COLUMN IF EXISTS nodeprimarynodeid;

View File

@ -50,3 +50,11 @@ DROP VIEW IF EXISTS pg_catalog.citus_lock_waits;
#include "udfs/citus_is_primary_node/13.1-1.sql"
#include "udfs/citus_stat_counters/13.1-1.sql"
#include "udfs/citus_stat_counters_reset/13.1-1.sql"
#include "udfs/citus_nodes/13.1-1.sql"
-- Since shard_name/13.1-1.sql first drops the function and then creates it, we first
-- need to drop citus_shards view since that view depends on this function. And immediately
-- after creating the function, we recreate citus_shards view again.
DROP VIEW pg_catalog.citus_shards;
#include "udfs/shard_name/13.1-1.sql"
#include "udfs/citus_shards/12.0-1.sql"

View File

@ -0,0 +1,84 @@
-- citus--13.1-1--13.2-1
-- bump version to 13.2-1
#include "udfs/worker_last_saved_explain_analyze/13.2-1.sql"
#include "cat_upgrades/add_clone_info_to_pg_dist_node.sql"
#include "udfs/citus_add_clone_node/13.2-1.sql"
#include "udfs/citus_remove_clone_node/13.2-1.sql"
#include "udfs/citus_promote_clone_and_rebalance/13.2-1.sql"
#include "udfs/get_snapshot_based_node_split_plan/13.2-1.sql"
#include "udfs/citus_rebalance_start/13.2-1.sql"
#include "udfs/citus_internal_copy_single_shard_placement/13.2-1.sql"
#include "udfs/citus_finish_pg_upgrade/13.2-1.sql"
#include "udfs/citus_stats/13.2-1.sql"
DO $drop_leftover_old_columnar_objects$
BEGIN
-- If old columnar exists, i.e., the columnar access method that we had before Citus 11.1,
-- and we don't have any relations using the old columnar, then we want to drop the columnar
-- objects. This is because, we don't want to automatically create the "citus_columnar"
-- extension together with the "citus" extension anymore. And for the cases where we don't
-- want to automatically create the "citus_columnar" extension, there is no point of keeping
-- the columnar objects that we had before Citus 11.1 around.
IF (
SELECT EXISTS (
SELECT 1 FROM pg_am
WHERE
-- looking for an access method whose name is "columnar" ..
pg_am.amname = 'columnar' AND
-- .. and there should *NOT* be such a dependency edge in pg_depend, where ..
NOT EXISTS (
SELECT 1 FROM pg_depend
WHERE
-- .. the depender is columnar access method (2601 = access method class) ..
pg_depend.classid = 2601 AND pg_depend.objid = pg_am.oid AND pg_depend.objsubid = 0 AND
-- .. and the dependee is an extension (3079 = extension class)
pg_depend.refclassid = 3079 AND pg_depend.refobjsubid = 0
LIMIT 1
) AND
-- .. and there should *NOT* be any relations using it
NOT EXISTS (
SELECT 1
FROM pg_class
WHERE pg_class.relam = pg_am.oid
LIMIT 1
)
)
)
THEN
-- Below we drop the columnar objects in such an order that the objects that depend on
-- other objects are dropped first.
DROP VIEW IF EXISTS columnar.options;
DROP VIEW IF EXISTS columnar.stripe;
DROP VIEW IF EXISTS columnar.chunk_group;
DROP VIEW IF EXISTS columnar.chunk;
DROP VIEW IF EXISTS columnar.storage;
DROP ACCESS METHOD IF EXISTS columnar;
DROP SEQUENCE IF EXISTS columnar_internal.storageid_seq;
DROP TABLE IF EXISTS columnar_internal.options;
DROP TABLE IF EXISTS columnar_internal.stripe;
DROP TABLE IF EXISTS columnar_internal.chunk_group;
DROP TABLE IF EXISTS columnar_internal.chunk;
DROP FUNCTION IF EXISTS columnar_internal.columnar_handler;
DROP FUNCTION IF EXISTS pg_catalog.alter_columnar_table_set;
DROP FUNCTION IF EXISTS pg_catalog.alter_columnar_table_reset;
DROP FUNCTION IF EXISTS columnar.get_storage_id;
DROP FUNCTION IF EXISTS citus_internal.upgrade_columnar_storage;
DROP FUNCTION IF EXISTS citus_internal.downgrade_columnar_storage;
DROP FUNCTION IF EXISTS citus_internal.columnar_ensure_am_depends_catalog;
DROP SCHEMA IF EXISTS columnar;
DROP SCHEMA IF EXISTS columnar_internal;
END IF;
END $drop_leftover_old_columnar_objects$;

View File

@ -0,0 +1,5 @@
-- citus--13.2-1--14.0-1
-- bump version to 14.0-1
#include "udfs/citus_prepare_pg_upgrade/14.0-1.sql"
#include "udfs/citus_finish_pg_upgrade/14.0-1.sql"

View File

@ -45,3 +45,25 @@ DROP FUNCTION citus_internal.is_replication_origin_tracking_active();
DROP VIEW pg_catalog.citus_stat_counters;
DROP FUNCTION pg_catalog.citus_stat_counters(oid);
DROP FUNCTION pg_catalog.citus_stat_counters_reset(oid);
DROP VIEW IF EXISTS pg_catalog.citus_nodes;
-- Definition of shard_name() prior to this release doesn't have a separate SQL file
-- because it's quite an old UDF that its prior definition(s) was(were) squashed into
-- citus--8.0-1.sql. For this reason, to downgrade it, here we directly execute its old
-- definition instead of including it from such a separate file.
--
-- And before dropping and creating the function, we also need to drop citus_shards view
-- since it depends on it. And immediately after creating the function, we recreate
-- citus_shards view again.
DROP VIEW pg_catalog.citus_shards;
DROP FUNCTION pg_catalog.shard_name(object_name regclass, shard_id bigint, skip_qualify_public boolean);
CREATE FUNCTION pg_catalog.shard_name(object_name regclass, shard_id bigint)
RETURNS text
LANGUAGE C STABLE STRICT
AS 'MODULE_PATHNAME', $$shard_name$$;
COMMENT ON FUNCTION pg_catalog.shard_name(object_name regclass, shard_id bigint)
IS 'returns schema-qualified, shard-extended identifier of object name';
#include "../udfs/citus_shards/12.0-1.sql"

View File

@ -0,0 +1,29 @@
-- citus--13.2-1--13.1-1
-- downgrade version to 13.1-1
DROP FUNCTION IF EXISTS citus_internal.citus_internal_copy_single_shard_placement(bigint, integer, integer, integer, citus.shard_transfer_mode);
DROP FUNCTION IF EXISTS pg_catalog.citus_rebalance_start(name, boolean, citus.shard_transfer_mode, boolean, boolean);
#include "../udfs/citus_rebalance_start/11.1-1.sql"
DROP FUNCTION IF EXISTS pg_catalog.worker_last_saved_explain_analyze();
#include "../udfs/worker_last_saved_explain_analyze/9.4-1.sql"
DROP FUNCTION IF EXISTS pg_catalog.citus_add_clone_node(text, integer, text, integer);
DROP FUNCTION IF EXISTS pg_catalog.citus_add_clone_node_with_nodeid(text, integer, integer);
DROP FUNCTION IF EXISTS pg_catalog.citus_remove_clone_node(text, integer);
DROP FUNCTION IF EXISTS pg_catalog.citus_remove_clone_node_with_nodeid(integer);
DROP FUNCTION IF EXISTS pg_catalog.citus_promote_clone_and_rebalance(integer, name, integer);
DROP FUNCTION IF EXISTS pg_catalog.get_snapshot_based_node_split_plan(text, integer, text, integer, name);
#include "../cat_upgrades/remove_clone_info_to_pg_dist_node.sql"
#include "../udfs/citus_finish_pg_upgrade/13.1-1.sql"
-- Note that we intentionally don't add the old columnar objects back to the "citus"
-- extension in this downgrade script, even if they were present in the older version.
--
-- If the user wants to create "citus_columnar" extension later, "citus_columnar"
-- will anyway properly create them at the scope of that extension.
DROP VIEW IF EXISTS pg_catalog.citus_stats;

View File

@ -0,0 +1,5 @@
-- citus--14.0-1--13.2-1
-- downgrade version to 13.2-1
#include "../udfs/citus_prepare_pg_upgrade/13.0-1.sql"
#include "../udfs/citus_finish_pg_upgrade/13.2-1.sql"

View File

@ -0,0 +1,26 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_add_clone_node(
replica_hostname text,
replica_port integer,
primary_hostname text,
primary_port integer)
RETURNS INTEGER
LANGUAGE C VOLATILE STRICT
AS 'MODULE_PATHNAME', $$citus_add_clone_node$$;
COMMENT ON FUNCTION pg_catalog.citus_add_clone_node(text, integer, text, integer) IS
'Adds a new node as a clone of an existing primary node. The clone is initially inactive. Returns the nodeid of the new clone node.';
REVOKE ALL ON FUNCTION pg_catalog.citus_add_clone_node(text, int, text, int) FROM PUBLIC;
CREATE OR REPLACE FUNCTION pg_catalog.citus_add_clone_node_with_nodeid(
replica_hostname text,
replica_port integer,
primary_nodeid integer)
RETURNS INTEGER
LANGUAGE C VOLATILE STRICT
AS 'MODULE_PATHNAME', $$citus_add_clone_node_with_nodeid$$;
COMMENT ON FUNCTION pg_catalog.citus_add_clone_node_with_nodeid(text, integer, integer) IS
'Adds a new node as a clone of an existing primary node using the primary node''s ID. The clone is initially inactive. Returns the nodeid of the new clone node.';
REVOKE ALL ON FUNCTION pg_catalog.citus_add_clone_node_with_nodeid(text, int, int) FROM PUBLIC;

View File

@ -0,0 +1,26 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_add_clone_node(
replica_hostname text,
replica_port integer,
primary_hostname text,
primary_port integer)
RETURNS INTEGER
LANGUAGE C VOLATILE STRICT
AS 'MODULE_PATHNAME', $$citus_add_clone_node$$;
COMMENT ON FUNCTION pg_catalog.citus_add_clone_node(text, integer, text, integer) IS
'Adds a new node as a clone of an existing primary node. The clone is initially inactive. Returns the nodeid of the new clone node.';
REVOKE ALL ON FUNCTION pg_catalog.citus_add_clone_node(text, int, text, int) FROM PUBLIC;
CREATE OR REPLACE FUNCTION pg_catalog.citus_add_clone_node_with_nodeid(
replica_hostname text,
replica_port integer,
primary_nodeid integer)
RETURNS INTEGER
LANGUAGE C VOLATILE STRICT
AS 'MODULE_PATHNAME', $$citus_add_clone_node_with_nodeid$$;
COMMENT ON FUNCTION pg_catalog.citus_add_clone_node_with_nodeid(text, integer, integer) IS
'Adds a new node as a clone of an existing primary node using the primary node''s ID. The clone is initially inactive. Returns the nodeid of the new clone node.';
REVOKE ALL ON FUNCTION pg_catalog.citus_add_clone_node_with_nodeid(text, int, int) FROM PUBLIC;

View File

@ -0,0 +1,260 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
DECLARE
table_name regclass;
command text;
trigger_name text;
BEGIN
IF substring(current_Setting('server_version'), '\d+')::int >= 14 THEN
EXECUTE $cmd$
-- disable propagation to prevent EnsureCoordinator errors
-- the aggregate created here does not depend on Citus extension (yet)
-- since we add the dependency with the next command
SET citus.enable_ddl_propagation TO OFF;
CREATE AGGREGATE array_cat_agg(anycompatiblearray) (SFUNC = array_cat, STYPE = anycompatiblearray);
COMMENT ON AGGREGATE array_cat_agg(anycompatiblearray)
IS 'concatenate input arrays into a single array';
RESET citus.enable_ddl_propagation;
$cmd$;
ELSE
EXECUTE $cmd$
SET citus.enable_ddl_propagation TO OFF;
CREATE AGGREGATE array_cat_agg(anyarray) (SFUNC = array_cat, STYPE = anyarray);
COMMENT ON AGGREGATE array_cat_agg(anyarray)
IS 'concatenate input arrays into a single array';
RESET citus.enable_ddl_propagation;
$cmd$;
END IF;
--
-- Citus creates the array_cat_agg but because of a compatibility
-- issue between pg13-pg14, we drop and create it during upgrade.
-- And as Citus creates it, there needs to be a dependency to the
-- Citus extension, so we create that dependency here.
-- We are not using:
-- ALTER EXENSION citus DROP/CREATE AGGREGATE array_cat_agg
-- because we don't have an easy way to check if the aggregate
-- exists with anyarray type or anycompatiblearray type.
INSERT INTO pg_depend
SELECT
'pg_proc'::regclass::oid as classid,
(SELECT oid FROM pg_proc WHERE proname = 'array_cat_agg') as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'e' as deptype;
-- PG16 has its own any_value, so only create it pre PG16.
-- We can remove this part when we drop support for PG16
IF substring(current_Setting('server_version'), '\d+')::int < 16 THEN
EXECUTE $cmd$
-- disable propagation to prevent EnsureCoordinator errors
-- the aggregate created here does not depend on Citus extension (yet)
-- since we add the dependency with the next command
SET citus.enable_ddl_propagation TO OFF;
CREATE OR REPLACE FUNCTION pg_catalog.any_value_agg ( anyelement, anyelement )
RETURNS anyelement AS $$
SELECT CASE WHEN $1 IS NULL THEN $2 ELSE $1 END;
$$ LANGUAGE SQL STABLE;
CREATE AGGREGATE pg_catalog.any_value (
sfunc = pg_catalog.any_value_agg,
combinefunc = pg_catalog.any_value_agg,
basetype = anyelement,
stype = anyelement
);
COMMENT ON AGGREGATE pg_catalog.any_value(anyelement) IS
'Returns the value of any row in the group. It is mostly useful when you know there will be only 1 element.';
RESET citus.enable_ddl_propagation;
--
-- Citus creates the any_value aggregate but because of a compatibility
-- issue between pg15-pg16 -- any_value is created in PG16, we drop
-- and create it during upgrade IF upgraded version is less than 16.
-- And as Citus creates it, there needs to be a dependency to the
-- Citus extension, so we create that dependency here.
INSERT INTO pg_depend
SELECT
'pg_proc'::regclass::oid as classid,
(SELECT oid FROM pg_proc WHERE proname = 'any_value_agg') as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'e' as deptype;
INSERT INTO pg_depend
SELECT
'pg_proc'::regclass::oid as classid,
(SELECT oid FROM pg_proc WHERE proname = 'any_value') as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'e' as deptype;
$cmd$;
END IF;
--
-- restore citus catalog tables
--
INSERT INTO pg_catalog.pg_dist_partition SELECT * FROM public.pg_dist_partition;
-- if we are upgrading from PG14/PG15 to PG16+,
-- we need to regenerate the partkeys because they will include varnullingrels as well.
UPDATE pg_catalog.pg_dist_partition
SET partkey = column_name_to_column(pg_dist_partkeys_pre_16_upgrade.logicalrelid, col_name)
FROM public.pg_dist_partkeys_pre_16_upgrade
WHERE pg_dist_partkeys_pre_16_upgrade.logicalrelid = pg_dist_partition.logicalrelid;
DROP TABLE public.pg_dist_partkeys_pre_16_upgrade;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
INSERT INTO pg_catalog.pg_dist_node SELECT * FROM public.pg_dist_node;
INSERT INTO pg_catalog.pg_dist_local_group SELECT * FROM public.pg_dist_local_group;
INSERT INTO pg_catalog.pg_dist_transaction SELECT * FROM public.pg_dist_transaction;
INSERT INTO pg_catalog.pg_dist_colocation SELECT * FROM public.pg_dist_colocation;
INSERT INTO pg_catalog.pg_dist_cleanup SELECT * FROM public.pg_dist_cleanup;
INSERT INTO pg_catalog.pg_dist_schema SELECT schemaname::regnamespace, colocationid FROM public.pg_dist_schema;
-- enterprise catalog tables
INSERT INTO pg_catalog.pg_dist_authinfo SELECT * FROM public.pg_dist_authinfo;
INSERT INTO pg_catalog.pg_dist_poolinfo SELECT * FROM public.pg_dist_poolinfo;
-- Temporarily disable trigger to check for validity of functions while
-- inserting. The current contents of the table might be invalid if one of
-- the functions was removed by the user without also removing the
-- rebalance strategy. Obviously that's not great, but it should be no
-- reason to fail the upgrade.
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy DISABLE TRIGGER pg_dist_rebalance_strategy_validation_trigger;
INSERT INTO pg_catalog.pg_dist_rebalance_strategy SELECT
name,
default_strategy,
shard_cost_function::regprocedure::regproc,
node_capacity_function::regprocedure::regproc,
shard_allowed_on_node_function::regprocedure::regproc,
default_threshold,
minimum_threshold,
improvement_threshold
FROM public.pg_dist_rebalance_strategy;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy ENABLE TRIGGER pg_dist_rebalance_strategy_validation_trigger;
--
-- drop backup tables
--
DROP TABLE public.pg_dist_authinfo;
DROP TABLE public.pg_dist_colocation;
DROP TABLE public.pg_dist_local_group;
DROP TABLE public.pg_dist_node;
DROP TABLE public.pg_dist_node_metadata;
DROP TABLE public.pg_dist_partition;
DROP TABLE public.pg_dist_placement;
DROP TABLE public.pg_dist_poolinfo;
DROP TABLE public.pg_dist_shard;
DROP TABLE public.pg_dist_transaction;
DROP TABLE public.pg_dist_rebalance_strategy;
DROP TABLE public.pg_dist_cleanup;
DROP TABLE public.pg_dist_schema;
--
-- reset sequences
--
PERFORM setval('pg_catalog.pg_dist_shardid_seq', (SELECT MAX(shardid)+1 AS max_shard_id FROM pg_dist_shard), false);
PERFORM setval('pg_catalog.pg_dist_placement_placementid_seq', (SELECT MAX(placementid)+1 AS max_placement_id FROM pg_dist_placement), false);
PERFORM setval('pg_catalog.pg_dist_groupid_seq', (SELECT MAX(groupid)+1 AS max_group_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_node_nodeid_seq', (SELECT MAX(nodeid)+1 AS max_node_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_colocationid_seq', (SELECT MAX(colocationid)+1 AS max_colocation_id FROM pg_dist_colocation), false);
PERFORM setval('pg_catalog.pg_dist_operationid_seq', (SELECT MAX(operation_id)+1 AS max_operation_id FROM pg_dist_cleanup), false);
PERFORM setval('pg_catalog.pg_dist_cleanup_recordid_seq', (SELECT MAX(record_id)+1 AS max_record_id FROM pg_dist_cleanup), false);
PERFORM setval('pg_catalog.pg_dist_clock_logical_seq', (SELECT last_value FROM public.pg_dist_clock_logical_seq), false);
DROP TABLE public.pg_dist_clock_logical_seq;
--
-- register triggers
--
FOR table_name IN SELECT logicalrelid FROM pg_catalog.pg_dist_partition JOIN pg_class ON (logicalrelid = oid) WHERE relkind <> 'f'
LOOP
trigger_name := 'truncate_trigger_' || table_name::oid;
command := 'create trigger ' || trigger_name || ' after truncate on ' || table_name || ' execute procedure pg_catalog.citus_truncate_trigger()';
EXECUTE command;
command := 'update pg_trigger set tgisinternal = true where tgname = ' || quote_literal(trigger_name);
EXECUTE command;
END LOOP;
--
-- set dependencies
--
INSERT INTO pg_depend
SELECT
'pg_class'::regclass::oid as classid,
p.logicalrelid::regclass::oid as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- If citus_columnar extension exists, then perform the post PG-upgrade work for columnar as well.
--
-- First look if pg_catalog.columnar_finish_pg_upgrade function exists as part of the citus_columnar
-- extension. (We check whether it's part of the extension just for security reasons). If it does, then
-- call it. If not, then look for columnar_internal.columnar_ensure_am_depends_catalog function and as
-- part of the citus_columnar extension. If so, then call it. We alternatively check for the latter UDF
-- just because pg_catalog.columnar_finish_pg_upgrade function is introduced in citus_columnar 13.2-1
-- and as of today all it does is to call columnar_internal.columnar_ensure_am_depends_catalog function.
IF EXISTS (
SELECT 1 FROM pg_depend
JOIN pg_proc ON (pg_depend.objid = pg_proc.oid)
JOIN pg_namespace ON (pg_proc.pronamespace = pg_namespace.oid)
JOIN pg_extension ON (pg_depend.refobjid = pg_extension.oid)
WHERE
-- Looking if pg_catalog.columnar_finish_pg_upgrade function exists and
-- if there is a dependency record from it (proc class = 1255) ..
pg_depend.classid = 1255 AND pg_namespace.nspname = 'pg_catalog' AND pg_proc.proname = 'columnar_finish_pg_upgrade' AND
-- .. to citus_columnar extension (3079 = extension class), if it exists.
pg_depend.refclassid = 3079 AND pg_extension.extname = 'citus_columnar'
)
THEN PERFORM pg_catalog.columnar_finish_pg_upgrade();
ELSIF EXISTS (
SELECT 1 FROM pg_depend
JOIN pg_proc ON (pg_depend.objid = pg_proc.oid)
JOIN pg_namespace ON (pg_proc.pronamespace = pg_namespace.oid)
JOIN pg_extension ON (pg_depend.refobjid = pg_extension.oid)
WHERE
-- Looking if columnar_internal.columnar_ensure_am_depends_catalog function exists and
-- if there is a dependency record from it (proc class = 1255) ..
pg_depend.classid = 1255 AND pg_namespace.nspname = 'columnar_internal' AND pg_proc.proname = 'columnar_ensure_am_depends_catalog' AND
-- .. to citus_columnar extension (3079 = extension class), if it exists.
pg_depend.refclassid = 3079 AND pg_extension.extname = 'citus_columnar'
)
THEN PERFORM columnar_internal.columnar_ensure_am_depends_catalog();
END IF;
-- restore pg_dist_object from the stable identifiers
TRUNCATE pg_catalog.pg_dist_object;
INSERT INTO pg_catalog.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
address.objid,
address.objsubid,
naming.distribution_argument_index,
naming.colocationid
FROM
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_finish_pg_upgrade()
IS 'perform tasks to restore citus settings from a location that has been prepared before pg_upgrade';

View File

@ -0,0 +1,268 @@
CREATE OR REPLACE FUNCTION pg_catalog.citus_finish_pg_upgrade()
RETURNS void
LANGUAGE plpgsql
SET search_path = pg_catalog
AS $cppu$
DECLARE
table_name regclass;
command text;
trigger_name text;
BEGIN
IF substring(current_Setting('server_version'), '\d+')::int >= 14 THEN
EXECUTE $cmd$
-- disable propagation to prevent EnsureCoordinator errors
-- the aggregate created here does not depend on Citus extension (yet)
-- since we add the dependency with the next command
SET citus.enable_ddl_propagation TO OFF;
CREATE AGGREGATE array_cat_agg(anycompatiblearray) (SFUNC = array_cat, STYPE = anycompatiblearray);
COMMENT ON AGGREGATE array_cat_agg(anycompatiblearray)
IS 'concatenate input arrays into a single array';
RESET citus.enable_ddl_propagation;
$cmd$;
ELSE
EXECUTE $cmd$
SET citus.enable_ddl_propagation TO OFF;
CREATE AGGREGATE array_cat_agg(anyarray) (SFUNC = array_cat, STYPE = anyarray);
COMMENT ON AGGREGATE array_cat_agg(anyarray)
IS 'concatenate input arrays into a single array';
RESET citus.enable_ddl_propagation;
$cmd$;
END IF;
--
-- Citus creates the array_cat_agg but because of a compatibility
-- issue between pg13-pg14, we drop and create it during upgrade.
-- And as Citus creates it, there needs to be a dependency to the
-- Citus extension, so we create that dependency here.
-- We are not using:
-- ALTER EXENSION citus DROP/CREATE AGGREGATE array_cat_agg
-- because we don't have an easy way to check if the aggregate
-- exists with anyarray type or anycompatiblearray type.
INSERT INTO pg_depend
SELECT
'pg_proc'::regclass::oid as classid,
(SELECT oid FROM pg_proc WHERE proname = 'array_cat_agg') as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'e' as deptype;
-- PG16 has its own any_value, so only create it pre PG16.
-- We can remove this part when we drop support for PG16
IF substring(current_Setting('server_version'), '\d+')::int < 16 THEN
EXECUTE $cmd$
-- disable propagation to prevent EnsureCoordinator errors
-- the aggregate created here does not depend on Citus extension (yet)
-- since we add the dependency with the next command
SET citus.enable_ddl_propagation TO OFF;
CREATE OR REPLACE FUNCTION pg_catalog.any_value_agg ( anyelement, anyelement )
RETURNS anyelement AS $$
SELECT CASE WHEN $1 IS NULL THEN $2 ELSE $1 END;
$$ LANGUAGE SQL STABLE;
CREATE AGGREGATE pg_catalog.any_value (
sfunc = pg_catalog.any_value_agg,
combinefunc = pg_catalog.any_value_agg,
basetype = anyelement,
stype = anyelement
);
COMMENT ON AGGREGATE pg_catalog.any_value(anyelement) IS
'Returns the value of any row in the group. It is mostly useful when you know there will be only 1 element.';
RESET citus.enable_ddl_propagation;
--
-- Citus creates the any_value aggregate but because of a compatibility
-- issue between pg15-pg16 -- any_value is created in PG16, we drop
-- and create it during upgrade IF upgraded version is less than 16.
-- And as Citus creates it, there needs to be a dependency to the
-- Citus extension, so we create that dependency here.
INSERT INTO pg_depend
SELECT
'pg_proc'::regclass::oid as classid,
(SELECT oid FROM pg_proc WHERE proname = 'any_value_agg') as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'e' as deptype;
INSERT INTO pg_depend
SELECT
'pg_proc'::regclass::oid as classid,
(SELECT oid FROM pg_proc WHERE proname = 'any_value') as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'e' as deptype;
$cmd$;
END IF;
--
-- restore citus catalog tables
--
INSERT INTO pg_catalog.pg_dist_partition SELECT * FROM public.pg_dist_partition;
-- if we are upgrading from PG14/PG15 to PG16+,
-- we need to regenerate the partkeys because they will include varnullingrels as well.
UPDATE pg_catalog.pg_dist_partition
SET partkey = column_name_to_column(pg_dist_partkeys_pre_16_upgrade.logicalrelid, col_name)
FROM public.pg_dist_partkeys_pre_16_upgrade
WHERE pg_dist_partkeys_pre_16_upgrade.logicalrelid = pg_dist_partition.logicalrelid;
DROP TABLE public.pg_dist_partkeys_pre_16_upgrade;
-- if we are upgrading to PG18+,
-- we need to regenerate the partkeys because they will include varreturningtype as well.
UPDATE pg_catalog.pg_dist_partition
SET partkey = column_name_to_column(pg_dist_partkeys_pre_18_upgrade.logicalrelid, col_name)
FROM public.pg_dist_partkeys_pre_18_upgrade
WHERE pg_dist_partkeys_pre_18_upgrade.logicalrelid = pg_dist_partition.logicalrelid;
DROP TABLE public.pg_dist_partkeys_pre_18_upgrade;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
INSERT INTO pg_catalog.pg_dist_node SELECT * FROM public.pg_dist_node;
INSERT INTO pg_catalog.pg_dist_local_group SELECT * FROM public.pg_dist_local_group;
INSERT INTO pg_catalog.pg_dist_transaction SELECT * FROM public.pg_dist_transaction;
INSERT INTO pg_catalog.pg_dist_colocation SELECT * FROM public.pg_dist_colocation;
INSERT INTO pg_catalog.pg_dist_cleanup SELECT * FROM public.pg_dist_cleanup;
INSERT INTO pg_catalog.pg_dist_schema SELECT schemaname::regnamespace, colocationid FROM public.pg_dist_schema;
-- enterprise catalog tables
INSERT INTO pg_catalog.pg_dist_authinfo SELECT * FROM public.pg_dist_authinfo;
INSERT INTO pg_catalog.pg_dist_poolinfo SELECT * FROM public.pg_dist_poolinfo;
-- Temporarily disable trigger to check for validity of functions while
-- inserting. The current contents of the table might be invalid if one of
-- the functions was removed by the user without also removing the
-- rebalance strategy. Obviously that's not great, but it should be no
-- reason to fail the upgrade.
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy DISABLE TRIGGER pg_dist_rebalance_strategy_validation_trigger;
INSERT INTO pg_catalog.pg_dist_rebalance_strategy SELECT
name,
default_strategy,
shard_cost_function::regprocedure::regproc,
node_capacity_function::regprocedure::regproc,
shard_allowed_on_node_function::regprocedure::regproc,
default_threshold,
minimum_threshold,
improvement_threshold
FROM public.pg_dist_rebalance_strategy;
ALTER TABLE pg_catalog.pg_dist_rebalance_strategy ENABLE TRIGGER pg_dist_rebalance_strategy_validation_trigger;
--
-- drop backup tables
--
DROP TABLE public.pg_dist_authinfo;
DROP TABLE public.pg_dist_colocation;
DROP TABLE public.pg_dist_local_group;
DROP TABLE public.pg_dist_node;
DROP TABLE public.pg_dist_node_metadata;
DROP TABLE public.pg_dist_partition;
DROP TABLE public.pg_dist_placement;
DROP TABLE public.pg_dist_poolinfo;
DROP TABLE public.pg_dist_shard;
DROP TABLE public.pg_dist_transaction;
DROP TABLE public.pg_dist_rebalance_strategy;
DROP TABLE public.pg_dist_cleanup;
DROP TABLE public.pg_dist_schema;
--
-- reset sequences
--
PERFORM setval('pg_catalog.pg_dist_shardid_seq', (SELECT MAX(shardid)+1 AS max_shard_id FROM pg_dist_shard), false);
PERFORM setval('pg_catalog.pg_dist_placement_placementid_seq', (SELECT MAX(placementid)+1 AS max_placement_id FROM pg_dist_placement), false);
PERFORM setval('pg_catalog.pg_dist_groupid_seq', (SELECT MAX(groupid)+1 AS max_group_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_node_nodeid_seq', (SELECT MAX(nodeid)+1 AS max_node_id FROM pg_dist_node), false);
PERFORM setval('pg_catalog.pg_dist_colocationid_seq', (SELECT MAX(colocationid)+1 AS max_colocation_id FROM pg_dist_colocation), false);
PERFORM setval('pg_catalog.pg_dist_operationid_seq', (SELECT MAX(operation_id)+1 AS max_operation_id FROM pg_dist_cleanup), false);
PERFORM setval('pg_catalog.pg_dist_cleanup_recordid_seq', (SELECT MAX(record_id)+1 AS max_record_id FROM pg_dist_cleanup), false);
PERFORM setval('pg_catalog.pg_dist_clock_logical_seq', (SELECT last_value FROM public.pg_dist_clock_logical_seq), false);
DROP TABLE public.pg_dist_clock_logical_seq;
--
-- register triggers
--
FOR table_name IN SELECT logicalrelid FROM pg_catalog.pg_dist_partition JOIN pg_class ON (logicalrelid = oid) WHERE relkind <> 'f'
LOOP
trigger_name := 'truncate_trigger_' || table_name::oid;
command := 'create trigger ' || trigger_name || ' after truncate on ' || table_name || ' execute procedure pg_catalog.citus_truncate_trigger()';
EXECUTE command;
command := 'update pg_trigger set tgisinternal = true where tgname = ' || quote_literal(trigger_name);
EXECUTE command;
END LOOP;
--
-- set dependencies
--
INSERT INTO pg_depend
SELECT
'pg_class'::regclass::oid as classid,
p.logicalrelid::regclass::oid as objid,
0 as objsubid,
'pg_extension'::regclass::oid as refclassid,
(select oid from pg_extension where extname = 'citus') as refobjid,
0 as refobjsubid ,
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- If citus_columnar extension exists, then perform the post PG-upgrade work for columnar as well.
--
-- First look if pg_catalog.columnar_finish_pg_upgrade function exists as part of the citus_columnar
-- extension. (We check whether it's part of the extension just for security reasons). If it does, then
-- call it. If not, then look for columnar_internal.columnar_ensure_am_depends_catalog function and as
-- part of the citus_columnar extension. If so, then call it. We alternatively check for the latter UDF
-- just because pg_catalog.columnar_finish_pg_upgrade function is introduced in citus_columnar 13.2-1
-- and as of today all it does is to call columnar_internal.columnar_ensure_am_depends_catalog function.
IF EXISTS (
SELECT 1 FROM pg_depend
JOIN pg_proc ON (pg_depend.objid = pg_proc.oid)
JOIN pg_namespace ON (pg_proc.pronamespace = pg_namespace.oid)
JOIN pg_extension ON (pg_depend.refobjid = pg_extension.oid)
WHERE
-- Looking if pg_catalog.columnar_finish_pg_upgrade function exists and
-- if there is a dependency record from it (proc class = 1255) ..
pg_depend.classid = 1255 AND pg_namespace.nspname = 'pg_catalog' AND pg_proc.proname = 'columnar_finish_pg_upgrade' AND
-- .. to citus_columnar extension (3079 = extension class), if it exists.
pg_depend.refclassid = 3079 AND pg_extension.extname = 'citus_columnar'
)
THEN PERFORM pg_catalog.columnar_finish_pg_upgrade();
ELSIF EXISTS (
SELECT 1 FROM pg_depend
JOIN pg_proc ON (pg_depend.objid = pg_proc.oid)
JOIN pg_namespace ON (pg_proc.pronamespace = pg_namespace.oid)
JOIN pg_extension ON (pg_depend.refobjid = pg_extension.oid)
WHERE
-- Looking if columnar_internal.columnar_ensure_am_depends_catalog function exists and
-- if there is a dependency record from it (proc class = 1255) ..
pg_depend.classid = 1255 AND pg_namespace.nspname = 'columnar_internal' AND pg_proc.proname = 'columnar_ensure_am_depends_catalog' AND
-- .. to citus_columnar extension (3079 = extension class), if it exists.
pg_depend.refclassid = 3079 AND pg_extension.extname = 'citus_columnar'
)
THEN PERFORM columnar_internal.columnar_ensure_am_depends_catalog();
END IF;
-- restore pg_dist_object from the stable identifiers
TRUNCATE pg_catalog.pg_dist_object;
INSERT INTO pg_catalog.pg_dist_object (classid, objid, objsubid, distribution_argument_index, colocationid)
SELECT
address.classid,
address.objid,
address.objsubid,
naming.distribution_argument_index,
naming.colocationid
FROM
public.pg_dist_object naming,
pg_catalog.pg_get_object_address(naming.type, naming.object_names, naming.object_args) address;
DROP TABLE public.pg_dist_object;
END;
$cppu$;
COMMENT ON FUNCTION pg_catalog.citus_finish_pg_upgrade()
IS 'perform tasks to restore citus settings from a location that has been prepared before pg_upgrade';

View File

@ -115,6 +115,14 @@ BEGIN
WHERE pg_dist_partkeys_pre_16_upgrade.logicalrelid = pg_dist_partition.logicalrelid;
DROP TABLE public.pg_dist_partkeys_pre_16_upgrade;
-- if we are upgrading to PG18+,
-- we need to regenerate the partkeys because they will include varreturningtype as well.
UPDATE pg_catalog.pg_dist_partition
SET partkey = column_name_to_column(pg_dist_partkeys_pre_18_upgrade.logicalrelid, col_name)
FROM public.pg_dist_partkeys_pre_18_upgrade
WHERE pg_dist_partkeys_pre_18_upgrade.logicalrelid = pg_dist_partition.logicalrelid;
DROP TABLE public.pg_dist_partkeys_pre_18_upgrade;
INSERT INTO pg_catalog.pg_dist_shard SELECT * FROM public.pg_dist_shard;
INSERT INTO pg_catalog.pg_dist_placement SELECT * FROM public.pg_dist_placement;
INSERT INTO pg_catalog.pg_dist_node_metadata SELECT * FROM public.pg_dist_node_metadata;
@ -203,8 +211,41 @@ BEGIN
'n' as deptype
FROM pg_catalog.pg_dist_partition p;
-- set dependencies for columnar table access method
PERFORM columnar_internal.columnar_ensure_am_depends_catalog();
-- If citus_columnar extension exists, then perform the post PG-upgrade work for columnar as well.
--
-- First look if pg_catalog.columnar_finish_pg_upgrade function exists as part of the citus_columnar
-- extension. (We check whether it's part of the extension just for security reasons). If it does, then
-- call it. If not, then look for columnar_internal.columnar_ensure_am_depends_catalog function and as
-- part of the citus_columnar extension. If so, then call it. We alternatively check for the latter UDF
-- just because pg_catalog.columnar_finish_pg_upgrade function is introduced in citus_columnar 13.2-1
-- and as of today all it does is to call columnar_internal.columnar_ensure_am_depends_catalog function.
IF EXISTS (
SELECT 1 FROM pg_depend
JOIN pg_proc ON (pg_depend.objid = pg_proc.oid)
JOIN pg_namespace ON (pg_proc.pronamespace = pg_namespace.oid)
JOIN pg_extension ON (pg_depend.refobjid = pg_extension.oid)
WHERE
-- Looking if pg_catalog.columnar_finish_pg_upgrade function exists and
-- if there is a dependency record from it (proc class = 1255) ..
pg_depend.classid = 1255 AND pg_namespace.nspname = 'pg_catalog' AND pg_proc.proname = 'columnar_finish_pg_upgrade' AND
-- .. to citus_columnar extension (3079 = extension class), if it exists.
pg_depend.refclassid = 3079 AND pg_extension.extname = 'citus_columnar'
)
THEN PERFORM pg_catalog.columnar_finish_pg_upgrade();
ELSIF EXISTS (
SELECT 1 FROM pg_depend
JOIN pg_proc ON (pg_depend.objid = pg_proc.oid)
JOIN pg_namespace ON (pg_proc.pronamespace = pg_namespace.oid)
JOIN pg_extension ON (pg_depend.refobjid = pg_extension.oid)
WHERE
-- Looking if columnar_internal.columnar_ensure_am_depends_catalog function exists and
-- if there is a dependency record from it (proc class = 1255) ..
pg_depend.classid = 1255 AND pg_namespace.nspname = 'columnar_internal' AND pg_proc.proname = 'columnar_ensure_am_depends_catalog' AND
-- .. to citus_columnar extension (3079 = extension class), if it exists.
pg_depend.refclassid = 3079 AND pg_extension.extname = 'citus_columnar'
)
THEN PERFORM columnar_internal.columnar_ensure_am_depends_catalog();
END IF;
-- restore pg_dist_object from the stable identifiers
TRUNCATE pg_catalog.pg_dist_object;

View File

@ -0,0 +1,9 @@
CREATE OR REPLACE FUNCTION citus_internal.citus_internal_copy_single_shard_placement(
shard_id bigint,
source_node_id integer,
target_node_id integer,
flags integer,
transfer_mode citus.shard_transfer_mode default 'auto')
RETURNS void
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_internal_copy_single_shard_placement$$;

Some files were not shown because too many files have changed in this diff Show More