mirror of https://github.com/citusdata/citus.git
3154 Commits (c843cb20604c3b4742653562e040612ddd836120)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
f743b35fc2
|
Parallelize Shard Rebalancing & Unlock Concurrent Logical Shard Moves (#7983)
DESCRIPTION: Parallelizes shard rebalancing and removes the bottlenecks
that previously blocked concurrent logical-replication moves.
These improvements reduce rebalance windows—particularly for clusters
with large reference tables and enable multiple shard transfers to run in parallel.
Motivation:
Citus’ shard rebalancer has some key performance bottlenecks:
**Sequential Movement of Reference Tables:**
Reference tables are often assumed to be small, but in real-world
deployments, they can grow significantly large. Previously, reference
table shards were transferred as a single unit, making the process
monolithic and time-consuming.
**No Parallelism Within a Colocation Group:**
Although Citus distributes data using colocated shards, shard
movements within the same colocation group were serialized. In
environments with hundreds of distributed tables colocated
together, this serialization significantly slowed down rebalance
operations.
**Excessive Locking:**
Rebalancer used restrictive locks and redundant logical replication
guards, further limiting concurrency.
The goal of this commit is to eliminate these inefficiencies and enable
maximum parallelism during rebalance, without compromising correctness
or compatibility. Parallelize shard rebalancing to reduce rebalance
time.
Feature Summary:
**1. Parallel Reference Table Rebalancing**
Each reference-table shard is now copied in its own background task.
Foreign key and other constraints are deferred until all shards are
copied.
For single shard movement without considering colocation a new
internal-only UDF '`citus_internal_copy_single_shard_placement`' is
introduced to allow single-shard copy/move operations.
Since this function is internal, we do not allow users to call it
directly.
**Temporary Hack to Set Background Task Context** Background tasks
cannot currently set custom GUCs like application_name before executing
internal-only functions. 'citus_rebalancer ...' statement as a prefix in
the task command. This is a temporary hack to label internal tasks until
proper GUC injection support is added to the background task executor.
**2. Changes in Locking Strategy**
- Drop the leftover replication lock that previously serialized shard
moves performed via logical replication. This lock was only needed when
we used to drop and recreate the subscriptions/publications before each
move. Since Citus now removes those objects later as part of the “unused
distributed objects” cleanup, shard moves via logical replication can
safely run in parallel without additional locking.
- Introduced a per-shard advisory lock to prevent concurrent operations
on the same shard while allowing maximum parallelism elsewhere.
- Change the lock mode in AcquirePlacementColocationLock from
ExclusiveLock to RowExclusiveLock to allow concurrent updates within the
same colocation group, while still preventing concurrent DDL operations.
**3. citus_rebalance_start() enhancements**
The citus_rebalance_start() function now accepts two new optional
parameters:
```
- parallel_transfer_colocated_shards BOOLEAN DEFAULT false,
- parallel_transfer_reference_tables BOOLEAN DEFAULT false
```
This ensures backward compatibility by preserving the existing behavior
and avoiding any disruption to user expectations and when both are set
to true, the rebalancer operates with full parallelism.
**Previous Rebalancer Behavior:**
`SELECT citus_rebalance_start(shard_transfer_mode := 'force_logical');`
This would:
Start a single background task for replicating all reference tables
Then, move all shards serially, one at a time.
```
Task 1: replicate_reference_tables()
↓
Task 2: move_shard_1()
↓
Task 3: move_shard_2()
↓
Task 4: move_shard_3()
```
Slow and sequential. Reference table copy is a bottleneck. Colocated
shards must wait for each other.
**New Parallel Rebalancer:**
```
SELECT citus_rebalance_start(
shard_transfer_mode := 'force_logical',
parallel_transfer_colocated_shards := true,
parallel_transfer_reference_tables := true
);
```
This would:
- Schedule independent background tasks for each reference-table shard.
- Move colocated shards in parallel, while still maintaining dependency
order.
- Defer constraint application until all reference shards are in place.
-
```
Task 1: copy_ref_shard_1()
Task 2: copy_ref_shard_2()
Task 3: copy_ref_shard_3()
→ Task 4: apply_constraints()
↓
Task 5: copy_shard_1()
Task 6: copy_shard_2()
Task 7: copy_shard_3()
↓
Task 8-10: move_shard_1..3()
```
Each operation is scheduled independently and can run as soon as
dependencies are satisfied.
|
|
|
|
8d929d3bf8
|
Push down recurring outer joins when possible (#7973)
DESCRIPTION: Adds support for pushing down LEFT/RIGHT outer joins having a reference table in the outer side and a distributed table on the inner side (e.g., <reference table> LEFT JOIN <distributed table>) Partially addresses #6546 1) `<outer:reference>` LEFT JOIN `<inner:distributed>` 2) `<inner:distributed>` RIGHT JOIN `<outer:reference>` Previously, for outer joins of types (1) and (2), the distributed side was computed recursively. This was necessary because, when the inner side of a recurring outer join is a distributed table, it is not possible to directly distribute the join; the preserved (outer and recurring) side may generate rows with join keys that hash to different shards. To implement distributed planning while maintaining consistency with global execution semantics, this PR restricts the outer side only to those partition key values that route to the selected shard during distributed shard query computation. This method is employed )when the following criteria are met: (recursive planning applied otherwise) - The join type is (1) or (2) (lateral joins are not supported). - The outer side is a reference table. - The outer join qualifications include an equality condition between the partition column of a distributed table and the recurring table. - The join is not part of a chained join. - The “enable_recurring_outer_join_pushdown” GUC is enabled (default is on). --------- Co-authored-by: ebruaydingol <ebruaydingol@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> |
|
|
|
87a1b631e8
|
Not automatically create citus_columnar when creating citus extension (#8081)
DESCRIPTION: Not automatically create citus_columnar when there are no relations using it. Previously, we were always creating citus_columnar when creating citus with version >= 11.1. And how we were doing was as follows: * Detach SQL objects owned by old columnar, i.e., "drop" them from citus, but not actually drop them from the database * "old columnar" is the one that we had before Citus 11.1 as part of citus, i.e., before splitting the access method ands its catalog to citus_columnar. * Create citus_columnar and attach the SQL objects leftover from old columnar to it so that we can continue supporting the columnar tables that user had before Citus 11.1 with citus_columnar. First part is unchanged, however, now we don't create citus_columnar automatically anymore if the user didn't have any relations using columnar. For this reason, as of Citus 13.2, when these SQL objects are not owned by an extension and there are no relations using columnar access method, we drop these SQL objects when updating Citus to 13.2. The net effect is still the same as if we automatically created citus_columnar and user dropped citus_columnar later, so we should not have any issues with dropping them. (**Update:** Seems we've made some assumptions in citus, e.g., citus_finish_pg_upgrade() still assumes columnar metadata exists and tries to apply some fixes for it, so this PR fixes them as well. See the last section of this PR description.) Also, ideally I was hoping to just remove some lines of code from extension.c, where we decide automatically creating citus_columnar when creating citus, however, this didn't happen to be the case for two reasons: * We still need to automatically create it for the servers using columnar access method. * We need to clean-up the leftover SQL objects from old columnar when the above is not case otherwise we would have leftover SQL objects from old columnar for no reason, and that would confuse users too. * Old columnar cannot be used to create columnar tables properly, so we should clean them up and let the user decide whether they want to create citus_columnar when they really need it later. --- Also made several changes in the test suite because similarly, we don't always want to have citus_columnar created in citus tests anymore: * Now, columnar specific test targets, which cover **41** test sql files, always install columnar by default, by using "--load-extension=citus_columnar". * "--load-extension=citus_columnar" is not added to citus specific test targets because by default we don't want to have citus_columnar created during citus tests. * Excluding citus_columnar specific tests, we have **601** sql files that we have as citus tests and in **27** of them we manually create citus_columnar at the very beginning of the test because these tests do test some functionalities of citus together with columnar tables. Also, before and after schedules for PG upgrade tests are now duplicated so we have two versions of each: one with columnar tests and one without. To choose between them, check-pg-upgrade now supports a "test-with-columnar" option, which can be set to "true" or anything else to logically indicate "false". In CI, we run the check-pg-upgrade test target with both options. The purpose is to ensure we can test PG upgrades where citus_columnar is not created in the cluster before the upgrade as well. Finally, added more tests to multi_extension.sql to test Citus upgrade scenarios with / without columnar tables / citus_columnar extension. --- Also, seems citus_finish_pg_upgrade was assuming that citus_columnar is always created but actually we should have never made such an assumption. To fix that, moved columnar specific post-PG-upgrade work from citus to a new columnar UDF, which is columnar_finish_pg_upgrade. But to avoid breaking existing customer / managed service scripts, we continue to automatically perform post PG-upgrade work for columnar within citus_finish_pg_upgrade, but only if columnar access method exists this time. |
|
|
|
41883cea38
|
PG18 - unify psql headings to ‘List of relations’ (#8119)
fixes #8110 This patch updates the `normalize.sed` script used in pg18 psql regression tests: - Replaces the headings “List of tables”, “List of indexes”, and “List of sequences” with a single, uniform heading: “List of relations”. |
|
|
|
bfc6d1f440
|
PG18 - Adjust EXPLAIN's output for disabled nodes (#8108)
fixes #8097 |
|
|
|
a6161f5a21
|
Fix CTE traversal for outer Vars in FindReferencedTableColumn (remove assert; correct parentQueryList handling) (#8106)
fixes #8105 This change lets `FindReferencedTableColumn()` correctly resolve columns through a CTE even when the expression comes from an outer query level (`varlevelsup > 0`, `skipOuterVars = false`). Before, we hit an `Assert(skipOuterVars)` in this path. **Problem** * Hitting a CTE after walking outer Vars triggered `Assert(skipOuterVars)`. * Cause: we modified `parentQueryList` in place and didn’t rebuild the correct parent chain before recursing into the CTE, so the path was considered unsafe. **Fix** * Remove the `Assert(skipOuterVars)` in the `RTE_CTE` branch. * Find the CTE’s owning level via `ctelevelsup` and compute `cteParentListIndex`. * Rebuild a private parent list for recursion: `list_copy` → `list_truncate` → `lappend(current query)`. * Add a bounds check before indexing the CTE’s `targetList`. **Why it works** ```diff -parentQueryList = lappend(parentQueryList, query); -FindReferencedTableColumn(targetEntry->expr, parentQueryList, - cteQuery, column, rteContainingReferencedColumn, - skipOuterVars); + /* hand a private, bounded parent list to the recursion */ + List *newParent = list_copy(parentQueryList); + newParent = list_truncate(newParent, cteParentListIndex + 1); + newParent = lappend(newParent, query); + + FindReferencedTableColumn(targetEntry->expr, + newParent, + cteQuery, + column, + rteContainingReferencedColumn, + skipOuterVars); +} ``` **Before:** We changed `parentQueryList` in place (`parentQueryList = lappend(...)`) and didn’t trim it to the CTE’s owner level. **After:** We copy the list, trim it to the CTE’s owner level, then append the current query. This keeps the parent list accurate for the current recursion and safe when following outer Vars. **Example: Nested subquery referencing the CTE (two levels down)** ``` WITH c AS MATERIALIZED (SELECT user_id FROM raw_events_first) SELECT 1 FROM raw_events_first t WHERE EXISTS ( SELECT 1 FROM (SELECT user_id FROM c) c2 WHERE c2.user_id = t.user_id ); ``` Levels: Q0 = top SELECT Q1 = EXISTS subquery Q2 = inner (SELECT user_id FROM c) When resolving c2.user_id inside Q2: - parentQueryList is [Q0, Q1, Q2]. - `ctelevelsup`: 2 `cteParentListIndex = length(parentQueryList) - ctelevelsup - 1` - Recurse into the CTE’s query with [Q0, Q2]. **Tests (added in `multi_insert_select`)** * **T1:** Correlated subquery that references a CTE (one level down) Verifies that resolving through `RTE_CTE` after following an outer `Var` succeeds, row count matches source table. * **T2:** Nested subquery that references a CTE (two levels down) Exercises deeper recursion and confirms identical to T1. * **T3:** Scalar subquery in a target list that reads from the outer CTE Checks expected row count and that no NULLs are inserted. These tests cover the cases that previously hit `Assert(skipOuterVars)` and confirm CTE references while following outer Vars. |
|
|
|
6b6d959fac
|
PG18 - pg17.sql Simplify step 10 verification to use COUNT(*) instead of SELECT * (#8111)
fixes #8096
PostgreSQL 18 adds a `conenforced` flag allowing `CHECK` constraints to
be declared `NOT ENFORCED`.
|
|
|
|
3d8fd337e5
|
Check outer table partition column (#8092)
DESCRIPTION: Introduce a new check to push down a query including union and outer join to fix #8091 . In "SafeToPushdownUnionSubquery", we check if the distribution column of the outer relation is in the target list. |
|
|
|
889aa92ac0
|
EXPLAIN ANALYZE - Prevent execution of the plan during the plan-print (#8017)
DESCRIPTION: Fixed a bug in EXPLAIN ANALYZE to prevent unintended (duplicate) execution of the (sub)plans during the explain phase. Fixes #4212 ### 🐞 Bug #4212 : Redundant (Subplan) Execution in `EXPLAIN ANALYZE` codepath #### 🔍 Background In the standard PostgreSQL execution path, `ExplainOnePlan()` is responsible for two distinct operations depending on whether `EXPLAIN ANALYZE` is requested: 1. **Execute the plan** ```c if (es->analyze) ExecutorRun(queryDesc, direction, 0L, true); ``` 2. **Print the plan tree** ```c ExplainPrintPlan(es, queryDesc); ``` When printing the plan, the executor should **not run the plan again**. Execution is only expected to happen once—at the top level when `es->analyze = true`. --- #### ⚠️ Issue in Citus In the Citus implementation of `CustomScanMethods.ExplainCustomScan = CitusExplainScan`, which is a custom scan explain callback function used to print explain information of a Citus plan incorrectly performs **redundant execution** inside the explain path of `ExplainPrintPlan()` ```c ExplainOnePlan() ExplainPrintPlan() ExplainNode() CitusExplainScan() if (distributedPlan->subPlanList != NIL) { ExplainSubPlans(distributedPlan, es); { PlannedStmt *plan = subPlan->plan; ExplainOnePlan(plan, ...); // ⚠️ May re-execute subplan if es->analyze is true } } ``` This causes the subplans to be **executed again**, even though they have already been executed during the top-level plan execution. This behavior violates the expectation in PostgreSQL where `EXPLAIN ANALYZE` should **execute each node exactly once** for analysis. --- #### ✅ Fix (proposed) Save the output of Subplans during `ExecuteSubPlans()`, and later use it in `ExplainSubPlans()` |
|
|
|
3e2b6f61fa
|
Bump certifi from 2024.2.2 to 2024.7.4 in /src/test/regress (#8076)
Bumps [certifi](https://github.com/certifi/python-certifi) from 2024.2.2 to 2024.7.4. <details> <summary>Commits</summary> <ul> <li><a href=" |
|
|
|
0c1b31cdb5
|
Fix UPDATE stmts with indirection & array/jsonb subscripting with more than 1 field (#7675)
DESCRIPTION: Fixes problematic UPDATE statements with indirection and array/jsonb subscripting with more than one field. Fixes #4092, #7674 and #5621. Issues #7674 and #4092 involve an UPDATE with out of order columns and a sublink (SELECT) in the source, e.g. `UPDATE T SET (col3, col1, col4) = (SELECT 3, 1, 4)` where an incorrect value could get written to a column because query deparsing generated an incorrect SQL statement. To address this the fix adds an additional check to `ruleutils` to ensure that the target list of an UPDATE statement is in an order so that deparsing can be done safely. It is needed when the source of the UPDATE has a sublink, because Postgres `rewrite` will have put the target list in attribute order, but for deparsing to produce a correct SQL text the target list needs to be in order of the references (or `paramids`) to the target list of the sublink(s). Issue #5621 involves an UPDATE with array/jsonb subscripting that can behave incorrectly with more than one field, again because Citus query deparsing is receiving a post-`rewrite` query tree. The fix also adds a check to `ruleutils` to enable correct query deparsing of the UPDATE. --------- Co-authored-by: Ibrahim Halatci <ihalatci@gmail.com> Co-authored-by: Colm McHugh <colm.mchugh@gmail.com> |
|
|
|
245a62df3e
|
Avoid query deparse and planning of shard query in local execution. (#8035)
DESCRIPTION: Avoid query deparse and planning of shard query in local execution. Adds citus.enable_local_execution_local_plan GUC to allow avoiding unnecessary query deparsing to improve performance of fast-path queries targeting local shards. If a fast path query resolves to a shard that is local to the node planning the query, a shortcut can be taken so that the OID of the shard is plugged into the parse tree, which is then planned by Postgres. In `local_executor.c` the task uses that plan instead of parsing and planning a shard query. How this is done: The fast path planner identifies if the shortcut is possible, and then the distributed planner checks, using `CheckAndBuildDelayedFastPathPlan()`, if a local plan can be generated or if the shard query should be generated. This optimization is controlled by a GUC `citus.enable_local_execution_local_plan` which is on by default. A new regress test `local_execution_local_plan` tests both row-sharding and schema sharding. Negative tests are added to `local_shard_execution_dropped_column` to verify that the optimization is not taken when the shard is local but there is a difference between the shard and distributed table because of a dropped column. |
|
|
|
3da9096d53
|
Bump black from 24.2.0 to 24.3.0 in /src/test/regress (#8062)
Bumps [black](https://github.com/psf/black) from 24.2.0 to 24.3.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/psf/black/releases">black's releases</a>.</em></p> <blockquote> <h2>24.3.0</h2> <h3>Highlights</h3> <p>This release is a milestone: it fixes Black's first CVE security vulnerability. If you run Black on untrusted input, or if you habitually put thousands of leading tab characters in your docstrings, you are strongly encouraged to upgrade immediately to fix <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p> <p>This release also fixes a bug in Black's AST safety check that allowed Black to make incorrect changes to certain f-strings that are valid in Python 3.12 and higher.</p> <h3>Stable style</h3> <ul> <li>Don't move comments along with delimiters, which could cause crashes (<a href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li> <li>Strengthen AST safety check to catch more unsafe changes to strings. Previous versions of Black would incorrectly format the contents of certain unusual f-strings containing nested strings with the same quote type. Now, Black will crash on such strings until support for the new f-string syntax is implemented. (<a href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li> <li>Fix a bug where line-ranges exceeding the last code line would not work as expected (<a href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Fix catastrophic performance on docstrings that contain large numbers of leading tab characters. This fixes <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>. (<a href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>Note what happens when <code>--check</code> is used with <code>--quiet</code> (<a href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/psf/black/blob/main/CHANGES.md">black's changelog</a>.</em></p> <blockquote> <h2>24.3.0</h2> <h3>Highlights</h3> <p>This release is a milestone: it fixes Black's first CVE security vulnerability. If you run Black on untrusted input, or if you habitually put thousands of leading tab characters in your docstrings, you are strongly encouraged to upgrade immediately to fix <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p> <p>This release also fixes a bug in Black's AST safety check that allowed Black to make incorrect changes to certain f-strings that are valid in Python 3.12 and higher.</p> <h3>Stable style</h3> <ul> <li>Don't move comments along with delimiters, which could cause crashes (<a href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li> <li>Strengthen AST safety check to catch more unsafe changes to strings. Previous versions of Black would incorrectly format the contents of certain unusual f-strings containing nested strings with the same quote type. Now, Black will crash on such strings until support for the new f-string syntax is implemented. (<a href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li> <li>Fix a bug where line-ranges exceeding the last code line would not work as expected (<a href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li> </ul> <h3>Performance</h3> <ul> <li>Fix catastrophic performance on docstrings that contain large numbers of leading tab characters. This fixes <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>. (<a href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li> </ul> <h3>Documentation</h3> <ul> <li>Note what happens when <code>--check</code> is used with <code>--quiet</code> (<a href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
|
|
|
743c9bbf87
|
fix #7715 - add assign hook for CDC library path adjustment (#8025)
DESCRIPTION: Automatically updates dynamic_library_path when CDC is enabled fix : #7715 According to the documentation and `pg_settings`, the context of the `citus.enable_change_data_capture` parameter is user. However, changing this parameter — even as a superuser — doesn't work as expected: while the initial copy phase works correctly, subsequent change events are not propagated. This appears to be due to the fact that `dynamic_library_path` is only updated to `$libdir/citus_decoders:$libdir` when the server is restarted and the `_PG_init` function is invoked. To address this, I added an `EnableChangeDataCaptureAssignHook` that automatically updates `dynamic_library_path` at runtime when `citus.enable_change_data_capture` is enabled, ensuring that the CDC decoder libraries are properly loaded. Note that `dynamic_library_path` is already a `superuser`-context parameter in base PostgreSQL, so updating it from within the assign hook should be safe and consistent with PostgreSQL’s configuration model. If there’s any reason this approach might be problematic or if there’s a preferred alternative, I’d appreciate any feedback. cc. @jy-min --------- Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com> Co-authored-by: ibrahim halatci <ihalatci@gmail.com> |
|
|
|
a8900b57e6
|
PG18 - Strip decimal fractions from actual rows counts in normalize.sed (#8041)
Fixes #8040
```
- Custom Scan (Citus Adaptive) (actual rows=0 loops=1)
+ Custom Scan (Citus Adaptive) (actual rows=0.00 loops=1)
```
Add a normalization rule to the pg_regress `normalize.sed` script that
strips any trailing decimal fraction from actual rows= counts (e.g.
turning `actual rows=0.00` into `actual rows=0`). This silences noise
diffs introduced by the new PostgreSQL 18 beta’s planner output.
commit
|
|
|
|
5deaf9a616
|
Bump werkzeug from 2.3.7 to 3.0.6 in /src/test/regress (#8003)
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.3.7 to 3.0.6. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/releases">werkzeug's releases</a>.</em></p> <blockquote> <h2>3.0.6</h2> <p>This is the Werkzeug 3.0.6 security fix release, which fixes security issues but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.6/">https://pypi.org/project/Werkzeug/3.0.6/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-6">https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-6</a></p> <ul> <li>Fix how <code>max_form_memory_size</code> is applied when parsing large non-file fields. <a href="https://github.com/advisories/GHSA-q34m-jh98-gwm2">GHSA-q34m-jh98-gwm2</a></li> <li><code>safe_join</code> catches certain paths on Windows that were not caught by <code>ntpath.isabs</code> on Python < 3.11. <a href="https://github.com/advisories/GHSA-f9vj-2wh5-fj8j">GHSA-f9vj-2wh5-fj8j</a></li> </ul> <h2>3.0.5</h2> <p>This is the Werkzeug 3.0.5 fix release, which fixes bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.5/">https://pypi.org/project/Werkzeug/3.0.5/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-5">https://werkzeug.palletsprojects.com/en/stable/changes/#version-3-0-5</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/37?closed=1">https://github.com/pallets/werkzeug/milestone/37?closed=1</a></p> <ul> <li>The Watchdog reloader ignores file closed no write events. <a href="https://redirect.github.com/pallets/werkzeug/issues/2945">#2945</a></li> <li>Logging works with client addresses containing an IPv6 scope. <a href="https://redirect.github.com/pallets/werkzeug/issues/2952">#2952</a></li> <li>Ignore invalid authorization parameters. <a href="https://redirect.github.com/pallets/werkzeug/issues/2955">#2955</a></li> <li>Improve type annotation fore <code>SharedDataMiddleware</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2958">#2958</a></li> <li>Compatibility with Python 3.13 when generating debugger pin and the current UID does not have an associated name. <a href="https://redirect.github.com/pallets/werkzeug/issues/2957">#2957</a></li> </ul> <h2>3.0.4</h2> <p>This is the Werkzeug 3.0.4 fix release, which fixes bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.4/">https://pypi.org/project/Werkzeug/3.0.4/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-4">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-4</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/36?closed=1">https://github.com/pallets/werkzeug/milestone/36?closed=1</a></p> <ul> <li>Restore behavior where parsing <code>multipart/x-www-form-urlencoded</code> data with invalid UTF-8 bytes in the body results in no form data parsed rather than a 413 error. <a href="https://redirect.github.com/pallets/werkzeug/issues/2930">#2930</a></li> <li>Improve <code>parse_options_header</code> performance when parsing unterminated quoted string values. <a href="https://redirect.github.com/pallets/werkzeug/issues/2904">#2904</a></li> <li>Debugger pin auth is synchronized across threads/processes when tracking failed entries. <a href="https://redirect.github.com/pallets/werkzeug/issues/2916">#2916</a></li> <li>Dev server handles unexpected <code>SSLEOFError</code> due to issue in Python < 3.13. <a href="https://redirect.github.com/pallets/werkzeug/issues/2926">#2926</a></li> <li>Debugger pin auth works when the URL already contains a query string. <a href="https://redirect.github.com/pallets/werkzeug/issues/2918">#2918</a></li> </ul> <h2>3.0.3</h2> <p>This is the Werkzeug 3.0.3 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Werkzeug/3.0.3/">https://pypi.org/project/Werkzeug/3.0.3/</a> Changes: <a href="https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3">https://werkzeug.palletsprojects.com/en/3.0.x/changes/#version-3-0-3</a> Milestone: <a href="https://github.com/pallets/werkzeug/milestone/35?closed=1">https://github.com/pallets/werkzeug/milestone/35?closed=1</a></p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger UI makes requests using the full URL rather than only the path. GHSA-2g68-c3qc-8985</li> <li>Make reloader more robust when <code>""</code> is in <code>sys.path</code>. <a href="https://redirect.github.com/pallets/werkzeug/issues/2823">#2823</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/werkzeug/blob/main/CHANGES.rst">werkzeug's changelog</a>.</em></p> <blockquote> <h2>Version 3.0.6</h2> <p>Released 2024-10-25</p> <ul> <li>Fix how <code>max_form_memory_size</code> is applied when parsing large non-file fields. :ghsa:<code>q34m-jh98-gwm2</code></li> <li><code>safe_join</code> catches certain paths on Windows that were not caught by <code>ntpath.isabs</code> on Python < 3.11. :ghsa:<code>f9vj-2wh5-fj8j</code></li> </ul> <h2>Version 3.0.5</h2> <p>Released 2024-10-24</p> <ul> <li>The Watchdog reloader ignores file closed no write events. :issue:<code>2945</code></li> <li>Logging works with client addresses containing an IPv6 scope :issue:<code>2952</code></li> <li>Ignore invalid authorization parameters. :issue:<code>2955</code></li> <li>Improve type annotation fore <code>SharedDataMiddleware</code>. :issue:<code>2958</code></li> <li>Compatibility with Python 3.13 when generating debugger pin and the current UID does not have an associated name. :issue:<code>2957</code></li> </ul> <h2>Version 3.0.4</h2> <p>Released 2024-08-21</p> <ul> <li>Restore behavior where parsing <code>multipart/x-www-form-urlencoded</code> data with invalid UTF-8 bytes in the body results in no form data parsed rather than a 413 error. :issue:<code>2930</code></li> <li>Improve <code>parse_options_header</code> performance when parsing unterminated quoted string values. :issue:<code>2904</code></li> <li>Debugger pin auth is synchronized across threads/processes when tracking failed entries. :issue:<code>2916</code></li> <li>Dev server handles unexpected <code>SSLEOFError</code> due to issue in Python < 3.13. :issue:<code>2926</code></li> <li>Debugger pin auth works when the URL already contains a query string. :issue:<code>2918</code></li> </ul> <h2>Version 3.0.3</h2> <p>Released 2024-05-05</p> <ul> <li>Only allow <code>localhost</code>, <code>.localhost</code>, <code>127.0.0.1</code>, or the specified hostname when running the dev server, to make debugger requests. Additional hosts can be added by using the debugger middleware directly. The debugger</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
|
|
|
4cd8bb1b67 | Bump Citus version to 13.2devel | |
|
|
55a0d1f730
|
Add skip_qualify_public param to shard_name() to allow qualifying for "public" schema (#8014)
DESCRIPTION: Adds skip_qualify_public param to `shard_name()` UDF to allow qualifying for "public" schema when needed. |
|
|
|
5e37fe0c46
|
Bump cryptography from 42.0.3 to 44.0.1 in /src/test/regress (#7996)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.3 to 44.0.1. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's changelog</a>.</em></p> <blockquote> <p>44.0.1 - 2025-02-11</p> <pre><code> * Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL 3.4.1. * We now build ``armv7l`` ``manylinux`` wheels and publish them to PyPI. * We now build ``manylinux_2_34`` wheels and publish them to PyPI. <p>.. _v44-0-0:</p> <p>44.0.0 - 2024-11-27 </code></pre></p> <ul> <li><strong>BACKWARDS INCOMPATIBLE:</strong> Dropped support for LibreSSL < 3.9.</li> <li>Deprecated Python 3.7 support. Python 3.7 is no longer supported by the Python core team. Support for Python 3.7 will be removed in a future <code>cryptography</code> release.</li> <li>Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL 3.4.0.</li> <li>macOS wheels are now built against the macOS 10.13 SDK. Users on older versions of macOS should upgrade, or they will need to build <code>cryptography</code> themselves.</li> <li>Enforce the :rfc:<code>5280</code> requirement that extended key usage extensions must not be empty.</li> <li>Added support for timestamp extraction to the :class:<code>~cryptography.fernet.MultiFernet</code> class.</li> <li>Relax the Authority Key Identifier requirements on root CA certificates during X.509 verification to allow fields permitted by :rfc:<code>5280</code> but forbidden by the CA/Browser BRs.</li> <li>Added support for :class:<code>~cryptography.hazmat.primitives.kdf.argon2.Argon2id</code> when using OpenSSL 3.2.0+.</li> <li>Added support for the :class:<code>~cryptography.x509.Admissions</code> certificate extension.</li> <li>Added basic support for PKCS7 decryption (including S/MIME 3.2) via :func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_der</code>, :func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_pem</code>, and :func:<code>~cryptography.hazmat.primitives.serialization.pkcs7.pkcs7_decrypt_smime</code>.</li> </ul> <p>.. _v43-0-3:</p> <p>43.0.3 - 2024-10-18</p> <pre><code> * Fixed release metadata for ``cryptography-vectors`` <p>.. _v43-0-2:</p> <p>43.0.2 - 2024-10-18 </code></pre></p> <ul> <li>Fixed compilation when using LibreSSL 4.0.0.</li> </ul> <p>.. _v43-0-1:</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
|
|
|
e8c3179b4d
|
Bump tornado from 6.4.2 to 6.5.1 in /src/test/regress (#8001)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.2 to 6.5.1. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst">tornado's changelog</a>.</em></p> <blockquote> <h1>Release notes</h1> <p>.. toctree:: :maxdepth: 2</p> <p>releases/v6.5.1 releases/v6.5.0 releases/v6.4.2 releases/v6.4.1 releases/v6.4.0 releases/v6.3.3 releases/v6.3.2 releases/v6.3.1 releases/v6.3.0 releases/v6.2.0 releases/v6.1.0 releases/v6.0.4 releases/v6.0.3 releases/v6.0.2 releases/v6.0.1 releases/v6.0.0 releases/v5.1.1 releases/v5.1.0 releases/v5.0.2 releases/v5.0.1 releases/v5.0.0 releases/v4.5.3 releases/v4.5.2 releases/v4.5.1 releases/v4.5.0 releases/v4.4.3 releases/v4.4.2 releases/v4.4.1 releases/v4.4.0 releases/v4.3.0 releases/v4.2.1 releases/v4.2.0 releases/v4.1.0 releases/v4.0.2 releases/v4.0.1 releases/v4.0.0 releases/v3.2.2 releases/v3.2.1 releases/v3.2.0 releases/v3.1.1 releases/v3.1.0 releases/v3.0.2 releases/v3.0.1 releases/v3.0.0</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
|
|
|
92dc7f36fc
|
Bump jinja2 from 3.1.3 to 3.1.6 in /src/test/regress (#8002)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.3 to 3.1.6. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/pallets/jinja/releases">jinja2's releases</a>.</em></p> <blockquote> <h2>3.1.6</h2> <p>This is the Jinja 3.1.6 security release, which fixes security issues but does not otherwise change behavior and should not result in breaking changes compared to the latest feature release.</p> <p>PyPI: <a href="https://pypi.org/project/Jinja2/3.1.6/">https://pypi.org/project/Jinja2/3.1.6/</a> Changes: <a href="https://jinja.palletsprojects.com/en/stable/changes/#version-3-1-6">https://jinja.palletsprojects.com/en/stable/changes/#version-3-1-6</a></p> <ul> <li>The <code>|attr</code> filter does not bypass the environment's attribute lookup, allowing the sandbox to apply its checks. <a href="https://github.com/pallets/jinja/security/advisories/GHSA-cpwx-vrp4-4pq7">https://github.com/pallets/jinja/security/advisories/GHSA-cpwx-vrp4-4pq7</a></li> </ul> <h2>3.1.5</h2> <p>This is the Jinja 3.1.5 security fix release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes compared to the latest feature release.</p> <p>PyPI: <a href="https://pypi.org/project/Jinja2/3.1.5/">https://pypi.org/project/Jinja2/3.1.5/</a> Changes: <a href="https://jinja.palletsprojects.com/changes/#version-3-1-5">https://jinja.palletsprojects.com/changes/#version-3-1-5</a> Milestone: <a href="https://github.com/pallets/jinja/milestone/16?closed=1">https://github.com/pallets/jinja/milestone/16?closed=1</a></p> <ul> <li>The sandboxed environment handles indirect calls to <code>str.format</code>, such as by passing a stored reference to a filter that calls its argument. <a href="https://github.com/pallets/jinja/security/advisories/GHSA-q2x7-8rv6-6q7h">GHSA-q2x7-8rv6-6q7h</a></li> <li>Escape template name before formatting it into error messages, to avoid issues with names that contain f-string syntax. <a href="https://redirect.github.com/pallets/jinja/issues/1792">#1792</a>, <a href="https://github.com/pallets/jinja/security/advisories/GHSA-gmj6-6f8f-6699">GHSA-gmj6-6f8f-6699</a></li> <li>Sandbox does not allow <code>clear</code> and <code>pop</code> on known mutable sequence types. <a href="https://redirect.github.com/pallets/jinja/issues/2032">#2032</a></li> <li>Calling sync <code>render</code> for an async template uses <code>asyncio.run</code>. <a href="https://redirect.github.com/pallets/jinja/issues/1952">#1952</a></li> <li>Avoid unclosed <code>auto_aiter</code> warnings. <a href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li> <li>Return an <code>aclose</code>-able <code>AsyncGenerator</code> from <code>Template.generate_async</code>. <a href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li> <li>Avoid leaving <code>root_render_func()</code> unclosed in <code>Template.generate_async</code>. <a href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li> <li>Avoid leaving async generators unclosed in blocks, includes and extends. <a href="https://redirect.github.com/pallets/jinja/issues/1960">#1960</a></li> <li>The runtime uses the correct <code>concat</code> function for the current environment when calling block references. <a href="https://redirect.github.com/pallets/jinja/issues/1701">#1701</a></li> <li>Make <code>|unique</code> async-aware, allowing it to be used after another async-aware filter. <a href="https://redirect.github.com/pallets/jinja/issues/1781">#1781</a></li> <li><code>|int</code> filter handles <code>OverflowError</code> from scientific notation. <a href="https://redirect.github.com/pallets/jinja/issues/1921">#1921</a></li> <li>Make compiling deterministic for tuple unpacking in a <code>{% set ... %}</code> call. <a href="https://redirect.github.com/pallets/jinja/issues/2021">#2021</a></li> <li>Fix dunder protocol (<code>copy</code>/<code>pickle</code>/etc) interaction with <code>Undefined</code> objects. <a href="https://redirect.github.com/pallets/jinja/issues/2025">#2025</a></li> <li>Fix <code>copy</code>/<code>pickle</code> support for the internal <code>missing</code> object. <a href="https://redirect.github.com/pallets/jinja/issues/2027">#2027</a></li> <li><code>Environment.overlay(enable_async)</code> is applied correctly. <a href="https://redirect.github.com/pallets/jinja/issues/2061">#2061</a></li> <li>The error message from <code>FileSystemLoader</code> includes the paths that were searched. <a href="https://redirect.github.com/pallets/jinja/issues/1661">#1661</a></li> <li><code>PackageLoader</code> shows a clearer error message when the package does not contain the templates directory. <a href="https://redirect.github.com/pallets/jinja/issues/1705">#1705</a></li> <li>Improve annotations for methods returning copies. <a href="https://redirect.github.com/pallets/jinja/issues/1880">#1880</a></li> <li><code>urlize</code> does not add <code>mailto:</code> to values like <code>@a@b</code>. <a href="https://redirect.github.com/pallets/jinja/issues/1870">#1870</a></li> <li>Tests decorated with <code>@pass_context</code> can be used with the <code>|select</code> filter. <a href="https://redirect.github.com/pallets/jinja/issues/1624">#1624</a></li> <li>Using <code>set</code> for multiple assignment (<code>a, b = 1, 2</code>) does not fail when the target is a namespace attribute. <a href="https://redirect.github.com/pallets/jinja/issues/1413">#1413</a></li> <li>Using <code>set</code> in all branches of <code>{% if %}{% elif %}{% else %}</code> blocks does not cause the variable to be considered initially undefined. <a href="https://redirect.github.com/pallets/jinja/issues/1253">#1253</a></li> </ul> <h2>3.1.4</h2> <p>This is the Jinja 3.1.4 security release, which fixes security issues and bugs but does not otherwise change behavior and should not result in breaking changes.</p> <p>PyPI: <a href="https://pypi.org/project/Jinja2/3.1.4/">https://pypi.org/project/Jinja2/3.1.4/</a> Changes: <a href="https://jinja.palletsprojects.com/en/3.1.x/changes/#version-3-1-4">https://jinja.palletsprojects.com/en/3.1.x/changes/#version-3-1-4</a></p> <ul> <li>The <code>xmlattr</code> filter does not allow keys with <code>/</code> solidus, <code>></code> greater-than sign, or <code>=</code> equals sign, in addition to disallowing spaces. Regardless of any validation done by Jinja, user input should never be used as keys to this filter, or must be separately validated first. GHSA-h75v-3vvj-5mfj</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/pallets/jinja/blob/main/CHANGES.rst">jinja2's changelog</a>.</em></p> <blockquote> <h2>Version 3.1.6</h2> <p>Released 2025-03-05</p> <ul> <li>The <code>|attr</code> filter does not bypass the environment's attribute lookup, allowing the sandbox to apply its checks. :ghsa:<code>cpwx-vrp4-4pq7</code></li> </ul> <h2>Version 3.1.5</h2> <p>Released 2024-12-21</p> <ul> <li>The sandboxed environment handles indirect calls to <code>str.format</code>, such as by passing a stored reference to a filter that calls its argument. :ghsa:<code>q2x7-8rv6-6q7h</code></li> <li>Escape template name before formatting it into error messages, to avoid issues with names that contain f-string syntax. :issue:<code>1792</code>, :ghsa:<code>gmj6-6f8f-6699</code></li> <li>Sandbox does not allow <code>clear</code> and <code>pop</code> on known mutable sequence types. :issue:<code>2032</code></li> <li>Calling sync <code>render</code> for an async template uses <code>asyncio.run</code>. :pr:<code>1952</code></li> <li>Avoid unclosed <code>auto_aiter</code> warnings. :pr:<code>1960</code></li> <li>Return an <code>aclose</code>-able <code>AsyncGenerator</code> from <code>Template.generate_async</code>. :pr:<code>1960</code></li> <li>Avoid leaving <code>root_render_func()</code> unclosed in <code>Template.generate_async</code>. :pr:<code>1960</code></li> <li>Avoid leaving async generators unclosed in blocks, includes and extends. :pr:<code>1960</code></li> <li>The runtime uses the correct <code>concat</code> function for the current environment when calling block references. :issue:<code>1701</code></li> <li>Make <code>|unique</code> async-aware, allowing it to be used after another async-aware filter. :issue:<code>1781</code></li> <li><code>|int</code> filter handles <code>OverflowError</code> from scientific notation. :issue:<code>1921</code></li> <li>Make compiling deterministic for tuple unpacking in a <code>{% set ... %}</code> call. :issue:<code>2021</code></li> <li>Fix dunder protocol (<code>copy</code>/<code>pickle</code>/etc) interaction with <code>Undefined</code> objects. :issue:<code>2025</code></li> <li>Fix <code>copy</code>/<code>pickle</code> support for the internal <code>missing</code> object. :issue:<code>2027</code></li> <li><code>Environment.overlay(enable_async)</code> is applied correctly. :pr:<code>2061</code></li> <li>The error message from <code>FileSystemLoader</code> includes the paths that were searched. :issue:<code>1661</code></li> <li><code>PackageLoader</code> shows a clearer error message when the package does not contain the templates directory. :issue:<code>1705</code></li> <li>Improve annotations for methods returning copies. :pr:<code>1880</code></li> <li><code>urlize</code> does not add <code>mailto:</code> to values like <code>@a@b</code>. :pr:<code>1870</code></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
|
|
|
c7f5e2b975
|
Bump tornado from 6.4 to 6.4.2 in /src/test/regress (#7984)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4 to 6.4.2. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst">tornado's changelog</a>.</em></p> <blockquote> <h1>Release notes</h1> <p>.. toctree:: :maxdepth: 2</p> <p>releases/v6.5.0 releases/v6.4.2 releases/v6.4.1 releases/v6.4.0 releases/v6.3.3 releases/v6.3.2 releases/v6.3.1 releases/v6.3.0 releases/v6.2.0 releases/v6.1.0 releases/v6.0.4 releases/v6.0.3 releases/v6.0.2 releases/v6.0.1 releases/v6.0.0 releases/v5.1.1 releases/v5.1.0 releases/v5.0.2 releases/v5.0.1 releases/v5.0.0 releases/v4.5.3 releases/v4.5.2 releases/v4.5.1 releases/v4.5.0 releases/v4.4.3 releases/v4.4.2 releases/v4.4.1 releases/v4.4.0 releases/v4.3.0 releases/v4.2.1 releases/v4.2.0 releases/v4.1.0 releases/v4.0.2 releases/v4.0.1 releases/v4.0.0 releases/v3.2.2 releases/v3.2.1 releases/v3.2.0 releases/v3.1.1 releases/v3.1.0 releases/v3.0.2 releases/v3.0.1 releases/v3.0.0 releases/v2.4.1</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
|
|
|
088ba75057
|
Add citus_nodes view (#7968)
DESCRIPTION: Adds `citus_nodes` view that displays the node name, port, role, and "active" for nodes in the cluster. This PR adds `citus_nodes` view to the `pg_catalog` schema. The `citus_nodes` view is created in the `citus` schema and is used to display the node name, port, role, and active status of each node in the `pg_dist_node` table. The view is granted `SELECT` permission to the `PUBLIC` role and is set to the `pg_catalog` schema. Test cases was added to `multi_cluster_management` tests. structs.py was modified to add white spaces as `citus_indent` required. --------- Co-authored-by: Alper Kocatas <alperkocatas@microsoft.com> |
|
|
|
a18040869a
|
Error out for queries with outer joins and pseudoconstant quals in PG<17 (#7937)
PG15 commit d1ef5631e620f9a5b6480a32bb70124c857af4f1 and PG16 commit 695f5deb7902865901eb2d50a70523af655c3a00 disallow replacing joins with scans in queries with pseudoconstant quals. This commit prevents the set_join_pathlist_hook from being called if any of the join restrictions is a pseudo-constant. So in these cases, citus has no info on the join, never sees that the query has an outer join, and ends up producing an incorrect plan. PG17 fixes this by commit 9e9931d2bf40e2fea447d779c2e133c2c1256ef3 Therefore, we take this extra measure here for PG versions less than 17. hasOuterJoin can never be true when set_join_pathlist_hook is absent. |
|
|
|
a4040ba5da
|
Planner: lift volatile target‑list items in `WrapSubquery` to coordinator (prevents sequence‑leap in distributed `INSERT … SELECT`) (#7976)
This PR fixes #7784 and refactors the `WrapSubquery(Query *subquery)` function to improve clarity and correctness when handling volatile expressions in subqueries during Citus insert-select rewriting. ### Background The `WrapSubquery` function rewrites a query of the form: ```sql INSERT INTO target_table SELECT ... FROM ... ``` ...by wrapping the `SELECT` in a subquery: ```sql SELECT <outer-TL> FROM ( <subquery with volatile expressions replaced with NULL> ) citus_insert_select_subquery ``` This transformation allows: * **Volatile expressions** (e.g., `nextval`, `now`) **not used in `GROUP BY` or `ORDER BY`** to be evaluated **exactly once on the coordinator**. * **Stable/immutable or sort-relevant expressions** to remain in the worker-executed subquery. * Placeholder `NULL`s to maintain column alignment in the inner subquery. ### Fix Details * Restructured the code into labeled logical sections: 1. Build wrapper query (`SELECT … FROM (subquery)`) 2. Rewrite target lists with volatility analysis 3. Assign and return updated query trees * Preserved existing behavior, focusing on clarity and maintainability. ### How the new code handles volatile items stage | what we look for | what we do | why -- | -- | -- | -- scan target list once | 1. `expr_is_volatile(te->expr)` 2. `te->ressortgroupref != 0` (is the column used in GROUP BY / ORDER BY?) | decide whether to hoist or keep | we must not hoist an expression the inner query still needs for sorting/grouping, otherwise its `SortGroupClause` breaks volatile & not used in sort/group | deep‑copy the expression into the outer target list | executes once on the coordinator | | leave a typed `NULL `placeholder (visible, not `resjunk`) in the inner target list | keeps column numbering stable for helpers that already ran (reorder, cast); the worker sends a cheap constant | stable / immutable, or volatile but used in sort/group | keep the original expression in the inner list; outer list references it via a `Var `| workers can evaluate it safely and, if needed, the inner ORDER BY still works | ### Example Given this query: ```sql INSERT INTO t SELECT nextval('s'), 42 FROM generate_series(1, 2); ``` The planner rewrites it as: ```sql SELECT nextval('s'), col2 FROM (SELECT NULL::bigint AS col1, 42 AS col2 FROM generate_series(1, 2)) citus_insert_select_subquery; ``` This ensures `nextval('s')` is evaluated only once per row on the **coordinator**, not on each worker node, preserving correct sequence semantics. #### **Outer‑Var guard (`FindReferencedTableColumn`)** Because `WrapSubquery` adds an extra query level, lots of Vars that the old code never expected become “outer” Vars; without teaching `FindReferencedTableColumn` to climb that extra level reliably, Citus would intermittently reject valid foreign keys and even hit asserts. * Re‑implemented the outer‑Var guard so that the function: * **Walks deterministically up the query stack** when `skipOuterVars = false` (default for FK / UNION checks). A new while‑loop copies — rather than truncates — `parentQueryList` on each hop, eliminating list‑aliasing that made *issue 5248* fail intermittently in parallel regressions. * Handles multi‑level `varlevelsup` in a single loop; never mutates the caller’s list in place. |
|
|
|
d4dd44e715
|
Propagate SECURITY LABEL on tables and columns. (#7956)
Issue #7709 asks for security labels on columns to be propagated, to support the `anon` extension. Before, Citus supported security labels on roles (#7735) and this PR adds support for propagating security labels on tables and columns. All scenarios that involve propagating metadata for a Citus table now include the security labels on the table and on the columns of the table. These scenarios are: - When a table becomes distributed using `create_distributed_table()` or `create_reference_table()`, its security labels (if any) are propageted. - When a security label is defined on a distributed table, or one of its columns, the label is propagated. - When a node is added to a Citus cluster, all distributed tables have their security labels propagated. - When a column of a distributed table is dropped, any security labels on the column are also dropped. - When a column is added to a distributed table, security labels can be defined on the column and are propagated. - Security labels on a distributed table or its columns are not propagated when `citus.enable_metadata_sync` is enabled. Regress test `seclabel` is extended with tests to cover these scenarios. The implementation is somewhat involved because it impacts DDL propagation of Citus tables, but can be broken down as follows: - distributed_object_ops has `Role_SecLabel`, `Table_SecLabel` and `Column_SecLabel` to take care of security labels on roles, tables and columns. `Any_SecLabel` is used for all other security labels and is essentially a nop. - Deparser support - `DeparseRoleSecLabelStmt()`, `DeparseTableSecLabelStmt()` and `DeparseColumnSecLabelStmt()` take care of deparsing security label statements on roles, tables and columns respectively. - When reconstructing the DDL for a citus table, security labels on the table or its columns are included by having `GetPreLoadTableCreationCommands()` call a new function `CreateSecurityLabelCommands()` to take care of any security labels on the table or its columns. - When changing a distributed table name to a shard name before running a command locally on a worker, function `RelayEventExtendNames()` checks for security labels on a table or its columns. |
|
|
|
3d61c4dc71
|
Add citus_stat_counters view and citus_stat_counters_reset() function to reset it (#7917)
DESCRIPTION: Adds citus_stat_counters view that can be used to query stat counters that Citus collects while the feature is enabled, which is controlled by citus.enable_stat_counters. citus_stat_counters() can be used to query the stat counters for the provided database oid and citus_stat_counters_reset() can be used to reset them for the provided database oid or for the current database if nothing or 0 is provided. Today we don't persist stat counters on server shutdown. In other words, stat counters are automatically reset in case of a server restart. Details on the underlying design can be found in header comment of stat_counters.c and in the technical readme. ------- Here are the details about what we track as of this PR: For connection management, we have three statistics about the inter-node connections initiated by the node itself: * **connection_establishment_succeeded** * **connection_establishment_failed** * **connection_reused** While the first two are relatively easier to understand, the third one covers the case where a connection is reused. This can happen when a connection was already established to the desired node, Citus decided to cache it for some time (see citus.max_cached_conns_per_worker & citus.max_cached_connection_lifetime), and then reused it for a new remote operation. Here are the other important details about these connection statistics: 1. connection_establishment_failed doesn't care about the connections that we could establish but are lost later in the transaction. Plus, we cannot guarantee that the connections that are counted in connection_establishment_succeeded were not lost later. 2. connection_establishment_failed doesn't care about the optional connections (see OPTIONAL_CONNECTION flag) that we gave up establishing because of the connection throttling rules we follow (see citus.max_shared_pool_size & citus.local_shared_pool_size). The reaason for this is that we didn't even try to establish these connections. 3. For the rest of the cases where a connection failed for some reason, we always increment connection_establishment_failed even if the caller was okay with the failure and know how to recover from it (e.g., the adaptive executor knows how to fall back local execution when the target node is the local node and if it cannot establish a connection to the local node). The reason is that even if it's likely that we can still serve the operation, we still failed to establish the connection and we want to track this. 4. Finally, the connection failures that we count in connection_establishment_failed might be caused by any of the following reasons and for now we prefer to _not_ further distinguish them for simplicity: a. remote node is down or cannot accept any more connections, or overloaded such that citus.node_connection_timeout is not enough to establish a connection b. any internal Citus error that might result in preparing a bad connection string so that libpq fails when parsing the connection string even before actually trying to establish a connection via connect() call c. broken citus.node_conninfo or such Citus configuration that was incorrectly set by the user can also result in similar outcomes as in b d. internal waitevent set / poll errors or OOM in local node We also track two more statistics for query execution: * **query_execution_single_shard** * **query_execution_multi_shard** And more importantly, both query_execution_single_shard and query_execution_multi_shard are not only tracked for the top-level queries but also for the subplans etc. The reason is that for some queries, e.g., the ones that go through recursive planning, after Citus performs the heavy work as part of subplans, the work that needs to be done for the top-level query becomes quite straightforward. And for such query types, it would be deceiving if we only incremented the query stat counters for the top-level query. Similarly, for non-pushable INSERT .. SELECT and MERGE queries, we perform separate counter increments for the SELECT / source part of the query besides the final INSERT / MERGE query. |
|
|
|
37e23f44b4
|
Add Support for CASCADE/RESTRICT in REVOKE statements (#7958)
Fixes #7105. DESCRIPTION: Fixes a bug that causes omitting CASCADE clause for the commands sent to workers for REVOKE commands on tables. --------- Co-authored-by: ThomasC02 <thomascantrell02@gmail.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> Co-authored-by: Tiago Silva <tiagos3373@gmail.com> |
|
|
|
1dc60e38bb
|
Propagates GRANT/REVOKE rights on table columns (#7918)
This commit adds support for GRANT/REVOKE on table columns. It extends propagated DDL according to this logic: https://github.com/citusdata/citus/tree/main/src/backend/distributed#ddl * Unchanged pre-existing behavior related to splitting ddl per relation during propagation. * Changed the way ACL are checked in some cases (see `EnsureTablePermissions()` and associated commits) * Rewrite `pg_get_table_grants` to include column grants as well * Add missing `pfree()` in `pg_get_table_grants()` Fixes https://github.com/citusdata/citus/issues/7287 Also check a box in https://github.com/citusdata/citus/issues/4812 |
|
|
|
a7e686c106
|
Make sure to prevent INSERT INTO ... SELECT queries involving subfield or sublink (#7912)
DESCRIPTION: Makes sure to prevent `INSERT INTO ... SELECT` queries involving subfield or sublink, to avoid crashes The following query was crashing the backend: ``` INSERT INTO field_indirection_test_1 ( int_col, ct1_col.int_1,ct1_col.int_2 ) SELECT 0, 1, 2; -- crash ``` En passant, added more tests with sublink in distributed_types and found another query with wrong behavior: ``` INSERT INTO domain_indirection_test (f1,f3.if1) SELECT 0, 1; ERROR: could not find a conversion path from type 23 to 17619 -- not the expected ERROR ``` Fixed them by using `strip_implicit_coercions()` on target entry expression before checking for the presence of a subscript or fieldstore, else we fail to find the existing ones and wrongly accept to execute unsafe query. |
|
|
|
4b4fa22b64
|
Fix mis-deparsing of shard query in "output-table column" name conflict (#7932)
DESCRIPTION: Fixes a bug in deparsing of shard query in case of
"output-table column" name conflict
If an `ORDER BY` item in `SELECT` is a bare identifier, the parser
_first seeks it as an output column name_ of the `SELECT` (for SQL92
compatibility). However, ruleutils.c is expecting the SQL99
interpretation _where such a name is an input column name_. So it's
possible to produce an incorrect display of a view in the (admittedly
pretty ill-advised) case where some other column is renamed in the
`SELECT` output list to match an `ORDER BY` column.
The `DISTINCT ON` expressions are interpreted using the same rules as
for `ORDER BY`.
We had an issue reported that actually uses `DISTINCT ON`: #7684
Since Citus uses ruleutils deparsing logic to create the shard queries,
it would not
table-qualify the column names as needed.
PG17 fixed this https://github.com/postgres/postgres/commit/a7eb633563c
by table-qualifying such names in the dumped view text. Therefore,
Citus doesn't reproduce the issue in PG17, since PG17 table-qualifies
the column names when needed, and the produced shard queries are
correct.
This PR applies the PG17 patch to `ruleutils_15.c` and `ruleutils_16.c`.
Even though we generally try to avoid modifying the ruleutils files, in
this case
we are applying a Postgres patch that `ruleutils_17.c` already has:
|
|
|
|
1c09469dd2
|
Adds a method to determine if current node is primary (#7720)
DESCRIPTION: Adds citus_is_primary_node() UDF to determine if the current node is a primary node in the cluster. --------- Co-authored-by: German Eichberger <geeichbe@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> |
|
|
|
680b870d45
|
Add STYLEGUIDE.md and update some other md files on best practices (#7347) | |
|
|
1d0bdbd749 | Bump Citus into 13.1devel | |
|
|
be75c0ec4c |
Use datlocale in check_database_on_all_nodes function for PG17
This commit also has to do with renaming of
daticulocale to datlocale
Relevant PG commit:
f696c0cd5f299f1b51e214efc55a22a782cc175d
|
|
|
|
ed40a0ad02 |
fix issue #7676: wrong handler around MULTIEXPR (#7914)
DESCRIPTION: Fixes a bug with `UPDATE SET (...) = (SELECT some_func(),... )` (#7676) Citus was checking for presence of sublink, but forgot to manage multiexpr while evaluating clauses during planning. At this stage (citus planner), it's not always possible to call PostgreSQL code because the tree is not yet ready for PostgreSQL pure executor. Fixes https://github.com/citusdata/citus/issues/7676. Fixed by adding a new function to check sublink or multiexpr in the tree. --------- Co-authored-by: Colm <colmmchugh@microsoft.com> |
|
|
|
e50563fbd8 |
Issue 7887 Enhance AddInsertSelectCasts for Identity Columns (#7920)
## Enhance `AddInsertSelectCasts` for Identity Columns This PR fixes #7887 and improves the behavior of partial inserts into **identity columns** by modifying the **`AddInsertSelectCasts`** function. Specifically, we introduce **special-case handling** for `nextval(...)` calls (represented in the parse tree as `NextValueExpr`) to ensure that if the identity column’s declared type differs from `nextval`’s default return type (`int8`), we **cast** the expression properly. This prevents mismatches like `int8` → `int4` from causing “invalid string enlargement” errors or other type-related failures. When `INSERT ... SELECT` is processed, `AddInsertSelectCasts` reconciles each target column’s type with the corresponding SELECT expression’s type. Historically, for identity columns that rely on `nextval(...)`, we can end up with a mismatch: - `nextval` returns **`int8`**, - The identity column might be **`int4`**, **`bigint`**, or another integer type. Without a correct cast, Postgres or Citus can produce plan-time or runtime errors. By **detecting** `NextValueExpr` and applying a cast to the column’s type, the final plan ensures consistent insertion without errors. ## What Changed 1. **Check for `NextValueExpr`**: In `AddInsertSelectCasts`, we now have a code block: ```c if (IsA(selectEntry->expr, NextValueExpr)) { Oid nextvalType = GetNextvalReturnTypeCatalog(); ... // If (targetType != nextvalType), build a cast from int8 -> targetType } else { // fallback to generic mismatch logic } ``` This short-circuits any expression that’s a `nextval(...)` call, letting us explicitly cast to the correct type. 2. **Fallback Generic Logic**: If it isn’t a `NextValueExpr` (i.e. a normal column or expression mismatch), we still rely on the existing path that compares `sourceType` vs. `targetType` and calls `CastExpr(...)` if they differ. 3. **`GetNextvalReturnTypeCatalog`**: We added or refined a helper function to confirm that `nextval` returns `int8`, or do a `LookupFuncName("nextval", ...)` to discover the function’s return type from `pg_proc`—making it robust if future changes happen. ## Benefits - **Partial inserts** into identity columns no longer fail with type mismatches. - When `nextval` yields `int8` but the identity column is `int4` (or another type), we properly cast to the column’s type in the plan. - Preserves the **existing** approach for other columns—only identity calls get the specialized `NextValueExpr` logic. ## Testing - Extended `generatedidentity.sql` test scenario to cover partial inserts into both `GENERATED ALWAYS` and `GENERATED BY DEFAULT` identity columns, including tests for the `OVERRIDING SYSTEM VALUE` clause and partial inserts referencing foreign-key columns. |
|
|
|
95da74c47f |
Fix Deadlock with transaction recovery is possible during Citus upgrades (#7910)
DESCRIPTION: Fixes deadlock with transaction recovery that is possible during Citus upgrades. Fixes #7875. This commit addresses two interrelated deadlock issues uncovered during Citus upgrades: 1. Local Deadlock: - **Problem:** In `RecoverWorkerTransactions()`, a new connection is created for each worker node to perform transaction recovery by locking the `pg_dist_transaction` catalog table until the end of the transaction. When `RecoverTwoPhaseCommits()` calls this function for each worker node, the order of acquiring locks on `pg_dist_authinfo` and `pg_dist_transaction` can alternate. This reversal can lead to a deadlock if any concurrent process requires locks on these tables. - **Fix:** Pre-establish all worker node connections upfront so that `RecoverWorkerTransactions()` operates with a single, consistent connection. This ensures that locks on `pg_dist_authinfo` and `pg_dist_transaction` are always acquired in the correct order, thereby preventing the local deadlock. 2. Distributed Deadlock: - **Problem:** After resolving the local deadlock, a distributed deadlock issue emerges. The maintenance daemon calls `RecoverWorkerTransactions()` on each worker node— including the local node—which leads to a complex locking sequence: - A RowExclusiveLock is taken on the `pg_dist_transaction` table in `RecoverWorkerTransactions()`. - An update extension then tries to acquire an AccessExclusiveLock on the same table, getting blocked by the RowExclusiveLock. - A subsequent query (e.g., a SELECT on `pg_prepared_xacts`) issued using a separate connection on the local node gets blocked due to locks held during a call to `BuildCitusTableCacheEntry()`. - The maintenance daemon waits for this query, resulting in a circular wait and stalling the entire cluster. - **Fix:** Avoid cache lookups for internal PostgreSQL tables by implementing an early bailout for relation IDs below `FirstNormalObjectId` (system objects). This eliminates unnecessary calls to `BuildCitusTableCache`, reducing lock contention and mitigating the distributed deadlock. Furthermore, this optimization improves performance in fast connect→query_catalog→disconnect cycles by eliminating redundant cache creation and lookups. 3. Also reverts the commit that disabled the relevant test cases. |
|
|
|
4139370a1d |
#7782 - catch when Postgres planning removes all Citus tables (#7907)
DESCRIPTION: fix a planning error caused by a redundant WHERE clause
Fix a Citus planning glitch that occurs in a DML query when the WHERE
clause of the query is of the form:
` WHERE true OR <expression with 1 or more citus tables> `
and this is the only place in the query referencing a citus table.
Postgres' standard planner transforms the WHERE clause to:
` WHERE true `
So the query now has no citus tables, confusing the Citus planner as
described in issues #7782 and #7783. The fix is to check, after Postgres
standard planner, if the Query has been transformed as shown, and re-run
the check of whether or not the query needs distributed planning.
|
|
|
|
87ec3def55 |
Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery (#7897)
This PR fixes an issue #7891 in the Citus planner where an `UPDATE` on a local table with a subquery referencing a reference table could produce a 0-task plan. Historically, the planner sometimes failed to detect that both the target and referenced tables were effectively “local,” assigning `INVALID_SHARD_ID `and yielding a no-op plan. ### Root Cause - In the Citus router logic (`PlanRouterQuery`), we relied on `shardId` to determine whether a query should be routed to a single shard. - If `shardId == INVALID_SHARD_ID`, but we also had not marked the query as a “local table modification,” the code path would produce zero tasks. - Local + reference tables do not require multi-shard routing. Failing to detect this “purely local” scenario caused Citus to incorrectly route to zero tasks. ### Changes **Enhanced Local Table Detection** - Updated `IsLocalTableModification` and related checks to consider both local and reference tables as “local” for planning, preventing the 0-task scenario. - Expanded `ContainsOnlyLocalOrReferenceTables` to return true if there are no fully distributed tables in the query. **Added Regress Test** - Introduced a new regress test (`issue_7891.sql`) which reproduces the scenario. - Verifies we get a valid single- or local-task plan rather than a 0-task plan. |
|
|
|
ec141f696a |
Enhance MERGE .. WHEN NOT MATCHED BY SOURCE for repartitioned source (#7900)
DESCRIPTION: Ensure that a MERGE command on a distributed table with a `WHEN NOT MATCHED BY SOURCE` clause runs against all shards of the distributed table. The Postgres MERGE command updates a table using a table or a query as a data source. It provides three ways to match the target table with the source: `WHEN MATCHED` means that there is a row in both the target and source; `WHEN NOT MATCHED` means that there is a row in the source that has no match (is not present) in the target; and, as of PG17, `WHEN NOT MATCHED BY SOURCE` means that there is a row in the target that has no match in the source. In Citus, when a MERGE command updates a distributed table using a local/reference table or a distributed query as source, that source is repartitioned, and for each repartitioned shard that has data (i.e. 1 or more rows) the MERGE is run against the corresponding distributed table shard. Suppose the distributed table has 32 shards, and the source repartitions into 4 shards that have data, with the remaining 28 shards being empty; then the MERGE command is performed on the 4 corresponding shards of the distributed table. However, the semantics of `WHEN NOT MATCHED BY SOURCE` are that the specified action must be performed on the target for each row in the target that is not in the source; so if the source is empty, all target rows should be updated. To see this, consider the following MERGE command: ``` MERGE INTO target AS t USING source AS s ON t.id = s.id WHEN NOT MATCHED BY SOURCE THEN UPDATE t SET t.col1 = 100 ``` If the source has zero rows then every row in the target is updated s.t. its col1 value is 100. Currently in Citus a MERGE on a distributed table with a local/reference table or a distributed query as source ignores shards of the distributed table when the corresponding shard of the repartitioned source has zero rows. However, if the MERGE command specifies a `WHEN NOT MATCHED BY SOURCE` clause, then the MERGE should be performed on all shards of the distributed table, to ensure that the specified action is performed on the target for each row in the target that is not in the source. This PR enhances Citus MERGE execution so that when a repartitioned source shard has zero rows, and the MERGE command specifies a `WHEN NOT MATCHED BY SOURCE` clause, the MERGE is performed against the corresponding shard of the distributed table using an empty (zero row) relation as source, by generating a query of the form: ``` MERGE INTO target_shard_0002 AS t USING (SELECT id FROM (VALUES (NULL) ) source_0002(id) WHERE FALSE) AS s ON t.id = s.id WHEN NOT MATCHED BY SOURCE THEN UPDATE t set t.col1 = 100 ``` This works because each row in the target shard will be updated, and `WHEN MATCHED` and `WHEN NOT MATCHED`, if specified, will be no-ops because the source has zero rows. To implement this when the source is a local or reference table involves teaching function `ExcuteSourceAtCoordAndRedistribution()` in `merge_executor.c` to not prune tasks when the query has `WHEN NOT MATCHED BY SOURCE` but to instead replace the task's query to one that uses an empty relation as source. And when the source is a distributed query, function `ExecuteMergeSourcePlanIntoColocatedIntermediateResults()` (also in `merge_executor.c`) instead of skipping empty tasks now generates a query that uses an empty relation as source for the corresponding target shard of the distributed table, but again only when the query has `WHEN NOT MATCHED BY SOURCE`. A new function `BuildEmptyResultQuery()` is added to `recursive_planning.c` and it is used by both the aforementioned functions in `merge_executor.c` to build an empty relation to use as the source. It applies the appropriate type to each column of the empty relation so the join with the target makes sense to the query compiler. |
|
|
|
ccd7ddee36 |
Custom Scan (ColumnarScan): exclude outer_join_rels from CandidateRelids (#7703)
DESCRIPTION: Fixes a crash in columnar custom scan that happens when a columnar table is used in a join. Fixes issue #7647. Co-authored-by: Ольга Сергеева <ob-sergeeva@it-serv.ru> |
|
|
|
89674d9630 |
[Bug Fix] SEGV on query with Left Outer Join (#7787) (#7901)
DESCRIPTION: Fixes a crash in left outer joins that can happen when there is an an aggregate on a column from the inner side of the join. Fix the SEGV seen in #7787 and #7899; it occurs because a column in the targetlist of a worker subquery can contain a non-empty varnullingrels field if the column is from the inner side of a left outer join. The issue can also occur with the columns in the HAVING clause, and this is also tested in the fix. The issue was triggered by the introduction of the varnullingrels to Vars in Postgres 16 (2489d76c) There is a related issue, #7705, where a non-empty varnullingrels was incorrectly copied into the query tree for the combine query. Here, a non-empty varnullingrels field of a var is incorrectly copied into the query tree for a worker subquery. The regress file from #7705 is used (and renamed) to also test this (#7787). An alternative test output file is required for Postgres 15 because of an optimization to DISTINCT in Postgres 16 (1349d2790bf). |
|
|
|
2b5dfbbd08 | Bump Citus version to 13.0.1 (#7872) | |
|
|
7004295065 |
Revert "Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered"
This reverts commit
|
|
|
|
3b1c082791 |
Drops PG14 support (#7753)
DESCRIPTION: Drops PG14 support 1. Remove "$version_num" != 'xx' from configure file 2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code 3. Look at pg_version_compat.h file, remove all _compat functions etc defined specifically for PGXX differences 4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM < PG_VERSION_(XX+1) ifs in the codebase 5. delete ruleutils_xx.c file 6. cleanup normalize.sed file from pg14 specific lines 7. delete all alternative output files for that particular PG version, server_version_ge variable helps here |
|
|
|
d5618b6b4c |
Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered
As of this commit, after recovering the remote transactions, now we release the lock on pg_dist_transaction while closing it to avoid deadlocks that might occur because of trying to acquire a lock on pg_dist_authinfo while holding a lock on pg_dist_transaction. Such a scenario can only cause a deadlock if another transaction is trying to acquire a strong lock on pg_dist_transaction while holding a lock on pg_dist_authinfo. As of today, we (implicitly) acquire a strong lock on pg_dist_transaction only when upgrading Citus to 11.3-1 and this happens when creating a REPLICA IDENTITY on pg_dist_transaction. And regardless of the code-path we are in, it should be okay to release the lock there because all we do after that point is to abort the prepared transactions that are not part of an in-progress distributed transaction and releasing the lock before doing so should be just fine. This also changes the blocking behavior between citus_create_restore_point and the transaction recovery code-path in the sense that now citus_create_restore_point doesn't until transaction recovery completes aborting the prepared transactions that are not part of an in-progress distributed transaction. However, this should be fine because even before this was possible, e.g., if transaction recovery fails to open a remote connection to a node. |
|
|
|
85739b34bf |
Fix pg17 test (#7857)
error merged in
|
|
|
|
1bb6c7e95f |
PG17 Compatibility - Fix crash when pg_class is used in MERGE (#7853)
This pull request addresses Issue #7846, where specific MERGE queries on
non-distributed and distributed tables can result in crashes in certain
scenarios. The issue stems from the usage of `pg_class` catalog table,
and the `FilterShardsFromPgclass` function in Citus. This function goes
through the query's jointree to hide the shards. However, in PG17,
MERGE's join quals are in a separate structure called
`mergeJoinCondition`. Therefore FilterShardsFromPgclass was not
filtering correctly in a `MERGE` command that involves `pg_class`. To
fix the issue, we handle `mergeJoinCondition` separately in PG17.
Relevant PG commit:
|
|
|
|
a18f8990be |
Update tdigest_aggregate_support output for PG15+ (#7849)
Regress test tdigest_aggregate_support has been failing since at least Citus 12.0, when tdigest extension is installed in Postgres. This appears to be because of an omission by commit |
|
|
|
0642a4dc08 |
Propagate MERGE ... WHEN NOT MATCHED BY SOURCE (#7807)
DESCRIPTION: Propagates MERGE ... WHEN NOT MATCHED BY SOURCE It seems like there is not much needed to be done here. `get_merge_query_def` from `ruleutils_17` is updated with "WHEN NOT MATCHED BY SOURCE" therefore `deparse_shard_query` parses the merge query for execution on the shard correctly. Relevant PG commit: https://github.com/postgres/postgres/commit/0294df2f1 |
|
|
|
74d945f5ae |
PG17 - Propagate EXPLAIN options: MEMORY and SERIALIZE (#7802)
DESCRIPTION: Propagates MEMORY and SERIALIZE options of EXPLAIN The options for `MEMORY` can be true or false. Default is false. The options for `SERIALIZE` can be none, text or binary. Default is none. I referred to how we added support for WAL option in this PR [Support EXPLAIN(ANALYZE, WAL)](https://github.com/citusdata/citus/pull/4196). For the tests however, I used the same tests as Postgres, not like the tests in the WAL PR. I used exactly the same tests as Postgres does, I simply distributed the table beforehand. See below the relevant Postgres commits from where you can see the tests added as well: - [Add EXPLAIN (MEMORY)](https://github.com/postgres/postgres/commit/5de890e36) - [Invent SERIALIZE option for EXPLAIN.](https://github.com/postgres/postgres/commit/06286709e) This PR required a lot of copying of Postgres static functions regarding how `EXPLAIN` works for `MEMORY` and `SERIALIZE` options. Specifically, these copy-pastes were required for updating `ExplainWorkerPlan()` function, which is in fact based on postgres' `ExplainOnePlan()`: ```C /* copied from explain.c to update ExplainWorkerPlan() in citus according to ExplainOnePlan() in postgres */ #define BYTES_TO_KILOBYTES(b) typedef struct SerializeMetrics static bool peek_buffer_usage(ExplainState *es, const BufferUsage *usage); static void show_buffer_usage(ExplainState *es, const BufferUsage *usage); static void show_memory_counters(ExplainState *es, const MemoryContextCounters *mem_counters); static void ExplainIndentText(ExplainState *es); static void ExplainPrintSerialize(ExplainState *es, SerializeMetrics *metrics); static SerializeMetrics GetSerializationMetrics(DestReceiver *dest); ``` _Note_: it looks like we were missing some `buffers` option details as well. I put them together with the memory option, like the code in Postgres explain.c, as I didn't want to change the copied code. However, I tested locally and there is no big deal in previous Citus versions, and you can also see that existing Citus tests with `buffers true` didn't change. Therefore, I prefer not to backport "buffers" changes to previous versions. |
|
|
|
7682d135a4 |
PG17 - Add Regression Test for REINDEX support in event triggers (#7819)
This PR adds regression tests to verify REINDEX support with event triggers. Tests validates trigger execution, shard placement consistency, and distributed index rebuilding without disruption. |
|
|
|
08d94f9eb6 |
PG17 - Add Regression Test for Access Method Behavior on Partitioned Tables (#7818)
This PR adds a regression test to verify the behavior of access methods for partitioned and distributed tables, including: - Creating partitioned tables with heap. - Distributing tables using create_distributed_table. - Switching access methods to columnar with ALTER TABLE. - Validating access method inheritance for new partitions. Relecant PG17 commit: https://github.com/postgres/postgres/commit/374c7a229 |
|
|
|
8f436e4a48 |
Add tests with xmltext() and random(min, max) (#7824)
xmltext() converts text into xml text nodes. Test with columnar and citus tables. Relevant PG17 commit: https://github.com/postgres/postgres/commit/526fe0d79 random(min, max) generates random numbers in a specified range Add tests like the ones for random() in aggregate_support.sql References: https://github.com/citusdata/citus/blob/main/src/test/regress/sql/aggregate_support.sql#L493-L532 https://github.com/citusdata/citus/pull/7183 Relevant PG17 commit: https://github.com/postgres/postgres/commit/e6341323a |
|
|
|
1d57a36ecc |
Add pg17 jsonpath methods tests (#7820)
various jsonpath methods were added in PG17 Relevant PG commit: https://github.com/postgres/postgres/commit/66ea94e8e Here we add the same test as in pg15_jsonpath.sql for the new additions |
|
|
|
658632642a |
Disallow infinite values for partition interval in create_time_partitions udf (#7822)
PG17 added +/- infinity values for the interval data type Relevant PG commit: https://github.com/postgres/postgres/commit/519fc1bd9 |
|
|
|
3e96a19606 |
Adds JSON_TABLE() support, and SQL/JSON constructor/query functions tests (#7816)
DESCRIPTION: Adds JSON_TABLE() support
PG17 has added basic `JSON_TABLE()` functionality
`JSON_TABLE()` allows `JSON` data to be converted into a relational view
and thus used, for example, in a `FROM` clause, like other tabular data.
We treat `JSON_TABLE` the same as correlated functions (e.g., recurring
tuples). In the end, for multi-shard `JSON_TABLE` commands, we apply the
same restrictions as reference tables (e.g., cannot perform a lateral
outer join when a distributed subquery references a (reference
table)/(json table) etc.)
Relevant PG17 commits:
[basic JSON
table](https://github.com/postgres/postgres/commit/de3600452), [nested
paths in json
table](https://github.com/postgres/postgres/commit/bb766cde6)
Onder had previously added json table support for PG15BETA1, but we
reverted that commit because json table was reverted in PG15.
|
|
|
|
2112aa1860 |
Add tests for inserting with AT LOCAL operator (#7815)
PG17 has added support for AT LOCAL operator it converts the given time type to time stamp with the session's TimeZone value as time zone. Here we add tests that validate that we can use AT LOCAL at INSERT commands Relevant PG commit: https://github.com/postgres/postgres/commit/97957fdba With the tests, we verify that we evaluate AT LOCAL at the coordinator and then perform the insert remotely. |
|
|
|
1cf5c190aa |
Error out for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION (#7814)
PG17 added support for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION. Relevant PG commit: https://github.com/postgres/postgres/commit/5d06e99a3 We currently don't support propagating this command for Citus tables. It is added to future work. This PR disallows `ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION` on all Citus table types (local, distributed, and partitioned distributed) by adding an error check in `ErrorIfUnsupportedAlterTableStmt`. A new regression test verifies that each table type fails with a consistent error message when attempting to set an expression. |
|
|
|
24585a8c04 |
Error out for ALTER TABLE ... SET ACCESS METHOD DEFAULT (#7803)
PG17 introduced ALTER TABLE ... SET ACCESS METHOD DEFAULT This PR introduces and enforces an error check preventing ALTER TABLE ... SET ACCESS METHOD DEFAULT on both Citus local tables (added via citus_add_local_table_to_metadata) and distributed/partitioned distributed tables. The regression tests now demonstrate that each table type raises an error advising users to explicitly specify an access method, rather than relying on DEFAULT. This ensures consistent behavior across local and distributed environments in Citus. The reason why we currently don't support this is that we can't simply propagate the command as it is, because the default table access method may be different across Citus cluster nodes. Relevant PG commit: https://github.com/postgres/postgres/commit/d61a6cad6 |
|
|
|
b7d04038cb |
Add tests for FORCE_NULL * and FORCE_NOT_NULL * options for COPY FROM (#7812)
These options already existed in PG17, and we support them and have tests for them in `multi_copy.sql`. In PG17, their capability was extended to specify ALL columns at once using *. Citus performs the COPY correctly, as is validated by the added tests in this PR. Relevant PG commit: https://github.com/postgres/postgres/commit/f6d4c9cf1 Copy-pasting from Postgres documentation what these options do, such that the reviewer may better understand the tests added: `FORCE_NOT_NULL`: Do not match the specified columns' values against the null string. In the default case where the null string is empty, this means that empty values will be read as zero-length strings rather than nulls, even when they are not quoted. If * is specified, the option will be applied to all columns. This option is allowed only in `COPY FROM`, and only when using `CSV` format. `FORCE_NULL`: Match the specified columns' values against the null string, even if it has been quoted, and if a match is found set the value to `NULL`. In the default case where the null string is empty, this converts a quoted empty string into `NULL`. If * is specified, the option will be applied to all columns. This option is allowed only in `COPY FROM`, and only when using `CSV` format. `FORCE_NULL` and `FORCE_NOT_NULL` can be used simultaneously on the same column. This results in converting quoted null strings to null values and unquoted null strings to empty strings. Explain it to me like I'm a 5-year-old, for a text column: `FORCE_NULL` looks for empty strings and registers them as `NULL` `FORCE_NOT_NULL` looks for null values and registers them as empty strings. |
|
|
|
5e9f8d838c |
Error for COPY FROM ... on_error, log_verbosity with Citus tables (#7811)
PG17 added the new ON_ERROR option for COPY FROM. When this option is specified, COPY skips soft errors and continues copying. Relevant PG commits: -- https://github.com/postgres/postgres/commit/9e2d87011 -- https://github.com/postgres/postgres/commit/b725b7eec I tried it locally with Citus tables. Without further implementation, it doesn't work correctly. Therefore, we error out for now, and add it to future work. PG17 also added log_verbosity option, which controls the amount of messages emitted during processing. This is currently used in COPY FROM when ON_ERROR option is set to ignore. Therefore, we error out for this option as well. Relevant PG17 commit: https://github.com/postgres/postgres/commit/f5a227895 |
|
|
|
202ad077bd |
PG17: ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT (#7808)
DESCRIPTION: Propagates ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT We automatically support this. Adding tests only. We currently don't support ALTER TABLE ALTER COLUMN SET STATISTICS Relevant PG commit: https://github.com/postgres/postgres/commit/4f622503d |
|
|
|
a383ef6831 |
Adds PG17.1 support - Regression tests sanity (#7661)
This is the final commit that adds PG17 compatibility with Citus's current capabilities. You can use Citus community, release-13.0 branch, with PG17.1. --------- Specifically, this commit: - Enables PG17 in the configure script. - Adds PG17 tests to CI using test images that have 17.1 - Fixes an upgrade test: see below for details In `citus_prepare_upgrade()`, don't drop any_value when upgrading from PG16+, because PG16+ has its own any_value function. Attempting to do so results in the error seen in [pg16-pg17 upgrade](https://github.com/citusdata/citus/actions/runs/11768444117/job/32778340003?pr=7661): ``` ERROR: cannot drop function any_value(anyelement) because it is required by the database system CONTEXT: SQL statement "DROP AGGREGATE IF EXISTS pg_catalog.any_value(anyelement)" ``` When 16 becomes the minimum supported Postgres version, the drop statements can be removed. --------- Several PG17 Compatibility commits have been merged before this final one. All these subtasks are done https://github.com/citusdata/citus/issues/7653 See the list below: Compilation PR: https://github.com/citusdata/citus/pull/7699 Ruleutils PR: https://github.com/citusdata/citus/pull/7725 Sister PR for tests: https://github.com/citusdata/the-process/pull/159 Helpful smaller PRs: - https://github.com/citusdata/citus/pull/7714 - https://github.com/citusdata/citus/pull/7726 - https://github.com/citusdata/citus/pull/7731 - https://github.com/citusdata/citus/pull/7732 - https://github.com/citusdata/citus/pull/7733 - https://github.com/citusdata/citus/pull/7738 - https://github.com/citusdata/citus/pull/7745 - https://github.com/citusdata/citus/pull/7747 - https://github.com/citusdata/citus/pull/7748 - https://github.com/citusdata/citus/pull/7749 - https://github.com/citusdata/citus/pull/7752 - https://github.com/citusdata/citus/pull/7755 - https://github.com/citusdata/citus/pull/7757 - https://github.com/citusdata/citus/pull/7759 - https://github.com/citusdata/citus/pull/7760 - https://github.com/citusdata/citus/pull/7761 - https://github.com/citusdata/citus/pull/7762 - https://github.com/citusdata/citus/pull/7765 - https://github.com/citusdata/citus/pull/7766 - https://github.com/citusdata/citus/pull/7768 - https://github.com/citusdata/citus/pull/7769 - https://github.com/citusdata/citus/pull/7771 - https://github.com/citusdata/citus/pull/7774 - https://github.com/citusdata/citus/pull/7776 - https://github.com/citusdata/citus/pull/7780 - https://github.com/citusdata/citus/pull/7781 - https://github.com/citusdata/citus/pull/7785 - https://github.com/citusdata/citus/pull/7788 - https://github.com/citusdata/citus/pull/7793 - https://github.com/citusdata/citus/pull/7796 --------- Co-authored-by: Colm <colmmchugh@microsoft.com> |
|
|
|
28b0b0e7a8 |
Bump Citus version into 13.0.0 (#7792)
We are using `release-13.0` branch for both development and release, to deliver PG17 support in Citus. Afterwards, we will (probably) merge this branch into main. Some potential changes for main branch, after we are done working on release-13.0: - Merge changes from `release-13.0` to `main` - Figure out what changes were there on 12.2, move them to 13.1 version. In a nutshell: rename `12.1--12.2` to `13.0--13.1` and fix issues. - Set version to 13.1devel |
|
|
|
80c6479408 |
PG17 compatibility: Fix Test Failure in multi_alter_table_add_const (#7733)
In earlier versions of PostgreSQL, exclusion constraints were not allowed on partitioned tables. This is why the error in your regression test (ERROR: exclusion constraints are not supported on partitioned tables) was raised in PostgreSQL 16. In PostgreSQL 17, exclusion constraints are now allowed on partitioned tables, which is why the error no longer appears when you attempt to add an exclusion constraint. The constraint exclusion mechanism, described in the documentation, relies on CHECK constraints to decide which partitions or child tables need to be queried. [CHECK constraints](https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION) ```diff -- Check "ADD EXCLUDE" errors out for partitioned table since the postgres does not allow it ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD EXCLUDE(partition_col WITH =); -ERROR: exclusion constraints are not supported on partitioned tables -- Check "ADD CHECK" SET client_min_messages TO DEBUG1; ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD CHECK (dist_col > 0); DEBUG: the constraint name on the shards of the partition is too long, switching to sequential and local execution mode to prevent self deadlocks: longlonglonglonglonglonglonglonglonglonglonglo_537570f5_5_check DEBUG: verifying table "longlonglonglonglonglonglonglonglonglonglonglonglonglonglongabc" DEBUG: verifying table "p1" RESET client_min_messages; SELECT con.conname FROM pg_catalog.pg_constraint con INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace WHERE rel.relname = 'citus_local_partitioned_table'; conname -------------------------------------------------- + citus_local_partitioned_table_partition_col_excl citus_local_partitioned_table_check -(1 row) +(2 rows) ``` |
|
|
|
29bd3dc41c |
PG17 compatibility: Fix Isolation Test Failure in isolation_multiuser_locking (#7714)
This PR enhances `isolation_multiuser_locking.spec` test compatibility across multiple PostgreSQL versions by handling differences in error messages and behavior. Key updates include: - **Error Message Handling:** Adjustments to manage version-specific error messages, ensuring consistent test results. - Modified to address variations in locking behavior across PostgreSQL versions, ensuring test stability in multiuser scenarios. - **REINDEX Behavior Adjustment**: This PR accounts for a behavioral change introduced in PostgreSQL by commit ecb0fd337, which alters how REINDEX interacts with system catalogs. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ecb0fd337 --------- Co-authored-by: Mehmet YILMAZ <mehmet.yilmaz@microsoft.com> |
|
|
|
09e96831b3 |
Fix pg17 test (#7797)
Broken from this commit
|
|
|
|
c662e68e44 |
Remove redundant normalize (#7794)
Redundant from this commit
|
|
|
|
915276ee7f |
PG17 compatibility: Fix Test Failure in local_table_join (#7732)
PostgreSQL 17 seems to have introduced improvements in how correlated subqueries are handled during plan generation. Instead of generating a trivial subplan with WHERE true, it now applies more specific filtering (WHERE (key = 5)), which makes the execution plan more efficient. https://github.com/postgres/postgres/commit/b262ad44 ``` diff -dU10 -w /__w/citus/citus/src/test/regress/expected/local_table_join.out /__w/citus/citus/src/test/regress/results/local_table_join.out --- /__w/citus/citus/src/test/regress/expected/local_table_join.out.modified 2024-11-05 09:53:50.423970699 +0000 +++ /__w/citus/citus/src/test/regress/results/local_table_join.out.modified 2024-11-05 09:53:50.463971296 +0000 @@ -1420,32 +1420,32 @@ ) as subq_1 ) as subq_2; DEBUG: Wrapping relation "custom_pg_type" to a subquery DEBUG: generating subplan 204_1 for subquery SELECT typdefault FROM local_table_join.custom_pg_type WHERE true ERROR: direct joins between distributed and local tables are not supported HINT: Use CTE's or subqueries to select from local tables and use them in joins -- correlated sublinks are not yet supported because of #4470, unless we convert not-correlated table SELECT COUNT(*) FROM distributed_table d1 JOIN postgres_table using(key) WHERE d1.key IN (SELECT key FROM distributed_table WHERE d1.key = key and key = 5); DEBUG: Wrapping relation "postgres_table" to a subquery -DEBUG: generating subplan XXX_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE true +DEBUG: generating subplan 206_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE (key OPERATOR(pg_catalog.=) 5) ``` Co-authored-by: Naisila Puka <37271756+naisila@users.noreply.github.com> |
|
|
|
3935710c17 |
PG17 compatibility: Fix Test Failure in local_dist_join_mixed (#7731)
PostgreSQL 16 adds an extra condition (id IS NOT NULL) to the subquery. This condition is likely used to ensure that no null values are processed in the subquery. Instead of using the condition id IS NOT NULL, PostgreSQL 17 generates the subplan with a trivial condition (WHERE true), indicating that it does not need to explicitly check for non-null values. PostgreSQL 17 likely includes optimizations to handle null checks more efficiently. The WHERE (id IS NOT NULL) condition that was present in PostgreSQL 16 may now be considered redundant by the planner, as it is implicitly handled by the query execution engine. https://github.com/postgres/postgres/commit/b262ad44 ```diff SELECT foo1.id FROM (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo9, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo8, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo7, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo6, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo5, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo4, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo3, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo2, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo10, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo1 WHERE foo1.id = foo9.id AND foo1.id = foo8.id AND foo1.id = foo7.id AND foo1.id = foo6.id AND foo1.id = foo5.id AND foo1.id = foo4.id AND foo1.id = foo3.id AND foo1.id = foo2.id AND foo1.id = foo10.id AND foo1.id = foo1.id ORDER BY 1; ... -DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE (id IS NOT NULL) +DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE true ... ``` |
|
|
|
11f76cb4bb |
PG17 compatibility: ensure get_progress() output is consistent (#7793)
in regress test isolation_progress_monitoring, with an ORDER BY. The implementation of get_progress() uses a tuplestore to hold the step and progress values, and tuplestore does not provide any guarantee on the ordering of the tuples so ORDER BY ensures stable test output. Also make the output more user friendly by including the column names. Fixing occasional failures seen in isolation_progress_monitoring.  |
|
|
|
35d1160ace |
PG17 Compatibility: Support MERGE features in Citus with clean exceptions (#7781)
- Adapted `pgmerge.sql` tests from PostgreSQL community's `merge.sql` to Citus by converting tables into Citus local tables. - Identified two new PostgreSQL 17 MERGE features (`RETURNING` support and MERGE on updatable views) not yet supported by Citus. - Implemented changes to detect unsupported features and raise clean exceptions, ensuring pgmerge tests pass without diffs. - Addressed breaking changes caused by `MERGE ... WHEN NOT MATCHED BY SOURCE` restructuring, reducing diffs in pgmerge tests. - Segregated unsupported test cases into `merge_unsupported.sql` to maintain clarity and avoid large diffs in test files. - Prepared the Citus MERGE planner to handle new PostgreSQL changes, reducing remaining test discrepancies. All merge tests now pass cleanly, with unsupported cases clearly isolated. Relevant PG commits: c649fa24a https://github.com/postgres/postgres/commit/c649fa24a 0294df2f1 https://github.com/postgres/postgres/commit/0294df2f1 --------- Co-authored-by: naisila <nicypp@gmail.com> |
|
|
|
088731e9db |
PG17 compatibility: account for identity columns in partitioned tables. (#7785)
PG17 added support for identity columns in partitioned tables: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=699586315 A consequence is that a table with an identity column cannot be attached as a partition. But Citus on Postgres 17 will generate identity column for the partitions if the parent table has one (or more) identity columns when propagating distributed table DDL to worker nodes, as happens in the `generated_identity` regress test in #7768: ``` CREATE TABLE partitioned_table ( a bigint CONSTRAINT myconname GENERATED BY DEFAULT AS IDENTITY (START WITH 10 INCREMENT BY 10), b bigint GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10), c int ) PARTITION BY RANGE (c); CREATE TABLE partitioned_table_1_50 PARTITION OF partitioned_table FOR VALUES FROM (1) TO (50); CREATE TABLE partitioned_table_50_500 PARTITION OF partitioned_table FOR VALUES FROM (50) TO (1000); SELECT create_distributed_table('partitioned_table', 'a'); - create_distributed_table ---------------------------------------------------------------------- - -(1 row) - +ERROR: table "partitioned_table_1_50" being attached contains an identity column "a" +DETAIL: The new partition may not contain an identity column. ``` It is the Citus-generated ATTACH PARTITION statement that errors out, because the Citus-generated CREATE TABLE for the partitions included identity column definitions. The fix is straightforward - when propagating the CREATE TABLE ddl for a partition of a table with an identity column, don't include the identity column(s), they will be inherited on attaching the partition. In Citus on Postgres 16 (or less) partitions do not inherit identity; the partitions in the example would not have any identity columns so it was not an issue previously. |
|
|
|
c3d21b807a |
PG17 compatibility: fix plan diffs in multi_explain (#7780)
Regress test `multi_explain` has two queries that have a different query plan with PG17. Here is part of the plan diff for the query labelled _Union and left join subquery pushdown_ in `multi_explain.sql` (for the complete diff, search for `multi_explain` [here](https://github.com/citusdata/citus/actions/runs/12158205599/attempts/1)): ``` -> Sort Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone, events.event_time - -> Hash Left Join - Hash Cond: (users.composite_id = subquery_2.composite_id) - -> HashAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time + -> Nested Loop Left Join + Join Filter: (users.composite_id = subquery_2.composite_id) + -> Unique + -> Sort + Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time -> Append ``` The change is the same in both queries; a hash left join with subquery_1 on the outer and subquery_2 on the inner side of the join is now a nested loop left join with subquery_1 on the outer and subquery_2 on the inner; additionally, the chosen method of uniquifying the UNION in subquery_1 has changed from hashed grouping to sort followed by unique, as shown in the diff above. The PG17 commit that caused this plan change is likely _[Fix MergeAppend to more accurately compute the number of rows that need to be sorted](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9d1a5354f)_ because it impacts the estimated rows counts of UNION paths. Comparing a costed plan of the query between PG16 and PG17 I noticed that with PG16 the rows estimate for the UNION in subquery_1 is 4, whereas with PG17 the rows estimate is 2. A lower rows estimate in the outer side of the join may result in nested loop looking cheaper than hash join for the left outer join, hence the plan change in the two queries where there is a UNION on the outer side of a left outer join. The proposed fix achieves a consistent plan across all supported postgres versions by temporarily disabling nested loop join and sort for the two impacted queries; the postgres optimizer selects hash join for the outer left join and hashed aggregation for the UNION operation. I investigated tweaking the queries, but was not able to arrive at a consistent plan, and I believe the SQL operator (e.g. join, group by, union) implementations are orthogonal to the intent of the test, so this should be a satisfactory solution, particularly as it avoids introducing a second alternative output file for `multi_explain`. |
|
|
|
592416250c |
PG17 compatibility: account for MAINTAIN privilege in regress tests (#7774)
This PR addresses regress tests impacted by the introduction of [the MAINTAIN privilege in PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ecb0fd337). The impacted tests include `generated_identity`, `create_single_shard_table`, `grant_on_sequence_propagation`, `grant_on_foreign_server_propagation`, `single_node_enterprise`, `multi_multiuser_master_protocol`, `multi_alter_table_row_level_security`, `shard_move_constraints` which show the following error: ``` SELECT start_metadata_sync_to_node('localhost', :worker_2_port); - start_metadata_sync_to_node ---------------------------------------------------------------------- - -(1 row) - +ERROR: unrecognized aclright: 16384 ``` and `multi_multiuser_master_protocol`, where the `pg_class.relacl` column has 'm' for MAINTAIN if applicable: ``` relname | rolname | relacl ---------------------+-------------+------------------------------------------------------------ trivial_full_access | full_access | - trivial_postgres | postgres | {postgres=arwdDxt/postgres,full_access=arwdDxt/postgres} + trivial_postgres | postgres | {postgres=arwdDxtm/postgres,full_access=arwdDxtm/postgres} ``` The PR updates function `convert_aclright_to_string()` in citus_ruleutils.c to include a case for `ACL_MAINTAIN`. Per the comment on `convert_aclright_to_string()` in citus_ruleutils.c, it is a copy of `convert_aclright_to_string()` in Postgres (where it is in `src/backend/utils/adt/acl.c`), so requires updating to be consistent with Postgres. With this change Citus can recognize the MAINTAIN privilege, and will not emit the `unrecognized aclright` error. The PR also adds an alternative goldfile for `multi_multiuser_master_protocol`. Note that `convert_aclright_to_string()` in Postgres includes access types SET and ALTER SYSTEM on system parameters (aka GUCs), added by [this PG16 commit](https://github.com/postgres/postgres/commit/a0ffa885e). If Citus were to have a requirement to support granting SET and ALTER SYSTEM we would need to update `convert_aclright_to_string()` in citus_ruleutils.c with SET and ALTER SYSTEM. |
|
|
|
beb222ea8d |
PG17 compatibility: fix multi-1 diffs caused by PG17 optimizer enhancements (#7769)
This fix ensures that the expected DEBUG error messages from the router
planner in `multi_router_planner`, `multi_router_planner_fast_path` and
`query_single_shard_table` are present with PG17.
In `query_single_shard_table` the diff:
```
SELECT COUNT(*) FROM citus_local_table t1
WHERE t1.b IN (
SELECT b+1 FROM nullkey_c1_t1 t2 WHERE t2.b = t1.a
);
-DEBUG: router planner does not support queries that reference non-colocated distributed tables
+DEBUG: Local tables cannot be used in distributed queries.
```
occurred because of[ this PG17
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639)
which enables the optimizer to pull up a correlated ANY subquery to a
join. The fix inhibits subquery pull up by including a volatile function
in the predicate involving the ANY subquery, preserving the pre-PG17
optimizer treatment of the query.
In the case of `multi_router_planner` and
`multi_router_planner_fast_path` the diffs:
```
-- partition_column is null clause does not prune out any shards,
-- all shards remain after shard pruning, not router plannable
SELECT *
FROM articles_hash a
WHERE a.author_id is null;
-DEBUG: Router planner cannot handle multi-shard select queries
+DEBUG: Creating router plan
```
are because of [this PG17
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b262ad440),
which enables the optimizer to detect and remove redundant IS (NOT) NULL
expressions. The fix is to adjust the table definition so the column
used for distribution is not marked NOT NULL, thus preserving the
pre-PG17 query planning behavior.
Finallly, a rule is added to `normalize.sed` to ignore DEBUG logging in CREATE MATERIALIZED
VIEW AS statements introduced by [this PG17
commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b4da732fd64);
_when creating materialized views, use REFRESH logic to load data_, a
consequence of which is that with `client_min_messages` at `DEBUG2`
Postgres emits extra detail for CREATE MATERIALIZED VIEW AS statements.
```
CREATE MATERIALIZED VIEW mv_articles_hash_empty AS
SELECT * FROM articles_hash WHERE author_id = 1;
DEBUG: Creating router plan
DEBUG: query has a single distribution column value: 1
+DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391
+DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391[]
```
The rule can be changed to a normalization, or possibly dropped, when 17 becomes the minimum supported version.
|
|
|
|
1797ab8a4f |
PG17 compatibility: Fix check-style, broken by PG17 columnar test fix… (#7776)
… (
|
|
|
|
808626ea78 |
PG17 compatibility (#7653): Fix test diffs in columnar schedule (#7768)
This PR fixes diffs in `columnnar_chunk_filtering` and `columnar_paths`
tests.
In `columnnar_chunk_filtering` an expression `(NOT (SubPlan 1))` changed
to `(NOT (ANY (a = (SubPlan 1).col1)))`. This is due to [aPG17
commit](https://github.com/postgres/postgres/commit/fd0398fc) that
improved how scalar subqueries (InitPlans) and ANY subqueries (SubPlans)
are EXPLAINed in expressions. The fix uses a helper function which
converts the PG17 format to the pre-PG17 format. It is done this way
because pre-PG17 EXPLAIN does not provide enough context to convert to
the PG17 format. The helper function can (and should) be retired when 17
becomes the minimum supported PG.
In `columnar_paths`, a merge join changed to a hash join. This is due to
[this PG17
commit](
|
|
|
|
6254ad81fc |
PG17 compatibility: revert #7764 (#7775)
Revert PG17 compatibility fix #7764 |
|
|
|
1074035446 |
PG17 compatibility: fix some tests outputs (#7765)
There are two commits in this PR: 1) Remove domain_default column since it has been removed from PG17 Relevant PG commit: |
|
|
|
0de7b5a240 |
PG17 compatibility: fix diff in tableam (#7771)
Test `tableam` expects that this CREATE TABLE statement: `CREATE TABLE test_partitioned(id int, p int, val int) PARTITION BY RANGE (p) USING fake_am;` will produce this error: `specifying a table access method is not supported on a partitioned table` but as of [this PG commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=374c7a229) it is possible to specify an access method on a partitioned table. This fix moves the CREATE TABLE statement to pg17, and adds an additional test to show parent access method is inherited. |
|
|
|
9615b52863 |
PG17 compatibility: Fix Test Failure in multi_name_lengths multi_create_table_constraints (#7726)
PG 17 Removes outer parentheses from CHECK constraints
we add them back for pg15,pg16 compatibility
e.g. change CHECK other_col >= 100 to CHECK (other_col >= 100)
Relevant PG commit:
e59fcbd712c777eb2987d7c9ad542a7e817954ec
|
|
|
|
a74bb6280c |
PG17 regress sanity: fix error unrecognized alter database option tablespace seen in database vanilla test (#7764)
Disable DDL propagation for the vanilla test suite. This enables the vanilla `database ` test to pass, where previously it was correctly returning `ERROR: unrecognized ALTER DATABASE option: tablespace` because release-13.0 does not propagate this ALTER DATABASE variant. We (Citus team) discussed cherry picking [#7253](https://github.com/citusdata/citus/pull/7253) from main to release-13.0 because it does propagate ALTER DATABASE tablespace option (as well as a couple of others) but decided fixing the regress test was not the proper context for that. The fix disables `citus.enable_metadata_sync` when running vanilla, we discussed disabling `citus.enable_create_database_propagation` but this is not in release-13.0. |
|
|
|
6043fcb263 |
PG17 regress test sanity: fix diffs in union_pushdown. (#7762)
Preserve the test error message by adjusting the query so that PG17 cannot pull it up to a join. Another instance of a subquery that can be pulled up to a join with PG17 (#7745) This should have been fixed in, but slipped by, #7745 |
|
|
|
ed71e65333 |
PG17 compatibility: Adjust print_extension_changes function for extra type outputs in PG17 (#7761)
In PG17, Auto-generated array types, multirange types, and relation
rowtypes
are treated as dependent objects, hence changing the output of the
print_extension_changes function.
Relevant PG commit:
e5bc9454e527b1cba97553531d8d4992892fdeef
|
|
|
|
ae104f06a6 |
PG17 compatibility: fix backend type orders in test (#7760)
This work was already done by @m3hm3t and approved as part of https://github.com/citusdata/citus/pull/7722 I separated it in this PR since the previous one contained other changes which we don't currently want to merge. Relevant PG commit: --------- Co-authored-by: Mehmet YILMAZ <mehmety87@gmail.com> |
|
|
|
b46d311e30 |
PG17 compatibility: Normalize COPY error messages (#7759)
A recent Postgres commit (*) that refactored error messages is the cause
of the diffs in pg16 regress test when running Citus on Postgres 17. The
fix changes the pg16 goldfile and includes a normalization rule for the
error messages so pg16 will pass when running with version 16 of
Postgres.
(*)
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=498ee9ee2f
|
|
|
|
4c080c48cd |
PG17 compatibility: add helper function for EXPLAIN diffs in scalar subquery output (#7757)
PG17 changed how scalar subquery outputs appear in EXPLAIN output (*). This commit changes impacted regress goldfiles to the PG17 format, and adds a helper function to covert pre-PG17 plans to the PG17 format. The conversion is required when testing Citus on pgversions prior to 17. The helper function can and should be removed when 17 becomes the minimum supported version. (*) https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=fd0398fcb |
|
|
|
81bda6fb8e |
PG17 compatibility: add/fix tests with correlated subqueries that can be pulled to a join (#7745)
Fix Test Failure in subquery_in_where, set_operations, dml_recursive in PG17 #7741 The test failures are caused by[ this commit in PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639), which enables correlated subqueries to be pulled up to a join. Prior to this, the correlated subquery was implemented as a subplan. In citus, it is not possible to pushdown a correlated subplan, but with a different plan in PG17 the query can be executed, per the test diff from `subquery_in_where`: ``` 37,39c37,41 < DEBUG: generating subplan XXX_1 for CTE event_id: SELECT user_id AS events_user_id, "time" AS events_time, event_type FROM public.events_table < DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count(*) AS count FROM ... < ERROR: correlated subqueries are not supported when the FROM clause contains a CTE or subquery --- > count > --------------------------------------------------------------------- > 0 > (1 row) > ``` This is because with pg17 `= ANY subquery` in the queries can be implemented as a join, instead of as a subplan filter on a table scan. For example, `SELECT * FROM test a WHERE x IN (SELECT x FROM test b UNION SELECT y FROM test c WHERE a.x = c.x) ORDER BY 1,2` (from set_operations) has this plan in pg17; note that the subquery is the inner side of a nested loop join: ``` ┌───────────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────────┤ │ Sort │ │ Sort Key: a.x, a.y │ │ -> Nested Loop │ │ -> Seq Scan on test a │ │ -> Subquery Scan on "ANY_subquery" │ │ Filter: (a.x = "ANY_subquery".x) │ │ -> HashAggregate │ │ Group Key: b.x │ │ -> Append │ │ -> Seq Scan on test b │ │ -> Seq Scan on test c │ │ Filter: (a.x = x) │ └───────────────────────────────────────────────────┘ ``` and this plan in pg16 (and previous pg versions); the subquery is a correlated subplan filter on a table scan: ``` ┌───────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────┤ │ Sort │ │ Sort Key: a.x, a.y │ │ -> Seq Scan on test a │ │ Filter: (SubPlan 1) │ │ SubPlan 1 │ │ -> HashAggregate │ │ Group Key: b.x │ │ -> Append │ │ -> Seq Scan on test b │ │ -> Seq Scan on test c │ │ Filter: (a.x = x) │ └───────────────────────────────────────────────┘ ``` The fix Modifies the queries causing the test failures so that an ANY subquery is not folded to a join, preserving the expected output of the tests. A similar approach was taken for existing regress tests in the[ postgres commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639). See the `join `regress test, for example. We also add pg17 specific tests that leverage this improvement in Postgres with Citus distributed planning as well. |
|
|
|
9dcd812a40 |
PG17 compatibility: Preserve DEBUG output in cte_inline (#7755)
Regression test cte_inline has the following diff;
```
DEBUG: CTE cte_1 is going to be inlined via distributed planning
DEBUG: CTE cte_1 is going to be inlined via distributed planning
DEBUG: Creating router plan
-DEBUG: query has a single distribution column value: 1
```
DEBUG message `query has a single distribution column value` does not
appear with PG17. This is because PG17 can recognize when a Result node
does not need to have an input node, so the predicate on the
distribution column is not present in the query plan. Comparing the
query plan obtained before PG17:
```
│ Result │
│ One-Time Filter: false │
│ -> GroupAggregate │
│ -> Seq Scan on public.test_table │
│ Filter: (test_table.key = 1) │
```
with the PG17 query plan:
```
┌──────────────────────────────────┐
│ QUERY PLAN │
├──────────────────────────────────┤
│ Result │
│ One-Time Filter: false │
└──────────────────────────────────┘
```
we see that the Result node in the PG16 plan has an Aggregate node, but
the Result node in the PG17 plan does not have any input node; PG17
recognizes it is not needed given a Filter that evaluates to False at
compile-time. The Result node is present in both plans because PG in
both versions can recognize when a combination of predicates equate to
false at compile time; this is the because the successive predicates in
the test query (key=6, key=5, key=4, etc) become contradictory when the
CTEs are inlined. Here is an example query showing the effect of the CTE
inlining:
```
select count(*), key FROM test_table WHERE key = 1 AND key = 2 GROUP BY key;
```
In this case, the WHERE clause obviously evaluates to False. The PG16
query plan for this query is:
```
┌────────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────────┤
│ GroupAggregate │
│ -> Result │
│ One-Time Filter: false │
│ -> Seq Scan on test_table │
│ Filter: (key = 1) │
└────────────────────────────────────┘
```
The PG17 query plan is:
```
┌────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────┤
│ GroupAggregate │
│ -> Result │
│ One-Time Filter: false │
└────────────────────────────────┘
```
In both plans the PG optimizer is able to derive the predicate 1=2 from
the equivalence class { key, 1, 2 } and then constant fold this to
False. But, in the PG16 plan the Result node has an input node (a
sequential scan on test_table), while in the PG17 plan the Result node
does not have any input. This is because PG17 recognizes that when the
Result filter resolves to False at compile time it is not necessary to
set an input on the Result. I think this is a consequence of this PG17
commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b262ad440
which handles redundant IS [NOT] NULL predicates, but also refactored
evaluating of predicates to true/false at compile-time, enabling
optimizations such as those seen here.
Given the reason for the diff, the fix preserves the test output by
modifying the query so the predicates are not contradictory when the
CTEs are inlined.
|
|
|
|
51c2e63c30 |
PG17 compatibility: add COLLPROVIDER_BUILTIN option and fix tests (#7752)
In PG17 adds builtin C.UTF-8 locale option, we add it in the code to avoid "unknown collation provider" in vanilla tests. Relevant PG commit: |
|
|
|
c8d9a1bd10 |
PG17 compatibility: Fix -1/Null diff in attstattarget test output (#7749)
Changed `attstattarget` in `pg_attribute` to use `NullableDatum`,
allowing null representation for default statistics target in PostgreSQL
17.
Relevant PG commit:
6a004f1be87d34cfe51acf2fe2552d2b08a79273
|
|
|
|
7e8bff034f |
PG17 compatibility: Fix -1/Null diff in stxstattarget test output (#7748)
Changed stxstattarget in pg_statistic_ext to use nullable
representation, removing explicit -1 for default statistics target in
PostgreSQL 17.
Relevant PG commit:
012460ee93c304fbc7220e5b55d9d0577fc766ab
|
|
|
|
26ad52713c
|
Check for Citus table in worker_copy_table_to_node (#7662)
Fixes #6795 The `worker_copy_table_to_node` is not supposed to be called for Citus tables. When this function was initially introduced in #6098 , it had the respective check. But the check was omitted, since `worker_copy_table_to_node` called for Citus table finishes with error anyway: ``` ERROR: cannot execute a distributed query from a query on a shard DETAIL: Executing a distributed query in a function call that may be pushed to a remote node can lead to incorrect results. ``` It turns out that in some cases this error does not occur. See #6795 I suggest restoring that check. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com> |
|
|
|
117bd1d04f
|
Disable nonmaindb interface (#7905)
DESCRIPTION: The PR disables the non-main db related features. The non-main db related features were introduced in https://github.com/citusdata/citus/pull/7203. |
|
|
|
711aec80fa
|
Fix system_queries test to actually test the problem (#7613)
The test added in #7604 doesn't reach the `HasRangeTableRef` function and thus doesn't test what it should. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com> |
|
|
|
2d8be01853 |
Disable 2PC recovery while executing ALTER EXTENSION cmd during Citus upgrade tests
(cherry picked from commit
|
|
|
|
b6e3f39583 |
Fix flaky citus upgrade test
(cherry picked from commit
|