citus

Commit Graph

Author	SHA1	Message	Date
Onur Tirtir	3d61c4dc71	Add citus_stat_counters view and citus_stat_counters_reset() function to reset it (#7917 ) DESCRIPTION: Adds citus_stat_counters view that can be used to query stat counters that Citus collects while the feature is enabled, which is controlled by citus.enable_stat_counters. citus_stat_counters() can be used to query the stat counters for the provided database oid and citus_stat_counters_reset() can be used to reset them for the provided database oid or for the current database if nothing or 0 is provided. Today we don't persist stat counters on server shutdown. In other words, stat counters are automatically reset in case of a server restart. Details on the underlying design can be found in header comment of stat_counters.c and in the technical readme. ------- Here are the details about what we track as of this PR: For connection management, we have three statistics about the inter-node connections initiated by the node itself: * connection_establishment_succeeded * connection_establishment_failed * connection_reused While the first two are relatively easier to understand, the third one covers the case where a connection is reused. This can happen when a connection was already established to the desired node, Citus decided to cache it for some time (see citus.max_cached_conns_per_worker & citus.max_cached_connection_lifetime), and then reused it for a new remote operation. Here are the other important details about these connection statistics: 1. connection_establishment_failed doesn't care about the connections that we could establish but are lost later in the transaction. Plus, we cannot guarantee that the connections that are counted in connection_establishment_succeeded were not lost later. 2. connection_establishment_failed doesn't care about the optional connections (see OPTIONAL_CONNECTION flag) that we gave up establishing because of the connection throttling rules we follow (see citus.max_shared_pool_size & citus.local_shared_pool_size). The reaason for this is that we didn't even try to establish these connections. 3. For the rest of the cases where a connection failed for some reason, we always increment connection_establishment_failed even if the caller was okay with the failure and know how to recover from it (e.g., the adaptive executor knows how to fall back local execution when the target node is the local node and if it cannot establish a connection to the local node). The reason is that even if it's likely that we can still serve the operation, we still failed to establish the connection and we want to track this. 4. Finally, the connection failures that we count in connection_establishment_failed might be caused by any of the following reasons and for now we prefer to _not_ further distinguish them for simplicity: a. remote node is down or cannot accept any more connections, or overloaded such that citus.node_connection_timeout is not enough to establish a connection b. any internal Citus error that might result in preparing a bad connection string so that libpq fails when parsing the connection string even before actually trying to establish a connection via connect() call c. broken citus.node_conninfo or such Citus configuration that was incorrectly set by the user can also result in similar outcomes as in b d. internal waitevent set / poll errors or OOM in local node We also track two more statistics for query execution: * query_execution_single_shard * query_execution_multi_shard And more importantly, both query_execution_single_shard and query_execution_multi_shard are not only tracked for the top-level queries but also for the subplans etc. The reason is that for some queries, e.g., the ones that go through recursive planning, after Citus performs the heavy work as part of subplans, the work that needs to be done for the top-level query becomes quite straightforward. And for such query types, it would be deceiving if we only incremented the query stat counters for the top-level query. Similarly, for non-pushable INSERT .. SELECT and MERGE queries, we perform separate counter increments for the SELECT / source part of the query besides the final INSERT / MERGE query.	2025-04-28 12:23:52 +00:00
ThomasC02	37e23f44b4	Add Support for CASCADE/RESTRICT in REVOKE statements (#7958 ) Fixes #7105. DESCRIPTION: Fixes a bug that causes omitting CASCADE clause for the commands sent to workers for REVOKE commands on tables. --------- Co-authored-by: ThomasC02 <thomascantrell02@gmail.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> Co-authored-by: Tiago Silva <tiagos3373@gmail.com>	2025-04-26 01:13:41 +03:00
Karina	48d89c9c1b	Adjust max_prepared_transactions only when it is default (#7712 ) DESCRIPTION: Adjusts max_prepared_transactions only when it's set to default on PG >= 16 Fixes #7711. Change AdjustMaxPreparedTransactions to really check if max_prepared_transactions is explicitly set by user, and only adjust max_prepared_transactions when it is default. This fixes 021_twophase test failure with loaded Citus library after postgres/postgres@b39c5272. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2025-04-24 11:11:49 +00:00
manaldush	0e6127c4f6	AddressSanitizer: stack-use-after-scope on distributed_planner:HasUnresolvedExternParamsWalker (#7948 ) Var externParamPlaceholder is created on stack, and its address is used for paramFetch. Postgres code return address of externParamPlaceholder var to externParam, then code flow go out of scope and dereference pointer on stack out of scope. Fixes https://github.com/citusdata/citus/issues/7941. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2025-04-04 13:27:56 +00:00
manaldush	f084b79a4b	AddressSanitizer: stack-use-after-scope on address in CreateBackgroundJob (#7949 ) Var jobTypeName is created on stack and its value over pointer is used in heap_form_tuple, so we have stack use out of scope. Issue was detected with adress sanitizer. Fixes #7943.	2025-04-04 13:03:41 +00:00
Cédric Villemain	1dc60e38bb	Propagates GRANT/REVOKE rights on table columns (#7918 ) This commit adds support for GRANT/REVOKE on table columns. It extends propagated DDL according to this logic: https://github.com/citusdata/citus/tree/main/src/backend/distributed#ddl * Unchanged pre-existing behavior related to splitting ddl per relation during propagation. * Changed the way ACL are checked in some cases (see `EnsureTablePermissions()` and associated commits) * Rewrite `pg_get_table_grants` to include column grants as well * Add missing `pfree()` in `pg_get_table_grants()` Fixes https://github.com/citusdata/citus/issues/7287 Also check a box in https://github.com/citusdata/citus/issues/4812	2025-04-04 11:54:16 +03:00
Cédric Villemain	a7e686c106	Make sure to prevent INSERT INTO ... SELECT queries involving subfield or sublink (#7912 ) DESCRIPTION: Makes sure to prevent `INSERT INTO ... SELECT` queries involving subfield or sublink, to avoid crashes The following query was crashing the backend: ``` INSERT INTO field_indirection_test_1 ( int_col, ct1_col.int_1,ct1_col.int_2 ) SELECT 0, 1, 2; -- crash ``` En passant, added more tests with sublink in distributed_types and found another query with wrong behavior: ``` INSERT INTO domain_indirection_test (f1,f3.if1) SELECT 0, 1; ERROR: could not find a conversion path from type 23 to 17619 -- not the expected ERROR ``` Fixed them by using `strip_implicit_coercions()` on target entry expression before checking for the presence of a subscript or fieldstore, else we fail to find the existing ones and wrongly accept to execute unsafe query.	2025-03-27 09:39:43 +00:00
Naisila Puka	4b4fa22b64	Fix mis-deparsing of shard query in "output-table column" name conflict (#7932 ) DESCRIPTION: Fixes a bug in deparsing of shard query in case of "output-table column" name conflict If an `ORDER BY` item in `SELECT` is a bare identifier, the parser _first seeks it as an output column name_ of the `SELECT` (for SQL92 compatibility). However, ruleutils.c is expecting the SQL99 interpretation _where such a name is an input column name_. So it's possible to produce an incorrect display of a view in the (admittedly pretty ill-advised) case where some other column is renamed in the `SELECT` output list to match an `ORDER BY` column. The `DISTINCT ON` expressions are interpreted using the same rules as for `ORDER BY`. We had an issue reported that actually uses `DISTINCT ON`: #7684 Since Citus uses ruleutils deparsing logic to create the shard queries, it would not table-qualify the column names as needed. PG17 fixed this https://github.com/postgres/postgres/commit/a7eb633563c by table-qualifying such names in the dumped view text. Therefore, Citus doesn't reproduce the issue in PG17, since PG17 table-qualifies the column names when needed, and the produced shard queries are correct. This PR applies the PG17 patch to `ruleutils_15.c` and `ruleutils_16.c`. Even though we generally try to avoid modifying the ruleutils files, in this case we are applying a Postgres patch that `ruleutils_17.c` already has: `897d996b8f` Thanks @c2main for your discussion and idea in the issue. Fixes #7684	2025-03-19 14:21:30 +03:00
German Eichberger	1c09469dd2	Adds a method to determine if current node is primary (#7720 ) DESCRIPTION: Adds citus_is_primary_node() UDF to determine if the current node is a primary node in the cluster. --------- Co-authored-by: German Eichberger <geeichbe@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2025-03-18 15:12:42 +00:00
Onur Tirtir	680b870d45	Add STYLEGUIDE.md and update some other md files on best practices (#7347 )	2025-03-14 15:42:59 +00:00
naisila	52bf7a1d03	Fix ObjectClass declaration for PG17 since it was removed Relevant PG commit: `89e5ef7e21` 89e5ef7e21812916c9cf9fcf56e45f0f74034656 We had already provided a fix for this in the following commit `da2624cee8` However, this solution wasn't enough for the commits on main. Specifically, we had issues with the following commit: `1d55debb98` Problem: https://github.com/citusdata/citus/actions/runs/13806825532/attempts/1#summary-38619483894 This new solution is better anyway. We define exactly what was previously defined in PG<17.	2025-03-13 15:13:56 +03:00
naisila	1d0bdbd749	Bump Citus into 13.1devel	2025-03-13 15:13:56 +03:00
naisila	be75c0ec4c	Use datlocale in check_database_on_all_nodes function for PG17 This commit also has to do with renaming of daticulocale to datlocale Relevant PG commit: f696c0cd5f299f1b51e214efc55a22a782cc175d `f696c0cd5f` Keeping this commit separate from the previous one because these changes will be different once we drop PG15 support. For now I renamed pg_ge_15_options to pg_ge_15_17_options and together with it I changed the meaning of the variable. However when we drop PG14 support, we will use pg_ge_17_options and delete pg_ge_15_options altogether	2025-03-13 15:13:56 +03:00
naisila	caceb35eba	Some cleanup from dropping pg14	2025-03-13 15:13:56 +03:00
naisila	08913e27d7	PG17 renamed Anum_pg_database_daticulocale to Anum_pg_database_datlocale	2025-03-13 15:13:56 +03:00
naisila	17b4122e84	Rename some more foreach_ptr to foreach_declared_ptr	2025-03-13 15:13:56 +03:00
naisila	c02d899b6c	Change StaticAssertStmt for node-wide objects to pg17	2025-03-13 15:13:56 +03:00
Cédric Villemain	ed40a0ad02	fix issue #7676 : wrong handler around MULTIEXPR (#7914 ) DESCRIPTION: Fixes a bug with `UPDATE SET (...) = (SELECT some_func(),... )` (#7676) Citus was checking for presence of sublink, but forgot to manage multiexpr while evaluating clauses during planning. At this stage (citus planner), it's not always possible to call PostgreSQL code because the tree is not yet ready for PostgreSQL pure executor. Fixes https://github.com/citusdata/citus/issues/7676. Fixed by adding a new function to check sublink or multiexpr in the tree. --------- Co-authored-by: Colm <colmmchugh@microsoft.com>	2025-03-12 16:03:30 +03:00
Mehmet YILMAZ	e50563fbd8	Issue 7887 Enhance AddInsertSelectCasts for Identity Columns (#7920 ) ## Enhance `AddInsertSelectCasts` for Identity Columns This PR fixes #7887 and improves the behavior of partial inserts into identity columns by modifying the `AddInsertSelectCasts` function. Specifically, we introduce special-case handling for `nextval(...)` calls (represented in the parse tree as `NextValueExpr`) to ensure that if the identity column’s declared type differs from `nextval`’s default return type (`int8`), we cast the expression properly. This prevents mismatches like `int8` → `int4` from causing “invalid string enlargement” errors or other type-related failures. When `INSERT ... SELECT` is processed, `AddInsertSelectCasts` reconciles each target column’s type with the corresponding SELECT expression’s type. Historically, for identity columns that rely on `nextval(...)`, we can end up with a mismatch: - `nextval` returns `int8`, - The identity column might be `int4`, `bigint`, or another integer type. Without a correct cast, Postgres or Citus can produce plan-time or runtime errors. By detecting `NextValueExpr` and applying a cast to the column’s type, the final plan ensures consistent insertion without errors. ## What Changed 1. Check for `NextValueExpr`: In `AddInsertSelectCasts`, we now have a code block: ```c if (IsA(selectEntry->expr, NextValueExpr)) { Oid nextvalType = GetNextvalReturnTypeCatalog(); ... // If (targetType != nextvalType), build a cast from int8 -> targetType } else { // fallback to generic mismatch logic } ``` This short-circuits any expression that’s a `nextval(...)` call, letting us explicitly cast to the correct type. 2. Fallback Generic Logic: If it isn’t a `NextValueExpr` (i.e. a normal column or expression mismatch), we still rely on the existing path that compares `sourceType` vs. `targetType` and calls `CastExpr(...)` if they differ. 3. `GetNextvalReturnTypeCatalog`: We added or refined a helper function to confirm that `nextval` returns `int8`, or do a `LookupFuncName("nextval", ...)` to discover the function’s return type from `pg_proc`—making it robust if future changes happen. ## Benefits - Partial inserts into identity columns no longer fail with type mismatches. - When `nextval` yields `int8` but the identity column is `int4` (or another type), we properly cast to the column’s type in the plan. - Preserves the existing approach for other columns—only identity calls get the specialized `NextValueExpr` logic. ## Testing - Extended `generatedidentity.sql` test scenario to cover partial inserts into both `GENERATED ALWAYS` and `GENERATED BY DEFAULT` identity columns, including tests for the `OVERRIDING SYSTEM VALUE` clause and partial inserts referencing foreign-key columns.	2025-03-12 12:43:01 +03:00
Muhammad Usama	95da74c47f	Fix Deadlock with transaction recovery is possible during Citus upgrades (#7910 ) DESCRIPTION: Fixes deadlock with transaction recovery that is possible during Citus upgrades. Fixes #7875. This commit addresses two interrelated deadlock issues uncovered during Citus upgrades: 1. Local Deadlock: - Problem: In `RecoverWorkerTransactions()`, a new connection is created for each worker node to perform transaction recovery by locking the `pg_dist_transaction` catalog table until the end of the transaction. When `RecoverTwoPhaseCommits()` calls this function for each worker node, the order of acquiring locks on `pg_dist_authinfo` and `pg_dist_transaction` can alternate. This reversal can lead to a deadlock if any concurrent process requires locks on these tables. - Fix: Pre-establish all worker node connections upfront so that `RecoverWorkerTransactions()` operates with a single, consistent connection. This ensures that locks on `pg_dist_authinfo` and `pg_dist_transaction` are always acquired in the correct order, thereby preventing the local deadlock. 2. Distributed Deadlock: - Problem: After resolving the local deadlock, a distributed deadlock issue emerges. The maintenance daemon calls `RecoverWorkerTransactions()` on each worker node— including the local node—which leads to a complex locking sequence: - A RowExclusiveLock is taken on the `pg_dist_transaction` table in `RecoverWorkerTransactions()`. - An update extension then tries to acquire an AccessExclusiveLock on the same table, getting blocked by the RowExclusiveLock. - A subsequent query (e.g., a SELECT on `pg_prepared_xacts`) issued using a separate connection on the local node gets blocked due to locks held during a call to `BuildCitusTableCacheEntry()`. - The maintenance daemon waits for this query, resulting in a circular wait and stalling the entire cluster. - Fix: Avoid cache lookups for internal PostgreSQL tables by implementing an early bailout for relation IDs below `FirstNormalObjectId` (system objects). This eliminates unnecessary calls to `BuildCitusTableCache`, reducing lock contention and mitigating the distributed deadlock. Furthermore, this optimization improves performance in fast connect→query_catalog→disconnect cycles by eliminating redundant cache creation and lookups. 3. Also reverts the commit that disabled the relevant test cases.	2025-03-12 12:43:01 +03:00
Colm	4139370a1d	#7782 - catch when Postgres planning removes all Citus tables (#7907 ) DESCRIPTION: fix a planning error caused by a redundant WHERE clause Fix a Citus planning glitch that occurs in a DML query when the WHERE clause of the query is of the form: ` WHERE true OR <expression with 1 or more citus tables> ` and this is the only place in the query referencing a citus table. Postgres' standard planner transforms the WHERE clause to: ` WHERE true ` So the query now has no citus tables, confusing the Citus planner as described in issues #7782 and #7783. The fix is to check, after Postgres standard planner, if the Query has been transformed as shown, and re-run the check of whether or not the query needs distributed planning.	2025-03-12 12:43:01 +03:00
Mehmet YILMAZ	87ec3def55	Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery (#7897 ) This PR fixes an issue #7891 in the Citus planner where an `UPDATE` on a local table with a subquery referencing a reference table could produce a 0-task plan. Historically, the planner sometimes failed to detect that both the target and referenced tables were effectively “local,” assigning `INVALID_SHARD_ID `and yielding a no-op plan. ### Root Cause - In the Citus router logic (`PlanRouterQuery`), we relied on `shardId` to determine whether a query should be routed to a single shard. - If `shardId == INVALID_SHARD_ID`, but we also had not marked the query as a “local table modification,” the code path would produce zero tasks. - Local + reference tables do not require multi-shard routing. Failing to detect this “purely local” scenario caused Citus to incorrectly route to zero tasks. ### Changes Enhanced Local Table Detection - Updated `IsLocalTableModification` and related checks to consider both local and reference tables as “local” for planning, preventing the 0-task scenario. - Expanded `ContainsOnlyLocalOrReferenceTables` to return true if there are no fully distributed tables in the query. Added Regress Test - Introduced a new regress test (`issue_7891.sql`) which reproduces the scenario. - Verifies we get a valid single- or local-task plan rather than a 0-task plan.	2025-03-12 12:43:01 +03:00
Colm	ec141f696a	Enhance MERGE .. WHEN NOT MATCHED BY SOURCE for repartitioned source (#7900 ) DESCRIPTION: Ensure that a MERGE command on a distributed table with a `WHEN NOT MATCHED BY SOURCE` clause runs against all shards of the distributed table. The Postgres MERGE command updates a table using a table or a query as a data source. It provides three ways to match the target table with the source: `WHEN MATCHED` means that there is a row in both the target and source; `WHEN NOT MATCHED` means that there is a row in the source that has no match (is not present) in the target; and, as of PG17, `WHEN NOT MATCHED BY SOURCE` means that there is a row in the target that has no match in the source. In Citus, when a MERGE command updates a distributed table using a local/reference table or a distributed query as source, that source is repartitioned, and for each repartitioned shard that has data (i.e. 1 or more rows) the MERGE is run against the corresponding distributed table shard. Suppose the distributed table has 32 shards, and the source repartitions into 4 shards that have data, with the remaining 28 shards being empty; then the MERGE command is performed on the 4 corresponding shards of the distributed table. However, the semantics of `WHEN NOT MATCHED BY SOURCE` are that the specified action must be performed on the target for each row in the target that is not in the source; so if the source is empty, all target rows should be updated. To see this, consider the following MERGE command: ``` MERGE INTO target AS t USING source AS s ON t.id = s.id WHEN NOT MATCHED BY SOURCE THEN UPDATE t SET t.col1 = 100 ``` If the source has zero rows then every row in the target is updated s.t. its col1 value is 100. Currently in Citus a MERGE on a distributed table with a local/reference table or a distributed query as source ignores shards of the distributed table when the corresponding shard of the repartitioned source has zero rows. However, if the MERGE command specifies a `WHEN NOT MATCHED BY SOURCE` clause, then the MERGE should be performed on all shards of the distributed table, to ensure that the specified action is performed on the target for each row in the target that is not in the source. This PR enhances Citus MERGE execution so that when a repartitioned source shard has zero rows, and the MERGE command specifies a `WHEN NOT MATCHED BY SOURCE` clause, the MERGE is performed against the corresponding shard of the distributed table using an empty (zero row) relation as source, by generating a query of the form: ``` MERGE INTO target_shard_0002 AS t USING (SELECT id FROM (VALUES (NULL) ) source_0002(id) WHERE FALSE) AS s ON t.id = s.id WHEN NOT MATCHED BY SOURCE THEN UPDATE t set t.col1 = 100 ``` This works because each row in the target shard will be updated, and `WHEN MATCHED` and `WHEN NOT MATCHED`, if specified, will be no-ops because the source has zero rows. To implement this when the source is a local or reference table involves teaching function `ExcuteSourceAtCoordAndRedistribution()` in `merge_executor.c` to not prune tasks when the query has `WHEN NOT MATCHED BY SOURCE` but to instead replace the task's query to one that uses an empty relation as source. And when the source is a distributed query, function `ExecuteMergeSourcePlanIntoColocatedIntermediateResults()` (also in `merge_executor.c`) instead of skipping empty tasks now generates a query that uses an empty relation as source for the corresponding target shard of the distributed table, but again only when the query has `WHEN NOT MATCHED BY SOURCE`. A new function `BuildEmptyResultQuery()` is added to `recursive_planning.c` and it is used by both the aforementioned functions in `merge_executor.c` to build an empty relation to use as the source. It applies the appropriate type to each column of the empty relation so the join with the target makes sense to the query compiler.	2025-03-12 12:43:01 +03:00
OlgaSergeyevaB	ccd7ddee36	Custom Scan (ColumnarScan): exclude outer_join_rels from CandidateRelids (#7703 ) DESCRIPTION: Fixes a crash in columnar custom scan that happens when a columnar table is used in a join. Fixes issue #7647. Co-authored-by: Ольга Сергеева <ob-sergeeva@it-serv.ru>	2025-03-12 12:43:01 +03:00
Colm	89674d9630	[Bug Fix] SEGV on query with Left Outer Join (#7787 ) (#7901 ) DESCRIPTION: Fixes a crash in left outer joins that can happen when there is an an aggregate on a column from the inner side of the join. Fix the SEGV seen in #7787 and #7899; it occurs because a column in the targetlist of a worker subquery can contain a non-empty varnullingrels field if the column is from the inner side of a left outer join. The issue can also occur with the columns in the HAVING clause, and this is also tested in the fix. The issue was triggered by the introduction of the varnullingrels to Vars in Postgres 16 (2489d76c) There is a related issue, #7705, where a non-empty varnullingrels was incorrectly copied into the query tree for the combine query. Here, a non-empty varnullingrels field of a var is incorrectly copied into the query tree for a worker subquery. The regress file from #7705 is used (and renamed) to also test this (#7787). An alternative test output file is required for Postgres 15 because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).	2025-03-12 12:43:01 +03:00
Naisila Puka	2b5dfbbd08	Bump Citus version to 13.0.1 (#7872 )	2025-03-12 12:43:01 +03:00
Onur Tirtir	7004295065	Revert "Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered" This reverts commit `684b4c6b96`.	2025-03-12 12:43:01 +03:00
Naisila Puka	3b1c082791	Drops PG14 support (#7753 ) DESCRIPTION: Drops PG14 support 1. Remove "$version_num" != 'xx' from configure file 2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code 3. Look at pg_version_compat.h file, remove all _compat functions etc defined specifically for PGXX differences 4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM < PG_VERSION_(XX+1) ifs in the codebase 5. delete ruleutils_xx.c file 6. cleanup normalize.sed file from pg14 specific lines 7. delete all alternative output files for that particular PG version, server_version_ge variable helps here	2025-03-12 12:43:01 +03:00
Onur Tirtir	d5618b6b4c	Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered As of this commit, after recovering the remote transactions, now we release the lock on pg_dist_transaction while closing it to avoid deadlocks that might occur because of trying to acquire a lock on pg_dist_authinfo while holding a lock on pg_dist_transaction. Such a scenario can only cause a deadlock if another transaction is trying to acquire a strong lock on pg_dist_transaction while holding a lock on pg_dist_authinfo. As of today, we (implicitly) acquire a strong lock on pg_dist_transaction only when upgrading Citus to 11.3-1 and this happens when creating a REPLICA IDENTITY on pg_dist_transaction. And regardless of the code-path we are in, it should be okay to release the lock there because all we do after that point is to abort the prepared transactions that are not part of an in-progress distributed transaction and releasing the lock before doing so should be just fine. This also changes the blocking behavior between citus_create_restore_point and the transaction recovery code-path in the sense that now citus_create_restore_point doesn't until transaction recovery completes aborting the prepared transactions that are not part of an in-progress distributed transaction. However, this should be fine because even before this was possible, e.g., if transaction recovery fails to open a remote connection to a node.	2025-03-12 12:43:01 +03:00
Naisila Puka	85739b34bf	Fix pg17 test (#7857 ) error merged in `ab7c3b7804`	2025-03-12 12:43:01 +03:00
Mehmet YILMAZ	1bb6c7e95f	PG17 Compatibility - Fix crash when pg_class is used in MERGE (#7853 ) This pull request addresses Issue #7846, where specific MERGE queries on non-distributed and distributed tables can result in crashes in certain scenarios. The issue stems from the usage of `pg_class` catalog table, and the `FilterShardsFromPgclass` function in Citus. This function goes through the query's jointree to hide the shards. However, in PG17, MERGE's join quals are in a separate structure called `mergeJoinCondition`. Therefore FilterShardsFromPgclass was not filtering correctly in a `MERGE` command that involves `pg_class`. To fix the issue, we handle `mergeJoinCondition` separately in PG17. Relevant PG commit: `0294df2f1f` Non-Distributed Tables: A MERGE query involving a non-distributed table using `pg_catalog.pg_class` as the source may execute successfully but needs testing to ensure stability. Distributed Tables: Performing a MERGE on a distributed table using `pg_catalog.pg_class` as the source raises an error: `ERROR: MERGE INTO a distributed table from Postgres table is not yet supported` However, in some cases, this can lead to a server crash if the unsupported operation is not properly handled. This is the test output from the same test conducted prior to the code changes being implemented. ``` -- Issue #7846: Test crash scenarios with MERGE on non-distributed and distributed tables -- Step 1: Connect to a worker node to verify shard visibility \c postgresql://postgres@localhost::worker_1_port/regression?application_name=psql SET search_path TO pg17; -- Step 2: Create and test a non-distributed table CREATE TABLE non_dist_table_12345 (id INTEGER); -- Test MERGE on the non-distributed table MERGE INTO non_dist_table_12345 AS target_0 USING pg_catalog.pg_class AS ref_0 ON target_0.id = ref_0.relpages WHEN NOT MATCHED THEN DO NOTHING; SSL SYSCALL error: EOF detected connection to server was lost ```	2025-03-12 12:43:01 +03:00
Colm	a18f8990be	Update tdigest_aggregate_support output for PG15+ (#7849 ) Regress test tdigest_aggregate_support has been failing since at least Citus 12.0, when tdigest extension is installed in Postgres. This appears to be because of an omission by commit `03832f3` and a change in the implementation of Postgres random() function (pg commit [d4f109e4a](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d4f109e4a)). To reproduce the test diff: - Checkout [tdigest ](https://github.com/tvondra/tdigest)and run `make; make install` - In citus regress directory run `make check-multi` or `./citus_tests/run_test.py tdigest_aggregate_support` There are two parts to this commit: 1. Revert `Output: xxxxx` in EXPLAIN VERBOSE. Citus commit `fe4ac51` normalized EXPLAIN VERBOSE output because of a change between pg12 and pg13. When pg12 support was no longer required, the rule was removed from normalize.sed and `Output: xxxx` was reverted in the impacted regress output files (`03832f3`), but `tdigest_aggregate_support` was omitted. 2. Adjust the query results; the tdigest_aggregate_support test file has a comment _verifying results - should be stable due to seed while inserting the data, if failure due to data these queries could be removed or check for certain ranges_ but the result values in this commit are consistent across citus 12.0 (pg 15), citus 12.1 (pg 16) and citus 13.0 (pg 17), or since the Postgres changed their [implementation of random](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d4f109e4a), so proposing to go with these results.	2025-03-12 12:43:01 +03:00
Naisila Puka	0642a4dc08	Propagate MERGE ... WHEN NOT MATCHED BY SOURCE (#7807 ) DESCRIPTION: Propagates MERGE ... WHEN NOT MATCHED BY SOURCE It seems like there is not much needed to be done here. `get_merge_query_def` from `ruleutils_17` is updated with "WHEN NOT MATCHED BY SOURCE" therefore `deparse_shard_query` parses the merge query for execution on the shard correctly. Relevant PG commit: https://github.com/postgres/postgres/commit/0294df2f1	2025-03-12 12:43:00 +03:00
Naisila Puka	74d945f5ae	PG17 - Propagate EXPLAIN options: MEMORY and SERIALIZE (#7802 ) DESCRIPTION: Propagates MEMORY and SERIALIZE options of EXPLAIN The options for `MEMORY` can be true or false. Default is false. The options for `SERIALIZE` can be none, text or binary. Default is none. I referred to how we added support for WAL option in this PR [Support EXPLAIN(ANALYZE, WAL)](https://github.com/citusdata/citus/pull/4196). For the tests however, I used the same tests as Postgres, not like the tests in the WAL PR. I used exactly the same tests as Postgres does, I simply distributed the table beforehand. See below the relevant Postgres commits from where you can see the tests added as well: - [Add EXPLAIN (MEMORY)](https://github.com/postgres/postgres/commit/5de890e36) - [Invent SERIALIZE option for EXPLAIN.](https://github.com/postgres/postgres/commit/06286709e) This PR required a lot of copying of Postgres static functions regarding how `EXPLAIN` works for `MEMORY` and `SERIALIZE` options. Specifically, these copy-pastes were required for updating `ExplainWorkerPlan()` function, which is in fact based on postgres' `ExplainOnePlan()`: ```C /* copied from explain.c to update ExplainWorkerPlan() in citus according to ExplainOnePlan() in postgres / #define BYTES_TO_KILOBYTES(b) typedef struct SerializeMetrics static bool peek_buffer_usage(ExplainState es, const BufferUsage usage); static void show_buffer_usage(ExplainState es, const BufferUsage usage); static void show_memory_counters(ExplainState es, const MemoryContextCounters mem_counters); static void ExplainIndentText(ExplainState es); static void ExplainPrintSerialize(ExplainState es, SerializeMetrics metrics); static SerializeMetrics GetSerializationMetrics(DestReceiver *dest); ``` _Note_: it looks like we were missing some `buffers` option details as well. I put them together with the memory option, like the code in Postgres explain.c, as I didn't want to change the copied code. However, I tested locally and there is no big deal in previous Citus versions, and you can also see that existing Citus tests with `buffers true` didn't change. Therefore, I prefer not to backport "buffers" changes to previous versions.	2025-03-12 12:43:00 +03:00
Mehmet YILMAZ	7682d135a4	PG17 - Add Regression Test for REINDEX support in event triggers (#7819 ) This PR adds regression tests to verify REINDEX support with event triggers. Tests validates trigger execution, shard placement consistency, and distributed index rebuilding without disruption.	2025-03-12 12:43:00 +03:00
Mehmet YILMAZ	08d94f9eb6	PG17 - Add Regression Test for Access Method Behavior on Partitioned Tables (#7818 ) This PR adds a regression test to verify the behavior of access methods for partitioned and distributed tables, including: - Creating partitioned tables with heap. - Distributing tables using create_distributed_table. - Switching access methods to columnar with ALTER TABLE. - Validating access method inheritance for new partitions. Relecant PG17 commit: https://github.com/postgres/postgres/commit/374c7a229	2025-03-12 12:43:00 +03:00
Naisila Puka	8f436e4a48	Add tests with xmltext() and random(min, max) (#7824 ) xmltext() converts text into xml text nodes. Test with columnar and citus tables. Relevant PG17 commit: https://github.com/postgres/postgres/commit/526fe0d79 random(min, max) generates random numbers in a specified range Add tests like the ones for random() in aggregate_support.sql References: https://github.com/citusdata/citus/blob/main/src/test/regress/sql/aggregate_support.sql#L493-L532 https://github.com/citusdata/citus/pull/7183 Relevant PG17 commit: https://github.com/postgres/postgres/commit/e6341323a	2025-03-12 12:43:00 +03:00
Naisila Puka	8940665d17	Allow configuring sslnegotiation using citus.node_conn_info (#7821 ) Relevant PG commit: https://github.com/postgres/postgres/commit/d39a49c1e PR similar to https://github.com/citusdata/citus/pull/5203	2025-03-12 12:26:06 +03:00
Naisila Puka	1d57a36ecc	Add pg17 jsonpath methods tests (#7820 ) various jsonpath methods were added in PG17 Relevant PG commit: https://github.com/postgres/postgres/commit/66ea94e8e Here we add the same test as in pg15_jsonpath.sql for the new additions	2025-03-12 12:26:06 +03:00
Naisila Puka	658632642a	Disallow infinite values for partition interval in create_time_partitions udf (#7822 ) PG17 added +/- infinity values for the interval data type Relevant PG commit: https://github.com/postgres/postgres/commit/519fc1bd9	2025-03-12 12:26:06 +03:00
Naisila Puka	3e96a19606	Adds JSON_TABLE() support, and SQL/JSON constructor/query functions tests (#7816 ) DESCRIPTION: Adds JSON_TABLE() support PG17 has added basic `JSON_TABLE()` functionality `JSON_TABLE()` allows `JSON` data to be converted into a relational view and thus used, for example, in a `FROM` clause, like other tabular data. We treat `JSON_TABLE` the same as correlated functions (e.g., recurring tuples). In the end, for multi-shard `JSON_TABLE` commands, we apply the same restrictions as reference tables (e.g., cannot perform a lateral outer join when a distributed subquery references a (reference table)/(json table) etc.) Relevant PG17 commits: [basic JSON table](https://github.com/postgres/postgres/commit/de3600452), [nested paths in json table](https://github.com/postgres/postgres/commit/bb766cde6) Onder had previously added json table support for PG15BETA1, but we reverted that commit because json table was reverted in PG15. `ce7f1a530f` Previous relevant PG15Beta1 commit: https://github.com/postgres/postgres/commit/4e34747c8 Therefore, I referred to Onder's commit for this commit as well, with a few changes due to some differences between PG15/PG17: 1) In PG15Beta1, we had also `PLAN` clauses for `JSON_TABLE` https://github.com/postgres/postgres/commit/fadb48b00, and Onder's commit includes tests for those as well. However, `PLAN` nodes are _not_ added in PG17. Therefore, I didn't include the `json_table_select_only` test, which had mostly queries involving `PLAN`. I only included the last query from that test. 2) In PG15 timeline (Citus 11.1), we didn't support outer joins where the outer rel is a recurring one and the inner one is a non-recurring one. However, [Onur added support for that one in Citus 11.2](https://github.com/citusdata/citus/pull/6512), therefore I updated the tests from Onder's commit accordingly. 3) PG17 json table has nested paths and columns, therefore I added a test with a distributed table, which is exactly the same as the one in sqljson_jsontable in PG17. https://github.com/postgres/postgres/commit/bb766cde6 This pull request also adds some basic tests on validation of SQL/JSON constructor functions JSON(), JSON_SCALAR(), and JSON_SERIALIZE(), and also SQL/JSON query functions JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE(). The relevant PG commits are the following: [JSON(), JSON_SCALAR(), JSON_SERIALIZE()](https://github.com/postgres/postgres/commit/03734a7fe) [JSON_EXISTS(), JSON_VALUE(), JSON_QUERY()](https://github.com/postgres/postgres/commit/6185c9737)	2025-03-12 12:26:05 +03:00
Naisila Puka	2112aa1860	Add tests for inserting with AT LOCAL operator (#7815 ) PG17 has added support for AT LOCAL operator it converts the given time type to time stamp with the session's TimeZone value as time zone. Here we add tests that validate that we can use AT LOCAL at INSERT commands Relevant PG commit: https://github.com/postgres/postgres/commit/97957fdba With the tests, we verify that we evaluate AT LOCAL at the coordinator and then perform the insert remotely.	2025-03-12 12:25:49 +03:00
Mehmet YILMAZ	1cf5c190aa	Error out for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION (#7814 ) PG17 added support for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION. Relevant PG commit: https://github.com/postgres/postgres/commit/5d06e99a3 We currently don't support propagating this command for Citus tables. It is added to future work. This PR disallows `ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION` on all Citus table types (local, distributed, and partitioned distributed) by adding an error check in `ErrorIfUnsupportedAlterTableStmt`. A new regression test verifies that each table type fails with a consistent error message when attempting to set an expression.	2025-03-12 12:25:49 +03:00
Mehmet YILMAZ	24585a8c04	Error out for ALTER TABLE ... SET ACCESS METHOD DEFAULT (#7803 ) PG17 introduced ALTER TABLE ... SET ACCESS METHOD DEFAULT This PR introduces and enforces an error check preventing ALTER TABLE ... SET ACCESS METHOD DEFAULT on both Citus local tables (added via citus_add_local_table_to_metadata) and distributed/partitioned distributed tables. The regression tests now demonstrate that each table type raises an error advising users to explicitly specify an access method, rather than relying on DEFAULT. This ensures consistent behavior across local and distributed environments in Citus. The reason why we currently don't support this is that we can't simply propagate the command as it is, because the default table access method may be different across Citus cluster nodes. Relevant PG commit: https://github.com/postgres/postgres/commit/d61a6cad6	2025-03-12 12:25:49 +03:00
Naisila Puka	b7d04038cb	Add tests for FORCE_NULL * and FORCE_NOT_NULL * options for COPY FROM (#7812 ) These options already existed in PG17, and we support them and have tests for them in `multi_copy.sql`. In PG17, their capability was extended to specify ALL columns at once using . Citus performs the COPY correctly, as is validated by the added tests in this PR. Relevant PG commit: https://github.com/postgres/postgres/commit/f6d4c9cf1 Copy-pasting from Postgres documentation what these options do, such that the reviewer may better understand the tests added: `FORCE_NOT_NULL`: Do not match the specified columns' values against the null string. In the default case where the null string is empty, this means that empty values will be read as zero-length strings rather than nulls, even when they are not quoted. If is specified, the option will be applied to all columns. This option is allowed only in `COPY FROM`, and only when using `CSV` format. `FORCE_NULL`: Match the specified columns' values against the null string, even if it has been quoted, and if a match is found set the value to `NULL`. In the default case where the null string is empty, this converts a quoted empty string into `NULL`. If * is specified, the option will be applied to all columns. This option is allowed only in `COPY FROM`, and only when using `CSV` format. `FORCE_NULL` and `FORCE_NOT_NULL` can be used simultaneously on the same column. This results in converting quoted null strings to null values and unquoted null strings to empty strings. Explain it to me like I'm a 5-year-old, for a text column: `FORCE_NULL` looks for empty strings and registers them as `NULL` `FORCE_NOT_NULL` looks for null values and registers them as empty strings.	2025-03-12 12:25:49 +03:00
Naisila Puka	5e9f8d838c	Error for COPY FROM ... on_error, log_verbosity with Citus tables (#7811 ) PG17 added the new ON_ERROR option for COPY FROM. When this option is specified, COPY skips soft errors and continues copying. Relevant PG commits: -- https://github.com/postgres/postgres/commit/9e2d87011 -- https://github.com/postgres/postgres/commit/b725b7eec I tried it locally with Citus tables. Without further implementation, it doesn't work correctly. Therefore, we error out for now, and add it to future work. PG17 also added log_verbosity option, which controls the amount of messages emitted during processing. This is currently used in COPY FROM when ON_ERROR option is set to ignore. Therefore, we error out for this option as well. Relevant PG17 commit: https://github.com/postgres/postgres/commit/f5a227895	2025-03-12 12:25:49 +03:00
Naisila Puka	202ad077bd	PG17: ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT (#7808 ) DESCRIPTION: Propagates ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT We automatically support this. Adding tests only. We currently don't support ALTER TABLE ALTER COLUMN SET STATISTICS Relevant PG commit: https://github.com/postgres/postgres/commit/4f622503d	2025-03-12 12:25:49 +03:00
Naisila Puka	a383ef6831	Adds PG17.1 support - Regression tests sanity (#7661 ) This is the final commit that adds PG17 compatibility with Citus's current capabilities. You can use Citus community, release-13.0 branch, with PG17.1. --------- Specifically, this commit: - Enables PG17 in the configure script. - Adds PG17 tests to CI using test images that have 17.1 - Fixes an upgrade test: see below for details In `citus_prepare_upgrade()`, don't drop any_value when upgrading from PG16+, because PG16+ has its own any_value function. Attempting to do so results in the error seen in [pg16-pg17 upgrade](https://github.com/citusdata/citus/actions/runs/11768444117/job/32778340003?pr=7661): ``` ERROR: cannot drop function any_value(anyelement) because it is required by the database system CONTEXT: SQL statement "DROP AGGREGATE IF EXISTS pg_catalog.any_value(anyelement)" ``` When 16 becomes the minimum supported Postgres version, the drop statements can be removed. --------- Several PG17 Compatibility commits have been merged before this final one. All these subtasks are done https://github.com/citusdata/citus/issues/7653 See the list below: Compilation PR: https://github.com/citusdata/citus/pull/7699 Ruleutils PR: https://github.com/citusdata/citus/pull/7725 Sister PR for tests: https://github.com/citusdata/the-process/pull/159 Helpful smaller PRs: - https://github.com/citusdata/citus/pull/7714 - https://github.com/citusdata/citus/pull/7726 - https://github.com/citusdata/citus/pull/7731 - https://github.com/citusdata/citus/pull/7732 - https://github.com/citusdata/citus/pull/7733 - https://github.com/citusdata/citus/pull/7738 - https://github.com/citusdata/citus/pull/7745 - https://github.com/citusdata/citus/pull/7747 - https://github.com/citusdata/citus/pull/7748 - https://github.com/citusdata/citus/pull/7749 - https://github.com/citusdata/citus/pull/7752 - https://github.com/citusdata/citus/pull/7755 - https://github.com/citusdata/citus/pull/7757 - https://github.com/citusdata/citus/pull/7759 - https://github.com/citusdata/citus/pull/7760 - https://github.com/citusdata/citus/pull/7761 - https://github.com/citusdata/citus/pull/7762 - https://github.com/citusdata/citus/pull/7765 - https://github.com/citusdata/citus/pull/7766 - https://github.com/citusdata/citus/pull/7768 - https://github.com/citusdata/citus/pull/7769 - https://github.com/citusdata/citus/pull/7771 - https://github.com/citusdata/citus/pull/7774 - https://github.com/citusdata/citus/pull/7776 - https://github.com/citusdata/citus/pull/7780 - https://github.com/citusdata/citus/pull/7781 - https://github.com/citusdata/citus/pull/7785 - https://github.com/citusdata/citus/pull/7788 - https://github.com/citusdata/citus/pull/7793 - https://github.com/citusdata/citus/pull/7796 --------- Co-authored-by: Colm <colmmchugh@microsoft.com>	2025-03-12 12:25:49 +03:00
Naisila Puka	28b0b0e7a8	Bump Citus version into 13.0.0 (#7792 ) We are using `release-13.0` branch for both development and release, to deliver PG17 support in Citus. Afterwards, we will (probably) merge this branch into main. Some potential changes for main branch, after we are done working on release-13.0: - Merge changes from `release-13.0` to `main` - Figure out what changes were there on 12.2, move them to 13.1 version. In a nutshell: rename `12.1--12.2` to `13.0--13.1` and fix issues. - Set version to 13.1devel	2025-03-12 12:25:49 +03:00
Mehmet YILMAZ	80c6479408	PG17 compatibility: Fix Test Failure in multi_alter_table_add_const (#7733 ) In earlier versions of PostgreSQL, exclusion constraints were not allowed on partitioned tables. This is why the error in your regression test (ERROR: exclusion constraints are not supported on partitioned tables) was raised in PostgreSQL 16. In PostgreSQL 17, exclusion constraints are now allowed on partitioned tables, which is why the error no longer appears when you attempt to add an exclusion constraint. The constraint exclusion mechanism, described in the documentation, relies on CHECK constraints to decide which partitions or child tables need to be queried. [CHECK constraints](https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION) ```diff -- Check "ADD EXCLUDE" errors out for partitioned table since the postgres does not allow it ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD EXCLUDE(partition_col WITH =); -ERROR: exclusion constraints are not supported on partitioned tables -- Check "ADD CHECK" SET client_min_messages TO DEBUG1; ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD CHECK (dist_col > 0); DEBUG: the constraint name on the shards of the partition is too long, switching to sequential and local execution mode to prevent self deadlocks: longlonglonglonglonglonglonglonglonglonglonglo_537570f5_5_check DEBUG: verifying table "longlonglonglonglonglonglonglonglonglonglonglonglonglonglongabc" DEBUG: verifying table "p1" RESET client_min_messages; SELECT con.conname FROM pg_catalog.pg_constraint con INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace WHERE rel.relname = 'citus_local_partitioned_table'; conname -------------------------------------------------- + citus_local_partitioned_table_partition_col_excl citus_local_partitioned_table_check -(1 row) +(2 rows) ```	2025-03-12 12:25:49 +03:00

1 2 3 4 5 ...

4669 Commits (3d61c4dc71ef5f2377139662694746ace06216b2)