Commit Graph

6876 Commits (v13.0.5)

Author SHA1 Message Date
naisila 2433c980e1 Bump version to 13.0.5 2025-10-02 12:57:32 +03:00
naisila 88ad8de24a Add changelog for 13.0.5 2025-10-02 12:57:32 +03:00
Onur Tirtir f87bdc0678 Properly detect no-op shard-key updates via UPDATE / MERGE (#8214)
DESCRIPTION: Fixes a bug that causes allowing UPDATE / MERGE queries
that may change the distribution column value.

Fixes: #8087.

Probably as of #769, we were not properly checking if UPDATE
may change the distribution column.

In #769, we had these checks:
```c
	if (targetEntry->resno != column->varattno)
	{
		/* target entry of the form SET some_other_col = <x> */
		isColumnValueChanged = false;
	}
	else if (IsA(setExpr, Var))
	{
		Var *newValue = (Var *) setExpr;
		if (newValue->varattno == column->varattno)
		{
			/* target entry of the form SET col = table.col */
			isColumnValueChanged = false;
		}
	}
```

However, what we check in "if" and in the "else if" are not so
different in the sense they both attempt to verify if SET expr
of the target entry points to the attno of given column. So, in
Also see this PR comment from #5220:
https://github.com/citusdata/citus/pull/5220#discussion_r699230597.
In #769, probably we actually wanted to first check whether both
SET expr of the target entry and given variable are pointing to the
same range var entry, but this wasn't what the "if" was checking,
so removed.

As a result, in the cases that are mentioned in the linked issue,
we were incorrectly concluding that the SET expr of the target
entry won't change given column just because it's pointing to the
same attno as given variable, regardless of what range var entries
the column and the SET expr are pointing to. Then we also started
using the same function to check for such cases for update action
of MERGE, so we have the same bug there as well.

So with this PR, we properly check for such cases by comparing
varno as well in TargetEntryChangesValue(). However, then some of
the existing tests started failing where the SET expr doesn't
directly assign the column to itself but the "where" clause could
actually imply that the distribution column won't change. Even before
we were not attempting to verify if "where" cluse quals could imply a
no-op assignment for the SET expr in such cases but that was not a
problem. This is because, for the most cases, we were always qualifying
such SET expressions as a no-op update as long as the SET expr's
attno is the same as given column's. For this reason, to prevent
regressions, this PR also adds some extra logic as well to understand
if the "where" clause quals could imply that SET expr for the
distribution key is a no-op.

Ideally, we should instead use "relation restriction equivalence"
mechanism to understand if the "where" clause implies a no-op
update. This is because, for instance, right now we're not able to
deduce that the update is a no-op when the "where" clause transitively
implies a no-op update, as in the case where we're setting "column a"
to "column c" and where clause looks like:
  "column a = column b AND column b = column c".
If this means a regression for some users, we can consider doing it
that way. Until then, as a workaround, we can suggest adding additional
quals to "where" clause that would directly imply equivalence.

Also, after fixing TargetEntryChangesValue(), we started successfully
deducing that the update action is a no-op for such MERGE queries:
```sql
MERGE INTO dist_1
USING dist_1 src
ON (dist_1.a = src.b)
WHEN MATCHED THEN UPDATE SET a = src.b;
```
However, we then started seeing below error for above query even
though now the update is qualified as a no-op update:
```
ERROR:  Unexpected column index of the source list
```
This was because of #8180 and #8201 fixed that.

In summary, with this PR:

* We disallow such queries,
  ```sql
  -- attno for dist_1.a, dist_1.b: 1, 2
  -- attno for dist_different_order_1.a, dist_different_order_1.b: 2, 1
  UPDATE dist_1 SET a = dist_different_order_1.b
  FROM dist_different_order_1
  WHERE dist_1.a dist_different_order_1.a;

  -- attno for dist_1.a, dist_1.b: 1, 2
  -- but ON (..) doesn't imply a no-op update for SET expr
  MERGE INTO dist_1
  USING dist_1 src
  ON (dist_1.a = src.b)
  WHEN MATCHED THEN UPDATE SET a = src.a;
  ```

* .. and allow such queries,
  ```sql
  MERGE INTO dist_1
  USING dist_1 src
  ON (dist_1.a = src.b)
  WHEN MATCHED THEN UPDATE SET a = src.b;
  ```

(cherry picked from commit 5eb1d93be1)
(cherry picked from commit 2fd20b3bb5dcc4d24cdee5985cf97c2e37a2b5e6)
2025-09-30 14:23:29 +03:00
Onur Tirtir 9a32f49dcb Fix unexpected column index error for repartitioned merge (#8201)
DESCRIPTION: Fixes a bug that causes an unexpected error when executing
repartitioned merge.

Fixes #8180.

This was happening because of a bug in
SourceResultPartitionColumnIndex(). And to fix it, this PR avoids
using DistributionColumnIndex() in SourceResultPartitionColumnIndex().
Instead, invents FindTargetListEntryWithVarExprAttno(), which finds
the index of the target entry in the source query's target list that
can be used to repartition the source for a repartitioned merge. In
short, to find the source target entry that refences the Var used in
ON (..) clause and that references the source rte, we should check the
varattno of the underlying expr, which presumably is always a Var for
repartitioned merge as we always wrap the source rte with a subquery,
where all target entries point to the columns of the original source
relation.

Using DistributionColumnIndex() prior to 13.0 wasn't causing such an
issue because prior to 13.0, the varattno of the underlying expr of
the source target entries was almost (*1) always equal to resno of the
target entry as we were including all target entries of the source
relation. However, starting with #7659, which is merged to main before
13.0, we started using CreateFilteredTargetListForRelation() instead of
CreateAllTargetListForRelation() to compute the target entry list for
the source rte to fix another bug. So we cannot revert to using
CreateAllTargetListForRelation() because otherwise we would re-introduce
bug that it helped fixing, so we instead had to find a way to properly
deal with the "filtered target list"s, as in this commit. Plus (*1),
even before #7659, probably we would still fail when the source relation
has dropped attributes or such because that would probably also cause
such a mismatch between the varattno of the underlying expr of the
target entry and its resno.

(cherry picked from commit 83b25e1fb1)
2025-09-30 14:23:29 +03:00
naisila 31e19e8b19 Fix HaveRegisteredOrActiveSnapshot() crashes
part of ce7ddc0d3d
2025-09-26 15:50:12 +03:00
naisila cdec663c60 Bump PG to 15.14, 16.10, 17.6 for Citus 13.0 2025-09-26 15:50:12 +03:00
naisila 65c90aeb7d Add check to identify views converted to RTE_SUBQUERY in 15.13
Relevant PG15 commit:
https://github.com/postgres/postgres/commit/317aba70e

Previously, when views were converted to RTE_SUBQUERY the relid
would be cleared in PG15. In this patch of PG15, relid is retained.
Therefore, we add a check with the "relkind and rtekind" to
identify the converted views in 15.13

Part of:
c98341e4ed
2025-09-26 15:50:12 +03:00
Colm 6ac5ad500e Fix bug in redundant WHERE clause detection. (#8162)
Need to also check Postgres plan's rangetables for relations used in Initplans.

DESCRIPTION: Fix a bug in redundant WHERE clause detection; we need to
additionally check the Postgres plan's range tables for the presence of
citus tables, to account for relations that are referenced from scalar
subqueries.

There is a fundamental flaw in 4139370, the assumption that, after
Postgres planning has completed, all tables used in a query can be
obtained by walking the query tree. This is not the case for scalar
subqueries, which will be referenced by `PARAM` nodes. The fix adds an
additional check of the Postgres plan range tables; if there is at least
one citus table in there we do not need to change the needs distributed
planning flag.

Fixes #8159
2025-09-03 08:32:09 +00:00
SongYoungUk 46fd74933c fix #7715 - add assign hook for CDC library path adjustment (#8025)
DESCRIPTION: Automatically updates dynamic_library_path when CDC is
enabled

fix : #7715

According to the documentation and `pg_settings`, the context of the
`citus.enable_change_data_capture` parameter is user.

However, changing this parameter — even as a superuser — doesn't work as
expected: while the initial copy phase works correctly, subsequent
change events are not propagated.

This appears to be due to the fact that `dynamic_library_path` is only
updated to `$libdir/citus_decoders:$libdir` when the server is restarted
and the `_PG_init` function is invoked.

To address this, I added an `EnableChangeDataCaptureAssignHook` that
automatically updates `dynamic_library_path` at runtime when
`citus.enable_change_data_capture` is enabled, ensuring that the CDC
decoder libraries are properly loaded.

Note that `dynamic_library_path` is already a `superuser`-context
parameter in base PostgreSQL, so updating it from within the assign hook
should be safe and consistent with PostgreSQL’s configuration model.

If there’s any reason this approach might be problematic or if there’s a
preferred alternative, I’d appreciate any feedback.

cc. @jy-min

---------

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Co-authored-by: ibrahim halatci <ihalatci@gmail.com>
(cherry picked from commit 743c9bbf87)
2025-07-18 13:06:12 +00:00
Jelte Fennema-Nio c2762af5a5 Run github actions on main (#7292)
We want the nice looking green checkmark on our main branch too.

This PR includes running on pushes to release branches too, but that
won't come into effect until we have release branches with this
workflow file.

(cherry picked from commit 2bccb58157)
2025-07-18 12:38:37 +00:00
Alper Kocatas 1e5decad75
Bump Citus version to 13.0.4 (#8008)
Bump Citus version to 13.0.4

Co-authored-by: ibrahim halatci <ihalatci@gmail.com>
2025-05-30 15:58:31 +03:00
Alper Kocatas e4db621256
Add changelog for 13.0.4 (#8005)
Add changelog entries for 13.0.4
2025-05-30 15:09:50 +03:00
Naisila Puka e254ced1db Error out for queries with outer joins and pseudoconstant quals in PG<17 (#7937)
PG15 commit d1ef5631e620f9a5b6480a32bb70124c857af4f1
and PG16 commit 695f5deb7902865901eb2d50a70523af655c3a00
disallow replacing joins with scans in queries with pseudoconstant quals.
This commit prevents the set_join_pathlist_hook from being called
if any of the join restrictions is a pseudo-constant.
So in these cases, citus has no info on the join, never sees that
the query has an outer join, and ends up producing an incorrect plan.
PG17 fixes this by commit 9e9931d2bf40e2fea447d779c2e133c2c1256ef3
Therefore, we take this extra measure here for PG versions less than 17.
hasOuterJoin can never be true when set_join_pathlist_hook is absent.
2025-05-20 16:56:52 +02:00
Muhammad Usama 902cef3745
Fix make install for OS/X: cherry picked from commit 0f28a69f12 (#7936)
Use the $(DLSUFFIX) instead of hard coded extensions for cdc (#7221)

cherry picked from commit 0f28a69f12

Co-authored-by: Nils Dijk <nils@citusdata.com>
Co-authored-by: ibrahim halatci <ihalatci@gmail.com>
2025-05-07 13:59:23 +03:00
Maksim Melnikov e940295202 AddressSanitizer: stack-use-after-scope on distributed_planner.c 2025-04-25 15:05:15 +03:00
manaldush 20a2d33f2a
AddressSanitizer: stack-use-after-scope on address in CreateBackground(backport to release-13.0) (#7966)
Backport of #7943 to release-13.0
Fixes #7964
2025-04-25 11:29:47 +00:00
ibrahim halatci da6cab83d8
update workflow base OS to ubutu-latest (#7970)
previous config was using ubuntu-20.04 which was end of support in GH
2025-04-25 11:14:28 +03:00
naisila 4c7004df42 Bump Citus version to 13.0.3 2025-03-20 12:56:05 +03:00
naisila bbe0539df2 Update changelog for 13.0.3 2025-03-20 12:56:05 +03:00
ibrahim halatci 12651a2320
updated change log for the 13.0.2 patch release (#7924)
updated change log for the 13.0.2 patch release

---------

Co-authored-by: Ibrahim Halatci <ihalatci@microsoft.com>
2025-03-12 14:06:10 +00:00
Cédric Villemain 759b49f69d
fix issue #7676: wrong handler around MULTIEXPR (#7914)
DESCRIPTION: Fixes a bug with `UPDATE SET (...) = (SELECT
some_func(),... )` (#7676)

Citus was checking for presence of sublink, but forgot to manage
multiexpr while evaluating clauses during planning. At this stage (citus
planner), it's not always possible to call PostgreSQL code because the
tree is not yet ready for PostgreSQL pure executor.

Fixes https://github.com/citusdata/citus/issues/7676.

Fixed by adding a new function to check sublink or multiexpr in the
tree.

---------

Co-authored-by: Colm <colmmchugh@microsoft.com>
2025-03-12 11:01:33 +00:00
Mehmet YILMAZ 80dc4bd90c
Issue 7887 Enhance AddInsertSelectCasts for Identity Columns (#7920)
## Enhance `AddInsertSelectCasts` for Identity Columns


This PR fixes #7887 and improves the behavior of partial inserts into
**identity columns** by modifying the **`AddInsertSelectCasts`**
function. Specifically, we introduce **special-case handling** for
`nextval(...)` calls (represented in the parse tree as `NextValueExpr`)
to ensure that if the identity column’s declared type differs from
`nextval`’s default return type (`int8`), we **cast** the expression
properly. This prevents mismatches like `int8` → `int4` from causing
“invalid string enlargement” errors or other type-related failures.

When `INSERT ... SELECT` is processed, `AddInsertSelectCasts` reconciles
each target column’s type with the corresponding SELECT expression’s
type. Historically, for identity columns that rely on `nextval(...)`, we
can end up with a mismatch:
- `nextval` returns **`int8`**,
- The identity column might be **`int4`**, **`bigint`**, or another
integer type.

Without a correct cast, Postgres or Citus can produce plan-time or
runtime errors. By **detecting** `NextValueExpr` and applying a cast to
the column’s type, the final plan ensures consistent insertion without
errors.

## What Changed

1. **Check for `NextValueExpr`**:  
   In `AddInsertSelectCasts`, we now have a code block:
   ```c
   if (IsA(selectEntry->expr, NextValueExpr))
   {
       Oid nextvalType = GetNextvalReturnTypeCatalog();
       ...
// If (targetType != nextvalType), build a cast from int8 -> targetType
   }
   else
   {
       // fallback to generic mismatch logic
   }
   ```
This short-circuits any expression that’s a `nextval(...)` call, letting
us explicitly cast to the correct type.

2. **Fallback Generic Logic**:  
If it isn’t a `NextValueExpr` (i.e. a normal column or expression
mismatch), we still rely on the existing path that compares `sourceType`
vs. `targetType` and calls `CastExpr(...)` if they differ.

3. **`GetNextvalReturnTypeCatalog`**:  
We added or refined a helper function to confirm that `nextval` returns
`int8`, or do a `LookupFuncName("nextval", ...)` to discover the
function’s return type from `pg_proc`—making it robust if future changes
happen.

## Benefits

- **Partial inserts** into identity columns no longer fail with type
mismatches.
- When `nextval` yields `int8` but the identity column is `int4` (or
another type), we properly cast to the column’s type in the plan.
- Preserves the **existing** approach for other columns—only identity
calls get the specialized `NextValueExpr` logic.

## Testing

- Extended `generatedidentity.sql` test scenario to cover partial
inserts into both `GENERATED ALWAYS` and `GENERATED BY DEFAULT` identity
columns, including tests for the `OVERRIDING SYSTEM VALUE` clause and
partial inserts referencing foreign-key columns.
2025-03-10 13:54:30 +03:00
Mehmet YILMAZ 1d0b15723b
Remove citus-tools subproject and add gitignore (#7916) 2025-03-06 15:51:01 +03:00
Muhammad Usama 43f3786c1f
Fix Deadlock with transaction recovery is possible during Citus upgrades (#7910)
DESCRIPTION: Fixes deadlock with transaction recovery that is possible
during Citus upgrades.

Fixes #7875.

This commit addresses two interrelated deadlock issues uncovered during Citus
upgrades:
1. Local Deadlock:
   - **Problem:**
     In `RecoverWorkerTransactions()`, a new connection is created for each worker
     node to perform transaction recovery by locking the
     `pg_dist_transaction` catalog table until the end of the transaction. When
     `RecoverTwoPhaseCommits()` calls this function for each worker node, the order
     of acquiring locks on `pg_dist_authinfo` and `pg_dist_transaction` can alternate.
     This reversal can lead to a deadlock if any concurrent process requires locks on
     these tables.
   - **Fix:**
     Pre-establish all worker node connections upfront so that
     `RecoverWorkerTransactions()` operates with a single, consistent connection.
     This ensures that locks on `pg_dist_authinfo` and `pg_dist_transaction` are always
     acquired in the correct order, thereby preventing the local deadlock.

2. Distributed Deadlock:
   - **Problem:**
     After resolving the local deadlock, a distributed deadlock issue emerges. The
     maintenance daemon calls `RecoverWorkerTransactions()` on each worker node—
     including the local node—which leads to a complex locking sequence:
       - A RowExclusiveLock is taken on the `pg_dist_transaction` table in
         `RecoverWorkerTransactions()`.
       - An update extension then tries to acquire an AccessExclusiveLock on the same
         table, getting blocked by the RowExclusiveLock.
       - A subsequent query (e.g., a SELECT on `pg_prepared_xacts`) issued using a
         separate connection on the local node gets blocked due to locks held during a
         call to `BuildCitusTableCacheEntry()`.
       - The maintenance daemon waits for this query, resulting in a circular wait and
         stalling the entire cluster.
   - **Fix:**
     Avoid cache lookups for internal PostgreSQL tables by implementing an early bailout
     for relation IDs below `FirstNormalObjectId` (system objects). This eliminates
     unnecessary calls to `BuildCitusTableCache`, reducing lock contention and mitigating
     the distributed deadlock.
     Furthermore, this optimization improves performance in fast
     connect→query_catalog→disconnect cycles by eliminating redundant
     cache creation and lookups.

3. Also reverts the commit that disabled the relevant test cases.
2025-03-04 15:11:01 +05:00
Colm 86107ca191
#7782 - catch when Postgres planning removes all Citus tables (#7907)
DESCRIPTION: fix a planning error caused by a redundant WHERE clause

Fix a Citus planning glitch that occurs in a DML query when the WHERE
clause of the query is of the form:
    ` WHERE true OR <expression with 1 or more citus tables> `
and this is the only place in the query referencing a citus table.
Postgres' standard planner transforms the WHERE clause to:
    ` WHERE true `
So the query now has no citus tables, confusing the Citus planner as
described in issues #7782 and #7783. The fix is to check, after Postgres
standard planner, if the Query has been transformed as shown, and re-run
the check of whether or not the query needs distributed planning.
2025-02-27 10:54:39 +00:00
Mehmet YILMAZ 2b964228bc
Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery (#7897)
This PR fixes an issue #7891 in the Citus planner where an `UPDATE` on a
local table with a subquery referencing a reference table could produce
a 0-task plan. Historically, the planner sometimes failed to detect that
both the target and referenced tables were effectively “local,”
assigning `INVALID_SHARD_ID `and yielding a no-op plan.

### Root Cause

- In the Citus router logic (`PlanRouterQuery`), we relied on `shardId`
to determine whether a query should be routed to a single shard.
- If `shardId == INVALID_SHARD_ID`, but we also had not marked the query
as a “local table modification,” the code path would produce zero tasks.
- Local + reference tables do not require multi-shard routing. Failing
to detect this “purely local” scenario caused Citus to incorrectly route
to zero tasks.

### Changes

**Enhanced Local Table Detection**

- Updated `IsLocalTableModification` and related checks to consider both
local and reference tables as “local” for planning, preventing the
0-task scenario.
- Expanded `ContainsOnlyLocalOrReferenceTables` to return true if there
are no fully distributed tables in the query.

**Added Regress Test**

- Introduced a new regress test (`issue_7891.sql`) which reproduces the
scenario.
- Verifies we get a valid single- or local-task plan rather than a
0-task plan.
2025-02-25 20:49:32 +03:00
Colm c1f5762645
Enhance MERGE .. WHEN NOT MATCHED BY SOURCE for repartitioned source (#7900)
DESCRIPTION: Ensure that a MERGE command on a distributed table with a
`WHEN NOT MATCHED BY SOURCE` clause runs against all shards of the
distributed table.

The Postgres MERGE command updates a table using a table or a query as a
data source. It provides three ways to match the target table with the
source: `WHEN MATCHED` means that there is a row in both the target and
source; `WHEN NOT MATCHED` means that there is a row in the source that
has no match (is not present) in the target; and, as of PG17, `WHEN NOT
MATCHED BY SOURCE` means that there is a row in the target that has no
match in the source.

In Citus, when a MERGE command updates a distributed table using a
local/reference table or a distributed query as source, that source is
repartitioned, and for each repartitioned shard that has data (i.e. 1 or
more rows) the MERGE is run against the corresponding distributed table
shard. Suppose the distributed table has 32 shards, and the source
repartitions into 4 shards that have data, with the remaining 28 shards
being empty; then the MERGE command is performed on the 4 corresponding
shards of the distributed table. However, the semantics of `WHEN NOT
MATCHED BY SOURCE` are that the specified action must be performed on
the target for each row in the target that is not in the source; so if
the source is empty, all target rows should be updated. To see this,
consider the following MERGE command:
```
MERGE INTO target AS t
USING source AS s ON t.id = s.id
WHEN NOT MATCHED BY SOURCE THEN UPDATE t SET t.col1 = 100
```
If the source has zero rows then every row in the target is updated s.t.
its col1 value is 100. Currently in Citus a MERGE on a distributed table
with a local/reference table or a distributed query as source ignores
shards of the distributed table when the corresponding shard of the
repartitioned source has zero rows. However, if the MERGE command
specifies a `WHEN NOT MATCHED BY SOURCE` clause, then the MERGE should
be performed on all shards of the distributed table, to ensure that the
specified action is performed on the target for each row in the target
that is not in the source. This PR enhances Citus MERGE execution so
that when a repartitioned source shard has zero rows, and the MERGE
command specifies a `WHEN NOT MATCHED BY SOURCE` clause, the MERGE is
performed against the corresponding shard of the distributed table using
an empty (zero row) relation as source, by generating a query of the
form:
```
MERGE INTO target_shard_0002 AS t
USING (SELECT id FROM (VALUES (NULL) ) source_0002(id) WHERE FALSE) AS s ON t.id = s.id
WHEN NOT MATCHED BY SOURCE THEN UPDATE t set t.col1 = 100
```
This works because each row in the target shard will be updated, and
`WHEN MATCHED` and `WHEN NOT MATCHED`, if specified, will be no-ops
because the source has zero rows.

To implement this when the source is a local or reference table involves
teaching function `ExcuteSourceAtCoordAndRedistribution()` in
`merge_executor.c` to not prune tasks when the query has `WHEN NOT
MATCHED BY SOURCE` but to instead replace the task's query to one that
uses an empty relation as source. And when the source is a distributed
query, function
`ExecuteMergeSourcePlanIntoColocatedIntermediateResults()` (also in
`merge_executor.c`) instead of skipping empty tasks now generates a
query that uses an empty relation as source for the corresponding target
shard of the distributed table, but again only when the query has `WHEN
NOT MATCHED BY SOURCE`. A new function `BuildEmptyResultQuery()` is
added to `recursive_planning.c` and it is used by both the
aforementioned functions in `merge_executor.c` to build an empty
relation to use as the source. It applies the appropriate type to each
column of the empty relation so the join with the target makes sense to
the query compiler.
2025-02-24 09:11:19 +00:00
OlgaSergeyevaB 459c283e7d
Custom Scan (ColumnarScan): exclude outer_join_rels from CandidateRelids (#7703)
DESCRIPTION: Fixes a crash in columnar custom scan that happens when a
columnar table is used in a join. Fixes issue #7647.

Co-authored-by: Ольга Сергеева <ob-sergeeva@it-serv.ru>
2025-02-18 20:58:02 +00:00
Colm 8f3d9deffe
[Bug Fix] SEGV on query with Left Outer Join (#7787) (#7901)
DESCRIPTION: Fixes a crash in left outer joins that can happen when
there is an an aggregate on a column from the inner side of the join.

Fix the SEGV seen in #7787 and #7899; it occurs because a column in the
targetlist of a worker subquery can contain a non-empty varnullingrels
field if the column is from the inner side of a left outer join. The
issue can also occur with the columns in the HAVING clause, and this is
also tested in the fix. The issue was triggered by the introduction of
the varnullingrels to Vars in Postgres 16 (2489d76c)

There is a related issue, #7705, where a non-empty varnullingrels was
incorrectly copied into the query tree for the combine query. Here, a
non-empty varnullingrels field of a var is incorrectly copied into the
query tree for a worker subquery.

The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
2025-02-18 12:41:34 +00:00
Naisila Puka d28a5eae6c
Changelog entries for v13.0.1 (#7873) 2025-02-04 12:55:35 +00:00
Naisila Puka e5a1c17134
Bump Citus version to 13.0.1 (#7872) 2025-02-04 15:15:05 +03:00
Onur Tirtir b6b73e2f4c Disable 2PC recovery while executing ALTER EXTENSION cmd during Citus upgrade tests 2025-02-04 14:00:31 +03:00
Onur Tirtir 0b4896f7b4 Revert "Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered"
This reverts commit 684b4c6b96.
2025-02-04 14:00:31 +03:00
Gürkan İndibay ee76c4423e Updates github checkout actions to v4 (#7611)
(cherry picked from commit 3fe22406e62fb40da12a0d91f3ecc0cba81cdb24)
2025-02-04 11:18:35 +03:00
Naisila Puka 9a7f6d6c59
Drops PG14 support (#7753)
DESCRIPTION: Drops PG14 support

1. Remove "$version_num" != 'xx' from configure file 
2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code 
3. Look at pg_version_compat.h file, remove all _compat functions etc
defined specifically for PGXX differences
4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM <
PG_VERSION_(XX+1) ifs in the codebase
5. delete ruleutils_xx.c file 
6. cleanup normalize.sed file from pg14 specific lines 
7. delete all alternative output files for that particular PG version,
server_version_ge variable helps here
2025-02-03 17:13:40 +03:00
Naisila Puka 6b70724b31
Fix mixed Citus upgrade tests (#7218) (#7871)
When testing rolling Citus upgrades, coordinator should not be upgraded
until we upgrade all the workers.

---------

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
(cherry picked from commit 27ac44eb2a)

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2025-02-03 16:49:18 +03:00
Onur Tirtir 26f16a7654 Avoid publishing artifacts with conflicting names
.. as documented in actions/upload-artifact#480.
2025-02-01 01:49:09 +03:00
Onur Tirtir 24758c39a1 Fix flaky citus upgrade test 2025-02-01 01:49:09 +03:00
Onur Tirtir 684b4c6b96 Release RowExclusiveLock on pg_dist_transaction as soon as remote xacts are recovered
As of this commit, after recovering the remote transactions, now we release the lock
on pg_dist_transaction while closing it to avoid deadlocks that might occur because
of trying to acquire a lock on pg_dist_authinfo while holding a lock on
pg_dist_transaction. Such a scenario can only cause a deadlock if another transaction
is trying to acquire a strong lock on pg_dist_transaction while holding a lock on
pg_dist_authinfo. As of today, we (implicitly) acquire a strong lock on
pg_dist_transaction only when upgrading Citus to 11.3-1 and this happens when creating
a REPLICA IDENTITY on pg_dist_transaction.

And regardless of the code-path we are in, it should be okay to release the lock there
because all we do after that point is to abort the prepared transactions that are not
part of an in-progress distributed transaction and releasing the lock before doing so
should be just fine.

This also changes the blocking behavior between citus_create_restore_point and the
transaction recovery code-path in the sense that now citus_create_restore_point doesn't
until transaction recovery completes aborting the prepared transactions that are not
part of an in-progress distributed transaction. However, this should be fine because
even before this was possible, e.g., if transaction recovery fails to open a remote
connection to a node.
2025-02-01 01:49:09 +03:00
Onur Tirtir cbe0de33a6 Upgrade download-artifacts action to 4.1.8 2025-02-01 01:49:09 +03:00
Onur Tirtir b886cfa223 Upgrade upload-artifacts action to 4.6.0 2025-02-01 01:49:09 +03:00
Naisila Puka 548395fd77
fix changelog date (#7859) 2025-01-22 14:28:46 +03:00
Naisila Puka cba8e57737
Changelog entries for 13.0.0 (#7858) 2025-01-22 13:17:35 +03:00
Naisila Puka 23d5207701
Fix pg17 test (#7857)
error merged in
ab7c3b7804
2025-01-22 12:54:52 +03:00
Mehmet YILMAZ ab7c3b7804
PG17 Compatibility - Fix crash when pg_class is used in MERGE (#7853)
This pull request addresses Issue #7846, where specific MERGE queries on
non-distributed and distributed tables can result in crashes in certain
scenarios. The issue stems from the usage of `pg_class` catalog table,
and the `FilterShardsFromPgclass` function in Citus. This function goes
through the query's jointree to hide the shards. However, in PG17,
MERGE's join quals are in a separate structure called
`mergeJoinCondition`. Therefore FilterShardsFromPgclass was not
filtering correctly in a `MERGE` command that involves `pg_class`. To
fix the issue, we handle `mergeJoinCondition` separately in PG17.

Relevant PG commit:

0294df2f1f

**Non-Distributed Tables:**
A MERGE query involving a non-distributed table using
`pg_catalog.pg_class` as the source may execute successfully but needs
testing to ensure stability.

**Distributed Tables:**
Performing a MERGE on a distributed table using `pg_catalog.pg_class` as
the source raises an error:
`ERROR: MERGE INTO a distributed table from Postgres table is not yet
supported`
However, in some cases, this can lead to a server crash if the
unsupported operation is not properly handled.

This is the test output from the same test conducted prior to the code
changes being implemented.

```
-- Issue #7846: Test crash scenarios with MERGE on non-distributed and distributed tables
-- Step 1: Connect to a worker node to verify shard visibility
\c postgresql://postgres@localhost::worker_1_port/regression?application_name=psql
SET search_path TO pg17;
-- Step 2: Create and test a non-distributed table
CREATE TABLE non_dist_table_12345 (id INTEGER);
-- Test MERGE on the non-distributed table
MERGE INTO non_dist_table_12345 AS target_0
USING pg_catalog.pg_class AS ref_0
ON target_0.id = ref_0.relpages
WHEN NOT MATCHED THEN DO NOTHING;
SSL SYSCALL error: EOF detected
connection to server was lost
```
2025-01-21 17:48:06 +03:00
Colm c2bc7aca4a
Update tdigest_aggregate_support output for PG15+ (#7849)
Regress test tdigest_aggregate_support has been failing since at least
Citus 12.0, when tdigest extension is installed in Postgres. This
appears to be because of an omission by commit 03832f3 and a change in
the implementation of Postgres random() function (pg commit
[d4f109e4a](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d4f109e4a)).
To reproduce the test diff:
- Checkout [tdigest ](https://github.com/tvondra/tdigest)and run `make;
make install`
- In citus regress directory run `make check-multi` or
`./citus_tests/run_test.py tdigest_aggregate_support`

There are two parts to this commit:

1. Revert `Output: xxxxx` in EXPLAIN VERBOSE. Citus commit fe4ac51
normalized EXPLAIN VERBOSE output because of a change between pg12 and
pg13. When pg12 support was no longer required, the rule was removed
from normalize.sed and `Output: xxxx` was reverted in the impacted
regress output files (03832f3), but `tdigest_aggregate_support` was
omitted.

2. Adjust the query results; the tdigest_aggregate_support test file has
a comment _verifying results - should be stable due to seed while
inserting the data, if failure due to data these queries could be
removed or check for certain ranges_ but the result values in this
commit are consistent across citus 12.0 (pg 15), citus 12.1 (pg 16) and
citus 13.0 (pg 17), or since the Postgres changed their [implementation
of
random](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d4f109e4a),
so proposing to go with these results.
2025-01-20 22:00:33 +00:00
Naisila Puka fa8e867662
Bump to latest PG minors 17.2, 16.6, 15.10, 14.15 (#7843)
Similar to
5ef2cd67ed,
we use the commit sha of a local build of the images, pushed.
2025-01-13 22:35:11 +03:00
Emel Şimşek c55bc8c669 Propagates SECURITY LABEL ON ROLE stmt (#7304) (#7735)
Propagates SECURITY LABEL ON ROLE stmt (https://github.com/citusdata/citus/pull/7304)
We propagate `SECURITY LABEL [for provider] ON ROLE rolename IS
labelname` to the worker nodes.
We also make sure to run the relevant `SecLabelStmt` commands on a
newly added node by looking at roles found in `pg_shseclabel`.

See official docs for explanation on how this command works:
https://www.postgresql.org/docs/current/sql-security-label.html
This command stores the role label in the `pg_shseclabel` catalog table.

This commit also fixes the regex string in
`check_gucs_are_alphabetically_sorted.sh` script such that it escapes
the dot. Previously it was looking for all strings starting with "citus"
instead of "citus." as it should.

To test this feature, I currently make use of a special GUC to control
label provider registration in PG_init when creating the Citus extension.

(cherry picked from commit 0d1f18862b)

Co-authored-by: Naisila Puka <37271756+naisila@users.noreply.github.com>
(cherry picked from commit 686d2b46ca)
2025-01-13 19:56:01 +03:00
Nils Dijk 7e316c90c4 Shard moves/isolate report LSN's in lsn format (#7227)
DESCRIPTION: Shard moves/isolate report LSN's in lsn format

While investigating an issue with our catchup mechanism on certain
postgres versions we noticed we print LSN's in the format of the native
long type. This is an uncommon representation for LSN's in postgres
logs.

This patch changes the output of our log message to go from the long
type representation to the native LSN type representation. Making it
easier for postgres users to recognize and compare LSN's with other
related reports.

example of new output:
```
2023-09-25 17:28:47.544 CEST [11345] LOG:  The LSN of the target subscriptions on node localhost:9701 have increased from 0/0 to 0/E1ED20F8 at 2023-09-25 17:28:47.544165+02 where the source LSN is 1/415DCAD0
```

(cherry picked from commit b87fbcbf79)
2025-01-13 17:47:47 +03:00
Teja Mupparti d2ca63fb8c For scenarios, such as, Bug 3697586: Server crashes when assigning distributed transaction: Raise an ERROR instead of a crash
(cherry picked from commit ab7c13beb5)
2025-01-13 17:47:18 +03:00