citus/src/backend/distributed
Onur Tirtir f87bdc0678 Properly detect no-op shard-key updates via UPDATE / MERGE (#8214)
DESCRIPTION: Fixes a bug that causes allowing UPDATE / MERGE queries
that may change the distribution column value.

Fixes: #8087.

Probably as of #769, we were not properly checking if UPDATE
may change the distribution column.

In #769, we had these checks:
```c
	if (targetEntry->resno != column->varattno)
	{
		/* target entry of the form SET some_other_col = <x> */
		isColumnValueChanged = false;
	}
	else if (IsA(setExpr, Var))
	{
		Var *newValue = (Var *) setExpr;
		if (newValue->varattno == column->varattno)
		{
			/* target entry of the form SET col = table.col */
			isColumnValueChanged = false;
		}
	}
```

However, what we check in "if" and in the "else if" are not so
different in the sense they both attempt to verify if SET expr
of the target entry points to the attno of given column. So, in
Also see this PR comment from #5220:
https://github.com/citusdata/citus/pull/5220#discussion_r699230597.
In #769, probably we actually wanted to first check whether both
SET expr of the target entry and given variable are pointing to the
same range var entry, but this wasn't what the "if" was checking,
so removed.

As a result, in the cases that are mentioned in the linked issue,
we were incorrectly concluding that the SET expr of the target
entry won't change given column just because it's pointing to the
same attno as given variable, regardless of what range var entries
the column and the SET expr are pointing to. Then we also started
using the same function to check for such cases for update action
of MERGE, so we have the same bug there as well.

So with this PR, we properly check for such cases by comparing
varno as well in TargetEntryChangesValue(). However, then some of
the existing tests started failing where the SET expr doesn't
directly assign the column to itself but the "where" clause could
actually imply that the distribution column won't change. Even before
we were not attempting to verify if "where" cluse quals could imply a
no-op assignment for the SET expr in such cases but that was not a
problem. This is because, for the most cases, we were always qualifying
such SET expressions as a no-op update as long as the SET expr's
attno is the same as given column's. For this reason, to prevent
regressions, this PR also adds some extra logic as well to understand
if the "where" clause quals could imply that SET expr for the
distribution key is a no-op.

Ideally, we should instead use "relation restriction equivalence"
mechanism to understand if the "where" clause implies a no-op
update. This is because, for instance, right now we're not able to
deduce that the update is a no-op when the "where" clause transitively
implies a no-op update, as in the case where we're setting "column a"
to "column c" and where clause looks like:
  "column a = column b AND column b = column c".
If this means a regression for some users, we can consider doing it
that way. Until then, as a workaround, we can suggest adding additional
quals to "where" clause that would directly imply equivalence.

Also, after fixing TargetEntryChangesValue(), we started successfully
deducing that the update action is a no-op for such MERGE queries:
```sql
MERGE INTO dist_1
USING dist_1 src
ON (dist_1.a = src.b)
WHEN MATCHED THEN UPDATE SET a = src.b;
```
However, we then started seeing below error for above query even
though now the update is qualified as a no-op update:
```
ERROR:  Unexpected column index of the source list
```
This was because of #8180 and #8201 fixed that.

In summary, with this PR:

* We disallow such queries,
  ```sql
  -- attno for dist_1.a, dist_1.b: 1, 2
  -- attno for dist_different_order_1.a, dist_different_order_1.b: 2, 1
  UPDATE dist_1 SET a = dist_different_order_1.b
  FROM dist_different_order_1
  WHERE dist_1.a dist_different_order_1.a;

  -- attno for dist_1.a, dist_1.b: 1, 2
  -- but ON (..) doesn't imply a no-op update for SET expr
  MERGE INTO dist_1
  USING dist_1 src
  ON (dist_1.a = src.b)
  WHEN MATCHED THEN UPDATE SET a = src.a;
  ```

* .. and allow such queries,
  ```sql
  MERGE INTO dist_1
  USING dist_1 src
  ON (dist_1.a = src.b)
  WHEN MATCHED THEN UPDATE SET a = src.b;
  ```

(cherry picked from commit 5eb1d93be1)
(cherry picked from commit 2fd20b3bb5dcc4d24cdee5985cf97c2e37a2b5e6)
2025-09-30 14:23:29 +03:00
..
cdc Fix make install for OS/X: cherry picked from commit 0f28a69f12 (#7936) 2025-05-07 13:59:23 +03:00
clock Drops PG14 support (#7753) 2025-02-03 17:13:40 +03:00
commands Drops PG14 support (#7753) 2025-02-03 17:13:40 +03:00
connection Drops PG14 support (#7753) 2025-02-03 17:13:40 +03:00
deparser Drops PG14 support (#7753) 2025-02-03 17:13:40 +03:00
executor Enhance MERGE .. WHEN NOT MATCHED BY SOURCE for repartitioned source (#7900) 2025-02-24 09:11:19 +00:00
metadata Fix HaveRegisteredOrActiveSnapshot() crashes 2025-09-26 15:50:12 +03:00
operations Propagates SECURITY LABEL ON ROLE stmt (#7304) (#7735) 2025-01-13 19:56:01 +03:00
planner Properly detect no-op shard-key updates via UPDATE / MERGE (#8214) 2025-09-30 14:23:29 +03:00
progress Rename foreach_ macros to foreach_declared_ macros (#7700) 2024-10-16 17:01:39 +03:00
relay Rename foreach_ macros to foreach_declared_ macros (#7700) 2024-10-16 17:01:39 +03:00
replication Propagates SECURITY LABEL ON ROLE stmt (#7304) (#7735) 2025-01-13 19:56:01 +03:00
shardsplit Drops PG14 support (#7753) 2025-02-03 17:13:40 +03:00
sql Disallow infinite values for partition interval in create_time_partitions udf (#7822) 2024-12-30 20:27:28 +03:00
test Drops PG14 support (#7753) 2025-02-03 17:13:40 +03:00
transaction Fix HaveRegisteredOrActiveSnapshot() crashes 2025-09-26 15:50:12 +03:00
utils fix issue #7676: wrong handler around MULTIEXPR (#7914) 2025-03-12 11:01:33 +00:00
worker PG17 Compatibility - Fix crash when pg_class is used in MERGE (#7853) 2025-01-21 17:48:06 +03:00
.gitignore Change SQL migration build process for easier reviews (#2951) 2019-09-13 18:44:27 +02:00
Makefile Makefile changes to build CDC in builddir for pgoutput and wal2json. (#6827) 2023-04-06 17:03:12 +05:30
citus--11.1-1.control Support citus_columnar as separate extension (#5911) 2022-07-13 21:08:29 -07:00
citus.control Bump Citus version into 13.0.0 (#7792) 2024-12-24 11:40:59 +03:00
safeclib Compile safestringlib using regular configure 2020-03-19 11:52:20 +01:00
shared_library_init.c fix #7715 - add assign hook for CDC library path adjustment (#8025) 2025-07-18 13:06:12 +00:00