Nontrivial bump because of the following PG15.3 commit
317aba70e
https://github.com/postgres/postgres/commit/317aba70e
Previously, when views were converted to RTE_SUBQUERY the relid
would be cleared in PG15. In this patch of PG15, relid is retained.
Therefore, we add a check with the "relkind and rtekind" to
identify the converted views in 15.13
Sister PR https://github.com/citusdata/the-process/pull/164
Using dev image sha because I encountered the libpq
symlink issue again with "-v219b87c"
PG15 commit d1ef5631e620f9a5b6480a32bb70124c857af4f1
and PG16 commit 695f5deb7902865901eb2d50a70523af655c3a00
disallow replacing joins with scans in queries with pseudoconstant quals.
This commit prevents the set_join_pathlist_hook from being called
if any of the join restrictions is a pseudo-constant.
So in these cases, citus has no info on the join, never sees that
the query has an outer join, and ends up producing an incorrect plan.
PG17 fixes this by commit 9e9931d2bf40e2fea447d779c2e133c2c1256ef3
Therefore, we take this extra measure here for PG versions less than 17.
hasOuterJoin can never be true when set_join_pathlist_hook is absent.
This PR fixes#7784 and refactors the `WrapSubquery(Query *subquery)`
function to improve clarity and correctness when handling volatile
expressions in subqueries during Citus insert-select rewriting.
### Background
The `WrapSubquery` function rewrites a query of the form:
```sql
INSERT INTO target_table SELECT ... FROM ...
```
...by wrapping the `SELECT` in a subquery:
```sql
SELECT <outer-TL>
FROM ( <subquery with volatile expressions replaced with NULL> ) citus_insert_select_subquery
```
This transformation allows:
* **Volatile expressions** (e.g., `nextval`, `now`) **not used in `GROUP
BY` or `ORDER BY`** to be evaluated **exactly once on the coordinator**.
* **Stable/immutable or sort-relevant expressions** to remain in the
worker-executed subquery.
* Placeholder `NULL`s to maintain column alignment in the inner
subquery.
### Fix Details
* Restructured the code into labeled logical sections:
1. Build wrapper query (`SELECT … FROM (subquery)`)
2. Rewrite target lists with volatility analysis
3. Assign and return updated query trees
* Preserved existing behavior, focusing on clarity and maintainability.
### How the new code handles volatile items
stage | what we look for | what we do | why
-- | -- | -- | --
scan target list once | 1. `expr_is_volatile(te->expr)` 2.
`te->ressortgroupref != 0` (is the column used in GROUP BY / ORDER BY?)
| decide whether to hoist or keep | we must not hoist an expression the
inner query still needs for sorting/grouping, otherwise its
`SortGroupClause` breaks
volatile & not used in sort/group | deep‑copy the expression into the
outer target list | executes once on the coordinator |
| leave a typed `NULL `placeholder (visible, not `resjunk`) in the
inner target list | keeps column numbering stable for helpers that
already ran (reorder, cast); the worker sends a cheap constant |
stable / immutable, or volatile but used in sort/group | keep the
original expression in the inner list; outer list references it via a
`Var `| workers can evaluate it safely and, if needed, the inner
ORDER BY still works |
### Example
Given this query:
```sql
INSERT INTO t SELECT nextval('s'), 42 FROM generate_series(1, 2);
```
The planner rewrites it as:
```sql
SELECT nextval('s'), col2
FROM (SELECT NULL::bigint AS col1, 42 AS col2 FROM generate_series(1, 2)) citus_insert_select_subquery;
```
This ensures `nextval('s')` is evaluated only once per row on the
**coordinator**, not on each worker node, preserving correct sequence
semantics.
#### **Outer‑Var guard (`FindReferencedTableColumn`)**
Because `WrapSubquery` adds an extra query level, lots of Vars that the
old code never expected become “outer” Vars; without teaching
`FindReferencedTableColumn` to climb that extra level reliably, Citus
would intermittently reject valid foreign keys and even hit asserts.
* Re‑implemented the outer‑Var guard so that the function:
* **Walks deterministically up the query stack** when `skipOuterVars =
false` (default for FK / UNION checks). A new while‑loop copies — rather
than truncates — `parentQueryList` on each hop, eliminating
list‑aliasing that made *issue 5248* fail intermittently in parallel
regressions.
* Handles multi‑level `varlevelsup` in a single loop; never mutates the
caller’s list in place.
Var externParamPlaceholder is created on stack, and its address is used
for paramFetch. Postgres code return address of externParamPlaceholder
var to externParam, then code flow go out of scope and dereference
pointer on stack out of scope.
Fixes https://github.com/citusdata/citus/issues/7941.
---------
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
DESCRIPTION: Makes sure to prevent `INSERT INTO ... SELECT` queries involving subfield or sublink, to avoid crashes
The following query was crashing the backend:
```
INSERT INTO field_indirection_test_1 (
int_col, ct1_col.int_1,ct1_col.int_2
) SELECT 0, 1, 2;
-- crash
```
En passant, added more tests with sublink in distributed_types and found
another query with wrong behavior:
```
INSERT INTO domain_indirection_test (f1,f3.if1) SELECT 0, 1;
ERROR: could not find a conversion path from type 23 to 17619
-- not the expected ERROR
```
Fixed them by using `strip_implicit_coercions()` on target entry
expression before checking for the presence of a subscript or
fieldstore, else we fail to find the existing ones and wrongly accept to
execute unsafe query.
## Enhance `AddInsertSelectCasts` for Identity Columns
This PR fixes#7887 and improves the behavior of partial inserts into
**identity columns** by modifying the **`AddInsertSelectCasts`**
function. Specifically, we introduce **special-case handling** for
`nextval(...)` calls (represented in the parse tree as `NextValueExpr`)
to ensure that if the identity column’s declared type differs from
`nextval`’s default return type (`int8`), we **cast** the expression
properly. This prevents mismatches like `int8` → `int4` from causing
“invalid string enlargement” errors or other type-related failures.
When `INSERT ... SELECT` is processed, `AddInsertSelectCasts` reconciles
each target column’s type with the corresponding SELECT expression’s
type. Historically, for identity columns that rely on `nextval(...)`, we
can end up with a mismatch:
- `nextval` returns **`int8`**,
- The identity column might be **`int4`**, **`bigint`**, or another
integer type.
Without a correct cast, Postgres or Citus can produce plan-time or
runtime errors. By **detecting** `NextValueExpr` and applying a cast to
the column’s type, the final plan ensures consistent insertion without
errors.
## What Changed
1. **Check for `NextValueExpr`**:
In `AddInsertSelectCasts`, we now have a code block:
```c
if (IsA(selectEntry->expr, NextValueExpr))
{
Oid nextvalType = GetNextvalReturnTypeCatalog();
...
// If (targetType != nextvalType), build a cast from int8 -> targetType
}
else
{
// fallback to generic mismatch logic
}
```
This short-circuits any expression that’s a `nextval(...)` call, letting
us explicitly cast to the correct type.
2. **Fallback Generic Logic**:
If it isn’t a `NextValueExpr` (i.e. a normal column or expression
mismatch), we still rely on the existing path that compares `sourceType`
vs. `targetType` and calls `CastExpr(...)` if they differ.
3. **`GetNextvalReturnTypeCatalog`**:
We added or refined a helper function to confirm that `nextval` returns
`int8`, or do a `LookupFuncName("nextval", ...)` to discover the
function’s return type from `pg_proc`—making it robust if future changes
happen.
## Benefits
- **Partial inserts** into identity columns no longer fail with type
mismatches.
- When `nextval` yields `int8` but the identity column is `int4` (or
another type), we properly cast to the column’s type in the plan.
- Preserves the **existing** approach for other columns—only identity
calls get the specialized `NextValueExpr` logic.
## Testing
- Extended `generatedidentity.sql` test scenario to cover partial
inserts into both `GENERATED ALWAYS` and `GENERATED BY DEFAULT` identity
columns, including tests for the `OVERRIDING SYSTEM VALUE` clause and
partial inserts referencing foreign-key columns.
DESCRIPTION: fix a planning error caused by a redundant WHERE clause
Fix a Citus planning glitch that occurs in a DML query when the WHERE
clause of the query is of the form:
` WHERE true OR <expression with 1 or more citus tables> `
and this is the only place in the query referencing a citus table.
Postgres' standard planner transforms the WHERE clause to:
` WHERE true `
So the query now has no citus tables, confusing the Citus planner as
described in issues #7782 and #7783. The fix is to check, after Postgres
standard planner, if the Query has been transformed as shown, and re-run
the check of whether or not the query needs distributed planning.
This PR fixes an issue #7891 in the Citus planner where an `UPDATE` on a
local table with a subquery referencing a reference table could produce
a 0-task plan. Historically, the planner sometimes failed to detect that
both the target and referenced tables were effectively “local,”
assigning `INVALID_SHARD_ID `and yielding a no-op plan.
### Root Cause
- In the Citus router logic (`PlanRouterQuery`), we relied on `shardId`
to determine whether a query should be routed to a single shard.
- If `shardId == INVALID_SHARD_ID`, but we also had not marked the query
as a “local table modification,” the code path would produce zero tasks.
- Local + reference tables do not require multi-shard routing. Failing
to detect this “purely local” scenario caused Citus to incorrectly route
to zero tasks.
### Changes
**Enhanced Local Table Detection**
- Updated `IsLocalTableModification` and related checks to consider both
local and reference tables as “local” for planning, preventing the
0-task scenario.
- Expanded `ContainsOnlyLocalOrReferenceTables` to return true if there
are no fully distributed tables in the query.
**Added Regress Test**
- Introduced a new regress test (`issue_7891.sql`) which reproduces the
scenario.
- Verifies we get a valid single- or local-task plan rather than a
0-task plan.
DESCRIPTION: Ensure that a MERGE command on a distributed table with a
`WHEN NOT MATCHED BY SOURCE` clause runs against all shards of the
distributed table.
The Postgres MERGE command updates a table using a table or a query as a
data source. It provides three ways to match the target table with the
source: `WHEN MATCHED` means that there is a row in both the target and
source; `WHEN NOT MATCHED` means that there is a row in the source that
has no match (is not present) in the target; and, as of PG17, `WHEN NOT
MATCHED BY SOURCE` means that there is a row in the target that has no
match in the source.
In Citus, when a MERGE command updates a distributed table using a
local/reference table or a distributed query as source, that source is
repartitioned, and for each repartitioned shard that has data (i.e. 1 or
more rows) the MERGE is run against the corresponding distributed table
shard. Suppose the distributed table has 32 shards, and the source
repartitions into 4 shards that have data, with the remaining 28 shards
being empty; then the MERGE command is performed on the 4 corresponding
shards of the distributed table. However, the semantics of `WHEN NOT
MATCHED BY SOURCE` are that the specified action must be performed on
the target for each row in the target that is not in the source; so if
the source is empty, all target rows should be updated. To see this,
consider the following MERGE command:
```
MERGE INTO target AS t
USING source AS s ON t.id = s.id
WHEN NOT MATCHED BY SOURCE THEN UPDATE t SET t.col1 = 100
```
If the source has zero rows then every row in the target is updated s.t.
its col1 value is 100. Currently in Citus a MERGE on a distributed table
with a local/reference table or a distributed query as source ignores
shards of the distributed table when the corresponding shard of the
repartitioned source has zero rows. However, if the MERGE command
specifies a `WHEN NOT MATCHED BY SOURCE` clause, then the MERGE should
be performed on all shards of the distributed table, to ensure that the
specified action is performed on the target for each row in the target
that is not in the source. This PR enhances Citus MERGE execution so
that when a repartitioned source shard has zero rows, and the MERGE
command specifies a `WHEN NOT MATCHED BY SOURCE` clause, the MERGE is
performed against the corresponding shard of the distributed table using
an empty (zero row) relation as source, by generating a query of the
form:
```
MERGE INTO target_shard_0002 AS t
USING (SELECT id FROM (VALUES (NULL) ) source_0002(id) WHERE FALSE) AS s ON t.id = s.id
WHEN NOT MATCHED BY SOURCE THEN UPDATE t set t.col1 = 100
```
This works because each row in the target shard will be updated, and
`WHEN MATCHED` and `WHEN NOT MATCHED`, if specified, will be no-ops
because the source has zero rows.
To implement this when the source is a local or reference table involves
teaching function `ExcuteSourceAtCoordAndRedistribution()` in
`merge_executor.c` to not prune tasks when the query has `WHEN NOT
MATCHED BY SOURCE` but to instead replace the task's query to one that
uses an empty relation as source. And when the source is a distributed
query, function
`ExecuteMergeSourcePlanIntoColocatedIntermediateResults()` (also in
`merge_executor.c`) instead of skipping empty tasks now generates a
query that uses an empty relation as source for the corresponding target
shard of the distributed table, but again only when the query has `WHEN
NOT MATCHED BY SOURCE`. A new function `BuildEmptyResultQuery()` is
added to `recursive_planning.c` and it is used by both the
aforementioned functions in `merge_executor.c` to build an empty
relation to use as the source. It applies the appropriate type to each
column of the empty relation so the join with the target makes sense to
the query compiler.
DESCRIPTION: Fixes a crash in left outer joins that can happen when
there is an an aggregate on a column from the inner side of the join.
Fix the SEGV seen in #7787 and #7899; it occurs because a column in the
targetlist of a worker subquery can contain a non-empty varnullingrels
field if the column is from the inner side of a left outer join. The
issue can also occur with the columns in the HAVING clause, and this is
also tested in the fix. The issue was triggered by the introduction of
the varnullingrels to Vars in Postgres 16 (2489d76c)
There is a related issue, #7705, where a non-empty varnullingrels was
incorrectly copied into the query tree for the combine query. Here, a
non-empty varnullingrels field of a var is incorrectly copied into the
query tree for a worker subquery.
The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
DESCRIPTION: Drops PG14 support
1. Remove "$version_num" != 'xx' from configure file
2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code
3. Look at pg_version_compat.h file, remove all _compat functions etc
defined specifically for PGXX differences
4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM <
PG_VERSION_(XX+1) ifs in the codebase
5. delete ruleutils_xx.c file
6. cleanup normalize.sed file from pg14 specific lines
7. delete all alternative output files for that particular PG version,
server_version_ge variable helps here
DESCRIPTION: Propagates MERGE ... WHEN NOT MATCHED BY SOURCE
It seems like there is not much needed to be done here.
`get_merge_query_def` from `ruleutils_17` is updated with "WHEN NOT
MATCHED BY SOURCE" therefore `deparse_shard_query` parses the merge
query for execution on the shard correctly.
Relevant PG commit:
https://github.com/postgres/postgres/commit/0294df2f1
DESCRIPTION: Propagates MEMORY and SERIALIZE options of EXPLAIN
The options for `MEMORY` can be true or false. Default is false.
The options for `SERIALIZE` can be none, text or binary. Default is
none.
I referred to how we added support for WAL option in this PR [Support
EXPLAIN(ANALYZE, WAL)](https://github.com/citusdata/citus/pull/4196).
For the tests however, I used the same tests as Postgres, not like the
tests in the WAL PR. I used exactly the same tests as Postgres does, I
simply distributed the table beforehand. See below the relevant Postgres
commits from where you can see the tests added as well:
- [Add EXPLAIN
(MEMORY)](https://github.com/postgres/postgres/commit/5de890e36)
- [Invent SERIALIZE option for
EXPLAIN.](https://github.com/postgres/postgres/commit/06286709e)
This PR required a lot of copying of Postgres static functions regarding
how `EXPLAIN` works for `MEMORY` and `SERIALIZE` options. Specifically,
these copy-pastes were required for updating `ExplainWorkerPlan()`
function, which is in fact based on postgres' `ExplainOnePlan()`:
```C
/* copied from explain.c to update ExplainWorkerPlan() in citus according to ExplainOnePlan() in postgres */
#define BYTES_TO_KILOBYTES(b)
typedef struct SerializeMetrics
static bool peek_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_memory_counters(ExplainState *es, const MemoryContextCounters *mem_counters);
static void ExplainIndentText(ExplainState *es);
static void ExplainPrintSerialize(ExplainState *es, SerializeMetrics *metrics);
static SerializeMetrics GetSerializationMetrics(DestReceiver *dest);
```
_Note_: it looks like we were missing some `buffers` option details as
well. I put them together with the memory option, like the code in
Postgres explain.c, as I didn't want to change the copied code. However,
I tested locally and there is no big deal in previous Citus versions,
and you can also see that existing Citus tests with `buffers true`
didn't change. Therefore, I prefer not to backport "buffers" changes to
previous versions.
DESCRIPTION: Adds JSON_TABLE() support
PG17 has added basic `JSON_TABLE()` functionality
`JSON_TABLE()` allows `JSON` data to be converted into a relational view
and thus used, for example, in a `FROM` clause, like other tabular data.
We treat `JSON_TABLE` the same as correlated functions (e.g., recurring
tuples). In the end, for multi-shard `JSON_TABLE` commands, we apply the
same restrictions as reference tables (e.g., cannot perform a lateral
outer join when a distributed subquery references a (reference
table)/(json table) etc.)
Relevant PG17 commits:
[basic JSON
table](https://github.com/postgres/postgres/commit/de3600452), [nested
paths in json
table](https://github.com/postgres/postgres/commit/bb766cde6)
Onder had previously added json table support for PG15BETA1, but we
reverted that commit because json table was reverted in PG15.
ce7f1a530f
Previous relevant PG15Beta1 commit:
https://github.com/postgres/postgres/commit/4e34747c8
Therefore, I referred to Onder's commit for this commit as well, with a
few changes due to some differences between PG15/PG17:
1) In PG15Beta1, we had also `PLAN` clauses for `JSON_TABLE`
https://github.com/postgres/postgres/commit/fadb48b00, and Onder's
commit includes tests for those as well. However, `PLAN` nodes are _not_
added in PG17. Therefore, I didn't include the `json_table_select_only`
test, which had mostly queries involving `PLAN`. I only included the
last query from that test.
2) In PG15 timeline (Citus 11.1), we didn't support outer joins where
the outer rel is a recurring one and the inner one is a non-recurring
one. However, [Onur added support for that one in Citus
11.2](https://github.com/citusdata/citus/pull/6512), therefore I updated
the tests from Onder's commit accordingly.
3) PG17 json table has nested paths and columns, therefore I added a
test
with a distributed table, which is exactly the same as the one in
sqljson_jsontable in PG17.
https://github.com/postgres/postgres/commit/bb766cde6
This pull request also adds some basic tests on validation of SQL/JSON
constructor functions JSON(), JSON_SCALAR(), and JSON_SERIALIZE(),
and also SQL/JSON query functions JSON_EXISTS(), JSON_QUERY(), and
JSON_VALUE(). The relevant PG commits are the following:
[JSON(), JSON_SCALAR(),
JSON_SERIALIZE()](https://github.com/postgres/postgres/commit/03734a7fe)
[JSON_EXISTS(), JSON_VALUE(),
JSON_QUERY()](https://github.com/postgres/postgres/commit/6185c9737)
- Adapted `pgmerge.sql` tests from PostgreSQL community's `merge.sql` to
Citus by converting tables into Citus local tables.
- Identified two new PostgreSQL 17 MERGE features (`RETURNING` support
and MERGE on updatable views) not yet supported by Citus.
- Implemented changes to detect unsupported features and raise clean
exceptions, ensuring pgmerge tests pass without diffs.
- Addressed breaking changes caused by `MERGE ... WHEN NOT MATCHED BY
SOURCE` restructuring, reducing diffs in pgmerge tests.
- Segregated unsupported test cases into `merge_unsupported.sql` to
maintain clarity and avoid large diffs in test files.
- Prepared the Citus MERGE planner to handle new PostgreSQL changes,
reducing remaining test discrepancies.
All merge tests now pass cleanly, with unsupported cases clearly
isolated.
Relevant PG commits:
c649fa24a
https://github.com/postgres/postgres/commit/c649fa24a
0294df2f1
https://github.com/postgres/postgres/commit/0294df2f1
---------
Co-authored-by: naisila <nicypp@gmail.com>
This PR provides successful compilation against PG17.0.
- Remove ExecFreeExprContext call
Relevant PG commit
d060e921ea5aa47b6265174c32e1128cebdbc3df
d060e921ea
- PG17 uses streaming IO in analyze, fix scan_analyze_next_block function
Relevant PG commit
041b96802efa33d2bc9456f2ad946976b92b5ae1
041b96802e
- Define ObjectClass for PG17+ only since it's removed
Relevant PG commit:
89e5ef7e21812916c9cf9fcf56e45f0f74034656
89e5ef7e21
- Remove ReorderBufferTupleBuf structure.
Relevant PG commit:
08e6344fd6423210b339e92c069bb979ba4e7cd6
08e6344fd6
- Define colliculocale and daticulocale since they have been renamed
Relevant PG commit:
f696c0cd5f299f1b51e214efc55a22a782cc175d
f696c0cd5f
- makeStringConst defined in PG17
Relevant PG commit:
de3600452b61d1bc3967e9e37e86db8956c8f577
de3600452b
- RangeVarCallbackOwnsTable was replaced by RangeVarCallbackMaintainsTable
Relevant PG commit:
ecb0fd33720fab91df1207e85704f382f55e1eb7
ecb0fd3372
- attstattarget is nullable, define pg compatible functions for it
Relevant PG commit:
4f622503d6de975ac87448aea5cea7de4bc140d5
4f622503d6
- stxstattarget is nullable in PG17, write compat functions for it
Relevant PG commit:
012460ee93c304fbc7220e5b55d9d0577fc766ab
012460ee93
- Use ResourceOwner to track WaitEventSet in PG17
Relevant PG commit:
50c67c2019ab9ade8aa8768bfe604cd802fe8591
50c67c2019
- getIdentitySequence now uses Relation instead of relation_id
Relevant PG commit:
509199587df73f06eda898ae13284292f4ae573a
509199587d
- Remove no-op tuplestore_donestoring function
Relevant PG commit:
75680c3d805e2323cd437ac567f0677fdfc7b680
75680c3d80
- MergeAction can have 3 merge kinds (now enum) in PG17, write compat
Relevant PG commit:
0294df2f1f842dfb0eed79007b21016f486a3c6c
0294df2f1f
- EXPLAIN (MEMORY) is added, make changes to ExplainOnePlan
Relevant PG commit:
5de890e3610d5a12cdaea36413d967cf5c544e20
5de890e361
- LIMIT_OPTION_DEFAULT has been removed as it's useless, use LIMIT_OPTION_COUNT
Relevant PG commit:
a6be0600ac3b71dda8277ab0fcbe59ee101ac1ce
a6be0600ac
- write compat for create_foreignscan_path bcs of more arguments in PG17
Relevant PG commit:
9e9931d2bf40e2fea447d779c2e133c2c1256ef3
9e9931d2bf
- pgprocno and lxid have been combined into a struct in PGPROC
Relevant PG commits:
28f3915b73f75bd1b50ba070f56b34241fe53fd1
28f3915b73
ab355e3a88de745607f6dd4c21f0119b5c68f2ad
ab355e3a88
024c521117579a6d356050ad3d78fdc95e44eefa
024c521117
- Simplify CitusNewNode (#7434)
postgres refactored newNode() in PG 17, the main point for doing this is
the original tricks is no longer neccessary for modern compilers[1].
This does the same for Citus.
This should have no backward compatibility issues since it just replaces
palloc0fast with palloc0.
This is good for forward compatibility since palloc0fast no longer
exists in PG 17.
[1]
https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
(cherry picked from commit 4b295cc)
This is prep work for successful compilation with PG17
PG17added foreach_ptr, foreach_int and foreach_oid macros
Relevant PG commit
14dd0f27d7cd56ffae9ecdbe324965073d01a9ff
14dd0f27d7
We already have these macros, but they are different with the
PG17 ones because our macros take a DECLARED variable, whereas
the PG16 macros declare a locally-scoped loop variable themselves.
Hence I am renaming our macros to foreach_declared_
I am separating this into its own PR since it touches many files. The
main compilation PR is https://github.com/citusdata/citus/pull/7699
We thought we provided support for this in
b8c493f2c4
However the use of parameters in SQL is not supported in Citus. Since
generic plan queries use parameters, we can't support for now.
Relevant PG16 commit https://github.com/postgres/postgres/commit/3c05284Fixes#7813 with proper error message
In function MasterAggregateMutator(), when the original Node is a Var node use makeVar() instead
of copyObject() when constructing the Var node for the target list of the combine query.
The varnullingrels field of the original Var node is ignored because it is not relevant for the
combine query; copying this cause the problem in issue 7705, where a coordinator query had
a Var with a reference to a non-existent join relation.
DESCRIPTION: Add a check to see if the given limit is null.
Fixes a bug by checking if the limit given in the query is null when the
actual limit is computed with respect to the given offset.
Prior to this change, null is interpreted as 0 during the limit
calculation when both limit and offset are given.
Fixes#7663
Related to issue #7619, #7620
Merge command fails when source query is single sharded and source and
target are co-located and insert is not using distribution key of
source.
Example
```
CREATE TABLE source (id integer);
CREATE TABLE target (id integer );
-- let's distribute both table on id field
SELECT create_distributed_table('source', 'id');
SELECT create_distributed_table('target', 'id');
MERGE INTO target t
USING ( SELECT 1 AS somekey
FROM source
WHERE source.id = 1) s
ON t.id = s.somekey
WHEN NOT MATCHED
THEN INSERT (id)
VALUES (s.somekey)
ERROR: MERGE INSERT must use the source table distribution column value
HINT: MERGE INSERT must use the source table distribution column value
```
Author's Opinion: If join is not between source and target distributed
column, we should not force user to use source distributed column while
inserting value of target distributed column.
Fix: If user is not using distributed key of source for insertion let's
not push down query to workers and don't force user to use source
distributed column if it is not part of join.
This reverts commit fa4fc0b372.
Co-authored-by: paragjain <paragjain@microsoft.com>
Because we want to track PR numbers and to make backporting easy we
(pretty much always) use squash-merges when merging to master. We
accidentally used a rebase merge for PR #7620. This reverts those
changes so we can redo the merge using squash merge.
This reverts all commits from eedb607c to 9e71750fc.
When using a CASE WHEN expression in the body
of the function that is used in the DO block, a segmentation
fault occured. This fixes that.
Fixes#7381
---------
Co-authored-by: Konstantin Morozov <vzbdryn@yahoo.com>
DESCRIPTION: Remove a few small memory leaks
In #7440 one instance of a strdup was removed. But there were a few
more. This removes the ones that are left over, or adds a comment why
strdup is on purpose.
This change adds a script to programatically group all includes in a
specific order. The script was used as a one time invocation to group
and sort all includes throught our formatted code. The grouping is as
follows:
- System includes (eg. `#include<...>`)
- Postgres.h (eg. `#include "postgres.h"`)
- Toplevel imports from postgres, not contained in a directory (eg.
`#include "miscadmin.h"`)
- General postgres includes (eg . `#include "nodes/..."`)
- Toplevel citus includes, not contained in a directory (eg. `#include
"citus_verion.h"`)
- Columnar includes (eg. `#include "columnar/..."`)
- Distributed includes (eg. `#include "distributed/..."`)
Because it is quite hard to understand the difference between toplevel
citus includes and toplevel postgres includes it hardcodes the list of
toplevel citus includes. In the same manner it assumes anything not
prefixed with `columnar/` or `distributed/` as a postgres include.
The sorting/grouping is enforced by CI. Since we do so with our own
script there are not changes required in our uncrustify configuration.
When executing a prepared CALL, which is not pure SQL but available with
some drivers like npgsql and jpgdbc, Citus entered a code path where a
plan is not defined, while trying to increase its cost. Thus SIG11 when
plan is a NULL pointer.
Fix by only increasing plan cost when plan is not null.
However, it is a bit suspicious to get here with a NULL plan and maybe a
better change will be to not call
ShardPlacementForFunctionColocatedWithDistTable() with a NULL plan at
all (in call.c:134)
bug hit with for example:
```
CallableStatement proc = con.prepareCall("{CALL p(?)}");
proc.registerOutParameter(1, java.sql.Types.BIGINT);
proc.setInt(1, -100);
proc.execute();
```
where `p(bigint)` is a distributed "function" and the param the
distribution key (also in a distributed table), see #7242 for details
Fixes#7242
PG16 compatibility - part 11
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff
part 9 b2291374b4
part 10 a2315fdc67
part 11 9fa72545e2
This commit is in the series of PG16 compatibility commits.
We already took care of the majority of necessary outer join checks
in part 4 7c6b4ce103
However, In RelationInfoContainsOnlyRecurringTuples,
we need to add one more check of whether we are dealing
with an outer join RTE using IsRelOptOuterJoin function.
This prevents an outer join crash in sqlancer_failures.sql test.
We expect one more commit of PG compatibility with Citus's current
features are regression tests sanity.
PG16 compatibility - Part 6
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
This commit is in the series of PG16 compatibility commits.
It handles the Permission Info changes in PG16. See below:
The main issue lies in the following entries of PlannedStmt: {
rtable
permInfos
}
Each rtable has an int perminfoindex, and its actual permission info is
obtained through the following:
permInfos[perminfoindex]
We had crashes because perminfoindexes were not updated in the finalized
planned statement after distributed planner hook.
So, basically, everywhere we set a query's or planned statement's rtable
entry, we need to set the rteperminfos/permInfos accordingly.
Relevant PG commits:
a61b1f7482
a61b1f74823c9c4f79c95226a461f1e7a367764b
b803b7d132
b803b7d132e3505ab77c29acf91f3d1caa298f95
More PG16 compatibility commits are coming soon ...
PG16 compatibility - Part 5
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
This commit is in the series of PG16 compatibility commits. Find the explanation below:
If we allow to adjust partitioning, we get a crash when accessing
amcostestimate of partitioned indexes, because amcostestimate is NULL
for them. The following PG commit is the culprit:
3c569049b7
3c569049b7b502bb4952483d19ce622ff0af5fd6
Previously, partitioned indexes would just be ignored.
Now, they are added in the list. However get_relation_info expects the
tables which have partitioned indexes to have the inh flag set properly.
AdjustPartitioningForDistributedPlanning plays with that flag, hence we
don't get the desired behaviour.
The hook is simply removing all partitioned indexes from the list.
More PG16 compatibility commits are coming soon ...
PG16 compatibility - Part 4
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
This commit is in the series of PG16 compatibility commits.
It adds some outer join checks to the planner,
the new password_required option to the subscription,
and a crash fix related to PGIOAlignedBlock, see below for more details:
- Fix PGIOAlignedBlock Assert crash in PG16
Relevant PG commit:
faeedbcefd
faeedbcefd40bfdf314e048c425b6d9208896d90
- Pass planner info as argument to make_simple_restrictinfo
Pre PG16 passing plannerInfo to make_simple_restrictinfo
was only needed for placeholder Vars, which is not the case
in this part of the codebase because we are building the
expression from shard intervals which don't have placeholder
vars.
However, PG16 is counting baserels appearing in clause_relids
and is deleting the rels mentioned in plannerinfo->outer_join_rels
Hence directly accessing plannerinfo.
We will crash if we leave it as NULL.
For reference
2489d76c49 (diff-e045c41eda9686451a7993e91518e40056b3739365e39eb1b70ae438dc1f7c76R207)
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
- Add outer join checks, root->simple_rel_array
- fix rebalancer to include passwork_required option
Relevant PG commit:
c3afe8cf5a
c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6
More PG16 compatibility commits are coming soon ...
Similar to https://github.com/citusdata/citus/pull/7077.
As PG 16+ has changed the join restriction information for certain outer
joins, MERGE is also impacted given that is is also underlying an outer
join.
See #7077 for the details.
Tradionally our planner works in the following order:
router - > pushdown -> repartition -> pull to coordinator
However, for INSERT .. SELECT commands, we did not support "router".
In practice, that is not a big issue, because pushdown planning can
handle router case as well.
However, with PG 16, certain outer joins are converted to JOIN without
any conditions (e.g., JOIN .. ON (true)) and the filters are pushed down
to the tables.
When the filters are pushed down to the tables, router planner can
detect. However, pushdown planner relies on JOIN conditions.
An example query:
```
INSERT INTO agg_events (user_id)
SELECT raw_events_first.user_id
FROM raw_events_first LEFT JOIN raw_events_second
ON raw_events_first.user_id = raw_events_second.user_id
WHERE raw_events_first.user_id = 10;
```
As a side effect of this change, now we can also relax certain
limitation that "pushdown" planner emposes, but not "router". So, with
this PR, we also allow those.
Closes https://github.com/citusdata/citus/pull/6772
DESCRIPTION: Prevents unnecessarily pulling the data into coordinator
for some INSERT .. SELECT queries that target a single-shard group