In preparation of sorting and grouping all includes we wanted to move
this file to the toplevel includes for good grouping/sorting.
(cherry picked from commit 0dac63afc0)
PG16 compatibility - part 11
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff
part 9 b2291374b4
part 10 a2315fdc67
part 11 9fa72545e2
This commit is in the series of PG16 compatibility commits.
We already took care of the majority of necessary outer join checks
in part 4 7c6b4ce103
However, In RelationInfoContainsOnlyRecurringTuples,
we need to add one more check of whether we are dealing
with an outer join RTE using IsRelOptOuterJoin function.
This prevents an outer join crash in sqlancer_failures.sql test.
We expect one more commit of PG compatibility with Citus's current
features are regression tests sanity.
PG16 compatibility - Part 4
Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
This commit is in the series of PG16 compatibility commits.
It adds some outer join checks to the planner,
the new password_required option to the subscription,
and a crash fix related to PGIOAlignedBlock, see below for more details:
- Fix PGIOAlignedBlock Assert crash in PG16
Relevant PG commit:
faeedbcefd
faeedbcefd40bfdf314e048c425b6d9208896d90
- Pass planner info as argument to make_simple_restrictinfo
Pre PG16 passing plannerInfo to make_simple_restrictinfo
was only needed for placeholder Vars, which is not the case
in this part of the codebase because we are building the
expression from shard intervals which don't have placeholder
vars.
However, PG16 is counting baserels appearing in clause_relids
and is deleting the rels mentioned in plannerinfo->outer_join_rels
Hence directly accessing plannerinfo.
We will crash if we leave it as NULL.
For reference
2489d76c49 (diff-e045c41eda9686451a7993e91518e40056b3739365e39eb1b70ae438dc1f7c76R207)
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d
- Add outer join checks, root->simple_rel_array
- fix rebalancer to include passwork_required option
Relevant PG commit:
c3afe8cf5a
c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6
More PG16 compatibility commits are coming soon ...
This commit is the second and last phase of dropping PG13 support.
It consists of the following:
- Removes all PG_VERSION_13 & PG_VERSION_14 from codepaths
- Removes pg_version_compat entries and columnar_version_compat entries
specific for PG13
- Removes alternative pg13 test outputs
- Removes PG13 normalize lines and fix the test outputs based on that
It is a continuation of 5bf163a27d
`PlaceHolderVar` is not relevant to be processed inside a restriction
clause. Otherwise, `pull_var_clause_default` would throw error. PG would
create the restriction to physical `Var` that `PlaceHolderVar` points to
anyway, so it is safe to skip this restriction.
DESCRIPTION: Fixes a bug related to WHERE clause list which contains
placeholder.
Fixes https://github.com/citusdata/citus/issues/6758
Today we allow planning the queries that reference non-colocated tables
if the shards that query targets are placed on the same node. However,
this may not be the case, e.g., after rebalancing shards because it's
not guaranteed to have those shards on the same node anymore.
This commit adds citus.enable_non_colocated_router_query_pushdown GUC
that can be used to disallow planning such queries via router planner,
when it's set to false. Note that the default value for this GUC will be
"true" for 11.3, but we will alter it to "false" on 12.0 to not
introduce
a breaking change in a minor release.
Closes#692.
Even more, allowing such queries to go through router planner also
causes
generating an incorrect plan for the DML queries that reference
distributed
tables that are sharded based on different replication factor settings.
For
this reason, #6779 can be closed after altering the default value for
this
GUC to "false", hence not now.
DESCRIPTION: Adds `citus.enable_non_colocated_router_query_pushdown` GUC
to ensure generating a consistent distributed plan for the queries that
reference non-colocated distributed tables (when set to "false", the
default is "true").
Because they're only interested in distributed tables. Even more,
this replaces HasDistributionKey() check with
IsCitusTableType(DISTRIBUTED_TABLE) because this doesn't make a
difference on main and sounds slightly more intuitive. Plus, this
would also allow safely using this function in
https://github.com/citusdata/citus/pull/6773.
Fixes#6672
2) Move all MERGE related routines to a new file merge_planner.c
3) Make ConjunctionContainsColumnFilter() static again, and rearrange the code in MergeQuerySupported()
4) Restore the original format in the comments section.
5) Add big serial test. Implement latest set of comments
This implements the phase - II of MERGE sql support
Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and
target relations are joined on the distribution column with a constant qual. This should be a Citus single-task
query. Below is an example.
SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);
MERGE INTO t1
USING s1 ON t1.id = s1.id AND t1.id = 100
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)
Basically, MERGE checks to see if
There are a minimum of two distributed tables (source and a target).
All the distributed tables are indeed colocated.
MERGE relations are joined on the distribution column
MERGE .. USING .. ON target.dist_key = source.dist_key
The query should touch only a single shard i.e. JOIN AND with a constant qual
MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <>
If any of the conditions are not met, it raises an exception.
(cherry picked from commit 44c387b978)
This implements MERGE phase3
Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both
the source and target relations are joined on the distribution column. This will generate multiple tasks
which execute independently after pushdown.
SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);
MERGE INTO t1
USING s1
ON t1.id = s1.id
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)
*The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group
as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not
guaranteed to be on the same node as the id distribution-column.
MERGE INTO target t
USING source s ON (t.customer_id = s.customer_id)
WHEN NOT MATCHED THEN - -
INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……);
OR this scenario where we update the distribution column itself
MERGE INTO target t
USING source s On (t.customer_id = s.customer_id)
WHEN MATCHED THEN
UPDATE SET customer_id = 100;
(cherry picked from commit fa7b8949a8)
Now that we will soon add another table type having DISTRIBUTE_BY_NONE
as distribution method and that we want the code to interpret such
tables mostly as distributed tables, let's make the definition of those
other two table types more strict by removing
CITUS_TABLE_WITH_NO_DIST_KEY
macro.
And instead, use HasDistributionKey() check in the places where the
logic applies to all table types that have / don't have a distribution
key. In future PRs, we might want to convert some of those
HasDistributionKey() checks if logic only applies to Citus local /
reference tables, not the others.
And adding HasDistributionKey() also allows us to consider having
DISTRIBUTE_BY_NONE as the distribution method as a "table attribute"
that can apply to distributed tables too, rather something that
determines the table type.
Recursive planner should handle all the tree from bottom to top at
single pass. i.e. It should have already recursively planned all
required parts in its first pass. Otherwise, this means we have bug at
recursive planner, which needs to be handled. We add a check here and
return error.
DESCRIPTION: Fixes wrong results by throwing error in case recursive
planner multipass the query.
We found 3 different cases which causes recursive planner passes the
query multiple times.
1. Sublink in WHERE clause is planned at second pass after we
recursively planned a distributed table at the first pass. Fixed by PR
#6657.
2. Local-distributed joins are recursively planned at both the first and
the second pass. Issue #6659.
3. Some parts of the query is considered to be noncolocated at the
second pass as we do not generate attribute equivalances between
nondistributed and distributed tables. Issue #6653
This implements the phase - II of MERGE sql support
Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and
target relations are joined on the distribution column with a constant qual. This should be a Citus single-task
query. Below is an example.
SELECT create_distributed_table('t1', 'id');
SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’);
MERGE INTO t1
USING s1 ON t1.id = s1.id AND t1.id = 100
WHEN MATCHED THEN
UPDATE SET val = s1.val + 10
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED THEN
INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src)
Basically, MERGE checks to see if
There are a minimum of two distributed tables (source and a target).
All the distributed tables are indeed colocated.
MERGE relations are joined on the distribution column
MERGE .. USING .. ON target.dist_key = source.dist_key
The query should touch only a single shard i.e. JOIN AND with a constant qual
MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <>
If any of the conditions are not met, it raises an exception.
Comment from the code is clear on this:
/*
* The statistics objects of the distributed table are not relevant
* for the distributed planning, so we can override it.
*
* Normally, we should not need this. However, the combination of
* Postgres commit 269b532aef55a579ae02a3e8e8df14101570dfd9 and
* Citus function AdjustPartitioningForDistributedPlanning()
* forces us to do this. The commit expects statistics objects
* of partitions to have "inh" flag set properly. Whereas, the
* function overrides "inh" flag. To avoid Postgres to throw error,
* we override statlist such that Postgres does not try to process
* any statistics objects during the standard_planner() on the
* coordinator. In the end, we do not need the standard_planner()
* on the coordinator to generate an optimized plan. We call
* into standard_planner() for other purposes, such as generating the
* relationRestrictionContext here.
*
* AdjustPartitioningForDistributedPlanning() is a hack that we use
* to prevent Postgres' standard_planner() to expand all the partitions
* for the distributed planning when a distributed partitioned table
* is queried. It is required for both correctness and performance
* reasons. Although we can eliminate the use of the function for
* the correctness (e.g., make sure that rest of the planner can handle
* partitions), it's performance implication is hard to avoid. Certain
* planning logic of Citus (such as router or query pushdown) relies
* heavily on the relationRestrictionList. If
* AdjustPartitioningForDistributedPlanning() is removed, all the
* partitions show up in the, causing high planning times for
* such queries.
*/
* Remove if conditions with PG_VERSION_NUM < 13
* Remove server_above_twelve(&eleven) checks from tests
* Fix tests
* Remove pg12 and pg11 alternative test output files
* Remove pg12 specific normalization rules
* Some more if conditions in the code
* Change RemoteCollationIdExpression and some pg12/pg13 comments
* Remove some more normalization rules
* Fix UNION not being pushdown
Postgres optimizes column fields that are not needed in the output. We
were relying on these fields to understand if it is safe to push down a
union query.
This fix looks at the parse query, which has the original column fields
to detect if it is safe to push down a union query.
* Add more tests
* Simplify code and make it more robust
* Process varlevelsup > 0 in FindReferencedTableColumn
* Only look for outers vars in union path
* Add more comments
* Remove UNION ALL specific logic for pulling up childvars
With this commit, we make sure to prevent infinite recursion for queries
in the format: [subquery with a UNION ALL] JOIN [table or subquery]
Also, fixes a bug where we pushdown UNION ALL below a JOIN even if the
UNION ALL is not safe to pushdown.
* Use translated vars in postgres 13 as well
Postgres 13 removed translated vars with pg 13 so we had a special logic
for pg 13. However it had some bug, so now we copy the translated vars
before postgres deletes it. This also simplifies the logic.
* fix rtoffset with pg >= 13
It seems that we need to consider only pseudo constants while doing some
shortcuts in planning. For example there could be a false clause but it
can contribute to the result in which case it will not be a pseudo
constant.
Attribute number in a subquery RTE and relation RTE means different
things. In a relation attribute number will point to the column number
in the table definition including the dropped columns as well however in
subquery, it means the index in the target list. When we convert a
relation RTE to subquery RTE we should either correct all the relevant
attribute numbers or we can just add a dummy column for the dropped
columns. We choose the latter in this commit because it is practically
too vulnerable to update all the vars in a query.
Another thing this commit fixes is that in case a join restriction
clause list contains a false clause, we should just returns a false
clause instead of the whole list, because the whole list will contain
restrictions from other RTEs as well and this breaks the query, which
can be seen from the output changes, now it is much simpler.
Also instead of adding single tests for dropped columns, we choose to
run the whole mixed queries with tables with dropped columns, this
revealed some bugs already, which are fixed in this commit.
Baseinfo also has pushed down filters etc, so it makes more sense to use
BaseRestrictInfo to determine what columns have constant equality
filters.
Also RteIdentity is used for removing conversion candidates instead of
rteIndex.
AllDataLocallyAccessible and ContainsLocalTableSubqueryJoin are removed.
We can possibly remove ModifiesLocalTableWithRemoteCitusLocalTable as
well. Though this removal has a side effect that now when all the data
is locally available, we could still wrap a relation into a subquery, I
guess that should be resolved in the router planner itself.
Add more tests
We should not recursively plan an already routable plannable query. An
example of this is (SELECT * FROM local JOIN (SELECT * FROM dist) d1
USING(a));
So we let the recursive planner do all of its work and at the end we
convert the final query to to handle unsupported joins. While doing each
conversion, we check if it is router plannable, if so we stop.
Only consider range table entries that are in jointree
If a range table is not in jointree then there is no point in
considering that because we are trying to convert range table entries to
subqueries for join use case.
The logical planner cannot handle joins between local and distributed table.
Instead, we can recursively plan one side of the join and let the logical
planner handle the rest.
Our algorithm is a little smart, trying not to recursively plan distributed
tables, but favors local tables.
When a relation is used on an OUTER JOIN with FALSE filters,
set_rel_pathlist_hook may not be called for the table.
There might be other cases as well, so do not rely on the hook
for classification of the tables.
RemoveDuplicateJoinRestrictions() function was introduced with the aim of decrasing the overall planning times by eliminating the duplicate JOIN restriction entries (#1989). However, it turns out that the function itself is so CPU intensive with a very high algorithmic complexity, it hurts a lot more than it helps. The function is a clear example of premature optimization.
The table below shows the difference clearly:
"distributed query planning
time master" RemoveDuplicateJoinRestrictions() execution time on master "Remove the function RemoveDuplicateJoinRestrictions()
this PR"
5 table INNER JOIN 9 msec 2msec 7 msec
10 table INNER JOIN 227 msec 194 msec 29 msec
20 table INNER JOIN 1 sec 235 msec 1 sec 139 msec 90 msecs
50 table INNER JOIN 24 seconds 21 seconds 1.5 seconds
100 table INNER JOIN 2 minutes 16 secods 1 minute 53 seconds 23 seconds
250 table INNER JOIN Bottleneck on JoinClauseList 18 minutes 52 seconds Bottleneck on JoinClauseList
5 table INNER JOIN in subquery 9 msec 0 msec 6 msec
10 table INNER JOIN subquery 33 msec 10 msec 32 msec
20 table INNER JOIN subquery 132 msec 67 msec 123 msec
50 table INNER JOIN subquery 1.2 seconds 900 msec 500 msec
100 table INNER JOIN subquery 6 seconds 5 seconds 2 seconds
250 table INNER JOIN subquery 54 seconds 37 seconds 20 seconds
5 table LEFT JOIN 5 msec 0 msec 5 msec
10 table LEFT JOIN 11 msec 0 msec 13 msec
20 table LEFT JOIN 26 msec 2 msec 30 msec
50 table LEFT JOIN 150 msec 15 msec 193 msec
100 table LEFT JOIN 757 msec 71 msec 722 msec
250 table LEFT JOIN 8 seconds 600 msec 8 seconds
5 JOINs among 2 table JOINs 37 msec 11 msec 25 msec
10 JOINs among 2 table JOINs 536 msec 306 msec 352 msec
20 JOINs among 2 table JOINs 794 msec 181 msec 640 msec
50 JOINs among 2 table JOINs 25 seconds 2 seconds 22 seconds
100 JOINs among 2 table JOINs Bottleneck on JoinClauseList 9 seconds Bottleneck on JoinClauseList
150 JOINs among 2 table JOINs Bottleneck on JoinClauseList 46 seconds Bottleneck on JoinClauseList
On top of the performance penalty, the function had a critical bug #4255, and with #4254 we hit one more important bug. It should be fixed by adding the followig check to the ContextCoversJoinRestriction():
```
static bool
JoinRelIdsSame(JoinRestriction *leftRestriction, JoinRestriction *rightRestriction)
{
Relids leftInnerRelIds = leftRestriction->innerrel->relids;
Relids rightInnerRelIds = rightRestriction->innerrel->relids;
if (!bms_equal(leftInnerRelIds, rightInnerRelIds))
{
return false;
}
Relids leftOuterRelIds = leftRestriction->outerrel->relids;
Relids rightOuterRelIds = rightRestriction->outerrel->relids;
if (!bms_equal(leftOuterRelIds, rightOuterRelIds))
{
return false;
}
return true;
}
```
However, adding this eliminates all the benefits tha RemoveDuplicateJoinRestrictions() brings.
I've used the commands here to generate the JOINs mentioned in the PR: https://gist.github.com/onderkalaci/fe8654f9df5916c7af4c7c5eb892561e#file-gistfile1-txt
Inner and outer JOINs behave roughly the same, to simplify the table only added INNER joins.
Disallow `ON TRUE` outer joins with reference & distributed tables
when reference table is outer relation by fixing the logic bug made
when calling `LeftListIsSubset` function.
Also, be more defensive when removing duplicate join restrictions
when join clause is empty for non-inner joins as they might still
contain useful information for non-inner joins.
It seems like Postgres could call set_rel_pathlist() for
the same relation multiple times. This breaks the logic
where we assume relationCount eqauls to the number of
entries in relationRestrictionList.
In summary, relationRestrictionList may contain duplicate
entries.
Introduce table entry utility functions
Citus table cache entry utilities are introduced so that we can easily
extend existing functionality with minimum changes, specifically changes
to these functions. For example IsNonDistributedTableCacheEntry can be
extended for citus local tables without the need to scan the whole
codebase and update each relevant part.
* Introduce utility functions to find the type of tables
A table type can be a reference table, a hash/range/append distributed
table. Utility methods are created so that we don't have to worry about
how a table is considered as a reference table etc. This also makes it
easy to extend the table types.
* Add IsCitusTableType utilities
* Rename IsCacheEntryCitusTableType -> IsCitusTableTypeCacheEntry
* Change citus table types in some checks
This commit mostly adds pg_get_triggerdef_command to our ruleutils_13.
This doesn't add anything extra for ruleutils 13 so it is basically a copy
of the change on ruleutils_12
Rte index is increased by range table index offset in pg >= 13. The
offset is removed with the pg >= 13.
Currently pushdown for union all is disabled because translatedVars is
set to nil on postgres side, and we were using translatedVars to
figure out if partition key has the same index in both sides of union
all. This should be fixed.
Commit on postgres side:
6ef77cf46e81f45716ec981cb08781d426181378
fix union all pushdown logic for pg13
Before pg 13, there was a field, translatedVars, and we were using that
to understand if the partition key has the same index on both sides of
the union all. With pg13 there is a parent_colnos field in appendRelInfo
and we can use that to get the attribute numbers(varattnos) in union all
vars. We make use of parent_colnos instead of translatedVars in pg >=13.