We used to rely on PG function flatten_join_alias_vars
to resolve actual columns referenced in target entry list.
The function goes deep and finds the actual relation. This logic
usually works fine. However, when joins are given an alias, inner
relation names are not visible to target entry entry. Thus relation
resolving should stop when we the target entry column refers an
rte of an aliased join.
We stopped using PG function and provided our own flatten function.
The rule for infinite recursion is the following:
- If the query contains a subquery which is recursively planned, and
no other subqueries can be recursively planned due to correlation
(e.g., LATERAL joins), the planner keeps recursing again and again.
One interesting thing here is that even if a subquery contains only intermediate
result(s), we re-recursively plan that. In the end, the logic in the code does the following:
- Try recursive planning any of the subqueries in the query tree
- If any subquery is recursively planned, call the planner again
where the subquery is replaced with the intermediate result.
- Try recursively planning any of the queries
- If any subquery is recursively planned, call the planner again
where the subquery (in this case it is already intermediate result)
is replaced with the intermediate result.
- Try recursively planning any of the queries
- If any subquery is recursively planned, call the planner again
where the subquery (in this case it is already intermediate result)
is replaced with the intermediate result.
- Try recursively planning any of the queries
- If any subquery is recursively planned, call the planner again
where the subquery (in this case it is already intermediate result)
is replaced with the intermediate result.
......
Following scenario resulted in distributed deadlock before this commit:
CREATE TABLE partitioning_test(id int, time date) PARTITION BY RANGE (time);
CREATE TABLE partitioning_test_2009 (LIKE partitioning_test);
CREATE TABLE partitioning_test_reference(id int PRIMARY KEY, subid int);
SELECT create_distributed_table('partitioning_test_2009', 'id'),
create_distributed_table('partitioning_test', 'id'),
create_reference_table('partitioning_test_reference');
ALTER TABLE partitioning_test ADD CONSTRAINT partitioning_reference_fkey FOREIGN KEY (id) REFERENCES partitioning_test_reference(id) ON DELETE CASCADE;
ALTER TABLE partitioning_test_2009 ADD CONSTRAINT partitioning_reference_fkey_2009 FOREIGN KEY (id) REFERENCES partitioning_test_reference(id) ON DELETE CASCADE;
ALTER TABLE partitioning_test ATTACH PARTITION partitioning_test_2009 FOR VALUES FROM ('2009-01-01') TO ('2010-01-01');
Since flattening query may flatten outer joins' columns into coalesce expr that is
in the USING part, and that was not expected before this commit, these queries were
erroring out. It is fixed by this commit with considering coalesce expression as well.
Before this commit, round-robin task assignment policy was relying
on the taskId. Thus, even inside a transaction, the tasks were
assigned to different nodes. This was especially problematic
while reading from reference tables within transaction blocks.
Because, we had to expand the distributed transaction to many
nodes that are not necessarily already in the distributed transaction.
In this context, we define "Fast Path Planning for SELECT" as trivial
queries where Citus can skip relying on the standard_planner() and
handle all the planning.
For router planner, standard_planner() is mostly important to generate
the necessary restriction information. Later, the restriction information
generated by the standard_planner is used to decide whether all the shards
that a distributed query touches reside on a single worker node. However,
standard_planner() does a lot of extra things such as cost estimation and
execution path generations which are completely unnecessary in the context
of distributed planning.
There are certain types of queries where Citus could skip relying on
standard_planner() to generate the restriction information. For queries
in the following format, Citus does not need any information that the
standard_planner() generates:
SELECT ... FROM single_table WHERE distribution_key = X; or
DELETE FROM single_table WHERE distribution_key = X; or
UPDATE single_table SET value_1 = value_2 + 1 WHERE distribution_key = X;
Note that the queries might not be as simple as the above such that
GROUP BY, WINDOW FUNCIONS, ORDER BY or HAVING etc. are all acceptable. The
only rule is that the query is on a single distributed (or reference) table
and there is a "distribution_key = X;" in the WHERE clause. With that, we
could use to decide the shard that a distributed query touches reside on
a worker node.
Failure&Cancellation tests for initial start_metadata_sync() calls
to worker and DDL queries that send metadata syncing messages to an MX node
Also adds message type definitions for messages that are exchanged
during metadata syncing
-
We used to error out if there is a reference table
in the query participating a union. This has caused
pushdownable queries to be evaluated in coordinator.
Now we let reference tables inside union queries as long
as there is a distributed table in from clause.
Existing join checks (reference table on the outer part)
sufficient enought that we do not need check the join relation
of reference tables.
Previously we allowed task assignment policy to have affect on router queries
with only intermediate results. However, that is erroneous since the code-path
that assigns placements relies on shardIds and placements, which doesn't exists
for intermediate results.
With this commit, we do not apply task assignment policies when a router query
hits only intermediate results.
We disable bunch of planning options on the workers. This might be
risky if any concurrent test relies on EXPLAIN OUTPUT as well. Still,
we want to keep this test, so we should try to not parallelize this
test with such test.
Before this commit, Citus supported INSERT...SELECT queries with
ON CONFLICT or RETURNING clauses only for pushdownable ones, since
queries supported via coordinator were utilizing COPY infrastructure
of PG to send selected tuples to the target worker nodes.
After this PR, INSERT...SELECT queries with ON CONFLICT or RETURNING
clauses will be performed in two phases via coordinator. In the first
phase selected tuples will be saved to the intermediate table which
is colocated with target table of the INSERT...SELECT query. Note that,
a utility function to save results to the colocated intermediate result
also implemented as a part of this commit. In the second phase, INSERT..
SELECT query is directly run on the worker node using the intermediate
table as the source table.