This commit doesn't change any of the logic at all.
Instead, the goal is to:
* Get rid of any code duplication
* Incremental changes to the optimizer made it slightly hard
to follow the code, improve that and make it easier to
implement new features
* Simplify the code by moving each part of query processing (e.g.,
DISTINCT, LIMIT etc) into its own function
* Make the interaction between each part of the query more
obvious (e.g., How DISTINCT affects LIMIT etc)
In case a failure happens when a transaction is rollbacked,
we used to hit an assertion for ensuring there is no pending
activity on the connection. However, that's not true after the
changes in #2031. Thus, we've replaced the assertion with a more
generic function call to consume any pending activity, if exists.
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
functionality in PG1
- no change is made to support new features in PG11
they need to be handled by new commit
- Add install.pl to instal .sql files on Windows
- Remove a hack to PGDLLIMPORT some variables
- Add citus_version.o to the Makefile
- Fix pg_regress_multi's PATH generation on Windows
- Output regression.diffs when the tests fail
- Fix permissions in data directory, make sure postgres can play with it
Before this commit, we had code duplication in the
WorkerExtendedOpNode(). The duplication was
noticeable and any change is prone to bugs.
The PR consists of 4 commits. Each commit incrementally
fixes the problem by moving certain parts of the duplicated
code into smaller, better-documented functions.
Before this commit, we had a divergence among
the creation of master/worker extended op nodes.
This commit moves the related parts into a single place
and allows the creation of master/extended op nodes to
share a common data structure.
PostgreSQL might remove some of the subqueries when they do not
contribute to the query result at all. Citus should not try to
access such subqueries during planning.
Without this change we crash on Windows with COPYing into a table with
62 shards, and we ERROR when COPYing into a table with >62 shards:
ERROR: WaitForMutipleObjects() failed: error code 87
Without this change multi_real_time_transaction blocks forever (on
Windows) in the block where it repeatedly calls pg_advisory_lock(15).
This happens because the deadlock detector tries to cancel the backend
but the backend never processes that signal.
This PR adds support for multiple AND expressions in Having
for pushdown planner. We simply make a call to make_ands_explicit
from MultiLogicalPlanOptimize for the having qual in
workerExtendedOpNode.
After this commit large_table_shard_count wont be used to
check whether broadcast join, which is renamed as reference
join, can be applied. Reference join can only be applied over
reference tables.
We recently added partitionin support to Citus MX. We should not execute
DROP table commands from MX workers but at the moment we try to execute
such commands for partitioned tables. This PR fixes that problem by
adding check.
Previously, we prevented creation of partitioned tables on Citus MX.
We decided to not focus on this feature until there is a need. Since
now there are requests for this feature, we are implementing support
for partitioned tables on Citus MX.
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
Pushing down limit and order by into workers may produce
wrong output when distinct on() clause has expressions,
aggregates, or window functions.
This checking allows pushing down of limits only if
distinct clause is a superset of group by clause. i.e. it contains all clauses in group by.
This commit checks the connection status right after any IO happens
on the socket.
This is necessary since before this commit we didn't pass any information
to the higher level functions whether we're done with the connection
(e.g., no IO required anymore) or an errors happened during the IO.
This is the first of series of window function work.
We can now support window functions that can be pushed down to workers.
Window function must have distribution column in the partition clause
to be pushed down.
We push down order by to worker query when limit is specified
(with some other additional checks). If the query has an expression
on an aggregate or avg aggregate by itself, and there is an order
by on this particular target we may send wrong order by to worker
query with potential to affect query result.
The fix creates a auxilary target entry in the worker query and
uses that target entry for sorting.
Before this PR, we were trusting on the columns of group by about
guaranteeing the uniqueness of the results. However, this assumption
is correct only if the columns in the group by is subset of columns
in the distinct clause. It can be wrong if we have part of group by
columns and some aggregation columns in the distinct clause. With
this PR, we add distinct plan on top of aggregate plan when necessary.
With #1804 (and related PRs), Citus gained the ability to
plan subqueries that are not safe to pushdown.
There are two high-level requirements for pushing down subqueries:
* Individual subqueries that require a merge step (i.e., GROUP BY
on non-distribution key, or LIMIT in the subquery etc). We've
handled such subqueries via #1876.
* Combination of subqueries that are not joined on distribution keys.
This commit aims to recursively plan some of such subqueries to make
the whole query safe to pushdown.
The main logic behind non colocated subquery joins is that we pick
an anchor range table entry and check for distribution key equality
of any other subqueries in the given query. If for a given subquery,
we cannot find distribution key equality with the anchor rte, we
recursively plan that subquery.
We also used a hacky solution for picking relations as the anchor range
table entries. The hack is that we wrap them into a subquery. This is only
necessary since some of the attribute equivalance checks are based on
queries rather than range table entries.
We used to only support pushdownable set operations inside a
subquery, however, we could easily expand the restriction
checks to cover top level set operations as well.