citus

Commit Graph

Author	SHA1	Message	Date
Onder Kalaci	633aa5686c	Pushdown only necessary projections in the recursive relation planning With this commit, we only pull&push the necessary columns. In this context, necessary columns means that the columns that are required for the query to be executed when the relation is wrapped into a subquery. We could potentially optimize things further: (a) If a column only appears as a qual filter, we don't need to pull it to the coordinator (b) We currently pull unnecessary columns as NULL. However, we could potentially adjust remaining of the query tree and do not add columns of the relation to the target entry.	2019-01-11 19:24:27 +03:00
Onder Kalaci	de5e36e850	Pushdown filters on recursive relation planning PostgreSQL already keeps track of the restrictions that are on the relation. With this commit, Citus uses that information and pushes down the filters to the subquery that is recursively planned for the table that is in considiration.	2019-01-11 19:24:27 +03:00
Onder Kalaci	4d1165848b	Basic implementation of recursive planning of non-colocated relation joins	2019-01-11 19:24:26 +03:00
Marco Slot	8893cc141d	Support INSERT...SELECT with ON CONFLICT or RETURNING via coordinator Before this commit, Citus supported INSERT...SELECT queries with ON CONFLICT or RETURNING clauses only for pushdownable ones, since queries supported via coordinator were utilizing COPY infrastructure of PG to send selected tuples to the target worker nodes. After this PR, INSERT...SELECT queries with ON CONFLICT or RETURNING clauses will be performed in two phases via coordinator. In the first phase selected tuples will be saved to the intermediate table which is colocated with target table of the INSERT...SELECT query. Note that, a utility function to save results to the colocated intermediate result also implemented as a part of this commit. In the second phase, INSERT.. SELECT query is directly run on the worker node using the intermediate table as the source table.	2018-11-30 15:29:12 +03:00
Marco Slot	f383e4f307	Description: Refactor code that handles DDL commands from one file into a module The file handling the utility functions (DDL) for citus organically grew over time and became unreasonably large. This refactor takes that file and refactored the functionality into separate files per command. Initially modeled after the directory and file layout that can be found in postgres. Although the size of the change is quite big there are barely any code changes. Only one two functions have been added for readability purposes: - PostProcessIndexStmt which is extracted from PostProcessUtility - PostProcessAlterTableStmt which is extracted from multi_ProcessUtility A README.md has been added to `src/backend/distributed/commands` describing the contents of the module and every file in the module. We need more documentation around the overloading of the COPY command, for now the boilerplate has been added for people with better knowledge to fill out.	2018-11-14 13:36:27 +01:00
mehmet furkan şahin	ef9f38b68d	ApplyLogRedaction noop func is added	2018-08-17 14:48:54 -07:00
Marco Slot	90cdfff602	Implement recursive planning for DML statements	2018-05-03 14:42:28 +02:00
velioglu	d9fa69c031	Refactor query pushdown related logic	2018-05-02 15:03:09 +03:00
Metin Doslu	bcf660475a	Add support for modifying CTEs	2018-02-27 15:08:32 +02:00
Onder Kalaci	1c930c96a3	Support non-co-located joins between subqueries With #1804 (and related PRs), Citus gained the ability to plan subqueries that are not safe to pushdown. There are two high-level requirements for pushing down subqueries: * Individual subqueries that require a merge step (i.e., GROUP BY on non-distribution key, or LIMIT in the subquery etc). We've handled such subqueries via #1876. * Combination of subqueries that are not joined on distribution keys. This commit aims to recursively plan some of such subqueries to make the whole query safe to pushdown. The main logic behind non colocated subquery joins is that we pick an anchor range table entry and check for distribution key equality of any other subqueries in the given query. If for a given subquery, we cannot find distribution key equality with the anchor rte, we recursively plan that subquery. We also used a hacky solution for picking relations as the anchor range table entries. The hack is that we wrap them into a subquery. This is only necessary since some of the attribute equivalance checks are based on queries rather than range table entries.	2018-02-26 13:50:37 +02:00
Onder Kalaci	7b57e0562a	Add infrastructure for detecting non-colocated subqueries	2018-02-26 13:28:25 +02:00
Onder Kalaci	4d70c86645	Leaf level recursive planning for non colocated subqueries With this commit, we enable recursive planning for the subqueries that are not joined on the distribution keys.	2018-02-26 13:28:24 +02:00
velioglu	195ac948d2	Recursively plan subqueries in WHERE clause when FROM recurs	2018-02-13 19:52:12 +03:00
Marco Slot	09c09f650f	Recursively plan set operations when leaf nodes recur	2017-12-26 13:46:55 +02:00
mehmet furkan şahin	57bc86e23d	new debug output for subplans	2017-12-25 09:50:51 +03:00
Murat Tuncer	a9cf0c3e66	Fix CTE column alias issue (#1893 ) We were creating intermediate query result's target names from subquery target list. Now we also check if cte re-defines its column name aliases, and create intermediate result query accordingly.	2017-12-22 09:39:40 +03:00
Onder Kalaci	0d5a4b9c72	Recursively plan subqueries that are not safe to pushdown With this commit, Citus recursively plans subqueries that are not safe to pushdown, in other words, requires a merge step. The algorithm is simple: Recursively traverse the query from bottom up (i.e., bottom meaning the leaf queries). On each level, check whether the query is safe to pushdown (or a single repartition subquery). If the answer is yes, do not touch that subquery. If the answer is no, plan the subquery seperately (i.e., create a subPlan for it) and replace the subquery with a call to `read_intermediate_results(planId, subPlanId)`. During the the execution, run the subPlans first, and make them avaliable to the next query executions. Some of the queries hat this change allows us: * Subqueries with LIMIT * Subqueries with GROUP BY/DISTINCT on non-partition keys * Subqueries involving re-partition joins, router queries * Mixed usage of subqueries and CTEs (i.e., use CTEs in subqueries as well). Nested subqueries as long as we support the subquery inside the nested subquery. * Subqueries with local tables (i.e., those subqueries has the limitation that they have to be leaf subqueries) * VIEWs on the distributed tables just works (i.e., the limitations mentioned below still applies to views) Some of the queries that is still NOT supported: * Corrolated subqueries that are not safe to pushdown * Window function on non-partition keys * Recursively planned subqueries or CTEs on the outer side of an outer join * Only recursively planned subqueries and CTEs in the FROM (i.e., not any distributed tables in the FROM) and subqueries in WHERE clause * Subquery joins that are not on the partition columns (i.e., each subquery is individually joined on partition keys but not the upper level subquery.) * Any limitation that logical planner applies such as aggregate distincts (except for count) when GROUP BY is on non-partition key, or array_agg with ORDER BY	2017-12-21 08:37:40 +02:00
Onder Kalaci	71ce42b936	Refactor RecursivelyPlanSubqueriesAndCTEs() to make it ready to work with subqueries	2017-12-20 09:03:47 +02:00
Marco Slot	5e0539efa3	Plan CTEs when subquery pushdown is on	2017-12-19 16:34:56 +01:00
Marco Slot	74bd33d0cc	Revert "Plan CTEs when subquery pushdown is on" This reverts commit `e3b953b8e3`.	2017-12-17 22:34:20 +01:00
Marco Slot	e3b953b8e3	Plan CTEs when subquery pushdown is on	2017-12-17 21:49:36 +01:00
Marco Slot	2e2b4e81fa	Add support for CTEs in distributed queries	2017-12-14 09:32:55 +01:00

22 Commits (633aa5686c36b03c51f4bf1e2a804f984451ac1c)