citus

Commit Graph

Author	SHA1	Message	Date
Onder Kalaci	0f462e6962	Fix tests after rebase onto pushdown outer joins	2019-01-11 19:32:07 +03:00
Onder Kalaci	736df67310	respect the regression tests for recursively plan inner parts of recurring tuple joins	2019-01-11 19:26:43 +03:00
Onder Kalaci	4d1165848b	Basic implementation of recursive planning of non-colocated relation joins	2019-01-11 19:24:26 +03:00
Hadi Moshayedi	86b12bc2d0	Always prefix operators with their namespace. (#2147 ) Previously we checked if an operator is in pg_catalog, and if it wasn't we prefixed it with namespace in worker queries. This can have a huge impact on performance of physical planner when using custom data types. This happened regardless of current search_path config, because Citus overrides the search path in get_query_def_extended(). When we do so, the check for existence of the operator in current search path in generate_operator_name() fails for any operators outside pg_catalog. This means that nothing gets cached, and in the following calls we will again recheck the system tables for existence of the operators, which took an additional 40-50ms for some of the usecases we were seeing. In this change we skip the pg_catalog check, and always prefix the operator with its namespace.	2018-05-05 13:27:26 -04:00
Onder Kalaci	1c930c96a3	Support non-co-located joins between subqueries With #1804 (and related PRs), Citus gained the ability to plan subqueries that are not safe to pushdown. There are two high-level requirements for pushing down subqueries: * Individual subqueries that require a merge step (i.e., GROUP BY on non-distribution key, or LIMIT in the subquery etc). We've handled such subqueries via #1876. * Combination of subqueries that are not joined on distribution keys. This commit aims to recursively plan some of such subqueries to make the whole query safe to pushdown. The main logic behind non colocated subquery joins is that we pick an anchor range table entry and check for distribution key equality of any other subqueries in the given query. If for a given subquery, we cannot find distribution key equality with the anchor rte, we recursively plan that subquery. We also used a hacky solution for picking relations as the anchor range table entries. The hack is that we wrap them into a subquery. This is only necessary since some of the attribute equivalance checks are based on queries rather than range table entries.	2018-02-26 13:50:37 +02:00
Marco Slot	09c09f650f	Recursively plan set operations when leaf nodes recur	2017-12-26 13:46:55 +02:00
Murat Tuncer	87c6f306f1	Fix join clause eq restrictions (#1884 ) We used to error out if the join clause includes filters like t1.a < t2.a even if other filter like t1.key = t2.key exists. Recently we lifted that restriction in subquery planning by not lifting that restriction and focusing on equivalance classes provided by postgres. This checkin forwards previously erroring out real-time queries due to join clauses to subquery planner and let it handle the join even if the query does not have a subquery. We are now pushing down queries that do not have any subqueries in it. Error message looked misleading, changed to a more descriptive one.	2017-12-22 12:16:14 +03:00
Onder Kalaci	e2a5124830	Add regression tests for recursive subquery planning	2017-12-21 08:37:40 +02:00
Onder Kalaci	0d5a4b9c72	Recursively plan subqueries that are not safe to pushdown With this commit, Citus recursively plans subqueries that are not safe to pushdown, in other words, requires a merge step. The algorithm is simple: Recursively traverse the query from bottom up (i.e., bottom meaning the leaf queries). On each level, check whether the query is safe to pushdown (or a single repartition subquery). If the answer is yes, do not touch that subquery. If the answer is no, plan the subquery seperately (i.e., create a subPlan for it) and replace the subquery with a call to `read_intermediate_results(planId, subPlanId)`. During the the execution, run the subPlans first, and make them avaliable to the next query executions. Some of the queries hat this change allows us: * Subqueries with LIMIT * Subqueries with GROUP BY/DISTINCT on non-partition keys * Subqueries involving re-partition joins, router queries * Mixed usage of subqueries and CTEs (i.e., use CTEs in subqueries as well). Nested subqueries as long as we support the subquery inside the nested subquery. * Subqueries with local tables (i.e., those subqueries has the limitation that they have to be leaf subqueries) * VIEWs on the distributed tables just works (i.e., the limitations mentioned below still applies to views) Some of the queries that is still NOT supported: * Corrolated subqueries that are not safe to pushdown * Window function on non-partition keys * Recursively planned subqueries or CTEs on the outer side of an outer join * Only recursively planned subqueries and CTEs in the FROM (i.e., not any distributed tables in the FROM) and subqueries in WHERE clause * Subquery joins that are not on the partition columns (i.e., each subquery is individually joined on partition keys but not the upper level subquery.) * Any limitation that logical planner applies such as aggregate distincts (except for count) when GROUP BY is on non-partition key, or array_agg with ORDER BY	2017-12-21 08:37:40 +02:00
Onder Kalaci	86b2d9420c	Treat recurring tuples as reference table for GROUP BY checks read_intermediate_results() and immutable functions are implemented. Empty join trees seems not applicable here.	2017-12-13 14:55:42 +02:00
mehmet furkan şahin	1b06b2b306	The data used in regression tests is reduced This commit reduces the size of the data in users_table.data and events_table.data from 10K rows to 100 rows.	2017-11-28 14:15:46 +03:00
Marco Slot	feffe86440	Subqueries containing functions go through subquery pushdown	2017-11-27 22:13:02 +01:00
Marco Slot	0ad39b36fe	Treat immutable table functions and constant subqueries as reference tables	2017-11-21 14:15:22 +01:00
Onder Kalaci	d558ebb923	Relax the checks on ensuring distribution columns for target entries With this commit, we allow pushing down subqueries with only reference tables where GROUP BY or DISTINCT clause or Window functions include only columns from reference tables.	2017-11-21 12:28:14 +02:00
Marco Slot	89eb833375	Use citus.next_shard_id where practical in regression tests	2017-11-15 10:12:05 +01:00
Onder Kalaci	498ac80d8b	Add window function support for SUBQUERY PUSHDOWN and INSERT INTO SELECT This commit provides the support for window functions in subquery and insert into select queries. Note that our support for window functions is still limited because it must have a partition by clause on the distribution key. This commit makes changes in the files insert_select_planner and multi_logical_planner. The required tests are also added with files multi_subquery_window_functions.out and multi_insert_select_window.out.	2017-10-04 15:33:07 +03:00
Onder Kalaci	6116c8e93d	Allow pushing down GROUP BYs when at least there is one distribution column in the target list	2017-09-15 19:15:06 +03:00
Onder Kalaci	a5b66912d4	Expand reference table support in subquery pushdown With this commit, we relax the restrictions put on the reference tables with subquery pushdown. We did three notable improvements: 1) Relax equi-join restrictions Previously, we always expected that the non-reference tables are equi joined with reference tables on the partition key of the non-reference table. With this commit, we allow any column of non-reference tables joined using non-equi joins as well. 2) Relax OUTER JOIN restrictions Previously Citus errored out if any reference table exists at any point of the outer part of an outer join. For instance, See the below sketch where (h) denotes a hash distributed relation, (r) denotes a reference table, (L) denotes LEFT JOIN and (I) denotes INNER JOIN. (L) / \ (I) h / \ r h Before this commit Citus would error out since a reference table appears on the left most part of an left join. However, that was too restrictive so that we only error out if the reference table is directly below and in the outer part of an outer join. 3) Bug fixes We've done some minor bugfixes in the existing implementation.	2017-09-14 20:59:22 +03:00
velioglu	b0efffae1c	Correct planner and add more tests	2017-08-11 10:16:13 +03:00
velioglu	ceba81ce35	Move physical planner checks to logical planner	2017-08-11 10:09:47 +03:00
velioglu	0359d03530	Add set operation check for reference tables	2017-08-11 10:09:47 +03:00
velioglu	c4e3b8b5e1	Add planner changes and tests for subquery on reference tables	2017-08-11 10:09:47 +03:00

22 Commits (0f462e6962f06614256070a4735bd38609fb17d2)