citus

Commit Graph

Author	SHA1	Message	Date
Onder Kalaci	c25de2cf22	Remove flag from As it doesn't make any sense anymore	2020-07-20 12:45:05 +02:00
Philip Dubé	720525cfda	Add support for window functions on coordinator Some refactoring: Consolidate expression which decides whether GROUP BY/HAVING are pushed down Rename early pullUpIntermediateRows to hasNonDistributableAggregates Create WorkerColumnName to handle formatting WORKER_COLUMN_FORMAT Ignore NULL StringInfo pointers to SafeToPushdownWindowFunction Fix bug where SubqueryPushdownMultiNodeTree mutates supplied Query, SafeToPushdownWindowFunction requires the original query as it relies on rtable	2020-03-25 15:31:20 +00:00
Philip Dubé	863bf49507	Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default	2020-01-07 01:16:04 +00:00
Jelte Fennema	acd12a6de5	Normalize tests: s/read_intermediate_result\('[0-9]+_/read_intermediate_result('XXX_/g	2020-01-06 09:32:03 +01:00
Jelte Fennema	21dbd4e55d	Normalize tests: s/generating subplan [0-9]+\_/generating subplan XXX\_/g	2020-01-06 09:32:03 +01:00
Jelte Fennema	58723dd8b0	Normalize tests: s/DEBUG: Plan [0-9]+/DEBUG: Plan XXX/g	2020-01-06 09:32:03 +01:00
Jelte Fennema	7730bd449c	Normalize tests: Remove trailing whitespace	2020-01-06 09:32:03 +01:00
Jelte Fennema	7f3de68b0d	Normalize tests: header separator length	2020-01-06 09:32:03 +01:00
Marco Slot	ba39d72fe1	Fix incorrect union all pushdown issue	2020-01-01 09:03:50 +01:00
SaitTalhaNisanci	7ff4ce2169	Add adaptive executor support for repartition joins (#3169 ) * WIP * wip * add basic logic to run a single job with repartioning joins with adaptive executor * fix some warnings and return in ExecuteDependedTasks if there is none * Add the logic to run depended jobs in adaptive executor The execution of depended tasks logic is changed. With the current logic: - All tasks are created from the top level task list. - At one iteration: - CurTasks whose dependencies are executed are found. - CurTasks are executed in parallel with adapter executor main logic. - The iteration is repeated until all tasks are completed. * Separate adaptive executor repartioning logic * Remove duplicate parts * cleanup directories and schemas * add basic repartion tests for adaptive executor * Use the first placement to fetch data In task tracker, when there are replicas, we try to fetch from a replica for which a map task is succeeded. TaskExecution is used for this, however TaskExecution is not used in adaptive executor. So we cannot use the same thing as task tracker. Since adaptive executor fails when a map task fails (There is no retry logic yet). We know that if we try to execute a fetch task, all of its map tasks already succeeded, so we can just use the first one to fetch from. * fix clean directories logic * do not change the search path while creating a udf * Enable repartition joins with adaptive executor with only enable_reparitition_joins guc * Add comments to adaptive_executor_repartition * dont run adaptive executor repartition test in paralle with other tests * execute cleanup only in the top level execution * do cleanup only in the top level ezecution * not begin a transaction if repartition query is used * use new connections for repartititon specific queries New connections are opened to send repartition specific queries. The opened connections will be closed at the FinishDistributedExecution. While sending repartition queries no transaction is begun so that we can see all changes. * error if a modification was done prior to repartition execution * not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level * fix cleanup logic * update tests * add missing function comments * add test for transaction with DDL before repartition query * do not close repartition connections in adaptive executor * rollback instead of commit in repartition join test * use close connection instead of shutdown connection * remove unnecesary connection list, ensure schema owner before removing directory * rename ExecuteTaskListRepartition * put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case * split adaptive executor repartition to DAG execution logic and repartition logic * apply review items * apply review items * use an enum for remote transaction state and fix cleanup for repartition * add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction * fix style * wip * rename removejobdir to partition cleanup * do not close connections at the end of repartition queries * do repartition cleanup in pg catch * apply review items * decide whether to use transaction or not at execution creation * rename isOutsideTransaction and add missing comment * not error in pg catch while doing cleanup * use replication factor of the creation time, not current time to decide if task tracker should be chosen * apply review items * apply review items * apply review item	2019-12-17 19:09:45 +03:00
SaitTalhaNisanci	70e46703aa	Fix debug1 message in JobExecutorType (#3147 ) When citus.enable_repartition_joins guc is set to on, and we have adaptive executor, there was a typo in the debug message, which was saying realtime executor no adaptive executor.	2019-11-01 11:14:19 +03:00
Philip Dubé	84fe626378	multi_router_planner: refactor error propagation	2019-06-26 10:32:01 +02:00
Hadi Moshayedi	8e2d328530	Search all outer node levels for lateral join params.	2019-06-04 10:14:05 -07:00
Onder Kalaci	64b323d9eb	Add ORDER BY to set_operations	2019-04-23 11:51:58 +03:00
Hanefi Onaldi	1106e14385	Wrap functions in subqueries remove debug logs to fix travis tests Support RowType functions in joins Regression tests for a custom type function in join	2019-02-04 19:19:29 +03:00
Hadi Moshayedi	86b12bc2d0	Always prefix operators with their namespace. (#2147 ) Previously we checked if an operator is in pg_catalog, and if it wasn't we prefixed it with namespace in worker queries. This can have a huge impact on performance of physical planner when using custom data types. This happened regardless of current search_path config, because Citus overrides the search path in get_query_def_extended(). When we do so, the check for existence of the operator in current search path in generate_operator_name() fails for any operators outside pg_catalog. This means that nothing gets cached, and in the following calls we will again recheck the system tables for existence of the operators, which took an additional 40-50ms for some of the usecases we were seeing. In this change we skip the pg_catalog check, and always prefix the operator with its namespace.	2018-05-05 13:27:26 -04:00
velioglu	698d585fb5	Remove broadcast join logic After this change all the logic related to shard data fetch logic will be removed. Planner won't plan any ShardFetchTask anymore. Shard fetch related steps in real time executor and task-tracker executor have been removed.	2018-03-30 11:45:19 +03:00
Murat Tuncer	76f6883d5d	Add support for window functions that can be pushed down to worker (#2008 ) This is the first of series of window function work. We can now support window functions that can be pushed down to workers. Window function must have distribution column in the partition clause to be pushed down.	2018-03-01 19:07:07 +03:00
Onder Kalaci	1c930c96a3	Support non-co-located joins between subqueries With #1804 (and related PRs), Citus gained the ability to plan subqueries that are not safe to pushdown. There are two high-level requirements for pushing down subqueries: * Individual subqueries that require a merge step (i.e., GROUP BY on non-distribution key, or LIMIT in the subquery etc). We've handled such subqueries via #1876. * Combination of subqueries that are not joined on distribution keys. This commit aims to recursively plan some of such subqueries to make the whole query safe to pushdown. The main logic behind non colocated subquery joins is that we pick an anchor range table entry and check for distribution key equality of any other subqueries in the given query. If for a given subquery, we cannot find distribution key equality with the anchor rte, we recursively plan that subquery. We also used a hacky solution for picking relations as the anchor range table entries. The hack is that we wrap them into a subquery. This is only necessary since some of the attribute equivalance checks are based on queries rather than range table entries.	2018-02-26 13:50:37 +02:00
Onder Kalaci	4d70c86645	Leaf level recursive planning for non colocated subqueries With this commit, we enable recursive planning for the subqueries that are not joined on the distribution keys.	2018-02-26 13:28:24 +02:00
velioglu	195ac948d2	Recursively plan subqueries in WHERE clause when FROM recurs	2018-02-13 19:52:12 +03:00
Marco Slot	09c09f650f	Recursively plan set operations when leaf nodes recur	2017-12-26 13:46:55 +02:00

22 Commits (61bf2fb4774790bcd4ef929b03bb2a8965ff511f)