citus

Commit Graph

Author	SHA1	Message	Date
Jelte Fennema	58723dd8b0	Normalize tests: s/DEBUG: Plan [0-9]+/DEBUG: Plan XXX/g	2020-01-06 09:32:03 +01:00
Jelte Fennema	7730bd449c	Normalize tests: Remove trailing whitespace	2020-01-06 09:32:03 +01:00
Jelte Fennema	7f3de68b0d	Normalize tests: header separator length	2020-01-06 09:32:03 +01:00
Marco Slot	ba39d72fe1	Fix incorrect union all pushdown issue	2020-01-01 09:03:50 +01:00
SaitTalhaNisanci	7ff4ce2169	Add adaptive executor support for repartition joins (#3169 ) * WIP * wip * add basic logic to run a single job with repartioning joins with adaptive executor * fix some warnings and return in ExecuteDependedTasks if there is none * Add the logic to run depended jobs in adaptive executor The execution of depended tasks logic is changed. With the current logic: - All tasks are created from the top level task list. - At one iteration: - CurTasks whose dependencies are executed are found. - CurTasks are executed in parallel with adapter executor main logic. - The iteration is repeated until all tasks are completed. * Separate adaptive executor repartioning logic * Remove duplicate parts * cleanup directories and schemas * add basic repartion tests for adaptive executor * Use the first placement to fetch data In task tracker, when there are replicas, we try to fetch from a replica for which a map task is succeeded. TaskExecution is used for this, however TaskExecution is not used in adaptive executor. So we cannot use the same thing as task tracker. Since adaptive executor fails when a map task fails (There is no retry logic yet). We know that if we try to execute a fetch task, all of its map tasks already succeeded, so we can just use the first one to fetch from. * fix clean directories logic * do not change the search path while creating a udf * Enable repartition joins with adaptive executor with only enable_reparitition_joins guc * Add comments to adaptive_executor_repartition * dont run adaptive executor repartition test in paralle with other tests * execute cleanup only in the top level execution * do cleanup only in the top level ezecution * not begin a transaction if repartition query is used * use new connections for repartititon specific queries New connections are opened to send repartition specific queries. The opened connections will be closed at the FinishDistributedExecution. While sending repartition queries no transaction is begun so that we can see all changes. * error if a modification was done prior to repartition execution * not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level * fix cleanup logic * update tests * add missing function comments * add test for transaction with DDL before repartition query * do not close repartition connections in adaptive executor * rollback instead of commit in repartition join test * use close connection instead of shutdown connection * remove unnecesary connection list, ensure schema owner before removing directory * rename ExecuteTaskListRepartition * put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case * split adaptive executor repartition to DAG execution logic and repartition logic * apply review items * apply review items * use an enum for remote transaction state and fix cleanup for repartition * add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction * fix style * wip * rename removejobdir to partition cleanup * do not close connections at the end of repartition queries * do repartition cleanup in pg catch * apply review items * decide whether to use transaction or not at execution creation * rename isOutsideTransaction and add missing comment * not error in pg catch while doing cleanup * use replication factor of the creation time, not current time to decide if task tracker should be chosen * apply review items * apply review items * apply review item	2019-12-17 19:09:45 +03:00
Philip Dubé	261a9de42d	Fix typos: VAR_SET_VALUE_KIND -> VAR_SET_VALUE kind beginnig -> beginning plannig -> planning the the -> the er then -> er than	2019-11-25 23:24:13 +00:00
Önder Kalacı	ffd89e4e01	Include all relevant relations in the ExtractRangeTableRelationWalker (#3135 ) We've changed the logic for pulling RTE_RELATIONs in #3109 and non-colocated subquery joins and partitioned tables. @onurctirtir found this steps where I traced back and found the issues. While looking into it in more detail, we decided to expand the list in a way that the callers get all the relevant RTE_RELATIONs RELKIND_RELATION, RELKIND_PARTITIONED_TABLE, RELKIND_FOREIGN_TABLE and RELKIND_MATVIEW. These are all relation kinds that Citus planner is aware of.	2019-11-01 16:06:58 +01:00
SaitTalhaNisanci	70e46703aa	Fix debug1 message in JobExecutorType (#3147 ) When citus.enable_repartition_joins guc is set to on, and we have adaptive executor, there was a typo in the debug message, which was saying realtime executor no adaptive executor.	2019-11-01 11:14:19 +03:00
Philip Dubé	77efec04a0	Router Planner: accept SELECT_CMD ctes in modification queries	2019-06-26 10:32:01 +02:00
Philip Dubé	84fe626378	multi_router_planner: refactor error propagation	2019-06-26 10:32:01 +02:00
Philip Dubé	b5ced403d8	Also check rewrittenQuery jointree for outer join	2019-06-04 07:47:35 -07:00
Onder Kalaci	ad5ff1d01a	Some queries lead to infinite recursion with recurisve planning The rule for infinite recursion is the following: - If the query contains a subquery which is recursively planned, and no other subqueries can be recursively planned due to correlation (e.g., LATERAL joins), the planner keeps recursing again and again. One interesting thing here is that even if a subquery contains only intermediate result(s), we re-recursively plan that. In the end, the logic in the code does the following: - Try recursive planning any of the subqueries in the query tree - If any subquery is recursively planned, call the planner again where the subquery is replaced with the intermediate result. - Try recursively planning any of the queries - If any subquery is recursively planned, call the planner again where the subquery (in this case it is already intermediate result) is replaced with the intermediate result. - Try recursively planning any of the queries - If any subquery is recursively planned, call the planner again where the subquery (in this case it is already intermediate result) is replaced with the intermediate result. - Try recursively planning any of the queries - If any subquery is recursively planned, call the planner again where the subquery (in this case it is already intermediate result) is replaced with the intermediate result. ......	2019-03-18 10:35:00 +03:00
Marco Slot	1656b519c4	Plan outer joins through pushdown planning	2019-01-05 20:55:27 +01:00
Hadi Moshayedi	86b12bc2d0	Always prefix operators with their namespace. (#2147 ) Previously we checked if an operator is in pg_catalog, and if it wasn't we prefixed it with namespace in worker queries. This can have a huge impact on performance of physical planner when using custom data types. This happened regardless of current search_path config, because Citus overrides the search path in get_query_def_extended(). When we do so, the check for existence of the operator in current search path in generate_operator_name() fails for any operators outside pg_catalog. This means that nothing gets cached, and in the following calls we will again recheck the system tables for existence of the operators, which took an additional 40-50ms for some of the usecases we were seeing. In this change we skip the pg_catalog check, and always prefix the operator with its namespace.	2018-05-05 13:27:26 -04:00
Onder Kalaci	1c930c96a3	Support non-co-located joins between subqueries With #1804 (and related PRs), Citus gained the ability to plan subqueries that are not safe to pushdown. There are two high-level requirements for pushing down subqueries: * Individual subqueries that require a merge step (i.e., GROUP BY on non-distribution key, or LIMIT in the subquery etc). We've handled such subqueries via #1876. * Combination of subqueries that are not joined on distribution keys. This commit aims to recursively plan some of such subqueries to make the whole query safe to pushdown. The main logic behind non colocated subquery joins is that we pick an anchor range table entry and check for distribution key equality of any other subqueries in the given query. If for a given subquery, we cannot find distribution key equality with the anchor rte, we recursively plan that subquery. We also used a hacky solution for picking relations as the anchor range table entries. The hack is that we wrap them into a subquery. This is only necessary since some of the attribute equivalance checks are based on queries rather than range table entries.	2018-02-26 13:50:37 +02:00

15 Commits (58723dd8b03d2e2c68ed82d9dbb9f3d0f6e0f22b)