citus

Commit Graph

Author	SHA1	Message	Date
SaitTalhaNisanci	c6c31e0f1f	Fix int32 overflow and use PG macros for INT32_XX (#4061 ) * Use CalculateUniformHashRangeIndex in HashPartitionId INT32_MIN definition can change among different platforms hence it is possible to get overflow, we would see crashes because of this in debian distros. We have already solved a similar problem with introducing CalculateUniformHashRangeIndex method, hence to solve it we can use the same method, this also removes some duplication and has a single place to decide that. * Use PG_INT32_XX instead of INT32_XX to be safer (cherry picked from commit `ef841115de`) Conflicts: src/backend/distributed/commands/multi_copy.c	2020-07-27 11:59:33 +03:00
Nils Dijk	4ce6c9d8b9	Feature: tdigest aggregate (#3897 ) DESCRIPTION: Adds support to partially push down tdigest aggregates tdigest extensions: https://github.com/tvondra/tdigest This PR implements the partial pushdown of tdigest calculations when possible. The extension adds a tdigest type which can be combined into the same structure. There are several aggregate functions that can be used to get; - a quantile - a list of quantiles - the quantile of a hypothetical value - a list of quantiles for a list of hypothetical values These function can work both on values or tdigest types. Since we can create tdigest values either by combining them, or based on a group of values we can rewrite the aggregates in such a way that most of the computation gets delegated to the compute on the shards. This both speeds up the percentile calculations because the values don't have to be sorted while at the same time making the transfer size from the shards to the coordinator significantly less. (cherry picked from commit `da8f2b0134`)	2020-06-17 14:31:37 +02:00
Philip Dubé	d693bc1b0c	Also check aggregates in havingQual when scanning for non pushdownable aggregates Came across this while coming up with test cases, 'result "68_1" does not exist' I'll seek to address in a future PR, for now avoid segfault (cherry picked from commit `4b68ee12c6`)	2020-03-25 13:38:29 +01:00
Nils Dijk	535b0804be	Fix left join shard pruning (#3569 ) DESCRIPTION: Fix left join shard pruning in pushdown planner Due to #2481 which moves outer join planning through the pushdown planner we caused a regression on the shard pruning behaviour for outer joins. In the pushdown planner we make a union of the placement groups for all shards accessed by a query based on the filters we see during planning. Unfortunately implicit filters for left joins are not available during this part. This causes the inner part of an outer join to not prune any shards away. When we take the union of the placement groups it shows the behaviour of not having any shards pruned. Since the inner part of an outer query will not return any rows if the outer part does not contain any rows we have observed we do not have to add the shard intervals of the inner part of an outer query to the list of shard intervals to query. Fixes: #3512 (cherry picked from commit `e5237b9e20`)	2020-03-25 13:38:07 +01:00
Philip Dubé	2452a899bd	9.2: First phase of addressing HAVING subquery issues (#3599 ) Add failing tests, make changes to avoid crashes at least Fix HAVING subquery pushdown ignoring reference table only subqueries, also include HAVING in recursive planning Given that we have a function IsDistributedTable which includes reference tables, it seems best to have IsDistributedTableRTE & QueryContainsDistributedTableRTE reflect that they do not include reference tables in their check Similarly SublinkList's name should reflect that it only scans WHERE contain_agg_clause asserts that we don't have SubLinks, use contain_aggs_of_level as suggested by pg sourcecode (cherry-picked from commit `81cfa05d3d`)	2020-03-25 10:19:18 +01:00
Jelte Fennema	a573e0df95	Really ignore -Wgnu-variable-sized-type-not-at-end (#3627 ) (cherry picked from commit `56863e8f0b`)	2020-03-25 09:23:09 +01:00
Jelte Fennema	e4e0c65203	Semmle: Check for NULL in some places where it might occur (#3509 ) Semmle reported quite some places where we use a value that could be NULL. Most of these are not actually a real issue, but better to be on the safe side with these things and make the static analysis happy. (cherry picked from commit `685b54b3de`)	2020-03-25 09:19:15 +01:00
Jelte Fennema	2f063d0316	Convert unsafe APIs to safe ones (cherry picked from commit `8de8b62669`)	2020-03-25 09:16:31 +01:00
Jelte Fennema	e0736d3da7	Remove READFUNCs (#3536 ) We don't actually use these functions anymore since merging #1477. Advantages of removing: 1. They add work whenever we add a new node. 2. They contain some usage of stdlib APIs that are banned by Microsoft. Removing it means we don't have to replace those with safe ones. (cherry picked from commit `2a9fccc7a0`)	2020-03-25 09:12:26 +01:00
Jelte Fennema	e2d49c6122	Semmle: Fix possible infite loops caused by overflow (#3503 ) Comparison between differently sized integers in loop conditions can cause infinite loops. This can happen when doing something like this: ```c int64 very_big = MAX_INT32 + 1; for (int32 i = 0; i < very_big; i++) { // do something } // never reached because i overflows before it can reach the value of very_big ``` (cherry picked from commit `3f7c5a5cf6`)	2020-03-25 09:11:21 +01:00
Onder Kalaci	db1a0835f3	Improve definition of RelationInfoContainsOnlyRecurringTuples Before this commit, we considered !ContainsRecurringRTE() enough for NotContainsOnlyRecurringTuples. However, instead, we can check for existince of any distributed table. DESCRIPTION: Fixes a bug that causes wrong results with complex outer joins	2020-03-09 17:29:13 +01:00
Hanefi Onaldi	8d979b4752	Fix early exit bug on intermediate result pruning There are 2 problems with our early exit strategy that this commit fixes: 1- When we decide that a subplan results are sent to all worker nodes, we used to skip traversing the whole distributed plan, instead of skipping only the subplan. 2- We used to consider all available nodes in the cluster (secondaries and inactive nodes as well as active primaries) when deciding on early exit strategy. This resulted in failures to early exit when there are secondaries or inactive nodes. (cherry picked from commit `c0ad44f975`)	2020-03-05 16:47:58 +03:00
Marco Slot	ca44697723	Refactor CitusBeginScan into separate DML / SELECT paths	2020-03-05 13:04:39 +01:00
Onder Kalaci	bcc675cf84	Do not prune shards if the distribution key is NULL The root of the problem is that, standard_planner() converts the following qual ``` {OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ( {VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 45 } {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull true :location 51 :constvalue <> } ) :location 49 } ``` To ``` ( {CONST :consttype 16 :consttypmod -1 :constcollid 0 :constlen 1 :constbyval true :constisnull true :location -1 :constvalue <> } ) ``` So, Citus doesn't deal with NULL values in real-time or non-fast path router queries. And, in the FastPathRouter planner, we check constisnull in DistKeyInSimpleOpExpression(). However, in deferred pruning case, we do not check for isnull for const. Thus, the fix consists of two parts: - Let PruneShards() not crash when NULL parameter is passed - For deferred shard pruning in fast-path queries, explicitly check that we have CONST which is not NULL	2020-02-13 17:22:49 +01:00
Nils Dijk	d5433400f9	Fix: Unnecessary repartition on joins with more than 4 tables (#3473 ) DESCRIPTION: Fix unnecessary repartition on joins with more than 4 tables In 9.1 we have introduced support for all CH-benCHmark queries by widening our definitions of joins to include joins with expressions in them. This had the undesired side effect of Q5 regressing on its plan by implementing a repartition join. It turned out this regression was not directly related to widening of the join clause, nor the schema employed by CH-benCHmark. Instead it had to do with 4 or more tables being joined in a chain. A chain meaning: ```sql SELECT * FROM a,b,c,d WHERE a.part = b.part AND b.part = c.part AND .... ``` Due to how our join order planner was implemented it would only keep track of 1 of the partition columns when comparing if the join could be executed locally. This manifested in a join chain of 4 tables to _always_ be executed as a repartition join. 3 tables joined in a chain would have the middle table shared by the two outer tables causing the local join possibility to be found. With this patch we keep a unique list (or set) of all partition columns participating in the join. When a candidate table is checked for a possibility to execute a local join it will check if there is any partition column in that set that matches an equality join clause on the partition column of the candidate table. By taking into account all partition columns in the left relation it will now find the local join path on >= 4 tables joined in a chain. fixes: #3276	2020-02-06 15:07:07 +01:00
Philip Dubé	c252811884	dont: don't, wont: won't, acylic: acyclic	2020-02-05 17:32:22 +00:00
Onder Kalaci	c7e2309f4c	Improve single hash-repartitioning with numeric (or non-int) types We used to treat the shard interval array that we passed as numeric[]. However, it should be int[], as the shard ranges are int[].	2020-02-04 20:30:04 +01:00
Hadi Moshayedi	264530311a	Don't use distributed insert/select for repartitioned joins	2020-02-03 13:13:30 -08:00
Philip Dubé	d43c80d4d8	pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by	2020-01-30 21:18:08 +00:00
Philip Dubé	5fccc56d3e	Expand the set of aggregates which cannot have LIMIT approximated Previously we only prevented AVG from being pushed down, but this is incorrect: - array_agg, while somewhat non sensical to order by, will potentially be missing values - combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative) This commit limits approximating LIMIT pushdown when ordering by aggregates to: min, max, sum, count, bit_and, bit_or, every, any Which means of those we previously supported, we now exclude: avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union	2020-01-30 17:45:18 +00:00
Önder Kalacı	4519d3411d	Improve the representation of used sub plans (#3411 ) Previously, we've identified the usedSubPlans by only looking to the subPlanId. With this commit, we're expanding it to also include information on the location of the subPlan. This is useful to distinguish the cases where the subPlan is used either on only HAVING or both HAVING and any other part of the query.	2020-01-24 10:47:14 +01:00
Philip Dubé	50c5e814c8	CurrentDatabaseName: return const char* as we're borrowing from cache	2020-01-23 22:49:35 +00:00
Önder Kalacı	ef7d1ea91d	Locally execute queries that don't need any data access (#3410 ) * Update shardPlacement->nodeId to uint As the source of the shardPlacement->nodeId is always workerNode->nodeId, and that is uint32. We had this hack because of: `0ea4e52df5 (r266421409)` And, that is gone with: `90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)` So, we're safe to do it now. * Relax the restrictions on using the local execution Previously, whenever any local execution happens, we disabled further commands to do any remote queries. The basic motivation for doing that is to prevent any accesses in the same transaction block to access the same placements over multiple sessions: one is local session the other is remote session to the same placement. However, the current implementation does not distinguish local accesses being to a placement or not. For example, we could have local accesses that only touches intermediate results. In that case, we should not implement the same restrictions as they become useless. So, this is a pre-requisite for executing the intermediate result only queries locally. * Update the error messages As the underlying implementation has changed, reflect it in the error messages. * Keep track of connections to local node With this commit, we're adding infrastructure to track if any connection to the same local host is done or not. The main motivation for doing this is that we've previously were more conservative about not choosing local execution. Simply, we disallowed local execution if any connection to any remote node is done. However, if we want to use local execution for intermediate result only queries, this'd be annoying because we expect all queries to touch remote node before the final query. Note that this approach is still limiting in Citus MX case, but for now we can ignore that. * Formalize the concept of Local Node Also some minor refactoring while creating the dummy placement * Write intermediate results locally when the results are only needed locally Before this commit, Citus used to always broadcast all the intermediate results to remote nodes. However, it is possible to skip pushing the results to remote nodes always. There are two notable cases for doing that: (a) When the query consists of only intermediate results (b) When the query is a zero shard query In both of the above cases, we don't need to access any data on the shards. So, it is a valuable optimization to skip pushing the results to remote nodes. The pattern mentioned in (a) is actually a common patterns that Citus users use in practice. For example, if you have the following query: WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...) SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...; The final query could be operating only on intermediate results. With this patch, the intermediate results of the ctes are not unnecessarily pushed to remote nodes. * Add specific regression tests As there are edge cases in Citus MX and with round-robin policy, use the same queries on those cases as well. * Fix failure tests By forcing not to use local execution for intermediate results since all the tests expects the results to be pushed remotely. * Fix flaky test * Apply code-review feedback Mostly style changes * Limit the max value of pg_dist_node_seq to reserve for internal use	2020-01-23 18:28:34 +01:00
Onder Kalaci	0bf1e81e33	Cache local plans on BeginScan	2020-01-17 16:02:57 +01:00
Onder Kalaci	3833a7e686	Fix issues for CTE inlining on Postgres 11 Comment from code: /* * We had to implement this hack because on Postgres11 and below, the originalQuery * and the query would have significant differences in terms of CTEs where CTEs * would not be inlined on the query (as standard_planner() wouldn't inline CTEs * on PG 11 and below). * * Instead, we prefer to pass the inlined query to the distributed planning. We rely * on the fact that the query includes subqueries, and it'd definitely go through * query pushdown planning. During query pushdown planning, the only relevant query * tree is the original query. */	2020-01-17 11:59:02 +01:00
Jelte Fennema	246435be7e	Lazy query deparsing executable queries (#3350 ) Deparsing and parsing a query can be heavy on CPU. When locally executing the query we don't need to do this in theory most of the time. This PR is the first step in allowing to skip deparsing and parsing the query in these cases, by lazily creating the query string and storing the query in the task. Future commits will make use of this and not deparse and parse the query anymore, but use the one from the task directly.	2020-01-17 11:49:43 +01:00
Hadi Moshayedi	8635396cea	Repartitioned INSERT/SELECT: Test rollback behaviour	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	97072c9eb1	INSERT/SELECT: show method in EXPLAIN output	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	b4e5f4b10a	Implement INSERT ... SELECT with repartitioning	2020-01-16 23:24:52 -08:00
Jelte Fennema	0ee1eab070	Make tests fail with a useful error message	2020-01-16 18:30:30 +01:00
Onder Kalaci	dc17c2658e	Defer shard pruning for fast-path router queries to execution This is purely to enable better performance with prepared statements. Before this commit, the fast path queries with prepared statements where the distribution key includes a parameter always went through distributed planning. After this change, we only go through distributed planning on the first 5 executions.	2020-01-16 16:59:36 +01:00
Onder Kalaci	64560b07be	Update regression tests-2 In this commit, we're introducing a way to prevent CTE inlining via a GUC. The GUC is used in all the tests where PG 11 and PG 12 tests would diverge otherwise. Note that, in PG 12, the restriction information for CTEs are generated. It means that for some queries involving CTEs, Citus planner (router planner/ pushdown planner) may behave differently. So, via the GUC, we prevent tests to diverge on PG 11 vs PG 12. When we drop PG 11 support, we should get rid of the GUC, and mark relevant ctes as MATERIALIZED, which does the same thing.	2020-01-16 12:28:15 +01:00
Onder Kalaci	5cb203b276	Update regression tests-1 These set of tests has changed in both PG 11 and PG 12. The changes are only about CTE inlining kicking in both versions, and yielding the exact same distributed planning.	2020-01-16 12:28:15 +01:00
Onder Kalaci	efb1577d06	Handle CTE aliases accurately Basically, make sure to update the column name with the CTEs alias if we need to do so.	2020-01-16 12:28:15 +01:00
Onder Kalaci	05d600dd8f	Call CTE inlining in Citus planner The idea is simple: Inline CTEs(if any), try distributed planning. If the planning yields a successful distributed plan, simply return it. If the planning fails, fallback to distributed planning on the query tree where CTEs are not inlined. In that case, if the planning failed just because of the CTE inlining, via recursive planning, the same query would yield a successful plan. A very basic set of examples: WITH cte_1 AS (SELECT * FROM test_table) SELECT , row_number() OVER () FROM cte_1; or WITH a AS (SELECT FROM test_table), b AS (SELECT * FROM test_table) SELECT * FROM a JOIN b ON (a.value> b.value);	2020-01-16 12:28:15 +01:00
Onder Kalaci	01a5800ee8	Add Citus' CTE inlining functions With this commit we add the necessary Citus function to inline CTEs in a queryTree. You might ask, why do we need to inline CTEs if Postgres is already going to do it? Few reasons behind this decision: - One techinal node here is that Citus does the recursive CTE planning by checking the originalQuery which is the query that has not gone through the standard_planner(). CTEs in Citus is super powerful. It is practically key for full SQL coverage for multi-shard queries. With CTEs, you can always reduce any query multi-shard query into a router query via recursive planning (thus full SQL coverage). We cannot let CTE inlining break that. The main idea is Citus should be able to retry planning if anything goes after CTE inlining. So, by taking ownership of CTE inlining on the originalQuery, Citus can fallback to recursive planning of CTEs if the planning with the inlined query fails. It could have been a lot harder if we had relied on standard_planner() to have the inlined CTEs on the original query. - We want to have this feature in PostgreSQL 11 as well, but Postgres only inlines in version 12	2020-01-16 12:28:15 +01:00
Onder Kalaci	1856ab6cdd	Copy & paste code from Postgres source All the code in this commit is direct copy & paste from Postgres source code. We can classify the copy&paste code into two: - Copy paste from CTE inline patch from postgres (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b) These include the functions inline_cte(), inline_cte_walker(), contain_dml(), contain_dml_walker(). It also include the code in function PostgreSQLCTEInlineCondition(). We prefer to extract that code into a seperate function, because (a) we'll re-use the logic later (b) we added one check for PG_11 Finally, the struct "inline_cte_walker_context" is also copied from the same Postgres commit. - Copy paste from the other parts of the Postgres code In order to implement CTE inlining in Postgres 12, the hackers modified the query_tree_walker()/range_table_walker() with the `18c0da88a5` Since Citus needs to support the same logic in PG 11, we copy & pasted that functions (and related flags) with the names pg_12_query_tree_walker() and pg_12_range_table_walker()	2020-01-16 12:28:15 +01:00
Philip Dubé	4d9a733c2f	Fix inserting multiple values with row expression partition column causing the insert to be ignored Raise an error instead of silently inserting nothing if we hit this condition in the future	2020-01-15 21:10:50 +00:00
Philip Dubé	4b5d6c3ebe	Rename RelayFileState to ShardState Replace FILE_ prefix with SHARD_STATE_	2020-01-12 05:57:53 +00:00
Philip Dubé	e71386af33	Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT Also use stack allocation for walkerContext in multi_logical_optimizer	2020-01-10 16:54:00 +00:00
Philip Dubé	281aacce9b	Fix row-gather for subqueries being handled by task-tracker task-tracker has specific logic for MultiPartition when GROUP BY is missing We were ending up in this code path because row-gather removes GROUP BY	2020-01-10 01:51:37 +00:00
Hadi Moshayedi	f38d0e5b3f	Partitioned task list results.	2020-01-09 10:32:58 -08:00
Philip Dubé	bf7d86a3e8	Fix typo: aggragate -> aggregate	2020-01-07 01:16:09 +00:00
Philip Dubé	863bf49507	Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default	2020-01-07 01:16:04 +00:00
Jelte Fennema	5b0baea72c	Refactor distributed_planner for better understandability	2020-01-06 14:23:38 +01:00
Onder Kalaci	5a1e752726	Apply feedback - add fastPath field to plan	2020-01-06 12:42:43 +01:00
Onder Kalaci	13a9b55695	Skip expensive checks when fast-path query The definition of fast-path query is very strict. So, we don't need to do some extra checks.	2020-01-06 12:42:43 +01:00
Onder Kalaci	7f3ab7892d	Skip shard pruning when possible We're already traversing the queryTree and finding the distribution key value, so pass it to the later stages of the planning.	2020-01-06 12:42:43 +01:00
Onder Kalaci	ca293116fa	Reduce calls to FastPathRouterQuery() Before this commit, we called it twice durning planning. Instead, we save the information and pass it.	2020-01-06 12:42:43 +01:00
Önder Kalacı	0c70a5470e	Allow RETURNING in fast-path queries (#3352 ) * Allow RETURNING in fast-path queries Because there is no specific reason for that.	2020-01-03 13:42:50 +00:00

1 2 3 4 5 ...

471 Commits (c6c31e0f1fe5b8cc955b0da42264578dcdae16cc)