citus

Commit Graph

Author	SHA1	Message	Date
Onder Kalaci	01a5800ee8	Add Citus' CTE inlining functions With this commit we add the necessary Citus function to inline CTEs in a queryTree. You might ask, why do we need to inline CTEs if Postgres is already going to do it? Few reasons behind this decision: - One techinal node here is that Citus does the recursive CTE planning by checking the originalQuery which is the query that has not gone through the standard_planner(). CTEs in Citus is super powerful. It is practically key for full SQL coverage for multi-shard queries. With CTEs, you can always reduce any query multi-shard query into a router query via recursive planning (thus full SQL coverage). We cannot let CTE inlining break that. The main idea is Citus should be able to retry planning if anything goes after CTE inlining. So, by taking ownership of CTE inlining on the originalQuery, Citus can fallback to recursive planning of CTEs if the planning with the inlined query fails. It could have been a lot harder if we had relied on standard_planner() to have the inlined CTEs on the original query. - We want to have this feature in PostgreSQL 11 as well, but Postgres only inlines in version 12	2020-01-16 12:28:15 +01:00
Onder Kalaci	1856ab6cdd	Copy & paste code from Postgres source All the code in this commit is direct copy & paste from Postgres source code. We can classify the copy&paste code into two: - Copy paste from CTE inline patch from postgres (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b) These include the functions inline_cte(), inline_cte_walker(), contain_dml(), contain_dml_walker(). It also include the code in function PostgreSQLCTEInlineCondition(). We prefer to extract that code into a seperate function, because (a) we'll re-use the logic later (b) we added one check for PG_11 Finally, the struct "inline_cte_walker_context" is also copied from the same Postgres commit. - Copy paste from the other parts of the Postgres code In order to implement CTE inlining in Postgres 12, the hackers modified the query_tree_walker()/range_table_walker() with the `18c0da88a5` Since Citus needs to support the same logic in PG 11, we copy & pasted that functions (and related flags) with the names pg_12_query_tree_walker() and pg_12_range_table_walker()	2020-01-16 12:28:15 +01:00
Philip Dubé	1cfebf9f41	Merge pull request #3387 from citusdata/multi_row_insert_bug Multi row insert bug	2020-01-16 05:48:56 +00:00
Philip Dubé	4d9a733c2f	Fix inserting multiple values with row expression partition column causing the insert to be ignored Raise an error instead of silently inserting nothing if we hit this condition in the future	2020-01-15 21:10:50 +00:00
Philip Dubé	f6d4df6da9	Merge pull request #3382 from citusdata/fix-error-on-repeated-placement-done PlacementExecutionDone: We may mark placements as failed multiple times	2020-01-15 18:52:58 +00:00
Philip Dubé	4989c9a15c	PlacementExecutionDone: We may mark placements as failed multiple times, but should only act the first time.	2020-01-15 18:20:01 +00:00
Marco Slot	fd5935d798	Always use NOTICE in log_remote_commands and avoid redaction wh… (#3339 ) Always use NOTICE in log_remote_commands and avoid redaction when possible	2020-01-14 11:36:56 +01:00
Marco Slot	f0d6ea1afb	Merge pull request #3261 from citusdata/remove_copy_from_worker Remove copy from worker for append-partitioned table	2020-01-14 09:21:19 +01:00
Marco Slot	90056f7d3c	Remove copy from worker for append-partitioned table	2020-01-13 23:03:40 -08:00
Philip Dubé	5ec644c691	Merge pull request #3381 from citusdata/mitm-threadsafe mitmscripts/fluent.py: use atomic increment	2020-01-14 06:32:31 +00:00
Philip Dubé	62524d152d	mitmscripts/fluent.py: use atomic increment	2020-01-13 20:35:08 +00:00
Marco Slot	f1a0582973	Make ApplyLogRedaction a macro and redefine ereport	2020-01-13 18:24:36 +01:00
Marco Slot	06709ee108	Always use NOTICE in log_remote_commands and avoid redaction when possible	2020-01-13 18:24:36 +01:00
Philip Dubé	b6975c7dcf	Merge pull request #3367 from citusdata/propagate-routine Propagate DROP ROUTINE, ALTER ROUTINE	2020-01-13 15:42:50 +00:00
Philip Dubé	ccabf19090	Propagate DROP ROUTINE, ALTER ROUTINE In two places I've made code more straight forward by using ROUTINE in our own codegen Two changes which may seem extraneous: AppendFunctionName was updated to not use pg_get_function_identity_arguments. This is because that function includes ORDER BY when printing an aggregate like my_rank. While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres, ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not. Tests were updated to use macaddr over integer. Using integer is flaky, our logic could sometimes end up on tables like users_table. I originally wanted to use money, but money isn't hashable.	2020-01-13 15:37:46 +00:00
Philip Dubé	8b4429e2dd	Merge pull request #3375 from citusdata/rename-relayfilestate Rename RelayFileState to ShardState	2020-01-12 06:18:36 +00:00
Philip Dubé	4b5d6c3ebe	Rename RelayFileState to ShardState Replace FILE_ prefix with SHARD_STATE_	2020-01-12 05:57:53 +00:00
Philip Dubé	f1a4b97450	Merge pull request #3372 from citusdata/dont-palloc-walkercontext Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT	2020-01-10 17:06:37 +00:00
Philip Dubé	e71386af33	Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT Also use stack allocation for walkerContext in multi_logical_optimizer	2020-01-10 16:54:00 +00:00
Hadi Moshayedi	c7efbf9711	Merge pull request #3355 from citusdata/redistribute_results Redistribute task list results to correspond to a target relation's distribution	2020-01-09 23:52:28 -08:00
Hadi Moshayedi	40ba2cdd6e	Test RedistributeTaskListResult	2020-01-09 23:47:25 -08:00
Hadi Moshayedi	527d7d41c1	Implement RedistributeTaskListResult	2020-01-09 23:47:25 -08:00
Philip Dubé	d855faf2b2	Merge pull request #3371 from citusdata/fix-task-tracker-row-gather-subquery Fix row-gather for subqueries being handled by task-tracker	2020-01-10 02:03:38 +00:00
Philip Dubé	281aacce9b	Fix row-gather for subqueries being handled by task-tracker task-tracker has specific logic for MultiPartition when GROUP BY is missing We were ending up in this code path because row-gather removes GROUP BY	2020-01-10 01:51:37 +00:00
Hadi Moshayedi	e185b54cbc	Merge pull request #3363 from citusdata/redistribute_failure PartitionTasklistResults: Use different queries per placement	2020-01-09 11:20:44 -08:00
Hadi Moshayedi	e1e383cb59	Don't override xact id assigned by coordinator on workers. We might need to send commands from workers to other workers. In these cases we shouldn't override the xact id assigned by coordinator, or otherwise we won't read the consistent set of result files accross the nodes.	2020-01-09 11:09:11 -08:00
Hadi Moshayedi	bb65669186	Failure tests for PartitionTasklistResults	2020-01-09 10:55:58 -08:00
Hadi Moshayedi	c7c460e843	PartitionTasklistResults: Use different queries per placement We need to know which placement succeeded in executing the worker_partition_query_result() call. Otherwise we wouldn't know which node to fetch from. This change allows that by introducing Task::perPlacementQueryStrings.	2020-01-09 10:55:58 -08:00
Hadi Moshayedi	08b5145765	Merge pull request #3353 from citusdata/partition_task_list_results Partitioned task list results. Implements PartitionTasklistResults(), which partitions results of given SELECT tasks based on shard ranges of a given relation.	2020-01-09 10:53:11 -08:00
Hadi Moshayedi	f38d0e5b3f	Partitioned task list results.	2020-01-09 10:32:58 -08:00
Philip Dubé	893b4538c2	Merge pull request #3364 from citusdata/fp-deparse Refactor deparsing/planning to use DistributeObjectOps struct	2020-01-09 18:29:32 +00:00
Philip Dubé	73c06fae3b	Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type	2020-01-09 18:24:29 +00:00
Önder Kalacı	22cc5b1240	Merge pull request #3366 from citusdata/normalize-plan-numbers-insert-select Normalize plan numbers in insert_select output	2020-01-07 10:02:01 +00:00
Jelte Fennema	9724e25065	Normalize plan numbers in insert_select output	2020-01-07 10:34:08 +01:00
Philip Dubé	0e227e391a	Merge pull request #3324 from citusdata/gather-row-aggregation Pull up intermediate rows to coordinator for aggregates we cannot push down	2020-01-07 01:26:39 +00:00
Philip Dubé	bf7d86a3e8	Fix typo: aggragate -> aggregate	2020-01-07 01:16:09 +00:00
Philip Dubé	863bf49507	Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default	2020-01-07 01:16:04 +00:00
Jelte Fennema	16b4140dc8	Use fewer CPU cycles on fast-path planning (#3332 ) Fast-path queries are introduced with #2606. The basic idea is that for very simple queries like SELECT count(*) FROM table WHERE dist_key = X, we can skip some parts of the distributed planning. The most notable thing to skip is standard_planner(), which was already done in #2606. With this commit, we do some further optimizations. First, we used to call the function which decides whether the query is fast path twice, which can be reduced to one. Second, we used to do shard pruning for every query, now we'll optimize it for some cases. Finally, since the definition of fast-path queries are very strict, we can skip some query traversals.	2020-01-06 14:54:11 +01:00
Jelte Fennema	5b0baea72c	Refactor distributed_planner for better understandability	2020-01-06 14:23:38 +01:00
Onder Kalaci	5a1e752726	Apply feedback - add fastPath field to plan	2020-01-06 12:42:43 +01:00
Onder Kalaci	13a9b55695	Skip expensive checks when fast-path query The definition of fast-path query is very strict. So, we don't need to do some extra checks.	2020-01-06 12:42:43 +01:00
Onder Kalaci	7f3ab7892d	Skip shard pruning when possible We're already traversing the queryTree and finding the distribution key value, so pass it to the later stages of the planning.	2020-01-06 12:42:43 +01:00
Onder Kalaci	ca293116fa	Reduce calls to FastPathRouterQuery() Before this commit, we called it twice durning planning. Instead, we save the information and pass it.	2020-01-06 12:42:43 +01:00
Önder Kalacı	270571c106	Merge pull request #3333 from citusdata/fix_wrong_data Make sure to update shard states of partitions on failures	2020-01-06 11:37:40 +00:00
Onder Kalaci	c8f14c9f6c	Make sure to update shard states of partitions on failures Fixes #3331 In #2389, we've implemented support for partitioned tables with rep > 1. The implementation is limiting the use of modification queries on the partitions. In fact, we error out when any partition is modified via EnsurePartitionTableNotReplicated(). However, we seem to forgot an important case, where the parent table's partition is marked as INVALID. In that case, at least one of the partition becomes INVALID. However, we do not mark partitions as INVALID ever. If the user queries the partition table directly, Citus could happily send the query to INVALID placements -- which are not marked as INVALID. This PR fixes it by marking the placements of the partitions as INVALID as well. The shard placement repair logic already re-creates all the partitions, so should be fine in that front.	2020-01-06 12:26:08 +01:00
Jelte Fennema	3c770516eb	Commenting out flaky intermediate data leak test (#3359 ) check-multi apparently has an intermediate data leak, so commenting out that test for now. This was introduced by #3349 Examples: - https://app.circleci.com/jobs/github/citusdata/citus/74675 - https://app.circleci.com/jobs/github/citusdata/citus/74683 - https://app.circleci.com/jobs/github/citusdata/citus/74763	2020-01-06 11:55:01 +01:00
Jelte Fennema	d29ce8965c	Actually check that test output normalization is applied in CI (#3358 ) Fixup of an issue with #3336 that caused CI not to check correctly that normalized test output was committed.	2020-01-06 10:37:34 +01:00
Jelte Fennema	de75243000	Commit normalized test output for better diffs (#3336 ) We have a `normalize.sed` script that before diffing test output normalizes the expected file and the actual file. This makes sure that we don't have random test failures and that we have to update our test output all the time. This PR takes that one step further and actually commits the normalized files. That way whenever we DO have to update our test output files only relevant changes will be visible in the diff. The other change that this PR does it that it strips trailing whitespace during normalization. This works well with our editorconfig settings. As an added benefit of committing these files it's also much more visible what new normalization rules will result in. The original changes that were proposed here were a bit to wide and were changing output that was not intentended to be changed: https://github.com/citusdata/citus/pull/3161#discussion_r360928922 Because these changes are now in the diff of the commit they are much easier to spot. Finally the Plan number normalization rules were also added to this PR, because they are useful even without the CTE inlining PR.	2020-01-06 09:56:31 +01:00
Jelte Fennema	4a20ba3bfc	Merge remote-tracking branch 'origin/master' into normalized-test-output	2020-01-06 09:36:04 +01:00
Jelte Fennema	2e4e1c030f	Make sure the expected .out file always exists when running diff on it	2020-01-06 09:32:03 +01:00

1 2 3 4 5 ...

3119 Commits (01a5800ee831d644e39c335c43d4bbbf9439e5bd) All Branches Search

3119 Commits (01a5800ee831d644e39c335c43d4bbbf9439e5bd)

All Branches