Commit Graph

3263 Commits (cd2a60699884b833eb577ceb51ba706368a6e36f)

Author SHA1 Message Date
Marco Slot fd5935d798
Always use NOTICE in log_remote_commands and avoid redaction wh… (#3339)
Always use NOTICE in log_remote_commands and avoid redaction when possible
2020-01-14 11:36:56 +01:00
Marco Slot f0d6ea1afb
Merge pull request #3261 from citusdata/remove_copy_from_worker
Remove copy from worker for append-partitioned table
2020-01-14 09:21:19 +01:00
Marco Slot 90056f7d3c Remove copy from worker for append-partitioned table 2020-01-13 23:03:40 -08:00
Philip Dubé 5ec644c691
Merge pull request #3381 from citusdata/mitm-threadsafe
mitmscripts/fluent.py: use atomic increment
2020-01-14 06:32:31 +00:00
Philip Dubé 62524d152d mitmscripts/fluent.py: use atomic increment 2020-01-13 20:35:08 +00:00
Marco Slot f1a0582973 Make ApplyLogRedaction a macro and redefine ereport 2020-01-13 18:24:36 +01:00
Marco Slot 06709ee108 Always use NOTICE in log_remote_commands and avoid redaction when possible 2020-01-13 18:24:36 +01:00
Philip Dubé b6975c7dcf
Merge pull request #3367 from citusdata/propagate-routine
Propagate DROP ROUTINE, ALTER ROUTINE
2020-01-13 15:42:50 +00:00
Philip Dubé ccabf19090 Propagate DROP ROUTINE, ALTER ROUTINE
In two places I've made code more straight forward by using ROUTINE in our own codegen

Two changes which may seem extraneous:

AppendFunctionName was updated to not use pg_get_function_identity_arguments.
This is because that function includes ORDER BY when printing an aggregate like my_rank.
While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres,
ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not.

Tests were updated to use macaddr over integer. Using integer is flaky, our logic
could sometimes end up on tables like users_table. I originally wanted to use money,
but money isn't hashable.
2020-01-13 15:37:46 +00:00
Philip Dubé 8b4429e2dd
Merge pull request #3375 from citusdata/rename-relayfilestate
Rename RelayFileState to ShardState
2020-01-12 06:18:36 +00:00
Philip Dubé 4b5d6c3ebe Rename RelayFileState to ShardState
Replace FILE_ prefix with SHARD_STATE_
2020-01-12 05:57:53 +00:00
Philip Dubé f1a4b97450
Merge pull request #3372 from citusdata/dont-palloc-walkercontext
Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT
2020-01-10 17:06:37 +00:00
Philip Dubé e71386af33 Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT
Also use stack allocation for walkerContext in multi_logical_optimizer
2020-01-10 16:54:00 +00:00
Hadi Moshayedi c7efbf9711
Merge pull request #3355 from citusdata/redistribute_results
Redistribute task list results to correspond to a target relation's distribution
2020-01-09 23:52:28 -08:00
Hadi Moshayedi 40ba2cdd6e Test RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Hadi Moshayedi 527d7d41c1 Implement RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Philip Dubé d855faf2b2
Merge pull request #3371 from citusdata/fix-task-tracker-row-gather-subquery
Fix row-gather for subqueries being handled by task-tracker
2020-01-10 02:03:38 +00:00
Philip Dubé 281aacce9b Fix row-gather for subqueries being handled by task-tracker
task-tracker has specific logic for MultiPartition when GROUP BY is missing

We were ending up in this code path because row-gather removes GROUP BY
2020-01-10 01:51:37 +00:00
Hadi Moshayedi e185b54cbc
Merge pull request #3363 from citusdata/redistribute_failure
PartitionTasklistResults: Use different queries per placement
2020-01-09 11:20:44 -08:00
Hadi Moshayedi e1e383cb59 Don't override xact id assigned by coordinator on workers.
We might need to send commands from workers to other workers. In
these cases we shouldn't override the xact id assigned by coordinator,
or otherwise we won't read the consistent set of result files
accross the nodes.
2020-01-09 11:09:11 -08:00
Hadi Moshayedi bb65669186 Failure tests for PartitionTasklistResults 2020-01-09 10:55:58 -08:00
Hadi Moshayedi c7c460e843 PartitionTasklistResults: Use different queries per placement
We need to know which placement succeeded in executing the worker_partition_query_result() call. Otherwise we wouldn't know which node to fetch from. This change allows that by introducing Task::perPlacementQueryStrings.
2020-01-09 10:55:58 -08:00
Hadi Moshayedi 08b5145765
Merge pull request #3353 from citusdata/partition_task_list_results
Partitioned task list results.

Implements PartitionTasklistResults(), which partitions results of given SELECT tasks based on shard ranges of a given relation.
2020-01-09 10:53:11 -08:00
Hadi Moshayedi f38d0e5b3f Partitioned task list results. 2020-01-09 10:32:58 -08:00
Philip Dubé 893b4538c2
Merge pull request #3364 from citusdata/fp-deparse
Refactor deparsing/planning to use DistributeObjectOps struct
2020-01-09 18:29:32 +00:00
Philip Dubé 73c06fae3b Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type 2020-01-09 18:24:29 +00:00
Önder Kalacı 22cc5b1240
Merge pull request #3366 from citusdata/normalize-plan-numbers-insert-select
Normalize plan numbers in insert_select output
2020-01-07 10:02:01 +00:00
Jelte Fennema 9724e25065 Normalize plan numbers in insert_select output 2020-01-07 10:34:08 +01:00
Philip Dubé 0e227e391a
Merge pull request #3324 from citusdata/gather-row-aggregation
Pull up intermediate rows to coordinator for aggregates we cannot push down
2020-01-07 01:26:39 +00:00
Philip Dubé bf7d86a3e8 Fix typo: aggragate -> aggregate 2020-01-07 01:16:09 +00:00
Philip Dubé 863bf49507 Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default 2020-01-07 01:16:04 +00:00
Jelte Fennema 16b4140dc8
Use fewer CPU cycles on fast-path planning (#3332)
Fast-path queries are introduced with #2606. The basic idea is that for very simple queries like SELECT count(*) FROM table WHERE dist_key = X, we can skip some parts of the distributed planning. The most notable thing to skip is standard_planner(), which was already done in #2606.

With this commit, we do some further optimizations. First, we used to call the function which decides whether the query is fast path twice, which can be reduced to one. Second, we used to do shard pruning for every query, now we'll optimize it for some cases. Finally, since the definition of fast-path queries are very strict, we can skip some query traversals.
2020-01-06 14:54:11 +01:00
Jelte Fennema 5b0baea72c Refactor distributed_planner for better understandability 2020-01-06 14:23:38 +01:00
Onder Kalaci 5a1e752726 Apply feedback - add fastPath field to plan 2020-01-06 12:42:43 +01:00
Onder Kalaci 13a9b55695 Skip expensive checks when fast-path query
The definition of fast-path query is very strict. So, we don't need
to do some extra checks.
2020-01-06 12:42:43 +01:00
Onder Kalaci 7f3ab7892d Skip shard pruning when possible
We're already traversing the queryTree and finding the distribution
key value, so pass it to the later stages of the planning.
2020-01-06 12:42:43 +01:00
Onder Kalaci ca293116fa Reduce calls to FastPathRouterQuery()
Before this commit, we called it twice durning planning. Instead,
we save the information and pass it.
2020-01-06 12:42:43 +01:00
Önder Kalacı 270571c106
Merge pull request #3333 from citusdata/fix_wrong_data
Make sure to update shard states of partitions on failures
2020-01-06 11:37:40 +00:00
Onder Kalaci c8f14c9f6c Make sure to update shard states of partitions on failures
Fixes #3331

In #2389, we've implemented support for partitioned tables with rep > 1.
The implementation is limiting the use of modification queries on the
partitions. In fact, we error out when any partition is modified via
EnsurePartitionTableNotReplicated().

However, we seem to forgot an important case, where the parent table's
partition is marked as INVALID. In that case, at least one of the partition
becomes INVALID. However, we do not mark partitions as INVALID ever.

If the user queries the partition table directly, Citus could happily send
the query to INVALID placements -- which are not marked as INVALID.

This PR fixes it by marking the placements of the partitions as INVALID
as well.

The shard placement repair logic already re-creates all the partitions,
so should be fine in that front.
2020-01-06 12:26:08 +01:00
Jelte Fennema 3c770516eb
Commenting out flaky intermediate data leak test (#3359)
check-multi apparently has an intermediate data leak, so commenting out
that test for now. This was introduced by #3349 

Examples:
- https://app.circleci.com/jobs/github/citusdata/citus/74675
- https://app.circleci.com/jobs/github/citusdata/citus/74683
- https://app.circleci.com/jobs/github/citusdata/citus/74763
2020-01-06 11:55:01 +01:00
Jelte Fennema d29ce8965c
Actually check that test output normalization is applied in CI (#3358)
Fixup of an issue with #3336 that caused CI not to check correctly that
normalized test output was committed.
2020-01-06 10:37:34 +01:00
Jelte Fennema de75243000
Commit normalized test output for better diffs (#3336)
We have a `normalize.sed` script that before diffing test output normalizes the
expected file and the actual file. This makes sure that we don't have random
test failures and that we have to update our test output all the time. This PR
takes that one step further and actually commits the normalized files. That way
whenever we DO have to update our test output files only relevant changes will
be visible in the diff.

The other change that this PR does it that it strips trailing whitespace during 
normalization. This works well with our editorconfig settings.

As an added benefit of committing these files it's also much more visible what
new normalization rules will result in. The original changes that were proposed
here were a bit to wide and were changing output that was not intentended to
be changed: https://github.com/citusdata/citus/pull/3161#discussion_r360928922
Because these changes are now in the diff of the commit they are much easier to
spot.

Finally the Plan number normalization rules were also added to this PR, because
they are useful even without the CTE inlining PR.
2020-01-06 09:56:31 +01:00
Jelte Fennema 4a20ba3bfc Merge remote-tracking branch 'origin/master' into normalized-test-output 2020-01-06 09:36:04 +01:00
Jelte Fennema 2e4e1c030f Make sure the expected .out file always exists when running diff on it 2020-01-06 09:32:03 +01:00
Jelte Fennema 16bcf15e16 Remove unused normalization rule 2020-01-06 09:32:03 +01:00
Jelte Fennema 634ea80009 Add a basic testing README including normalization explanation 2020-01-06 09:32:03 +01:00
Jelte Fennema 7c3e8e150e Normalize tests: s/Subplan [0-9]+\_/Subplan XXX\_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema acd12a6de5 Normalize tests: s/read_intermediate_result\('[0-9]+_/read_intermediate_result('XXX_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 21dbd4e55d Normalize tests: s/generating subplan [0-9]+\_/generating subplan XXX\_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 58723dd8b0 Normalize tests: s/DEBUG: Plan [0-9]+/DEBUG: Plan XXX/g 2020-01-06 09:32:03 +01:00