Commit Graph

2027 Commits (bc1a800f7095c8fe29f0c95e21c97dc008eca225)

Author SHA1 Message Date
Hadi Moshayedi 3258d87f3e Isolation tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 8b27a9a195 More range partitioned tests 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 8635396cea Repartitioned INSERT/SELECT: Test rollback behaviour 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 43218eebf6 Failure tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 665b33dca1 MX tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi af2349f21f Repartitioned INSERT/SELECT: Add a prepared statement test 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 97072c9eb1 INSERT/SELECT: show method in EXPLAIN output 2020-01-16 23:24:52 -08:00
Hadi Moshayedi b143d9588a Repartitioned INSERT/SELECT: Test GROUP BY 2020-01-16 23:24:52 -08:00
Hadi Moshayedi fe548b762f Repartitioned INSERT/SELECT: Test CTEs 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 494cc383cc Repartitioned INSERT/SELECT: Enable RETURNING 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 4b14347fc3 Tests for DML followed by insert/select repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 44a2aede16 Don't start a coordinated transaction on workers.
Otherwise transaction hooks of Citus kick in and might cause unwanted errors.
2020-01-16 23:24:52 -08:00
Hadi Moshayedi 42c3c03b85 Handle extra columns added in ExpandWorkerTargetEntry() in repartitioned INSERT/SELECT 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 89463f9760 Repartitioned INSERT/SELECT: cast columns in SELECT targets 2020-01-16 23:24:52 -08:00
Hadi Moshayedi d67a384350 Enable repartitioned INSERT/SELECT ON CONFLICT. 2020-01-16 23:24:52 -08:00
Hadi Moshayedi b4e5f4b10a Implement INSERT ... SELECT with repartitioning 2020-01-16 23:24:52 -08:00
Hadi Moshayedi ced876358d INSERT/SELECT: Refactor out AddInsertSelectCasts 2020-01-16 23:24:52 -08:00
Hadi Moshayedi d449c1857c INSERT/SELECT: Use ExecutePlan* instead of ExecuteSelect* 2020-01-16 23:24:52 -08:00
Hadi Moshayedi e30580e2bd Add ORDER BY to multi_row_insert.sql 2020-01-16 15:20:39 -08:00
Jelte Fennema 0ee1eab070 Make tests fail with a useful error message 2020-01-16 18:30:30 +01:00
Jelte Fennema cb5154cf03 Add more failing tests, of which some have bad error messages 2020-01-16 18:30:30 +01:00
Marco Slot 82f1fffa28 Fix epoll_ctl() error message on connection error 2020-01-16 06:40:57 +01:00
Onder Kalaci dc17c2658e Defer shard pruning for fast-path router queries to execution
This is purely to enable better performance with prepared statements.
Before this commit, the fast path queries with prepared statements
where the distribution key includes a parameter always went through
distributed planning. After this change, we only go through distributed
planning on the first 5 executions.
2020-01-16 16:59:36 +01:00
Onder Kalaci 933d666c0d Do not forget to copy fastPathRouterPlan@DistributedPlan 2020-01-16 16:39:20 +01:00
Halil Ozan Akgul c5539d20d9 Adds alter table schema propagation 2020-01-16 17:04:16 +03:00
Nils Dijk b6e09eb691
Fix: distributed function with table reference in declare (#3384)
DESCRIPTION: Fixes a problem when adding a new node due to tables referenced in a functions body

Fixes #3378 

It was reported that `master_add_node` would fail if a distributed function has a table name referenced in its declare section of the body. By default postgres validates the body of a function on creation. This is not a problem in the normal case as tables are replicated to the workers when we distribute functions.

However when a new node is added we first create dependencies on the workers before we try to create any tables, and the original tables get created out of bound when the metadata gets synced to the new node. This causes the function body validator to raise an error the table is not on the worker.

To mitigate this issue we set `check_function_bodies` to `off` right before we are creating the function.

The added test shows this does resolve the issue. (issue can be reproduced on the commit without the fix)
2020-01-16 14:21:54 +01:00
Jelte Fennema e76281500c
Replace shardId lock with lock on colocation+shardIntervalIndex (#3374)
This new locking pattern makes sure that some deadlocks that could
happend during rebalancing cannot occur anymore.
2020-01-16 13:14:01 +01:00
Jelte Fennema 86343bcc8f Re-add test that broke with GUC workaround 2020-01-16 12:34:50 +01:00
Jelte Fennema 6b9b633695 Add more tests for prepared statements 2020-01-16 12:28:15 +01:00
Jelte Fennema 43a3fdd12f Fix comment 2020-01-16 12:28:15 +01:00
Jelte Fennema fe3827e499 Add tests for [NOT] MATERIALEZED 2020-01-16 12:28:15 +01:00
Onder Kalaci 326dfab44a Fix a query which triggers an existing bug, see https://github.com/citusdata/citus/issues/3189#issuecomment-571497051 2020-01-16 12:28:15 +01:00
Onder Kalaci 81d8178625 Note that we'll drop the GUC after PG 11 support dropped 2020-01-16 12:28:15 +01:00
Onder Kalaci c653923960 Update regression tests 6
Local execution and CTE pushdown
2020-01-16 12:28:15 +01:00
Onder Kalaci 3818be45a6 Update regression tests-5
Failure tests that rely on intermediate results
2020-01-16 12:28:15 +01:00
Onder Kalaci 1e85938b46 Update regression tests-4
Update the MX tests. Similar to the previous commits, prevent CTE
inlining in some cases to prevent divergent test outputs.
2020-01-16 12:28:15 +01:00
Onder Kalaci fc07bd7c5b Update regression tests-3
Update the regression tests which only change in PG 12.
2020-01-16 12:28:15 +01:00
Onder Kalaci 64560b07be Update regression tests-2
In this commit, we're introducing a way to prevent CTE inlining via a GUC.

The GUC is used in all the tests where PG 11 and PG 12 tests would diverge
otherwise.

Note that, in PG 12, the restriction information for CTEs are generated. It
means that for some queries involving CTEs, Citus planner (router planner/
pushdown planner) may behave differently. So, via the GUC, we prevent
tests to diverge on PG 11 vs PG 12.

When we drop PG 11 support, we should get rid of the GUC, and mark
relevant ctes as MATERIALIZED, which does the same thing.
2020-01-16 12:28:15 +01:00
Onder Kalaci 5cb203b276 Update regression tests-1
These set of tests has changed in both PG 11 and PG 12.
The changes are only about CTE inlining kicking in both
versions, and yielding the exact same distributed planning.
2020-01-16 12:28:15 +01:00
Onder Kalaci 421bf68516 Add the specific regression tests
With this commit, we're adding the specific tests for CTE inlining.
The test has a different output file for pg 11, because as mentioned
in the previous commits, PG 12 generates more restriction information
for CTEs.
2020-01-16 12:28:15 +01:00
Onder Kalaci efb1577d06 Handle CTE aliases accurately
Basically, make sure to update the column name with the CTEs alias
if we need to do so.
2020-01-16 12:28:15 +01:00
Onder Kalaci 05d600dd8f Call CTE inlining in Citus planner
The idea is simple: Inline CTEs(if any), try distributed planning.
If the planning yields a successful distributed plan, simply return
it.

If the planning fails, fallback to distributed planning on the query
tree where CTEs are not inlined. In that case, if the planning failed
just because of the CTE inlining, via recursive planning, the same
query would yield a successful plan.

A very basic set of examples:

WITH cte_1 AS (SELECT * FROM test_table)
SELECT
	*, row_number() OVER ()
FROM
	cte_1;

or

WITH a AS (SELECT * FROM test_table),
b AS (SELECT * FROM test_table)
SELECT * FROM  a JOIN b ON (a.value> b.value);
2020-01-16 12:28:15 +01:00
Onder Kalaci 01a5800ee8 Add Citus' CTE inlining functions
With this commit we add the necessary Citus function to inline CTEs
in a queryTree.

You might ask, why do we need to inline CTEs if Postgres is already
going to do it?

Few reasons behind this decision:

- One techinal node here is that Citus does the recursive CTE planning
  by checking the originalQuery which is the query that has not gone
  through the standard_planner().

  CTEs in Citus is super powerful. It is practically key for full SQL
  coverage for multi-shard queries. With CTEs, you can always reduce
  any query multi-shard query into a router query via recursive
  planning (thus full SQL coverage).
  We cannot let CTE inlining break that. The main idea is Citus should
  be able to retry planning if anything goes after CTE inlining.

  So, by taking ownership of CTE inlining on the originalQuery, Citus
  can fallback to recursive planning of CTEs if the planning with the
  inlined query fails. It could have been a lot harder if we had relied
  on standard_planner() to have the inlined CTEs on the original query.

- We want to have this feature in PostgreSQL 11 as well, but Postgres
  only inlines in version 12
2020-01-16 12:28:15 +01:00
Onder Kalaci 1856ab6cdd Copy & paste code from Postgres source
All the code in this commit is direct copy & paste from Postgres
source code.

We can classify the copy&paste code into two:

- Copy paste from CTE inline patch from postgres
  (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b)
  These include the functions inline_cte(), inline_cte_walker(),
  contain_dml(), contain_dml_walker().
  It also include the code in function PostgreSQLCTEInlineCondition().
  We prefer to extract that code into a seperate function, because
  (a) we'll re-use the logic later (b) we added one check for PG_11

  Finally, the struct "inline_cte_walker_context" is also copied from
  the same Postgres commit.

- Copy paste from the other parts of the Postgres code

  In order to implement CTE inlining in Postgres 12, the hackers
  modified the query_tree_walker()/range_table_walker() with the
  18c0da88a5

  Since Citus needs to support the same logic in PG 11, we copy & pasted
  that functions (and related flags) with the names pg_12_query_tree_walker()
  and pg_12_range_table_walker()
2020-01-16 12:28:15 +01:00
Philip Dubé 4d9a733c2f Fix inserting multiple values with row expression partition column causing the insert to be ignored
Raise an error instead of silently inserting nothing if we hit this condition in the future
2020-01-15 21:10:50 +00:00
Philip Dubé 4989c9a15c PlacementExecutionDone: We may mark placements as failed multiple times, but should only act the first time. 2020-01-15 18:20:01 +00:00
Marco Slot f1a0582973 Make ApplyLogRedaction a macro and redefine ereport 2020-01-13 18:24:36 +01:00
Marco Slot 06709ee108 Always use NOTICE in log_remote_commands and avoid redaction when possible 2020-01-13 18:24:36 +01:00
Marco Slot 90056f7d3c Remove copy from worker for append-partitioned table 2020-01-13 23:03:40 -08:00
Philip Dubé 62524d152d mitmscripts/fluent.py: use atomic increment 2020-01-13 20:35:08 +00:00
Philip Dubé ccabf19090 Propagate DROP ROUTINE, ALTER ROUTINE
In two places I've made code more straight forward by using ROUTINE in our own codegen

Two changes which may seem extraneous:

AppendFunctionName was updated to not use pg_get_function_identity_arguments.
This is because that function includes ORDER BY when printing an aggregate like my_rank.
While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres,
ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not.

Tests were updated to use macaddr over integer. Using integer is flaky, our logic
could sometimes end up on tables like users_table. I originally wanted to use money,
but money isn't hashable.
2020-01-13 15:37:46 +00:00
Philip Dubé 4b5d6c3ebe Rename RelayFileState to ShardState
Replace FILE_ prefix with SHARD_STATE_
2020-01-12 05:57:53 +00:00
Philip Dubé e71386af33 Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT
Also use stack allocation for walkerContext in multi_logical_optimizer
2020-01-10 16:54:00 +00:00
Hadi Moshayedi 40ba2cdd6e Test RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Hadi Moshayedi 527d7d41c1 Implement RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Philip Dubé 281aacce9b Fix row-gather for subqueries being handled by task-tracker
task-tracker has specific logic for MultiPartition when GROUP BY is missing

We were ending up in this code path because row-gather removes GROUP BY
2020-01-10 01:51:37 +00:00
Hadi Moshayedi e1e383cb59 Don't override xact id assigned by coordinator on workers.
We might need to send commands from workers to other workers. In
these cases we shouldn't override the xact id assigned by coordinator,
or otherwise we won't read the consistent set of result files
accross the nodes.
2020-01-09 11:09:11 -08:00
Hadi Moshayedi bb65669186 Failure tests for PartitionTasklistResults 2020-01-09 10:55:58 -08:00
Hadi Moshayedi c7c460e843 PartitionTasklistResults: Use different queries per placement
We need to know which placement succeeded in executing the worker_partition_query_result() call. Otherwise we wouldn't know which node to fetch from. This change allows that by introducing Task::perPlacementQueryStrings.
2020-01-09 10:55:58 -08:00
Hadi Moshayedi f38d0e5b3f Partitioned task list results. 2020-01-09 10:32:58 -08:00
Philip Dubé 73c06fae3b Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type 2020-01-09 18:24:29 +00:00
Jelte Fennema 9724e25065 Normalize plan numbers in insert_select output 2020-01-07 10:34:08 +01:00
Philip Dubé bf7d86a3e8 Fix typo: aggragate -> aggregate 2020-01-07 01:16:09 +00:00
Philip Dubé 863bf49507 Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default 2020-01-07 01:16:04 +00:00
Jelte Fennema 5b0baea72c Refactor distributed_planner for better understandability 2020-01-06 14:23:38 +01:00
Onder Kalaci 5a1e752726 Apply feedback - add fastPath field to plan 2020-01-06 12:42:43 +01:00
Onder Kalaci 13a9b55695 Skip expensive checks when fast-path query
The definition of fast-path query is very strict. So, we don't need
to do some extra checks.
2020-01-06 12:42:43 +01:00
Onder Kalaci 7f3ab7892d Skip shard pruning when possible
We're already traversing the queryTree and finding the distribution
key value, so pass it to the later stages of the planning.
2020-01-06 12:42:43 +01:00
Onder Kalaci ca293116fa Reduce calls to FastPathRouterQuery()
Before this commit, we called it twice durning planning. Instead,
we save the information and pass it.
2020-01-06 12:42:43 +01:00
Onder Kalaci c8f14c9f6c Make sure to update shard states of partitions on failures
Fixes #3331

In #2389, we've implemented support for partitioned tables with rep > 1.
The implementation is limiting the use of modification queries on the
partitions. In fact, we error out when any partition is modified via
EnsurePartitionTableNotReplicated().

However, we seem to forgot an important case, where the parent table's
partition is marked as INVALID. In that case, at least one of the partition
becomes INVALID. However, we do not mark partitions as INVALID ever.

If the user queries the partition table directly, Citus could happily send
the query to INVALID placements -- which are not marked as INVALID.

This PR fixes it by marking the placements of the partitions as INVALID
as well.

The shard placement repair logic already re-creates all the partitions,
so should be fine in that front.
2020-01-06 12:26:08 +01:00
Jelte Fennema 3c770516eb
Commenting out flaky intermediate data leak test (#3359)
check-multi apparently has an intermediate data leak, so commenting out
that test for now. This was introduced by #3349 

Examples:
- https://app.circleci.com/jobs/github/citusdata/citus/74675
- https://app.circleci.com/jobs/github/citusdata/citus/74683
- https://app.circleci.com/jobs/github/citusdata/citus/74763
2020-01-06 11:55:01 +01:00
Jelte Fennema d29ce8965c
Actually check that test output normalization is applied in CI (#3358)
Fixup of an issue with #3336 that caused CI not to check correctly that
normalized test output was committed.
2020-01-06 10:37:34 +01:00
Jelte Fennema 4a20ba3bfc Merge remote-tracking branch 'origin/master' into normalized-test-output 2020-01-06 09:36:04 +01:00
Jelte Fennema 2e4e1c030f Make sure the expected .out file always exists when running diff on it 2020-01-06 09:32:03 +01:00
Jelte Fennema 16bcf15e16 Remove unused normalization rule 2020-01-06 09:32:03 +01:00
Jelte Fennema 634ea80009 Add a basic testing README including normalization explanation 2020-01-06 09:32:03 +01:00
Jelte Fennema 7c3e8e150e Normalize tests: s/Subplan [0-9]+\_/Subplan XXX\_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema acd12a6de5 Normalize tests: s/read_intermediate_result\('[0-9]+_/read_intermediate_result('XXX_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 21dbd4e55d Normalize tests: s/generating subplan [0-9]+\_/generating subplan XXX\_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 58723dd8b0 Normalize tests: s/DEBUG: Plan [0-9]+/DEBUG: Plan XXX/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 34c5532e9c Add commented out rules to normalize Plan numbers 2020-01-06 09:32:03 +01:00
Jelte Fennema 38ac28b4b8 Normalize tests: intermediate_results 2020-01-06 09:32:03 +01:00
Jelte Fennema 0c6983a80e Normalize tests: pg12 changes 2020-01-06 09:32:03 +01:00
Jelte Fennema 7730bd449c Normalize tests: Remove trailing whitespace 2020-01-06 09:32:03 +01:00
Jelte Fennema 6353c9907f Normalize tests: Line info varies between versions 2020-01-06 09:32:03 +01:00
Jelte Fennema bf2c203908 Normalize tests: solation_ref2ref_foreign_keys 2020-01-06 09:32:03 +01:00
Jelte Fennema 7b2c769a5d Normalize tests: normalize file names for partitioned files 2020-01-06 09:32:03 +01:00
Jelte Fennema 98bab9caab Normalize tests: ignore WAL warnings 2020-01-06 09:32:03 +01:00
Jelte Fennema 5c0f955ab9 Normalize tests: ignore could not consume warnings 2020-01-06 09:32:03 +01:00
Jelte Fennema dc3cff991f Normalize tests: normalize failed task ids 2020-01-06 09:32:03 +01:00
Jelte Fennema d0ade90cd0 Normalize tests: pkey constraints for multi_insert_select 2020-01-06 09:32:03 +01:00
Jelte Fennema 704e1d2bc8 Normalize tests: shard table names for multi_name_lengths 2020-01-06 09:32:03 +01:00
Jelte Fennema 1c4ea6836b Normalize tests: shard table names for multi_insert_select_conflict 2020-01-06 09:32:03 +01:00
Jelte Fennema 27997c054e Normalize tests: shard table names for foreign_key_restrection_enforcement 2020-01-06 09:32:03 +01:00
Jelte Fennema 432b5baac7 Normalize tests: shard table names for custom_aggregate_support 2020-01-06 09:32:03 +01:00
Jelte Fennema 0c23caeb75 Normalize tests: shard table names for multi_subtransactions 2020-01-06 09:32:03 +01:00
Jelte Fennema 883ee9121f Normalize tests: shard table names in foreign_key_to_reference_table 2020-01-06 09:32:03 +01:00
Jelte Fennema 7f3de68b0d Normalize tests: header separator length 2020-01-06 09:32:03 +01:00
Philip Dubé 566246ecd4 End regression tests with ensure_no_intermediate_data_leak
Also update tests to clean up jobs when they're directly testing job udfs
2020-01-03 18:59:02 +00:00
Önder Kalacı 0c70a5470e
Allow RETURNING in fast-path queries (#3352)
* Allow RETURNING in fast-path queries

Because there is no specific reason for that.
2020-01-03 13:42:50 +00:00