Commit Graph

534 Commits (aa16512a815d2ac263cb99a182f4d887f11f7ff3)

Author SHA1 Message Date
Murat Tuncer 23800f50f1 Update citus_stat_statements view and regression tests 2018-07-03 16:14:13 +03:00
Murat Tuncer e532755a6e Fix bug in partition column extraction
added strip_implicit_coercion prior to
checking if the expression is Const.
This is important to find values for types
like bigint.
2018-07-02 18:08:16 +03:00
Onder Kalaci 4ccabf9544 Increase timeout to keep appveyor happy 2018-06-25 18:40:40 +03:00
Onder Kalaci 8ccb8b679e Real-time executor marks multi shard relation accesses before opening connections 2018-06-25 18:40:31 +03:00
Onder Kalaci 2890154420 Make sure that TRUNCATE always opens a DDL access 2018-06-25 18:40:31 +03:00
Onder Kalaci 21038f0d0e Make sure that inter-shard DDL commands are always covers both tables 2018-06-25 18:40:30 +03:00
Onder Kalaci 2f01894589 Track relation accesses using the connection management infrastructure 2018-06-25 18:40:30 +03:00
Onder Kalaci d5472614df Use non-data connection for intermediate results
Make sure that intermediate results use a connection that is
not associated with any placement. That is useful in two ways:
    - More complex queries can be executed with CTEs
    - Safely use the same connections when there is a foreign key
      to reference table from a distributed table, which needs to
      use the same connection for modifications since the reference
      table might cascade to the distributed table.
2018-06-21 13:26:13 +03:00
Jason Petersen 7a75c2ed31 Add connparam invalidation trigger creation logic
This needs to live in Community, since we haven't yet added the com-
plication of having divergent upgrade scripts in Enterprise.
2018-06-20 14:13:18 -06:00
mehmet furkan şahin 2b2ce036eb create_distributed_table honors sequential mode 2018-06-19 17:33:45 +03:00
Onder Kalaci 8f5821493a Implement C interface for setting GUC
We need the ability to switch to sequential mode (e.g.,
 SET LOCAL citus.multi_shard_modify_mode = 'sequential'). This
commit enables that.
2018-06-19 10:23:43 +03:00
Marco Slot f3f2805978
Fix use-after-free that may occur for INSERT..SELECT in prepared statements 2018-06-18 22:55:06 -06:00
velioglu 53b2e81d01 Adds SELECT ... FOR UPDATE support for router plannable queries 2018-06-18 13:55:17 +03:00
Marco Slot 28860b2469 Remove volatile explain plan from regression tests 2018-06-15 00:21:52 +02:00
Marco Slot 04da0cf9b1 Remove costs from explain plans in window_functions tests 2018-06-14 23:51:46 +02:00
Jason Petersen 5bf7bc64ba Add pg_dist_authinfo schema and validation
This table will be used by Citus Enterprise to populate authentication-
related fields in outbound connections; Citus Community lacks support
for this functionality.
2018-06-13 11:16:26 -06:00
Onder Kalaci a5370f5bb0 Realtime executor honours multi_shard_modify_mode
We're relying on multi_shard_modify_mode GUC for real-time SELECTs.
The name of the GUC is unfortunate, but, adding one more GUC
(or renaming the GUC) would make the UX even worse. Given that this
mode is mostly important for transaction blocks that involve modification
/DDL queries along with real-time SELECTs, we can live with the confusion.
2018-06-06 14:59:54 +03:00
Onder Kalaci d918556dca INSERT .. SELECT pushdown honors multi_shard_modification_mode 2018-06-06 12:42:23 +03:00
Onder Kalaci 336044f2a8 master_modify_multiple_shards() and TRUNCATE honors multi_shard_modification_mode 2018-06-06 12:29:05 +03:00
Onder Kalaci 51cb24b39c Increase timeout to make the appveyor tests happy 2018-06-05 17:52:18 +03:00
Onder Kalaci df44956dc3 Make sure that sequential DDL opens a single connection to each node
After this commit DDL commands honour `citus.multi_shard_modify_mode`.

We preferred using the code-path that executes single task router
queries (e.g., ExecuteSingleModifyTask()) in order not to invent
a new executor that is only applicable for DDL commands that require
sequential execution.
2018-06-05 17:52:17 +03:00
Murat Tuncer ba50e3f33e Add handling for grant/revoke all tables in schema 2018-05-31 13:47:02 +03:00
Brian Cloutier a7e09d777b Increase deadlock timeout so we get fewer signals 2018-05-16 17:07:24 -07:00
Marco Slot 61d2c0f618 Stabilise output of multi_shard_update_delete test 2018-05-11 08:33:23 +02:00
mehmet furkan şahin b8c3197399 enterprise test fixes 2018-05-10 13:06:54 +03:00
mehmet furkan şahin 785a86ed0a Tests are updated to use create_distributed_table 2018-05-10 11:18:59 +03:00
Marco Slot 9438e5bde9 Ensure single-shard modifying CTEs are part of distributed transaction 2018-05-06 12:49:40 +02:00
velioglu caa27161ca Check volatile functions in modify queries 2018-05-08 11:16:40 +03:00
Marco Slot 2f9c8c6af0 Allow DML commands with unreferenced SELECT CTEs 2018-05-03 14:53:26 +02:00
Marco Slot f8cfe07fd1 Support intermediate results in distributed INSERT..SELECT 2018-05-03 14:42:28 +02:00
Marco Slot 90cdfff602 Implement recursive planning for DML statements 2018-05-03 14:42:28 +02:00
mehmet furkan şahin ef90122cd3 shard count for some of the tests are increased 2018-05-03 10:44:43 +03:00
Onder Kalaci 317dd02a2f Implement single repartitioning on hash distributed tables
* Change worker_hash_partition_table() such that the
     divergence between Citus planner's hashing and
     worker_hash_partition_table() becomes the same.

   * Rename single partitioning to single range partitioning.

   * Add single hash repartitioning. Basically, logical planner
     treats single hash and range partitioning almost equally.
     Physical planner, on the other hand, treats single hash and
     dual hash repartitioning almost equally (except for JoinPruning).

   * Add a new GUC to enable this feature
2018-05-02 18:50:55 +03:00
velioglu 32bcd610c1 Support modify queries with multiple tables
With this commit we begin to support modify queries with multiple
tables if these queries are pushdownable.
2018-05-02 16:22:26 +03:00
Brian Cloutier f8fb7a27fb Don't copyObject into the wrong memory context
utilityStmt sometimes (such as when it's inside of a plpgsql function)
comes from a cached plan, which is kept in a child of the
CacheMemoryContext. When we naively call copyObject we're copying it into
a statement-local context, which corrupts the cached plan when it's
thrown away.
2018-05-01 15:34:32 -07:00
Marco Slot 2559b84049 Drop shards as current user instead of super user 2018-05-01 09:57:20 +02:00
velioglu 121ff39b26 Removes large_table_shard_count GUC 2018-04-29 10:34:50 +02:00
mehmet furkan şahin a4153c6ab1 notice handler is implemented 2018-04-27 14:37:01 +03:00
Marco Slot 3d3c19a717
Improve messages for essential connection failures 2018-04-26 12:58:47 -06:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Onder Kalaci 814f0e3acc Ensure Citus never try to access a not planned subquery
PostgreSQL might remove some of the subqueries when they do not
contribute to the query result at all. Citus should not try to
access such subqueries during planning.
2018-04-20 13:52:00 +03:00
Brian Cloutier d02f761d8e Change intermediate_results test to not crash 2018-04-17 15:14:02 -07:00
mehmet furkan şahin 00e786af00 Capital named schema support is added 2018-04-17 17:17:42 +03:00
mehmet furkan şahin e5a5502b16 Adds support for multiple ANDs in Having
This PR adds support for multiple AND expressions in Having
for pushdown planner. We simply make a call to make_ands_explicit
from MultiLogicalPlanOptimize for the having qual in
workerExtendedOpNode.
2018-04-16 14:14:48 +03:00
velioglu 82b2d21b0c Convert broadcast join to reference join
After this commit large_table_shard_count wont be used to
check whether broadcast join, which is renamed as reference
join, can be applied. Reference join can only be applied over
reference tables.
2018-04-13 12:58:14 +03:00
velioglu 1b92812be2 Add co-placement check to CoPartition function 2018-04-13 12:13:08 +03:00
Marco Slot 9318aeee6b Allow multiple size function calls per query 2018-04-12 14:16:17 +02:00
Burak Yucesoy b33b282030 Fix bug while DROPping partitioned table from worker
We recently added partitionin support to Citus MX. We should not execute
DROP table commands from MX workers but at the moment we try to execute
such commands for partitioned tables. This PR fixes that problem by
adding check.
2018-04-09 13:50:21 +03:00
Burak Yucesoy 0c283fa8a3 Add partitioning support to MX tables
Previously, we prevented creation of partitioned tables on Citus MX.
We decided to not focus on this feature until there is a need. Since
now there are requests for this feature, we are implementing support
for partitioned tables on Citus MX.
2018-04-06 12:47:06 +03:00
velioglu 72dfe4a289 Adds colocation check to local join 2018-04-04 22:49:27 +03:00
velioglu 698d585fb5 Remove broadcast join logic
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
2018-03-30 11:45:19 +03:00
Brian Cloutier 9aff4384a1 Make tests platform independent
- Force all platforms to use the same collation
- Force all platforms to use the same locale
- Use /dev/null or NUL, depending on platform
- Use /tmp or %TEMP%, dpeending on platform
2018-03-27 14:18:48 -07:00
Murat Tuncer 1440caeef2
Fix incorrect limit pushdown when distinct clause is not superset of group by (#2035)
Pushing down limit and order by into workers may produce
wrong output when distinct on() clause has expressions,
aggregates, or window functions.

This checking allows pushing down of limits only if
distinct clause is a superset of group by clause. i.e. it contains all clauses in group by.
2018-03-07 13:24:56 +03:00
Onder Kalaci 40b898b59f Improve error messages for INSERT queries that have subqueries 2018-03-05 14:46:47 +02:00
Murat Tuncer 76f6883d5d
Add support for window functions that can be pushed down to worker (#2008)
This is the first of series of window function work.

We can now support window functions that can be pushed down to workers.
Window function must have distribution column in the partition clause
 to be pushed down.
2018-03-01 19:07:07 +03:00
Marco Slot dc7213a11c Use expressions in the ORDER BY in bool_agg 2018-02-27 23:52:44 +01:00
Marco Slot c723a1fa32 Add support for bool and bit aggregates 2018-02-27 23:48:25 +01:00
Murat Tuncer e13c5beced
Fix worker query when order by avg aggregate is used (#2024)
We push down order by to worker query when limit is specified
(with some other additional checks). If the query has an expression
on an aggregate or avg aggregate by itself, and there is an order
by on this particular target we may send wrong order by to worker
query with potential to affect query result.

The fix creates a auxilary target entry in the worker query and
uses that target entry for sorting.
2018-02-28 12:12:54 +03:00
Metin Doslu bcf660475a Add support for modifying CTEs 2018-02-27 15:08:32 +02:00
velioglu 78e6d990a2 Fix master plan of the query with distinct, aggregate and group by clauses.
Before this PR, we were trusting on the columns of group by about
guaranteeing the uniqueness of the results. However, this assumption
is correct only if the columns in the group by is subset of columns
in the distinct clause. It can be wrong if we have part of group by
columns and some aggregation columns in the distinct clause. With
this PR, we add distinct plan on top of aggregate plan when necessary.
2018-02-26 15:30:15 +03:00
Onder Kalaci 1c930c96a3 Support non-co-located joins between subqueries
With #1804 (and related PRs), Citus gained the ability to
plan subqueries that are not safe to pushdown.

There are two high-level requirements for pushing down subqueries:

   * Individual subqueries that require a merge step (i.e., GROUP BY
     on non-distribution key, or LIMIT in the subquery etc). We've
     handled such subqueries via #1876.

    * Combination of subqueries that are not joined on distribution keys.
      This commit aims to recursively plan some of such subqueries to make
      the whole query safe to pushdown.

The main logic behind non colocated subquery joins is that we pick
an anchor range table entry and check for distribution key equality
of any  other subqueries in the given query. If for a given subquery,
we cannot find distribution key equality with the anchor rte, we
recursively plan that subquery.

We also used a hacky solution for picking relations as the anchor range
table entries. The hack is that we wrap them into a subquery. This is only
necessary since some of the attribute equivalance checks are based on
queries rather than range table entries.
2018-02-26 13:50:37 +02:00
Onder Kalaci cdb8d429a7 Add regression tests for non-colocated leaf subqueries 2018-02-26 13:28:24 +02:00
Onder Kalaci 4d4648aabd Change single shard mx test tables to reference tables 2018-02-26 13:28:24 +02:00
Onder Kalaci 4d70c86645 Leaf level recursive planning for non colocated subqueries
With this commit, we enable recursive planning for the subqueries
that are not joined on the distribution keys.
2018-02-26 13:28:24 +02:00
Markus Sintonen 6202e80d06 Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg 2018-02-18 00:19:18 +02:00
velioglu 195ac948d2 Recursively plan subqueries in WHERE clause when FROM recurs 2018-02-13 19:52:12 +03:00
metdos 35f864bcaf Respect enable_hashagg in the master planner 2018-02-05 15:06:00 +02:00
Brian Cloutier b864d014ab
GetNextNodeId() incorrectly called PG_RETURN_DATUM
- Also stabilize the output of a multi_router_planner test
2018-01-29 15:32:36 -08:00
Dimitri Fontaine 1f088791bd Add DDL tests with non-public schema.
Citus sometimes have regressions around non-default schema support, meaning
not public and not in the search_path, per @marcocitus. This patch changes
some regression tests to use a non-default schema in order to cover more
cases.
2018-01-11 13:21:24 +01:00
Dimitri Fontaine e010238280 Implement ALTER TABLE ... RENAME TO ...
The implementation was already mostly in place, but the code was protected
by a principled check against the operation. Turns out there's a nasty
concurrency bug though with long identifier names, much as in #1664.

To prevent deadlocks from happening, we could either review the DDL
transaction management in shards and placements, or we can simply reject
names with (NAMEDATALEN - 1) chars or more — that's because of the
PostgreSQL array types being created with a one-char prefix: '_'.
2018-01-11 13:21:24 +01:00
Marco Slot 8f69973411 Fix cancellation issues in the real-time executor (#1905) 2018-01-01 23:10:29 -05:00
Marco Slot 3fd65cb91b Do not raise errors in the real-time executor (#1903) 2018-01-01 22:26:31 -05:00
Onder Kalaci a1bbdf2d44 Outer joins should also use subquery pushdown planner if join
clause is not supported

This change allows unsupported clauses to go through query pushdown
planner instead of erroring out as we already do for non-outer joins.
2017-12-29 16:40:47 +02:00
Marco Slot 09c09f650f Recursively plan set operations when leaf nodes recur 2017-12-26 13:46:55 +02:00
Onder Kalaci eb929e9001 Add some more basic regression tests, mostly for documentation purposes 2017-12-25 15:03:45 +02:00
mehmet furkan şahin 446893234a unsupported subquery error messages are fixed 2017-12-25 15:10:59 +03:00
Murat Tuncer 87c6f306f1
Fix join clause eq restrictions (#1884)
We used to error out if the join clause includes filters like
t1.a < t2.a even if other filter like t1.key = t2.key exists.

Recently we lifted that restriction in subquery planning by
not lifting that restriction and focusing on equivalance classes
provided by postgres.

This checkin forwards previously erroring out real-time queries
due to join clauses to subquery planner and let it handle the
join even if the query does not have a subquery.

We are now pushing down queries that do not have any
subqueries in it. Error message looked misleading, changed to a more descriptive one.
2017-12-22 12:16:14 +03:00
Murat Tuncer a9cf0c3e66
Fix CTE column alias issue (#1893)
We were creating intermediate query result's target
names from subquery target list. Now we also check
if cte re-defines its column name aliases, and create
intermediate result query accordingly.
2017-12-22 09:39:40 +03:00
mehmet furkan şahin fd546cf322 Intermediate result size limitation
This commit introduces a new GUC to limit the intermediate
result size which we handle when we use read_intermediate_result
function for CTEs and complex subqueries.
2017-12-21 14:26:56 +03:00
Onder Kalaci e2a5124830 Add regression tests for recursive subquery planning 2017-12-21 08:37:40 +02:00
Onder Kalaci 0d5a4b9c72 Recursively plan subqueries that are not safe to pushdown
With this commit, Citus recursively plans subqueries that
are not safe to pushdown, in other words, requires a merge
step.

The algorithm is simple: Recursively traverse the query from bottom
up (i.e., bottom meaning the leaf queries). On each level, check
whether the query is safe to pushdown (or a single repartition
subquery). If the answer is yes, do not touch that subquery. If the
answer is no, plan the subquery seperately (i.e., create a subPlan
for it) and replace the subquery with a call to
`read_intermediate_results(planId, subPlanId)`. During the the
execution, run the subPlans first, and make them avaliable to the
next query executions.

Some of the queries hat this change allows us:

   * Subqueries with LIMIT
   * Subqueries with GROUP BY/DISTINCT on non-partition keys
   * Subqueries involving re-partition joins, router queries
   * Mixed usage of subqueries and CTEs (i.e., use CTEs in
     subqueries as well). Nested subqueries as long as we
     support the subquery inside the nested subquery.
   * Subqueries with local tables (i.e., those subqueries
     has the limitation that they have to be leaf subqueries)

   * VIEWs on the distributed tables just works (i.e., the
     limitations mentioned below still applies to views)

Some of the queries that is still NOT supported:

  * Corrolated subqueries that are not safe to pushdown
  * Window function on non-partition keys
  * Recursively planned subqueries or CTEs on the outer
    side of an outer join
  * Only recursively planned subqueries and CTEs in the FROM
    (i.e., not any distributed tables in the FROM) and subqueries
    in WHERE clause
  * Subquery joins that are not on the partition columns (i.e., each
    subquery is individually joined on partition keys but not the upper
    level subquery.)
  * Any limitation that logical planner applies such as aggregate
    distincts (except for count) when GROUP BY is on non-partition key,
    or array_agg with ORDER BY
2017-12-21 08:37:40 +02:00
Marco Slot 6a6e986c2b Add EXPLAIN regression test with subplans 2017-12-19 16:34:56 +01:00
Marco Slot 9b520ae194 Add test for using transaction ID in parallel worker 2017-12-19 09:30:29 +01:00
Marco Slot 7dab078e67 Set cost estimates for read_intermediate_result 2017-12-18 16:23:44 +01:00
Marco Slot e49254f876 Revert "Add EXPLAIN regression test with subplans"
This reverts commit 8b6d641227.
2017-12-17 22:34:31 +01:00
Marco Slot 8b6d641227 Add EXPLAIN regression test with subplans 2017-12-17 22:00:25 +01:00
Marco Slot ea6b98fda4 Allow count(distinct) in queries with a subquery 2017-12-15 15:24:26 +01:00
Marco Slot 5a69fc1b17 Relax checks on recurring tuples in FROM with sublinks 2017-12-15 11:56:06 +01:00
mehmet furkan şahin 5851f71bfb Add CTE regression tests 2017-12-14 09:32:55 +01:00
Marco Slot fa73abe6d4 Regression test output changes after CTE support 2017-12-14 09:32:55 +01:00
Onder Kalaci 86b2d9420c Treat recurring tuples as reference table for GROUP BY checks
read_intermediate_results() and immutable functions are implemented.
Empty join trees seems not applicable here.
2017-12-13 14:55:42 +02:00
Marco Slot d1a470a52e Fix issue with multiple ANALYZE in transaction block 2017-12-12 10:28:48 +01:00
mehmet furkan şahin 3c941aedf1 adds citus.enable_repartition_joins GUC
The new GUC allows Citus to switch between task executors
when necessary
2017-12-11 09:36:37 +03:00
Marco Slot 5895c88552 Add materialized view regression tests 2017-12-07 16:20:23 +01:00
Marco Slot f8550b8c85 Fix issues with read_intermediate_result signature 2017-12-07 13:47:56 +01:00
metdos 12d5974d97 Increase sleep time in a regression test to give Valgrind tests enough time 2017-12-05 14:59:37 +02:00
Marco Slot 716448ddef Add regression tests for intermediate results 2017-12-04 14:50:11 +01:00
Marco Slot 4cdadfcab6 Add intermediate results infrastructure 2017-12-04 14:50:11 +01:00
Marco Slot 0d6a7f5884 Add real-time BEGIN regression tests 2017-11-30 12:59:09 +01:00
Marco Slot 3a4d5f8182 Remove filter checks on leaf queries 2017-11-30 12:25:14 +01:00