Commit Graph

563 Commits (c58bb37ad7ca2abd19eede620d3f8a62f59b9c8f)

Author SHA1 Message Date
velioglu 44fc9f46fc Add create_distributed_table (without data) failure tests 2018-08-13 09:31:15 +03:00
Onder Kalaci 974cbf11a5 Hide shard names on MX worker nodes
This commit by default enables hiding shard names on MX workers
by simple replacing `pg_table_is_visible()` calls with
`citus_table_is_visible()` calls on the MX worker nodes. The latter
function filters out tables that are known to be shards.

The main motivation of this change is a better UX. The functionality
can be opted out via a GUC.

We also added two views, namely citus_shards_on_worker and
citus_shard_indexes_on_worker such that users can query
them to see the shards and their corresponding indexes.

We also added debug messages such that the filtered tables can
be interactively seen by setting the level to DEBUG1.
2018-08-07 14:21:45 +03:00
mehmet furkan şahin c1f7631f98 failure tests on create_distributed_table nonempty 2018-08-03 12:41:25 -07:00
velioglu b21bd2d1a0 Add create_reference_table failure tests 2018-08-03 17:49:57 +03:00
velioglu bc27651dd9 Add failure test for copy on hash distributed table 2018-08-03 17:11:09 +03:00
Brian Cloutier 82fa85fa5b
Add tests for 1PC COPY on append and hash-distributed tables
Add tests for 1PC COPY on append and hash-distributed tables
2018-07-31 15:17:59 -07:00
Brian Cloutier f0f7a691a3
Prevent failure tests from hanging by using a port outside the ephemeral port range
- mitmdump now listens on port 9060
- Add some logging to fluent.py, making issues like this easier to debug in the future
- Fail the tests if something is already running on the port mitmProxy tries to use
- check-failure now works with VPATH builds
2018-07-31 14:30:56 -07:00
mehmet furkan şahin dde86cb731 Copy to reference table failure tests are added 2018-07-30 11:48:12 +03:00
mehmet furkan şahin bc757845eb Citus versioning fix 2018-07-26 10:56:34 +03:00
Brian Cloutier ace248d13c Remove unnecessary calls to 'conn.allow()' 2018-07-25 17:45:00 -07:00
mehmet furkan şahin 6d0fbbace7 ALTER TABLE %s ADD COLUMN constraint check is added 2018-07-24 15:53:05 +03:00
Nils Dijk 2d13900230
error on unsupported changing of distirbution column in ON CONFLICT for INSERT ... SELECT 2018-07-23 15:18:21 +02:00
Marco Slot 69a3ebea5f Ensure StartPlacementListConnection connects with username supplied by the caller 2018-07-19 20:10:11 +02:00
Murat Tuncer a837dde1a0 Add failure tests for master add/remove/disable/active node 2018-07-13 18:06:24 +03:00
mehmet furkan şahin f854420079 truncate failure tests are added 2018-07-13 13:20:50 +03:00
Murat Tuncer 2795494758 Added failure test for create index concurrently 2018-07-13 11:53:49 +03:00
Onder Kalaci a446e71ee7 Add failure testing for DDL commands
This commit adds an extensive failure testing, which covers quite
a bit of things and their combinations:
   - 1PC vs 2PC
   - Replication factor 1 and Replication factor 2
   - Network failures and query cancellations
   - Sequential vs Parallel query execution mode
2018-07-12 13:05:29 +03:00
Jason Petersen 318119910b
Add pg_dist_poolinfo table
For storing nodes' pool host/port overrides.
2018-07-10 09:30:22 -07:00
mehmet furkan şahin 3afa7f425d Topn aggregates are supported 2018-07-10 14:33:42 +03:00
Murat Tuncer a7277526fd Make citus_stat_statements_reset() super user function 2018-07-10 11:21:20 +03:00
Marco Slot 89870e76ce Add a select_opens_transaction_block GUC 2018-07-08 03:50:39 +02:00
Brian Cloutier a54f9a6d2c network proxy-based failure testing
- Lots of detail is in src/test/regress/mitmscripts/README
- Create a new target, make check-failure, which runs tests
- Tells travis how to install everything and run the tests
2018-07-06 12:38:53 -07:00
mehmet furkan şahin df11dda750 hll aggregates are tested 2018-07-05 08:19:01 +03:00
Onder Kalaci d83be3a33f Enforce foreign key restrictions inside transaction blocks
When a hash distributed table have a foreign key to a reference
table, there are few restrictions we have to apply in order to
prevent distributed deadlocks or reading wrong results.

The necessity to apply the restrictions arise from cascading
nature of foreign keys. When a foreign key on a reference table
cascades to a distributed table, a single operation over a single
connection can acquire locks on multiple shards of the distributed
table. Thus, any parallel operation on that distributed table, in the
same transaction should not open parallel connections to the shards.
Otherwise, we'd either end-up with a self-distributed deadlock or
read wrong results.

As briefly described above, the restrictions that we apply is done
by tracking the distributed/reference relation accesses inside
transaction blocks, and act accordingly when necessary.

The two main rules are as follows:
   - Whenever a parallel distributed relation access conflicts
     with a consecutive reference relation access, Citus errors
     out
   - Whenever a reference relation access is followed by a
     conflicting parallel relation access, the execution mode
     is switched to sequential mode.

There are also some other notes to mention:
   - If the user does SET LOCAL citus.multi_shard_modify_mode
     TO 'sequential';, all the queries should simply work with
     using one connection per worker and sequentially executing
     the commands. That's obviously a slower approach than Citus'
     usual parallel execution. However, we've at least have a way
     to run all commands successfully.

   - If an unrelated parallel query executed on any distributed
     table, we cannot switch to sequential mode. Because, the essense
     of sequential mode is using one connection per worker. However,
     in the presence of a parallel connection, the connection manager
     picks those connections to execute the commands. That contradicts
     with our purpose, thus we error out.

   - COPY to a distributed table cannot be executed in sequential mode.
     Thus, if we switch to sequential mode and COPY is executed, the
     operation fails and there is currently no way of implementing that.
     Note that, when the local table is not empty and create_distributed_table
     is used, citus uses COPY internally. Thus, in those cases,
     create_distributed_table() will also fail.

   - There is a GUC called citus.enforce_foreign_key_restrictions
     to disable all the checks. We added that GUC since the restrictions
     we apply is sometimes a bit more restrictive than its necessary.
     The user might want to relax those. Similarly, if you don't have
     CASCADEing reference tables, you might consider disabling all the
     checks.
2018-07-03 17:05:55 +03:00
velioglu 6be6911ed9 Create foreign key relation graph and functions to query on it 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 89a8d6ab95 FK from dist to ref is tested for partitioning, MX 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 4db72c99f6 Specific DDLs are sequentialized when there is FK
-[x] drop constraint
-[x] drop column
-[x] alter column type
-[x] truncate

are sequentialized if there is a foreign constraint from
a distributed table to a reference table on the affected relations
by the above commands.
2018-07-03 17:05:55 +03:00
mehmet furkan şahin e37f76c276 tests are added 2018-07-03 17:05:01 +03:00
mehmet furkan şahin 2fa4e38841 FK from dist to ref can be added with alter table 2018-07-03 17:05:01 +03:00
Murat Tuncer 23800f50f1 Update citus_stat_statements view and regression tests 2018-07-03 16:14:13 +03:00
Murat Tuncer e532755a6e Fix bug in partition column extraction
added strip_implicit_coercion prior to
checking if the expression is Const.
This is important to find values for types
like bigint.
2018-07-02 18:08:16 +03:00
Onder Kalaci 4ccabf9544 Increase timeout to keep appveyor happy 2018-06-25 18:40:40 +03:00
Onder Kalaci 8ccb8b679e Real-time executor marks multi shard relation accesses before opening connections 2018-06-25 18:40:31 +03:00
Onder Kalaci 2890154420 Make sure that TRUNCATE always opens a DDL access 2018-06-25 18:40:31 +03:00
Onder Kalaci 21038f0d0e Make sure that inter-shard DDL commands are always covers both tables 2018-06-25 18:40:30 +03:00
Onder Kalaci 2f01894589 Track relation accesses using the connection management infrastructure 2018-06-25 18:40:30 +03:00
Onder Kalaci d5472614df Use non-data connection for intermediate results
Make sure that intermediate results use a connection that is
not associated with any placement. That is useful in two ways:
    - More complex queries can be executed with CTEs
    - Safely use the same connections when there is a foreign key
      to reference table from a distributed table, which needs to
      use the same connection for modifications since the reference
      table might cascade to the distributed table.
2018-06-21 13:26:13 +03:00
Jason Petersen 7a75c2ed31 Add connparam invalidation trigger creation logic
This needs to live in Community, since we haven't yet added the com-
plication of having divergent upgrade scripts in Enterprise.
2018-06-20 14:13:18 -06:00
mehmet furkan şahin 2b2ce036eb create_distributed_table honors sequential mode 2018-06-19 17:33:45 +03:00
Onder Kalaci 8f5821493a Implement C interface for setting GUC
We need the ability to switch to sequential mode (e.g.,
 SET LOCAL citus.multi_shard_modify_mode = 'sequential'). This
commit enables that.
2018-06-19 10:23:43 +03:00
Marco Slot f3f2805978
Fix use-after-free that may occur for INSERT..SELECT in prepared statements 2018-06-18 22:55:06 -06:00
velioglu 53b2e81d01 Adds SELECT ... FOR UPDATE support for router plannable queries 2018-06-18 13:55:17 +03:00
Marco Slot 28860b2469 Remove volatile explain plan from regression tests 2018-06-15 00:21:52 +02:00
Marco Slot 04da0cf9b1 Remove costs from explain plans in window_functions tests 2018-06-14 23:51:46 +02:00
Jason Petersen 5bf7bc64ba Add pg_dist_authinfo schema and validation
This table will be used by Citus Enterprise to populate authentication-
related fields in outbound connections; Citus Community lacks support
for this functionality.
2018-06-13 11:16:26 -06:00
Onder Kalaci a5370f5bb0 Realtime executor honours multi_shard_modify_mode
We're relying on multi_shard_modify_mode GUC for real-time SELECTs.
The name of the GUC is unfortunate, but, adding one more GUC
(or renaming the GUC) would make the UX even worse. Given that this
mode is mostly important for transaction blocks that involve modification
/DDL queries along with real-time SELECTs, we can live with the confusion.
2018-06-06 14:59:54 +03:00
Onder Kalaci d918556dca INSERT .. SELECT pushdown honors multi_shard_modification_mode 2018-06-06 12:42:23 +03:00
Onder Kalaci 336044f2a8 master_modify_multiple_shards() and TRUNCATE honors multi_shard_modification_mode 2018-06-06 12:29:05 +03:00
Onder Kalaci 51cb24b39c Increase timeout to make the appveyor tests happy 2018-06-05 17:52:18 +03:00
Onder Kalaci df44956dc3 Make sure that sequential DDL opens a single connection to each node
After this commit DDL commands honour `citus.multi_shard_modify_mode`.

We preferred using the code-path that executes single task router
queries (e.g., ExecuteSingleModifyTask()) in order not to invent
a new executor that is only applicable for DDL commands that require
sequential execution.
2018-06-05 17:52:17 +03:00
Murat Tuncer ba50e3f33e Add handling for grant/revoke all tables in schema 2018-05-31 13:47:02 +03:00
Brian Cloutier a7e09d777b Increase deadlock timeout so we get fewer signals 2018-05-16 17:07:24 -07:00
Marco Slot 61d2c0f618 Stabilise output of multi_shard_update_delete test 2018-05-11 08:33:23 +02:00
mehmet furkan şahin b8c3197399 enterprise test fixes 2018-05-10 13:06:54 +03:00
mehmet furkan şahin 785a86ed0a Tests are updated to use create_distributed_table 2018-05-10 11:18:59 +03:00
Marco Slot 9438e5bde9 Ensure single-shard modifying CTEs are part of distributed transaction 2018-05-06 12:49:40 +02:00
velioglu caa27161ca Check volatile functions in modify queries 2018-05-08 11:16:40 +03:00
Marco Slot 2f9c8c6af0 Allow DML commands with unreferenced SELECT CTEs 2018-05-03 14:53:26 +02:00
Marco Slot f8cfe07fd1 Support intermediate results in distributed INSERT..SELECT 2018-05-03 14:42:28 +02:00
Marco Slot 90cdfff602 Implement recursive planning for DML statements 2018-05-03 14:42:28 +02:00
mehmet furkan şahin ef90122cd3 shard count for some of the tests are increased 2018-05-03 10:44:43 +03:00
Onder Kalaci 317dd02a2f Implement single repartitioning on hash distributed tables
* Change worker_hash_partition_table() such that the
     divergence between Citus planner's hashing and
     worker_hash_partition_table() becomes the same.

   * Rename single partitioning to single range partitioning.

   * Add single hash repartitioning. Basically, logical planner
     treats single hash and range partitioning almost equally.
     Physical planner, on the other hand, treats single hash and
     dual hash repartitioning almost equally (except for JoinPruning).

   * Add a new GUC to enable this feature
2018-05-02 18:50:55 +03:00
velioglu 32bcd610c1 Support modify queries with multiple tables
With this commit we begin to support modify queries with multiple
tables if these queries are pushdownable.
2018-05-02 16:22:26 +03:00
Brian Cloutier f8fb7a27fb Don't copyObject into the wrong memory context
utilityStmt sometimes (such as when it's inside of a plpgsql function)
comes from a cached plan, which is kept in a child of the
CacheMemoryContext. When we naively call copyObject we're copying it into
a statement-local context, which corrupts the cached plan when it's
thrown away.
2018-05-01 15:34:32 -07:00
Marco Slot 2559b84049 Drop shards as current user instead of super user 2018-05-01 09:57:20 +02:00
velioglu 121ff39b26 Removes large_table_shard_count GUC 2018-04-29 10:34:50 +02:00
mehmet furkan şahin a4153c6ab1 notice handler is implemented 2018-04-27 14:37:01 +03:00
Marco Slot 3d3c19a717
Improve messages for essential connection failures 2018-04-26 12:58:47 -06:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Onder Kalaci 814f0e3acc Ensure Citus never try to access a not planned subquery
PostgreSQL might remove some of the subqueries when they do not
contribute to the query result at all. Citus should not try to
access such subqueries during planning.
2018-04-20 13:52:00 +03:00
Brian Cloutier d02f761d8e Change intermediate_results test to not crash 2018-04-17 15:14:02 -07:00
mehmet furkan şahin 00e786af00 Capital named schema support is added 2018-04-17 17:17:42 +03:00
mehmet furkan şahin e5a5502b16 Adds support for multiple ANDs in Having
This PR adds support for multiple AND expressions in Having
for pushdown planner. We simply make a call to make_ands_explicit
from MultiLogicalPlanOptimize for the having qual in
workerExtendedOpNode.
2018-04-16 14:14:48 +03:00
velioglu 82b2d21b0c Convert broadcast join to reference join
After this commit large_table_shard_count wont be used to
check whether broadcast join, which is renamed as reference
join, can be applied. Reference join can only be applied over
reference tables.
2018-04-13 12:58:14 +03:00
velioglu 1b92812be2 Add co-placement check to CoPartition function 2018-04-13 12:13:08 +03:00
Marco Slot 9318aeee6b Allow multiple size function calls per query 2018-04-12 14:16:17 +02:00
Burak Yucesoy b33b282030 Fix bug while DROPping partitioned table from worker
We recently added partitionin support to Citus MX. We should not execute
DROP table commands from MX workers but at the moment we try to execute
such commands for partitioned tables. This PR fixes that problem by
adding check.
2018-04-09 13:50:21 +03:00
Burak Yucesoy 0c283fa8a3 Add partitioning support to MX tables
Previously, we prevented creation of partitioned tables on Citus MX.
We decided to not focus on this feature until there is a need. Since
now there are requests for this feature, we are implementing support
for partitioned tables on Citus MX.
2018-04-06 12:47:06 +03:00
velioglu 72dfe4a289 Adds colocation check to local join 2018-04-04 22:49:27 +03:00
velioglu 698d585fb5 Remove broadcast join logic
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
2018-03-30 11:45:19 +03:00
Brian Cloutier 9aff4384a1 Make tests platform independent
- Force all platforms to use the same collation
- Force all platforms to use the same locale
- Use /dev/null or NUL, depending on platform
- Use /tmp or %TEMP%, dpeending on platform
2018-03-27 14:18:48 -07:00
Murat Tuncer 1440caeef2
Fix incorrect limit pushdown when distinct clause is not superset of group by (#2035)
Pushing down limit and order by into workers may produce
wrong output when distinct on() clause has expressions,
aggregates, or window functions.

This checking allows pushing down of limits only if
distinct clause is a superset of group by clause. i.e. it contains all clauses in group by.
2018-03-07 13:24:56 +03:00
Onder Kalaci 40b898b59f Improve error messages for INSERT queries that have subqueries 2018-03-05 14:46:47 +02:00
Murat Tuncer 76f6883d5d
Add support for window functions that can be pushed down to worker (#2008)
This is the first of series of window function work.

We can now support window functions that can be pushed down to workers.
Window function must have distribution column in the partition clause
 to be pushed down.
2018-03-01 19:07:07 +03:00
Marco Slot dc7213a11c Use expressions in the ORDER BY in bool_agg 2018-02-27 23:52:44 +01:00
Marco Slot c723a1fa32 Add support for bool and bit aggregates 2018-02-27 23:48:25 +01:00
Murat Tuncer e13c5beced
Fix worker query when order by avg aggregate is used (#2024)
We push down order by to worker query when limit is specified
(with some other additional checks). If the query has an expression
on an aggregate or avg aggregate by itself, and there is an order
by on this particular target we may send wrong order by to worker
query with potential to affect query result.

The fix creates a auxilary target entry in the worker query and
uses that target entry for sorting.
2018-02-28 12:12:54 +03:00
Metin Doslu bcf660475a Add support for modifying CTEs 2018-02-27 15:08:32 +02:00
velioglu 78e6d990a2 Fix master plan of the query with distinct, aggregate and group by clauses.
Before this PR, we were trusting on the columns of group by about
guaranteeing the uniqueness of the results. However, this assumption
is correct only if the columns in the group by is subset of columns
in the distinct clause. It can be wrong if we have part of group by
columns and some aggregation columns in the distinct clause. With
this PR, we add distinct plan on top of aggregate plan when necessary.
2018-02-26 15:30:15 +03:00
Onder Kalaci 1c930c96a3 Support non-co-located joins between subqueries
With #1804 (and related PRs), Citus gained the ability to
plan subqueries that are not safe to pushdown.

There are two high-level requirements for pushing down subqueries:

   * Individual subqueries that require a merge step (i.e., GROUP BY
     on non-distribution key, or LIMIT in the subquery etc). We've
     handled such subqueries via #1876.

    * Combination of subqueries that are not joined on distribution keys.
      This commit aims to recursively plan some of such subqueries to make
      the whole query safe to pushdown.

The main logic behind non colocated subquery joins is that we pick
an anchor range table entry and check for distribution key equality
of any  other subqueries in the given query. If for a given subquery,
we cannot find distribution key equality with the anchor rte, we
recursively plan that subquery.

We also used a hacky solution for picking relations as the anchor range
table entries. The hack is that we wrap them into a subquery. This is only
necessary since some of the attribute equivalance checks are based on
queries rather than range table entries.
2018-02-26 13:50:37 +02:00
Onder Kalaci cdb8d429a7 Add regression tests for non-colocated leaf subqueries 2018-02-26 13:28:24 +02:00
Onder Kalaci 4d4648aabd Change single shard mx test tables to reference tables 2018-02-26 13:28:24 +02:00
Onder Kalaci 4d70c86645 Leaf level recursive planning for non colocated subqueries
With this commit, we enable recursive planning for the subqueries
that are not joined on the distribution keys.
2018-02-26 13:28:24 +02:00
Markus Sintonen 6202e80d06 Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg 2018-02-18 00:19:18 +02:00
velioglu 195ac948d2 Recursively plan subqueries in WHERE clause when FROM recurs 2018-02-13 19:52:12 +03:00
metdos 35f864bcaf Respect enable_hashagg in the master planner 2018-02-05 15:06:00 +02:00
Brian Cloutier b864d014ab
GetNextNodeId() incorrectly called PG_RETURN_DATUM
- Also stabilize the output of a multi_router_planner test
2018-01-29 15:32:36 -08:00
Dimitri Fontaine 1f088791bd Add DDL tests with non-public schema.
Citus sometimes have regressions around non-default schema support, meaning
not public and not in the search_path, per @marcocitus. This patch changes
some regression tests to use a non-default schema in order to cover more
cases.
2018-01-11 13:21:24 +01:00
Dimitri Fontaine e010238280 Implement ALTER TABLE ... RENAME TO ...
The implementation was already mostly in place, but the code was protected
by a principled check against the operation. Turns out there's a nasty
concurrency bug though with long identifier names, much as in #1664.

To prevent deadlocks from happening, we could either review the DDL
transaction management in shards and placements, or we can simply reject
names with (NAMEDATALEN - 1) chars or more — that's because of the
PostgreSQL array types being created with a one-char prefix: '_'.
2018-01-11 13:21:24 +01:00
Marco Slot 8f69973411 Fix cancellation issues in the real-time executor (#1905) 2018-01-01 23:10:29 -05:00