Commit Graph

595 Commits (1656b519c49062dee5bb5fc8174963821dda9f17)

Author SHA1 Message Date
Murat Tuncer 6c66033455 Add failure tests for multi-shard update/delete
Failure tests for update/delete  on hash distributed tables
using 1PC and 2PC
2018-10-03 15:43:48 +03:00
Murat Tuncer 9bdef67bab Do not create inherited constraints on worker shards
PG now allows foreign keys on partitioned tables.
Each foreign key constraint on partitioned table
is propagated down to partitions.

We used to create all constraints on shards when we are creating
a new shard, or when just simply moving a shard from one worker
to another. We also used the same logic when creating a copy of
coordinator table in mx node.

With this change we create the constraint on worker node only if
it is not an inherited constraint.
2018-09-28 14:14:51 +03:00
Onder Kalaci cdc0d1491c Make sure to use correct execution mode for TRUNCATE
We used to set the execution mode in the truncate trigger. However,
when multiple tables are truncated with a single command, we could
set the execution mode very late. Instead, now set the execution mode
on the utility hook.
2018-09-25 15:35:27 +03:00
Jason Petersen d7f10b0896 Rewrite parallel ID test to avoid costly JITting
By setting the CPU tuple cost so high, we were triggering JIT. Instead,
we should use parallel_tuple_cost.

See: rhaas.blogspot.com/2018/06/using-forceparallelmode-correctly.html
2018-09-24 09:29:53 +03:00
Onder Kalaci abc443d7fa Make sure that shard repair considers replication factor 2018-09-21 15:24:49 +03:00
Onder Kalaci c1b5a04f6e Allow partitioned tables with replication factor > 1
With this commit, we all partitioned distributed tables with
replication factor > 1. However, we also have many restrictions.

In summary, we disallow all kinds of modifications (including DDLs)
on the partition tables. Instead, the user is allowed to run the
modifications over the parent table.

The necessity for such a restriction have two aspects:
   - We need to acquire shard resource locks appropriately
   - We need to handle marking partitions INVALID in case
     of any failures. Note that, in theory, the parent table
     should also become INVALID, which is too aggressive.
2018-09-21 14:40:41 +03:00
velioglu d7f75e5b48 Add citus_lock_waits to show locked distributed queries 2018-09-20 14:13:51 +03:00
Murat Tuncer 0f6e514bfb Fixes a bug on not being able to drop index on a partitioned table.
Reason for the failure is that PG11 introduced a new relation kind
RELKIND_PARTITIONED_INDEX to be used for partitioned indices.

We expanded our check to cover that case.
2018-09-19 13:15:05 +03:00
Marco Slot f34ab55389 Fix bug preventing rollback in stored procedure 2018-08-31 20:49:20 +02:00
Onder Kalaci 41d606b575 Use tree walker instad of mutator in relation visibility
This commit uses *_walker instead of *_mutator for performance reasons.
Given that we're only updating a functionId in the tree, the approach
seems fine.
2018-09-18 09:33:01 +03:00
Marco Slot 55f46acedf Support TABLESAMPLE in router queries 2018-08-31 13:22:38 +02:00
Brian Cloutier 2fae06056a
Attempt to stabilize packet dumps and add them back it 2018-09-12 22:10:39 -06:00
Murat Tuncer ae0032dff8 Add regression tests for procedure calls
PG11 introduced PROCEDURE concept similar to FUNCTION
Procedure's allow committing/rolling back behavior.

This commmit adds regression tests for procedure calls.
2018-09-12 10:28:50 +03:00
velioglu d1f005daac Adds UDFs for testing MX functionalities with isolation tests 2018-09-12 07:04:16 +03:00
Onder Kalaci d657759c97 Views to Provide some insight about the distributed transactions on Citus MX
With this commit, we implement two views that are very similar
to pg_stat_activity, but showing queries that are involved in
distributed queries:

    - citus_dist_stat_activity: Shows all the distributed queries
    - citus_worker_stat_activity: Shows all the queries on the shards
                                  that are initiated by distributed queries.

Both views have the same columns in the outputs. In very basic terms, both of the views
are meant to provide some useful insights about the distributed
transactions within the cluster. As the names reveal, both views are similar to pg_stat_activity.
Also note that these views can be pretty useful on Citus MX clusters.

Note that when the views are queried from the worker nodes, they'd not show the distributed
transactions that are initiated from the coordinator node. The reason is that the worker
nodes do not know the host/port of the coordinator. Thus, it is advisable to query the
views from the coordinator.

If we bucket the columns that the views returns, we'd end up with the following:

- Hostnames and ports:
   - query_hostname, query_hostport: The node that the query is running
   - master_query_host_name, master_query_host_port: The node in the cluster
                                                   initiated the query.
    Note that for citus_dist_stat_activity view, the query_hostname-query_hostport
    is always the same with master_query_host_name-master_query_host_port. The
    distinction is mostly relevant for citus_worker_stat_activity. For example,
    on Citus MX, a users starts a transaction on Node-A, which starts worker
    transactions on Node-B and Node-C. In that case, the query hostnames would be
    Node-B and Node-C whereas the master_query_host_name would Node-A.

- Distributed transaction related things:
    This is mostly the process_id, distributed transactionId and distributed transaction
    number.

- pg_stat_activity columns:
    These two views get all the columns from pg_stat_activity. We're basically joining
    pg_stat_activity with get_all_active_transactions on process_id.
2018-09-10 21:33:27 +03:00
Onder Kalaci 7de5e30432 Change flaky explain test to non-explain
This test's output changes depending on which worker is
picked for explain (e.g., worker port in the output changes).

Given that the test is only aiming to ensure that CTEs inside
CTEs work fine in DML queries, it should be fine to get rid of
the EXPLAIN. The output is verified to be correct as well.
2018-09-10 16:01:30 +03:00
Onder Kalaci 5cf8fbe7b6 Add infrastructure to relation if exists 2018-09-07 14:49:36 +03:00
Onder Kalaci bf28dd0cff Do not recover wrong distributed transactions in MX 2018-09-07 09:52:46 +03:00
Murat Tuncer d8279569b8 Add support for INCLUDE option in index creation
INCLUDE is a new feature in index creation in PG11.
Included column/expression paramameters are now forwarded to shards
2018-09-06 19:41:06 +03:00
Murat Tuncer 7d3f7c2bf4 Add regression tests related to new PG11 partitioning features 2018-09-06 19:06:28 +03:00
Murat Tuncer 55cf3e321c Add regression tests for new PG11 window functions
- <offset> preceding/following
- exclude
2018-09-04 10:48:04 +03:00
Onder Kalaci 1b3257816e Make sure that table is dropped before shards are dropped
This commit fixes a bug where a concurrent DROP TABLE deadlocks
with SELECT (or DML) when the SELECT is executed from the workers.

The problem was that Citus used to remove the metadata before
droping the table on the workers. That creates a time window
where the SELECT starts running on some of the nodes and DROP
table on some of the other nodes.
2018-09-04 08:57:20 +03:00
Onder Kalaci 2ab0e63b30 Fix flaky test 2018-09-03 14:06:32 +03:00
Onder Kalaci 26e308bf2a Support TRUNCATE from the MX worker nodes
This commit enables support for TRUNCATE on both
distributed table and reference tables.

The basic idea is to acquire lock on the relation by sending
the TRUNCATE command to all metedata worker nodes. We only
skip sending the TRUNCATE command to the node that actually
executus the command to prevent a self-distributed-deadlock.
2018-09-03 14:06:31 +03:00
velioglu bd30e3e908 Add support for writing to reference tables from MX nodes 2018-08-27 18:15:04 +03:00
Onder Kalaci b8af8c359b Make sure that modifying CTEs always use the correct execution mode 2018-08-23 14:53:55 +03:00
Onder Kalaci cb481f55cf Prevent excessive number of unnecessary range table traversal 2018-08-22 11:45:00 +03:00
Jason Petersen c3c0d62ca6
Add test showing poolinfo validation works
In other words, that it errors out.
2018-08-16 20:14:18 -06:00
Nils Dijk 6cf4516fdb
fix \d change for indexes in pg11 2018-08-15 23:27:31 -06:00
Nils Dijk 2a9d47e1a6
fix pg11 tests 2018-08-15 23:27:31 -06:00
mehmet furkan şahin 1a3b9f731e Make master_disable/activate_node runnable when superuser 2018-08-15 00:43:35 -07:00
Onder Kalaci 85d418412d Fix DDL execution problem on MX when search_path is used
Make sure that the coordinator sends the commands when the search
path synchronised with the coordinator's search_path. This is only
important when Citus sends the commands that are directly relayed
to the worker nodes. For example, the deparsed DLL commands or
queries always adds schema qualifications to the queries. So, they
do not require this change.
2018-08-13 16:34:50 +03:00
velioglu 44fc9f46fc Add create_distributed_table (without data) failure tests 2018-08-13 09:31:15 +03:00
Onder Kalaci 974cbf11a5 Hide shard names on MX worker nodes
This commit by default enables hiding shard names on MX workers
by simple replacing `pg_table_is_visible()` calls with
`citus_table_is_visible()` calls on the MX worker nodes. The latter
function filters out tables that are known to be shards.

The main motivation of this change is a better UX. The functionality
can be opted out via a GUC.

We also added two views, namely citus_shards_on_worker and
citus_shard_indexes_on_worker such that users can query
them to see the shards and their corresponding indexes.

We also added debug messages such that the filtered tables can
be interactively seen by setting the level to DEBUG1.
2018-08-07 14:21:45 +03:00
mehmet furkan şahin c1f7631f98 failure tests on create_distributed_table nonempty 2018-08-03 12:41:25 -07:00
velioglu b21bd2d1a0 Add create_reference_table failure tests 2018-08-03 17:49:57 +03:00
velioglu bc27651dd9 Add failure test for copy on hash distributed table 2018-08-03 17:11:09 +03:00
Brian Cloutier 82fa85fa5b
Add tests for 1PC COPY on append and hash-distributed tables
Add tests for 1PC COPY on append and hash-distributed tables
2018-07-31 15:17:59 -07:00
Brian Cloutier f0f7a691a3
Prevent failure tests from hanging by using a port outside the ephemeral port range
- mitmdump now listens on port 9060
- Add some logging to fluent.py, making issues like this easier to debug in the future
- Fail the tests if something is already running on the port mitmProxy tries to use
- check-failure now works with VPATH builds
2018-07-31 14:30:56 -07:00
mehmet furkan şahin dde86cb731 Copy to reference table failure tests are added 2018-07-30 11:48:12 +03:00
mehmet furkan şahin bc757845eb Citus versioning fix 2018-07-26 10:56:34 +03:00
Brian Cloutier ace248d13c Remove unnecessary calls to 'conn.allow()' 2018-07-25 17:45:00 -07:00
mehmet furkan şahin 6d0fbbace7 ALTER TABLE %s ADD COLUMN constraint check is added 2018-07-24 15:53:05 +03:00
Nils Dijk 2d13900230
error on unsupported changing of distirbution column in ON CONFLICT for INSERT ... SELECT 2018-07-23 15:18:21 +02:00
Marco Slot 69a3ebea5f Ensure StartPlacementListConnection connects with username supplied by the caller 2018-07-19 20:10:11 +02:00
Murat Tuncer a837dde1a0 Add failure tests for master add/remove/disable/active node 2018-07-13 18:06:24 +03:00
mehmet furkan şahin f854420079 truncate failure tests are added 2018-07-13 13:20:50 +03:00
Murat Tuncer 2795494758 Added failure test for create index concurrently 2018-07-13 11:53:49 +03:00
Onder Kalaci a446e71ee7 Add failure testing for DDL commands
This commit adds an extensive failure testing, which covers quite
a bit of things and their combinations:
   - 1PC vs 2PC
   - Replication factor 1 and Replication factor 2
   - Network failures and query cancellations
   - Sequential vs Parallel query execution mode
2018-07-12 13:05:29 +03:00
Jason Petersen 318119910b
Add pg_dist_poolinfo table
For storing nodes' pool host/port overrides.
2018-07-10 09:30:22 -07:00
mehmet furkan şahin 3afa7f425d Topn aggregates are supported 2018-07-10 14:33:42 +03:00
Murat Tuncer a7277526fd Make citus_stat_statements_reset() super user function 2018-07-10 11:21:20 +03:00
Marco Slot 89870e76ce Add a select_opens_transaction_block GUC 2018-07-08 03:50:39 +02:00
Brian Cloutier a54f9a6d2c network proxy-based failure testing
- Lots of detail is in src/test/regress/mitmscripts/README
- Create a new target, make check-failure, which runs tests
- Tells travis how to install everything and run the tests
2018-07-06 12:38:53 -07:00
mehmet furkan şahin df11dda750 hll aggregates are tested 2018-07-05 08:19:01 +03:00
Onder Kalaci d83be3a33f Enforce foreign key restrictions inside transaction blocks
When a hash distributed table have a foreign key to a reference
table, there are few restrictions we have to apply in order to
prevent distributed deadlocks or reading wrong results.

The necessity to apply the restrictions arise from cascading
nature of foreign keys. When a foreign key on a reference table
cascades to a distributed table, a single operation over a single
connection can acquire locks on multiple shards of the distributed
table. Thus, any parallel operation on that distributed table, in the
same transaction should not open parallel connections to the shards.
Otherwise, we'd either end-up with a self-distributed deadlock or
read wrong results.

As briefly described above, the restrictions that we apply is done
by tracking the distributed/reference relation accesses inside
transaction blocks, and act accordingly when necessary.

The two main rules are as follows:
   - Whenever a parallel distributed relation access conflicts
     with a consecutive reference relation access, Citus errors
     out
   - Whenever a reference relation access is followed by a
     conflicting parallel relation access, the execution mode
     is switched to sequential mode.

There are also some other notes to mention:
   - If the user does SET LOCAL citus.multi_shard_modify_mode
     TO 'sequential';, all the queries should simply work with
     using one connection per worker and sequentially executing
     the commands. That's obviously a slower approach than Citus'
     usual parallel execution. However, we've at least have a way
     to run all commands successfully.

   - If an unrelated parallel query executed on any distributed
     table, we cannot switch to sequential mode. Because, the essense
     of sequential mode is using one connection per worker. However,
     in the presence of a parallel connection, the connection manager
     picks those connections to execute the commands. That contradicts
     with our purpose, thus we error out.

   - COPY to a distributed table cannot be executed in sequential mode.
     Thus, if we switch to sequential mode and COPY is executed, the
     operation fails and there is currently no way of implementing that.
     Note that, when the local table is not empty and create_distributed_table
     is used, citus uses COPY internally. Thus, in those cases,
     create_distributed_table() will also fail.

   - There is a GUC called citus.enforce_foreign_key_restrictions
     to disable all the checks. We added that GUC since the restrictions
     we apply is sometimes a bit more restrictive than its necessary.
     The user might want to relax those. Similarly, if you don't have
     CASCADEing reference tables, you might consider disabling all the
     checks.
2018-07-03 17:05:55 +03:00
velioglu 6be6911ed9 Create foreign key relation graph and functions to query on it 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 89a8d6ab95 FK from dist to ref is tested for partitioning, MX 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 4db72c99f6 Specific DDLs are sequentialized when there is FK
-[x] drop constraint
-[x] drop column
-[x] alter column type
-[x] truncate

are sequentialized if there is a foreign constraint from
a distributed table to a reference table on the affected relations
by the above commands.
2018-07-03 17:05:55 +03:00
mehmet furkan şahin e37f76c276 tests are added 2018-07-03 17:05:01 +03:00
mehmet furkan şahin 2fa4e38841 FK from dist to ref can be added with alter table 2018-07-03 17:05:01 +03:00
Murat Tuncer 23800f50f1 Update citus_stat_statements view and regression tests 2018-07-03 16:14:13 +03:00
Murat Tuncer e532755a6e Fix bug in partition column extraction
added strip_implicit_coercion prior to
checking if the expression is Const.
This is important to find values for types
like bigint.
2018-07-02 18:08:16 +03:00
Onder Kalaci 4ccabf9544 Increase timeout to keep appveyor happy 2018-06-25 18:40:40 +03:00
Onder Kalaci 8ccb8b679e Real-time executor marks multi shard relation accesses before opening connections 2018-06-25 18:40:31 +03:00
Onder Kalaci 2890154420 Make sure that TRUNCATE always opens a DDL access 2018-06-25 18:40:31 +03:00
Onder Kalaci 21038f0d0e Make sure that inter-shard DDL commands are always covers both tables 2018-06-25 18:40:30 +03:00
Onder Kalaci 2f01894589 Track relation accesses using the connection management infrastructure 2018-06-25 18:40:30 +03:00
Onder Kalaci d5472614df Use non-data connection for intermediate results
Make sure that intermediate results use a connection that is
not associated with any placement. That is useful in two ways:
    - More complex queries can be executed with CTEs
    - Safely use the same connections when there is a foreign key
      to reference table from a distributed table, which needs to
      use the same connection for modifications since the reference
      table might cascade to the distributed table.
2018-06-21 13:26:13 +03:00
Jason Petersen 7a75c2ed31 Add connparam invalidation trigger creation logic
This needs to live in Community, since we haven't yet added the com-
plication of having divergent upgrade scripts in Enterprise.
2018-06-20 14:13:18 -06:00
mehmet furkan şahin 2b2ce036eb create_distributed_table honors sequential mode 2018-06-19 17:33:45 +03:00
Onder Kalaci 8f5821493a Implement C interface for setting GUC
We need the ability to switch to sequential mode (e.g.,
 SET LOCAL citus.multi_shard_modify_mode = 'sequential'). This
commit enables that.
2018-06-19 10:23:43 +03:00
Marco Slot f3f2805978
Fix use-after-free that may occur for INSERT..SELECT in prepared statements 2018-06-18 22:55:06 -06:00
velioglu 53b2e81d01 Adds SELECT ... FOR UPDATE support for router plannable queries 2018-06-18 13:55:17 +03:00
Marco Slot 28860b2469 Remove volatile explain plan from regression tests 2018-06-15 00:21:52 +02:00
Marco Slot 04da0cf9b1 Remove costs from explain plans in window_functions tests 2018-06-14 23:51:46 +02:00
Jason Petersen 5bf7bc64ba Add pg_dist_authinfo schema and validation
This table will be used by Citus Enterprise to populate authentication-
related fields in outbound connections; Citus Community lacks support
for this functionality.
2018-06-13 11:16:26 -06:00
Onder Kalaci a5370f5bb0 Realtime executor honours multi_shard_modify_mode
We're relying on multi_shard_modify_mode GUC for real-time SELECTs.
The name of the GUC is unfortunate, but, adding one more GUC
(or renaming the GUC) would make the UX even worse. Given that this
mode is mostly important for transaction blocks that involve modification
/DDL queries along with real-time SELECTs, we can live with the confusion.
2018-06-06 14:59:54 +03:00
Onder Kalaci d918556dca INSERT .. SELECT pushdown honors multi_shard_modification_mode 2018-06-06 12:42:23 +03:00
Onder Kalaci 336044f2a8 master_modify_multiple_shards() and TRUNCATE honors multi_shard_modification_mode 2018-06-06 12:29:05 +03:00
Onder Kalaci 51cb24b39c Increase timeout to make the appveyor tests happy 2018-06-05 17:52:18 +03:00
Onder Kalaci df44956dc3 Make sure that sequential DDL opens a single connection to each node
After this commit DDL commands honour `citus.multi_shard_modify_mode`.

We preferred using the code-path that executes single task router
queries (e.g., ExecuteSingleModifyTask()) in order not to invent
a new executor that is only applicable for DDL commands that require
sequential execution.
2018-06-05 17:52:17 +03:00
Murat Tuncer ba50e3f33e Add handling for grant/revoke all tables in schema 2018-05-31 13:47:02 +03:00
Brian Cloutier a7e09d777b Increase deadlock timeout so we get fewer signals 2018-05-16 17:07:24 -07:00
Marco Slot 61d2c0f618 Stabilise output of multi_shard_update_delete test 2018-05-11 08:33:23 +02:00
mehmet furkan şahin b8c3197399 enterprise test fixes 2018-05-10 13:06:54 +03:00
mehmet furkan şahin 785a86ed0a Tests are updated to use create_distributed_table 2018-05-10 11:18:59 +03:00
Marco Slot 9438e5bde9 Ensure single-shard modifying CTEs are part of distributed transaction 2018-05-06 12:49:40 +02:00
velioglu caa27161ca Check volatile functions in modify queries 2018-05-08 11:16:40 +03:00
Marco Slot 2f9c8c6af0 Allow DML commands with unreferenced SELECT CTEs 2018-05-03 14:53:26 +02:00
Marco Slot f8cfe07fd1 Support intermediate results in distributed INSERT..SELECT 2018-05-03 14:42:28 +02:00
Marco Slot 90cdfff602 Implement recursive planning for DML statements 2018-05-03 14:42:28 +02:00
mehmet furkan şahin ef90122cd3 shard count for some of the tests are increased 2018-05-03 10:44:43 +03:00
Onder Kalaci 317dd02a2f Implement single repartitioning on hash distributed tables
* Change worker_hash_partition_table() such that the
     divergence between Citus planner's hashing and
     worker_hash_partition_table() becomes the same.

   * Rename single partitioning to single range partitioning.

   * Add single hash repartitioning. Basically, logical planner
     treats single hash and range partitioning almost equally.
     Physical planner, on the other hand, treats single hash and
     dual hash repartitioning almost equally (except for JoinPruning).

   * Add a new GUC to enable this feature
2018-05-02 18:50:55 +03:00
velioglu 32bcd610c1 Support modify queries with multiple tables
With this commit we begin to support modify queries with multiple
tables if these queries are pushdownable.
2018-05-02 16:22:26 +03:00
Brian Cloutier f8fb7a27fb Don't copyObject into the wrong memory context
utilityStmt sometimes (such as when it's inside of a plpgsql function)
comes from a cached plan, which is kept in a child of the
CacheMemoryContext. When we naively call copyObject we're copying it into
a statement-local context, which corrupts the cached plan when it's
thrown away.
2018-05-01 15:34:32 -07:00
Marco Slot 2559b84049 Drop shards as current user instead of super user 2018-05-01 09:57:20 +02:00
velioglu 121ff39b26 Removes large_table_shard_count GUC 2018-04-29 10:34:50 +02:00
mehmet furkan şahin a4153c6ab1 notice handler is implemented 2018-04-27 14:37:01 +03:00
Marco Slot 3d3c19a717
Improve messages for essential connection failures 2018-04-26 12:58:47 -06:00