citus

Commit Graph

Author	SHA1	Message	Date
Onder Kalaci	9e2e2a7300	Make sure to access PARAM_EXTERN accurately in PG 11 PG 11 has change the way that PARAM_EXTERN is processed. This commit ensures that Citus follows the same pattern. For details see the related Postgres commit: `6719b238e8`	2018-10-25 21:55:03 +03:00
Onder Kalaci	6e05921736	Processes that are blocked on advisory locks show up in wait edges Assign the distributed transaction id before trying to acquire the executor advisory locks. This is useful to show this backend in citus lock graphs (e.g., dump_global_wait_edges() and citus_lock_waits).	2018-10-24 13:32:13 +03:00
Jason Petersen	98c8267a37	Add single-shard modification failure tests I'm pretty sure a lot of this test functionality may be covered in some of our existing regression tests, but I've included them to ensure we put all failure-based tests under our new testing method for that kind of test. Didn't include lower replication factor, as (for a single-shard mod.), it's indistinguishable from modifying a reference table. So these all test modifications which hit a single, replicated shard.	2018-10-23 23:31:40 +01:00
Hadi Moshayedi	3e00bf1c0d	Don't throw error for DROP DATABASE IF EXISTS	2018-10-23 09:45:03 -04:00
Murat Tuncer	081594ad03	Don't allow PG11 travis failures anymore We made PG11 builds optional when we had an issue with mx isolation test that we could not solve back then. This commit solves the issue with a workaround by running start_metadata_sync_to_node outside the transaction block.	2018-10-19 15:20:53 +03:00
Jason Petersen	ae9a98c2d1	Attempt to address planner context crashes Both of these are a bit of a shot in the dark. In one case, we noticed a stack trace where a caller received a null pointer and attempted to dereference the memory context field (at 0x010). In the other, I saw that any error thrown from within AdjustParseTree could keep the stack from being cleaned up (presumably if we push we should always pop). Both stack traces were collected during times of high memory pressure and locally reproducing the problem locally or otherwise has been very tricky (i.e. it hasn't been reproduced reliably at all).	2018-10-18 08:41:51 -06:00
Murat Tuncer	c7efd8aff0	Add failure test for insert/select pushdown	2018-10-18 09:09:26 +03:00
Hadi Moshayedi	431ac80563	Keep track of cached entries in case of interruption. (#2433 ) * Keep track of cached entries in case of interruption. Previously we set DistTableCacheEntry->sortedShardIntervalArray and DistTableCacheEntry->shardIntervalArrayLength after we entered all related shard entries into DistShardCacheHash. The drawback was that if populating DistShardCacheHash was interrupted, ResetDistTableCacheEntry() didn't see the shard hash entries created, so was unable to clean them up. This patch fixes that by setting sortedShardIntervalArray earlier, and incrementing shardIntervalArrayLength as we enter shards into the cache.	2018-10-15 14:06:56 -04:00
Jason Petersen	9fb951c312	Fix user-facing typos Lintian found these (presumably by looking in the text section and running them through e.g. aspell).	2018-10-09 16:54:03 -07:00
velioglu	5713019058	Add failure tests for real time select queries	2018-10-09 14:12:02 -07:00
Onder Kalaci	73696a03e4	Make sure not to leak intermediate result folders on the workers	2018-10-09 22:47:56 +03:00
Hadi Moshayedi	7509c6c8fb	Add tests which check we disallow writes to local tables.	2018-10-06 10:54:44 +02:00
Marco Slot	d56baefe3d	Allow simple DML commands from hot standby	2018-10-06 10:54:44 +02:00
Jason Petersen	1cb48416eb	Add reference table failure tests Fairly straightforward; verified that modifications fail atomically if a worker is down or fails mid-transaction (i.e. all workers need to ack modifications to reference tables in order to persist changes).	2018-10-09 09:39:30 -07:00
Jason Petersen	9bcf2873a7	Add single-shard router select failure tests Including several examples from #1926. I couldn't understand why the recover_prepared_transactions "should be an error", and EXPLAIN has changed since the original bug (so that it runs EXPLAINs in txns, I think for EXPLAIN ANALYZE to not have side effects); other than that, most of the reported bugs now error out rather than crash or return an empty result set.	2018-10-09 08:51:10 -07:00
Jason Petersen	8f2aa00951	Add failure tests for VACUUM/ANALYZE VACUUM runs outside of a transaction, so the failure modes for it are somewhat straightforward, though ANALYZE runs in a 1pc transaction and multi-table VACUUM can fail between statements (PG 11 and higher).	2018-10-09 08:50:37 -07:00
Jason Petersen	ee4114bc7a	Failure tests for modifying multiple shards in txn Tests various failure points during a multi-shard modification within a transaction with multiple statements. Verifies three cases: * Reference tables (single shard, many placements) * Normal table with replication factor two * Multi-shard table with no replication In the replication-factor case, we expect shard health to be affected in some transactions; most others fail the transaction entirely and all we need verify is that no effects of the transaction are visible. Had trouble testing the final PREPARE/COMMIT/ROLLBACK phase of the 2pc, in particular because the error message produced includes the PID of the backend, which is unpredictable.	2018-10-09 09:17:32 -06:00
Murat Tuncer	4f8042085c	Fix drop schema in mx with partitioned tables Drop schema command fails in mx mode if there is a partitioned table with active partitions. This is due to fact that sql drop trigger receives all the dropped objects including partitions. When we call drop table on parent partition, it also drops the partitions on the mx node. This causes the drop table command on partitions to fail on mx node because they are already dropped when the partition parent was dropped. With this work we did not require the table to exist on worker_drop_distributed_table.	2018-10-08 17:01:54 -07:00
Murat Tuncer	71a910d2fa	Add failure tests for insert/select via coordinator	2018-10-04 18:01:19 +03:00
Murat Tuncer	0a987e9c0e	Fix cte subquery failure test	2018-10-03 15:43:48 +03:00
Murat Tuncer	d26b312cad	Add failure test for coordinator pull/push for cte	2018-10-03 15:43:48 +03:00
Murat Tuncer	6c66033455	Add failure tests for multi-shard update/delete Failure tests for update/delete on hash distributed tables using 1PC and 2PC	2018-10-03 15:43:48 +03:00
velioglu	512d23934f	Show router modify,select and real-time queries on MX views	2018-10-02 13:59:38 +03:00
Murat Tuncer	9bdef67bab	Do not create inherited constraints on worker shards PG now allows foreign keys on partitioned tables. Each foreign key constraint on partitioned table is propagated down to partitions. We used to create all constraints on shards when we are creating a new shard, or when just simply moving a shard from one worker to another. We also used the same logic when creating a copy of coordinator table in mx node. With this change we create the constraint on worker node only if it is not an inherited constraint.	2018-09-28 14:14:51 +03:00
Murat Tuncer	653c7e4ae0	Fix memory leak in FinishRemoteTransactionPrepare	2018-09-28 11:13:21 +03:00
Onder Kalaci	cdc0d1491c	Make sure to use correct execution mode for TRUNCATE We used to set the execution mode in the truncate trigger. However, when multiple tables are truncated with a single command, we could set the execution mode very late. Instead, now set the execution mode on the utility hook.	2018-09-25 15:35:27 +03:00
Marco Slot	1ca9a5b867	Do not allow unresolved parameters in INSERT...SELECT	2018-09-24 14:12:04 +02:00
Jason Petersen	d7f10b0896	Rewrite parallel ID test to avoid costly JITting By setting the CPU tuple cost so high, we were triggering JIT. Instead, we should use parallel_tuple_cost. See: rhaas.blogspot.com/2018/06/using-forceparallelmode-correctly.html	2018-09-24 09:29:53 +03:00
Jason Petersen	e62a1ab43d	Revert "Disable JIT during PostgreSQL 11 test runs" This reverts commit `a2fb5a84f1`. JIT wasn't actually interfering with the operation of Citus, a test was just written in a way which caused JIT to run for a function on every row in a 150k-row table.	2018-09-24 09:29:53 +03:00
Marco Slot	877d703ac5	Evaluate functions (and when applicable, parameters) anywhere in query	2018-09-21 12:57:50 -06:00
Onder Kalaci	abc443d7fa	Make sure that shard repair considers replication factor	2018-09-21 15:24:49 +03:00
Onder Kalaci	8520a5b432	worker_append_table_to_shard becomes aware of partitioned tables	2018-09-21 14:40:42 +03:00
Onder Kalaci	c1b5a04f6e	Allow partitioned tables with replication factor > 1 With this commit, we all partitioned distributed tables with replication factor > 1. However, we also have many restrictions. In summary, we disallow all kinds of modifications (including DDLs) on the partition tables. Instead, the user is allowed to run the modifications over the parent table. The necessity for such a restriction have two aspects: - We need to acquire shard resource locks appropriately - We need to handle marking partitions INVALID in case of any failures. Note that, in theory, the parent table should also become INVALID, which is too aggressive.	2018-09-21 14:40:41 +03:00
Murat Tuncer	b6930e3db9	Add distributed locking to truncated mx tables We acquire distributed lock on all mx nodes for truncated tables before actually doing truncate operation. This is needed for distributed serialization of the truncate command without causing a deadlock.	2018-09-21 14:23:19 +03:00
velioglu	d7f75e5b48	Add citus_lock_waits to show locked distributed queries	2018-09-20 14:13:51 +03:00
Murat Tuncer	0f6e514bfb	Fixes a bug on not being able to drop index on a partitioned table. Reason for the failure is that PG11 introduced a new relation kind RELKIND_PARTITIONED_INDEX to be used for partitioned indices. We expanded our check to cover that case.	2018-09-19 13:15:05 +03:00
Marco Slot	f34ab55389	Fix bug preventing rollback in stored procedure	2018-08-31 20:49:20 +02:00
Onder Kalaci	41d606b575	Use tree walker instad of mutator in relation visibility This commit uses _walker instead of _mutator for performance reasons. Given that we're only updating a functionId in the tree, the approach seems fine.	2018-09-18 09:33:01 +03:00
Onder Kalaci	4cae856846	Relax assertion on transaction abort on PREPARE step In case a failure happens when a transaction is failed on PREPARE, we used to hit an assertion for ensuring there is no pending activity on the connection. However, that's not true after the changes in #2031. Thus, we've replaced the assertion with a more generic function call to consume any pending activity, if exists.	2018-09-17 18:09:16 +03:00
Onder Kalaci	a94184fff8	Prevent overflow of memory accesses during deadlock detection In the distributed deadlock detection design, we concluded that prepared transactions cannot be part of a distributed deadlock. The idea is that (a) when the transaction is prepared it already acquires all the locks, so cannot be part of a deadlock (b) even if some other processes blocked on the prepared transaction, prepared transactions would eventually be committed (or rollbacked) and the system will continue operating. With the above in mind, we probably had a mistake in terms of memory allocations. For each backend initialized, we keep a `BackendData` struct. The bug we've introduced is that, we assumed there would only be `MaxBackend` number of backends. However, `MaxBackends` doesn't include the prepared transactions and axuliary processes. When you check Postgres' InitProcGlobal` you'd see that `TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;` This commit aligns with total procs processed with that.	2018-09-17 16:23:29 +03:00
Marco Slot	55f46acedf	Support TABLESAMPLE in router queries	2018-08-31 13:22:38 +02:00
Brian Cloutier	2fae06056a	Attempt to stabilize packet dumps and add them back it	2018-09-12 22:10:39 -06:00
Brian Cloutier	5bde8626c5	Travis uses Pipfile instead of re-specifying deps	2018-09-12 17:37:14 -06:00
Brian Cloutier	e61e5d4980	Update mitmproxy version to remove vulnerability warnings	2018-09-12 17:17:22 -06:00
Murat Tuncer	ae0032dff8	Add regression tests for procedure calls PG11 introduced PROCEDURE concept similar to FUNCTION Procedure's allow committing/rolling back behavior. This commmit adds regression tests for procedure calls.	2018-09-12 10:28:50 +03:00
velioglu	d1f005daac	Adds UDFs for testing MX functionalities with isolation tests	2018-09-12 07:04:16 +03:00
Murat Tuncer	470ee0b4d9	Revert multi_partition test back to being required Test was marked as optional (ignore) by previous commit. Reverting that change to make test required	2018-09-11 12:39:44 -06:00
Onder Kalaci	d657759c97	Views to Provide some insight about the distributed transactions on Citus MX With this commit, we implement two views that are very similar to pg_stat_activity, but showing queries that are involved in distributed queries: - citus_dist_stat_activity: Shows all the distributed queries - citus_worker_stat_activity: Shows all the queries on the shards that are initiated by distributed queries. Both views have the same columns in the outputs. In very basic terms, both of the views are meant to provide some useful insights about the distributed transactions within the cluster. As the names reveal, both views are similar to pg_stat_activity. Also note that these views can be pretty useful on Citus MX clusters. Note that when the views are queried from the worker nodes, they'd not show the distributed transactions that are initiated from the coordinator node. The reason is that the worker nodes do not know the host/port of the coordinator. Thus, it is advisable to query the views from the coordinator. If we bucket the columns that the views returns, we'd end up with the following: - Hostnames and ports: - query_hostname, query_hostport: The node that the query is running - master_query_host_name, master_query_host_port: The node in the cluster initiated the query. Note that for citus_dist_stat_activity view, the query_hostname-query_hostport is always the same with master_query_host_name-master_query_host_port. The distinction is mostly relevant for citus_worker_stat_activity. For example, on Citus MX, a users starts a transaction on Node-A, which starts worker transactions on Node-B and Node-C. In that case, the query hostnames would be Node-B and Node-C whereas the master_query_host_name would Node-A. - Distributed transaction related things: This is mostly the process_id, distributed transactionId and distributed transaction number. - pg_stat_activity columns: These two views get all the columns from pg_stat_activity. We're basically joining pg_stat_activity with get_all_active_transactions on process_id.	2018-09-10 21:33:27 +03:00
Onder Kalaci	7de5e30432	Change flaky explain test to non-explain This test's output changes depending on which worker is picked for explain (e.g., worker port in the output changes). Given that the test is only aiming to ensure that CTEs inside CTEs work fine in DML queries, it should be fine to get rid of the EXPLAIN. The output is verified to be correct as well.	2018-09-10 16:01:30 +03:00
Onder Kalaci	76aa6951c2	Properly send commands to other nodes We previously implemented OTHER_WORKERS_WITH_METADATA tag. However, that was wrong. See the related discussion: https://github.com/citusdata/citus/issues/2320 Instead, we switched using OTHER_WORKER_NODES and make the command that we're running optional such that even if the node is not a metadata node, we won't be in trouble.	2018-09-10 16:01:30 +03:00

1 2 3 4 5 ...

1304 Commits (0f768349e0af19ef4a49d5e87a8e0dff78e64d71)