citus

Commit Graph

Author	SHA1	Message	Date
Nils Dijk	2d13900230	error on unsupported changing of distirbution column in ON CONFLICT for INSERT ... SELECT	2018-07-23 15:18:21 +02:00
Nils Dijk	6a15e1c9fc	extract ErrorIfOnConflictNotSupported function for reuse	2018-07-23 12:20:10 +02:00
Nils Dijk	df98900f80	fix missing space for tablein in error	2018-07-20 15:05:13 +02:00
Marco Slot	69a3ebea5f	Ensure StartPlacementListConnection connects with username supplied by the caller	2018-07-19 20:10:11 +02:00
Jason Petersen	318119910b	Add pg_dist_poolinfo table For storing nodes' pool host/port overrides.	2018-07-10 09:30:22 -07:00
mehmet furkan şahin	3afa7f425d	Topn aggregates are supported	2018-07-10 14:33:42 +03:00
Murat Tuncer	a7277526fd	Make citus_stat_statements_reset() super user function	2018-07-10 11:21:20 +03:00
Marco Slot	89870e76ce	Add a select_opens_transaction_block GUC	2018-07-08 03:50:39 +02:00
Murat Tuncer	f20258ef10	Expand count distinct support We can now support more complex count distinct operations by pulling necessary columns to coordinator and evalutating the aggreage at coordinator. It supports broad range of expression with the restriction that the expression must contain a column.	2018-07-06 09:44:20 +03:00
Onder Kalaci	7fb529aab9	Some stylistic improvements in the foreign keys to reference table changes.	2018-07-05 23:23:34 +03:00
Brian Cloutier	735218ee5d	Remove sslmode structs and add more helpful description	2018-07-05 14:12:36 +02:00
Nils Dijk	c1c8c38dc9	create placeholder for policy ddl	2018-07-05 11:07:01 +02:00
mehmet furkan şahin	06217be326	hll aggregate functions are supported natively	2018-07-04 16:41:09 +03:00
Murat Tuncer	901066a421	Move partition key logging related code from enterprise	2018-07-04 13:11:34 +03:00
mehmet furkan şahin	f7b901e3fd	CopyShardForeignConstraintCommandList API change for grouped constraints	2018-07-03 17:05:55 +03:00
mehmet furkan şahin	35eac2318d	lock referenced reference table metadata is added For certain operations in enterprise, we need to lock the referenced reference table shard distribution metadata	2018-07-03 17:05:55 +03:00
Onder Kalaci	d83be3a33f	Enforce foreign key restrictions inside transaction blocks When a hash distributed table have a foreign key to a reference table, there are few restrictions we have to apply in order to prevent distributed deadlocks or reading wrong results. The necessity to apply the restrictions arise from cascading nature of foreign keys. When a foreign key on a reference table cascades to a distributed table, a single operation over a single connection can acquire locks on multiple shards of the distributed table. Thus, any parallel operation on that distributed table, in the same transaction should not open parallel connections to the shards. Otherwise, we'd either end-up with a self-distributed deadlock or read wrong results. As briefly described above, the restrictions that we apply is done by tracking the distributed/reference relation accesses inside transaction blocks, and act accordingly when necessary. The two main rules are as follows: - Whenever a parallel distributed relation access conflicts with a consecutive reference relation access, Citus errors out - Whenever a reference relation access is followed by a conflicting parallel relation access, the execution mode is switched to sequential mode. There are also some other notes to mention: - If the user does SET LOCAL citus.multi_shard_modify_mode TO 'sequential';, all the queries should simply work with using one connection per worker and sequentially executing the commands. That's obviously a slower approach than Citus' usual parallel execution. However, we've at least have a way to run all commands successfully. - If an unrelated parallel query executed on any distributed table, we cannot switch to sequential mode. Because, the essense of sequential mode is using one connection per worker. However, in the presence of a parallel connection, the connection manager picks those connections to execute the commands. That contradicts with our purpose, thus we error out. - COPY to a distributed table cannot be executed in sequential mode. Thus, if we switch to sequential mode and COPY is executed, the operation fails and there is currently no way of implementing that. Note that, when the local table is not empty and create_distributed_table is used, citus uses COPY internally. Thus, in those cases, create_distributed_table() will also fail. - There is a GUC called citus.enforce_foreign_key_restrictions to disable all the checks. We added that GUC since the restrictions we apply is sometimes a bit more restrictive than its necessary. The user might want to relax those. Similarly, if you don't have CASCADEing reference tables, you might consider disabling all the checks.	2018-07-03 17:05:55 +03:00
velioglu	6be6911ed9	Create foreign key relation graph and functions to query on it	2018-07-03 17:05:55 +03:00
mehmet furkan şahin	4db72c99f6	Specific DDLs are sequentialized when there is FK -[x] drop constraint -[x] drop column -[x] alter column type -[x] truncate are sequentialized if there is a foreign constraint from a distributed table to a reference table on the affected relations by the above commands.	2018-07-03 17:05:55 +03:00
mehmet furkan şahin	2c5d59f3a8	create_distributed_table in transaction is fixed	2018-07-03 17:05:01 +03:00
mehmet furkan şahin	45f8017f42	create_distributed_table with fk to ref table is implemented	2018-07-03 17:05:01 +03:00
mehmet furkan şahin	2fa4e38841	FK from dist to ref can be added with alter table	2018-07-03 17:05:01 +03:00
Murat Tuncer	23800f50f1	Update citus_stat_statements view and regression tests	2018-07-03 16:14:13 +03:00
Murat Tuncer	e532755a6e	Fix bug in partition column extraction added strip_implicit_coercion prior to checking if the expression is Const. This is important to find values for types like bigint.	2018-07-02 18:08:16 +03:00
Murat Tuncer	3fc7cdfe6d	Apply master_stage_protocol refactoring changes	2018-06-28 11:24:57 +03:00
Murat Tuncer	4d35b92016	Add groundwork for citus_stat_statements api	2018-06-27 14:20:03 +03:00
Brian Cloutier	5ce18327a7	Don't spinloop when trying to cleanup a failed connection	2018-06-26 13:13:34 -07:00
Onder Kalaci	7d0f7835e7	Improve relation accesses association to do less job	2018-06-25 18:40:40 +03:00
Onder Kalaci	8ccb8b679e	Real-time executor marks multi shard relation accesses before opening connections	2018-06-25 18:40:31 +03:00
Onder Kalaci	2890154420	Make sure that TRUNCATE always opens a DDL access	2018-06-25 18:40:31 +03:00
Onder Kalaci	21038f0d0e	Make sure that inter-shard DDL commands are always covers both tables	2018-06-25 18:40:30 +03:00
Onder Kalaci	2f01894589	Track relation accesses using the connection management infrastructure	2018-06-25 18:40:30 +03:00
Onder Kalaci	d5472614df	Use non-data connection for intermediate results Make sure that intermediate results use a connection that is not associated with any placement. That is useful in two ways: - More complex queries can be executed with CTEs - Safely use the same connections when there is a foreign key to reference table from a distributed table, which needs to use the same connection for modifications since the reference table might cascade to the distributed table.	2018-06-21 13:26:13 +03:00
Onder Kalaci	7762d81cba	Move test UDF under test folder	2018-06-21 08:42:44 +03:00
Jason Petersen	7a75c2ed31	Add connparam invalidation trigger creation logic This needs to live in Community, since we haven't yet added the com- plication of having divergent upgrade scripts in Enterprise.	2018-06-20 14:13:18 -06:00
mehmet furkan şahin	2b2ce036eb	create_distributed_table honors sequential mode	2018-06-19 17:33:45 +03:00
Onder Kalaci	8f5821493a	Implement C interface for setting GUC We need the ability to switch to sequential mode (e.g., SET LOCAL citus.multi_shard_modify_mode = 'sequential'). This commit enables that.	2018-06-19 10:23:43 +03:00
Marco Slot	f3f2805978	Fix use-after-free that may occur for INSERT..SELECT in prepared statements	2018-06-18 22:55:06 -06:00
velioglu	53b2e81d01	Adds SELECT ... FOR UPDATE support for router plannable queries	2018-06-18 13:55:17 +03:00
Marco Slot	0bbe778760	Rename failOnError to alwaysThrowErrorOnFailure	2018-06-14 23:37:47 +02:00
Marco Slot	0feb1f2eb1	Do not call CheckRemoteTransactionsHealth from commit handler	2018-06-14 23:33:07 +02:00
Marco Slot	4ab8e87090	Always throw errors on failure on critical connection in router executor	2018-06-14 23:33:07 +02:00
Nils Dijk	73efcb22c4	Extract RoleSpecString and resolve role references	2018-06-14 11:38:42 +02:00
Jason Petersen	5bf7bc64ba	Add pg_dist_authinfo schema and validation This table will be used by Citus Enterprise to populate authentication- related fields in outbound connections; Citus Community lacks support for this functionality.	2018-06-13 11:16:26 -06:00
Jason Petersen	57b3f253c5	Add node_conninfo GUC and related logic To support more flexible (i.e. not at compile-time) specification of libpq connection parameters, this change adds a new GUC, node_conninfo, which must be a space-separated string of key-value pairs suitable for parsing by libpq's connection establishment methods. To avoid rebuilding and parsing these values at connection time, this change also adds a cache in front of the configuration params to permit immediate use of any previously-calculated parameters.	2018-06-12 20:23:47 -06:00
mehmet furkan şahin	d1a3b20115	foreign_constraint_utils is created	2018-06-07 18:19:24 +03:00
Onder Kalaci	a5370f5bb0	Realtime executor honours multi_shard_modify_mode We're relying on multi_shard_modify_mode GUC for real-time SELECTs. The name of the GUC is unfortunate, but, adding one more GUC (or renaming the GUC) would make the UX even worse. Given that this mode is mostly important for transaction blocks that involve modification /DDL queries along with real-time SELECTs, we can live with the confusion.	2018-06-06 14:59:54 +03:00
Onder Kalaci	d918556dca	INSERT .. SELECT pushdown honors multi_shard_modification_mode	2018-06-06 12:42:23 +03:00
Onder Kalaci	336044f2a8	master_modify_multiple_shards() and TRUNCATE honors multi_shard_modification_mode	2018-06-06 12:29:05 +03:00
Onder Kalaci	df44956dc3	Make sure that sequential DDL opens a single connection to each node After this commit DDL commands honour `citus.multi_shard_modify_mode`. We preferred using the code-path that executes single task router queries (e.g., ExecuteSingleModifyTask()) in order not to invent a new executor that is only applicable for DDL commands that require sequential execution.	2018-06-05 17:52:17 +03:00
Marco Slot	fd4ff29f2f	Add a debug message with distribution column value	2018-06-05 15:09:17 +03:00
Murat Tuncer	ba50e3f33e	Add handling for grant/revoke all tables in schema	2018-05-31 13:47:02 +03:00
velioglu	20acee2cd4	Bump citus version to 7.5devel	2018-05-28 17:25:21 -06:00
Brian Cloutier	9667ee5ac9	Alleviate OOM failures in COMMIT callback Previously those failures caused us to crash, postgres abort()s when it notices a failure in the COMMIT callback.	2018-05-15 16:39:33 -07:00
Brian Cloutier	4c2bf5d2d6	Move call to RemoveIntermediateResultsDirectory Errors thrown in the COMMIT handler will cause Postgres to segfault, there's nothing it can do it abort the transaction by the time that handler is called! RemoveIntermediateResultsDirectory is problematic for two reasons: - It has calls to ereport(ERROR which have been known to trigger - It makes memory allocations which raise ERRORs when they fail Once the COMMIT process has begun we don't use the intermediate results, so it's safe to remove them a little earlier in the process. A failure here will abort the transaction. That's pretty unnecessary, it's not that important that we remove the results, but it's still better than a crash.	2018-05-10 19:28:41 -07:00
mehmet furkan şahin	d35f2725bf	valgrind tests fix	2018-05-10 10:20:14 +03:00
Dimitri Fontaine	8b258cbdb0	Lock reads and writes only to the node being updated in master_update_node Rather than locking out all the writes in the cluster, the function now only locks out writes that target shards hosted by the node we're updating.	2018-05-09 15:14:20 +02:00
Marco Slot	5f5f7b4fe0	Throw an error if placements cannot be found in router executor	2018-05-08 22:39:18 -04:00
Marco Slot	9438e5bde9	Ensure single-shard modifying CTEs are part of distributed transaction	2018-05-06 12:49:40 +02:00
velioglu	caa27161ca	Check volatile functions in modify queries	2018-05-08 11:16:40 +03:00
Hadi Moshayedi	86b12bc2d0	Always prefix operators with their namespace. (#2147 ) Previously we checked if an operator is in pg_catalog, and if it wasn't we prefixed it with namespace in worker queries. This can have a huge impact on performance of physical planner when using custom data types. This happened regardless of current search_path config, because Citus overrides the search path in get_query_def_extended(). When we do so, the check for existence of the operator in current search path in generate_operator_name() fails for any operators outside pg_catalog. This means that nothing gets cached, and in the following calls we will again recheck the system tables for existence of the operators, which took an additional 40-50ms for some of the usecases we were seeing. In this change we skip the pg_catalog check, and always prefix the operator with its namespace.	2018-05-05 13:27:26 -04:00
Marco Slot	2f9c8c6af0	Allow DML commands with unreferenced SELECT CTEs	2018-05-03 14:53:26 +02:00
Marco Slot	f8cfe07fd1	Support intermediate results in distributed INSERT..SELECT	2018-05-03 14:42:28 +02:00
Marco Slot	90cdfff602	Implement recursive planning for DML statements	2018-05-03 14:42:28 +02:00
Murat Tuncer	42a8082721	PG11 compatibility refresh adds a shim for a changed function api	2018-05-03 13:21:15 -06:00
Onder Kalaci	317dd02a2f	Implement single repartitioning on hash distributed tables * Change worker_hash_partition_table() such that the divergence between Citus planner's hashing and worker_hash_partition_table() becomes the same. * Rename single partitioning to single range partitioning. * Add single hash repartitioning. Basically, logical planner treats single hash and range partitioning almost equally. Physical planner, on the other hand, treats single hash and dual hash repartitioning almost equally (except for JoinPruning). * Add a new GUC to enable this feature	2018-05-02 18:50:55 +03:00
velioglu	32bcd610c1	Support modify queries with multiple tables With this commit we begin to support modify queries with multiple tables if these queries are pushdownable.	2018-05-02 16:22:26 +03:00
velioglu	d9fa69c031	Refactor query pushdown related logic	2018-05-02 15:03:09 +03:00
Brian Cloutier	f8fb7a27fb	Don't copyObject into the wrong memory context utilityStmt sometimes (such as when it's inside of a plpgsql function) comes from a cached plan, which is kept in a child of the CacheMemoryContext. When we naively call copyObject we're copying it into a statement-local context, which corrupts the cached plan when it's thrown away.	2018-05-01 15:34:32 -07:00
Marco Slot	2559b84049	Drop shards as current user instead of super user	2018-05-01 09:57:20 +02:00
velioglu	121ff39b26	Removes large_table_shard_count GUC	2018-04-29 10:34:50 +02:00
Onder Kalaci	832c91e28c	Move processing each part of the query into its own functions This commit doesn't change any of the logic at all. Instead, the goal is to: * Get rid of any code duplication * Incremental changes to the optimizer made it slightly hard to follow the code, improve that and make it easier to implement new features * Simplify the code by moving each part of query processing (e.g., DISTINCT, LIMIT etc) into its own function * Make the interaction between each part of the query more obvious (e.g., How DISTINCT affects LIMIT etc)	2018-04-27 17:32:38 +03:00
mehmet furkan şahin	f2555317b6	ProcessVacuumStmt update on names	2018-04-27 14:37:01 +03:00
mehmet furkan şahin	a4153c6ab1	notice handler is implemented	2018-04-27 14:37:01 +03:00
Marco Slot	304b3a41ba	Cache the partition column Var	2018-04-26 14:58:16 -06:00
Marco Slot	3d3c19a717	Improve messages for essential connection failures	2018-04-26 12:58:47 -06:00
Marco Slot	88f64d22db	Prevent connection pointer is NULL details	2018-04-26 12:49:57 -06:00
Marco Slot	394732b6be	Add a connection failure error code	2018-04-26 12:49:57 -06:00
Önder Kalacı	ebb8f902c8	Relax assertion on transaction rollback failure (#2052 ) In case a failure happens when a transaction is rollbacked, we used to hit an assertion for ensuring there is no pending activity on the connection. However, that's not true after the changes in #2031. Thus, we've replaced the assertion with a more generic function call to consume any pending activity, if exists.	2018-04-26 13:39:03 -04:00
Hadi Moshayedi	24659a97dc	Fail task in real-time executor if no placements found. (#2133 )	2018-04-26 12:05:24 -04:00
Murat Tuncer	a6fe5ca183	PG11 compatibility update - changes in ruleutils_11.c is reflected - vacuum statement api change is handled. We now allow multi-table vacuum commands. - some other function header changes are reflected - api conflicts between PG11 and earlier versions are handled by adding shims in version_compat.h - various regression tests are fixed due output and functionality in PG1 - no change is made to support new features in PG11 they need to be handled by new commit	2018-04-26 11:29:43 +03:00
Brian Cloutier	49255213d4	Configure appveyor to run regression tests - Add install.pl to instal .sql files on Windows - Remove a hack to PGDLLIMPORT some variables - Add citus_version.o to the Makefile - Fix pg_regress_multi's PATH generation on Windows - Output regression.diffs when the tests fail - Fix permissions in data directory, make sure postgres can play with it	2018-04-25 18:02:07 -07:00
Onder Kalaci	ac8f2f1e6d	Eliminate code duplication in WorkerExtendedOpNode() Before this commit, we had code duplication in the WorkerExtendedOpNode(). The duplication was noticeable and any change is prone to bugs. The PR consists of 4 commits. Each commit incrementally fixes the problem by moving certain parts of the duplicated code into smaller, better-documented functions.	2018-04-25 08:54:59 +03:00
Brian Cloutier	8d4c4d5c58	Close all files before trying to remove them	2018-04-24 14:35:20 -07:00
Brian Cloutier	c5f1235090	Turn the crashes on Windows into WARNINGs	2018-04-24 14:35:20 -07:00
Onder Kalaci	ee748d9140	Unify extendedOpNode Processing Before this commit, we had a divergence among the creation of master/worker extended op nodes. This commit moves the related parts into a single place and allows the creation of master/extended op nodes to share a common data structure.	2018-04-24 11:56:38 +03:00
Hadi Moshayedi	966f01fad3	Fix write and copy functions for TaskExecution. (#2120 ) We were missing criticalErrorOccurred from CopyNodeTaskExecution() and OutTaskExecution(). This PR fixes it.	2018-04-23 09:07:52 -04:00
Onder Kalaci	814f0e3acc	Ensure Citus never try to access a not planned subquery PostgreSQL might remove some of the subqueries when they do not contribute to the query result at all. Citus should not try to access such subqueries during planning.	2018-04-20 13:52:00 +03:00
Brian Cloutier	b0b130f064	Fix Windows crash in multi_copy test Without this change we crash on Windows with COPYing into a table with 62 shards, and we ERROR when COPYing into a table with >62 shards: ERROR: WaitForMutipleObjects() failed: error code 87	2018-04-17 15:48:02 -07:00
Brian Cloutier	a59c1c634e	Fix cancellation of real time queries Without this change multi_real_time_transaction blocks forever (on Windows) in the block where it repeatedly calls pg_advisory_lock(15). This happens because the deadlock detector tries to cancel the backend but the backend never processes that signal.	2018-04-17 14:26:22 -07:00
mehmet furkan şahin	00e786af00	Capital named schema support is added	2018-04-17 17:17:42 +03:00
mehmet furkan şahin	e5a5502b16	Adds support for multiple ANDs in Having This PR adds support for multiple AND expressions in Having for pushdown planner. We simply make a call to make_ands_explicit from MultiLogicalPlanOptimize for the having qual in workerExtendedOpNode.	2018-04-16 14:14:48 +03:00
Brian Cloutier	42ddfa176d	Fix crash on Windows where there is no detail	2018-04-13 12:54:22 -07:00
velioglu	82b2d21b0c	Convert broadcast join to reference join After this commit large_table_shard_count wont be used to check whether broadcast join, which is renamed as reference join, can be applied. Reference join can only be applied over reference tables.	2018-04-13 12:58:14 +03:00
velioglu	1b92812be2	Add co-placement check to CoPartition function	2018-04-13 12:13:08 +03:00
Marco Slot	9318aeee6b	Allow multiple size function calls per query	2018-04-12 14:16:17 +02:00
Marco Slot	ee132c5ead	Prune shards once per relation in subquery pushdown	2018-04-10 20:33:07 +02:00
Burak Yucesoy	b33b282030	Fix bug while DROPping partitioned table from worker We recently added partitionin support to Citus MX. We should not execute DROP table commands from MX workers but at the moment we try to execute such commands for partitioned tables. This PR fixes that problem by adding check.	2018-04-09 13:50:21 +03:00
Burak Yucesoy	0c283fa8a3	Add partitioning support to MX tables Previously, we prevented creation of partitioned tables on Citus MX. We decided to not focus on this feature until there is a need. Since now there are requests for this feature, we are implementing support for partitioned tables on Citus MX.	2018-04-06 12:47:06 +03:00
velioglu	72dfe4a289	Adds colocation check to local join	2018-04-04 22:49:27 +03:00
velioglu	698d585fb5	Remove broadcast join logic After this change all the logic related to shard data fetch logic will be removed. Planner won't plan any ShardFetchTask anymore. Shard fetch related steps in real time executor and task-tracker executor have been removed.	2018-03-30 11:45:19 +03:00
Matthew Wozniczka	4582a4b398	Fixed a typo	2018-03-27 22:51:36 -06:00
Brian Cloutier	f8f0d4aedc	Add Windows replacement for uname	2018-03-21 20:35:56 -07:00
Brian Cloutier	98ffafe16e	Fix error handling in connection_management	2018-03-21 20:05:00 -07:00
Murat Tuncer	224b0a8c14	Replace poll with select/poll Windows does not have poll(), so fall back to select()	2018-03-21 20:05:00 -07:00
Metin Doslu	3b7b64a8b6	Remove skip_jsonb_validation_in_copy GUC	2018-03-13 10:33:27 +02:00
Murat Tuncer	1440caeef2	Fix incorrect limit pushdown when distinct clause is not superset of group by (#2035 ) Pushing down limit and order by into workers may produce wrong output when distinct on() clause has expressions, aggregates, or window functions. This checking allows pushing down of limits only if distinct clause is a superset of group by clause. i.e. it contains all clauses in group by.	2018-03-07 13:24:56 +03:00
Metin Doslu	e86d34256c	Change default to false for citus.skip_jsonb_validation_in_copy	2018-03-06 13:19:47 +02:00
Onder Kalaci	40b898b59f	Improve error messages for INSERT queries that have subqueries	2018-03-05 14:46:47 +02:00
Onder Kalaci	7dc9589b56	Handle failures during I/O This commit checks the connection status right after any IO happens on the socket. This is necessary since before this commit we didn't pass any information to the higher level functions whether we're done with the connection (e.g., no IO required anymore) or an errors happened during the IO.	2018-03-02 08:33:53 +02:00
Onder Kalaci	da0048e0b7	ForgetResults() becomes a wrapper for ClearResults() ClearResults() is able to handle failures properly by checking the result status. So, relying on it makes error handling more generic in Citus.	2018-03-02 08:33:53 +02:00
Murat Tuncer	76f6883d5d	Add support for window functions that can be pushed down to worker (#2008 ) This is the first of series of window function work. We can now support window functions that can be pushed down to workers. Window function must have distribution column in the partition clause to be pushed down.	2018-03-01 19:07:07 +03:00
Marco Slot	e79db17b91	Update comment in WorkerAggregateExpressionList	2018-02-27 23:48:25 +01:00
Murat Tuncer	e13c5beced	Fix worker query when order by avg aggregate is used (#2024 ) We push down order by to worker query when limit is specified (with some other additional checks). If the query has an expression on an aggregate or avg aggregate by itself, and there is an order by on this particular target we may send wrong order by to worker query with potential to affect query result. The fix creates a auxilary target entry in the worker query and uses that target entry for sorting.	2018-02-28 12:12:54 +03:00
Metin Doslu	bcf660475a	Add support for modifying CTEs	2018-02-27 15:08:32 +02:00
velioglu	78e6d990a2	Fix master plan of the query with distinct, aggregate and group by clauses. Before this PR, we were trusting on the columns of group by about guaranteeing the uniqueness of the results. However, this assumption is correct only if the columns in the group by is subset of columns in the distinct clause. It can be wrong if we have part of group by columns and some aggregation columns in the distinct clause. With this PR, we add distinct plan on top of aggregate plan when necessary.	2018-02-26 15:30:15 +03:00
Onder Kalaci	1c930c96a3	Support non-co-located joins between subqueries With #1804 (and related PRs), Citus gained the ability to plan subqueries that are not safe to pushdown. There are two high-level requirements for pushing down subqueries: * Individual subqueries that require a merge step (i.e., GROUP BY on non-distribution key, or LIMIT in the subquery etc). We've handled such subqueries via #1876. * Combination of subqueries that are not joined on distribution keys. This commit aims to recursively plan some of such subqueries to make the whole query safe to pushdown. The main logic behind non colocated subquery joins is that we pick an anchor range table entry and check for distribution key equality of any other subqueries in the given query. If for a given subquery, we cannot find distribution key equality with the anchor rte, we recursively plan that subquery. We also used a hacky solution for picking relations as the anchor range table entries. The hack is that we wrap them into a subquery. This is only necessary since some of the attribute equivalance checks are based on queries rather than range table entries.	2018-02-26 13:50:37 +02:00
Onder Kalaci	7b57e0562a	Add infrastructure for detecting non-colocated subqueries	2018-02-26 13:28:25 +02:00
Onder Kalaci	4d70c86645	Leaf level recursive planning for non colocated subqueries With this commit, we enable recursive planning for the subqueries that are not joined on the distribution keys.	2018-02-26 13:28:24 +02:00
Onder Kalaci	e998703ff8	Enable restriction eq. checks for top level set operations We used to only support pushdownable set operations inside a subquery, however, we could easily expand the restriction checks to cover top level set operations as well.	2018-02-26 13:28:24 +02:00
Onder Kalaci	e8aa532a90	Refactor checks for distribution key equality Change some function names, ensure we stick to Citus' function order rules etc.	2018-02-26 13:28:24 +02:00
Marco Slot	1e9186a3b5	Do not use new connection in table size functions	2018-02-23 07:07:55 +01:00
Markus Sintonen	6202e80d06	Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg	2018-02-18 00:19:18 +02:00
velioglu	195ac948d2	Recursively plan subqueries in WHERE clause when FROM recurs	2018-02-13 19:52:12 +03:00
Marco Slot	0cba4ab588	Refactor worker node hash initialisation	2018-02-12 23:36:43 +01:00
Marco Slot	40d715d494	Cache worker node array for faster iteration	2018-02-12 23:36:43 +01:00
Marco Slot	6e79a34c97	Do not check for cancellation in ClearResultsIfReady	2018-02-12 16:45:02 +01:00
Marco Slot	6051aae56e	Handle errors that are discovered during abort	2018-02-12 16:45:02 +01:00
Marco Slot	ee6a751798	Only copy distributed plan when modifying it	2018-02-12 16:30:55 +01:00
Onder Kalaci	94c5ac6ebb	Remove duplicate join restrictions We use PostgreSQL hooks to accumulate the join restrictions and PostgreSQL gives us all the join paths it tries while deciding on the join order. Thus, for queries that have many joins, this function is likely to remove lots of duplicate join restrictions. This becomes relevant for Citus on query pushdown check peformance.	2018-02-12 18:35:05 +02:00
Onder Kalaci	c228d8ff3d	Refactor equivalance generation related codes This commit changes the APIs for restriction generation to make future changes simpler.	2018-02-12 18:35:04 +02:00
Onder Kalaci	2f2d350924	Refactor relation restriction related codes This commit moves some of the functions to a more relevant source file.	2018-02-12 18:35:04 +02:00
Murat Tuncer	901b543e20	Fix count distinct using field select on top level query We were allowing count distict queries even if they were not directly on columns if the query is grouped on distribution column. When performing these checks we were skipping subqueries because they also perform this check in a more concise manner. We relied on oid SUBQUERY_RELATION_ID (10000) to decide if a given RTE relation id denotes a subquery, however, we also use SUBQUERY_PUSHDOWN_RELATION_ID (10001) for some subqueries. We skip both type of subqueries with this change.	2018-02-06 13:16:10 +03:00
metdos	35f864bcaf	Respect enable_hashagg in the master planner	2018-02-05 15:06:00 +02:00
metdos	3d540d961c	Fix typo in grouping_is_sortable()	2018-02-05 12:10:19 +02:00
Marco Slot	6f7c3bd73b	Skip JSON validation on coordinator during COPY	2018-02-02 15:33:27 +01:00
Brian Cloutier	15511f6ba1	Dynamically allocate connection metadata in WaitForAllConnections	2018-02-01 10:30:41 -08:00
Brian Cloutier	e6ebfc1f53	Remove VLA from UpdateNodeLocation	2018-02-01 10:30:41 -08:00
Brian Cloutier	a2ed45e206	Remove variable length arrays VLAs aren't supported by Visual Studio. - Remove all existing instances of VLAs. - Add a flag, -Werror=vla, which makes gcc refuse to compile if we add VLAs in the future.	2018-02-01 10:30:41 -08:00
Brian Cloutier	2efe80ce55	CheckForDistributedDeadlocks no longer uses a VLA - variable length arrays (VLAs) do not work with Visual Studio - fix an off-by-one error. We incorrectly assumed there would always at least as many edges as there were nodes. - refactor: reduce scope of transactionNodeStack by moving it into the function which uses it. - refactor: break up the distinct uses of currentStackDepth into separate variables.	2018-02-01 10:30:41 -08:00
Brian Cloutier	097fd15a89	small refactor, CheckDeadlockForTransactionNode builds it's own array	2018-02-01 10:30:41 -08:00
Brian Cloutier	457f570b77	Small refactor, we were using incompatible types	2018-01-31 11:05:59 -08:00
Brian Cloutier	b864d014ab	GetNextNodeId() incorrectly called PG_RETURN_DATUM - Also stabilize the output of a multi_router_planner test	2018-01-29 15:32:36 -08:00
Brian Cloutier	61a6b846b9	Refactor: use a temporary timestamp variable It's against our coding convention to call functions inside parameter lists; when single-stepping with a debugger it's difficult to determine what the function returned. That wouldn't be good enough reason to change this code but while porting Citus to Windows I ran into this line of code. assign_distributed_transaction_id was called with a weird timestamp and I wasn't able to find the problem without first making this change.	2018-01-29 11:20:13 -08:00
Marco Slot	bd0ebac865	Skip call to ActiveReadableNodeList when there are no subplans	2018-01-29 16:05:10 +01:00
Hadi Moshayedi	ff26bcd5a5	Include sys/stat.h for S_IRUSR and S_IWUSR. (#1977 )	2018-01-26 16:21:48 -05:00
Brian Cloutier	76d1edc3fd	Don't rely on gcc-specific features (#1963 ) * Don't use expressions inside compound statements * Don't depend on __builtin_constant_p * Remove reliance on S_ISLNK * Replace use of __func__: older mcvs doesn't support this builtin	2018-01-23 17:03:29 -08:00
Onder Kalaci	fbde87d2d0	Allocate enough space for transaction nodes This fix prevents any potential memory access that might occur while forming the deadlock path.	2018-01-22 08:45:48 +02:00
Onder Kalaci	9a89c0b425	Fix bug while traversing the distributed deadlock graph With this fix, we traverse the graph with DFS which was originally intended. Note that, before the fix, we traverse the graph with BFS which might lead to killing some unrelated backend that is not involved in the distributed deadlock.	2018-01-22 08:45:48 +02:00
Dimitri Fontaine	c9760fbb64	Fix CREATE INDEX with storage options on distributed tables. By sharing the implementation of the function AppendOptionListToString on three call sites, we would expand an extra OPTIONS keyword in a create index statement, and omit other bits of the specific syntax here. This patch introduces an AppendStorageParametersToString() function that is very similar to AppendOptionListToString() but handles WITH(a="foo",...) syntax that is used in reloptions (aka Storage Parameters). Fixes #1747.	2018-01-17 21:56:40 +01:00
Dimitri Fontaine	952da72c55	Implement ALTER TABLE\|INDEX ... SET\|RESET (). PostgreSQL implements support for several relation kinds in a single statement, such as in the AlterTableStmt case, which supports both tables and indexes and more (see ATExecSetRelOptions in PostgreSQL source code file src/backend/commands/tablecmds.c for an example of that). As a consequence, this patch implements support for setting and resetting storage parameters on both relation kinds.	2018-01-17 21:56:40 +01:00
Dimitri Fontaine	17266e3301	Implement ALTER INDEX ... RENAME TO ... The command is now distributed among the shards when the table is distributed. To that effect, we fill in the DDLJob's targetRelationId with the OID of the table for which the index is defined, rather than the OID of the index itself.	2018-01-17 21:56:40 +01:00
velioglu	d357d2fccd	Bump citus version to 7.3devel	2018-01-16 11:50:28 +03:00
Dimitri Fontaine	e010238280	Implement ALTER TABLE ... RENAME TO ... The implementation was already mostly in place, but the code was protected by a principled check against the operation. Turns out there's a nasty concurrency bug though with long identifier names, much as in #1664. To prevent deadlocks from happening, we could either review the DDL transaction management in shards and placements, or we can simply reject names with (NAMEDATALEN - 1) chars or more — that's because of the PostgreSQL array types being created with a one-char prefix: '_'.	2018-01-11 13:21:24 +01:00
Hadi Moshayedi	5d7c52ffa6	Don't return in PG_TRY() block when cancellations happen in WaitForConnections(). (#1923 ) We shouldn't return in middle of a PG_TRY() block because if we do, we won't reset PG_exception_stack, and later when a re-throw tries to jump to the jump-point which was active in this PG_TRY() block, it seg-faults. We used to return in middle of PG_TRY() block in WaitForConnections() where we checked for cancellations. Whenever cancellations were caught here, Citus crashed. And example was reported by @onderkalaci at #1903.	2018-01-03 09:54:03 -05:00
Marco Slot	8f69973411	Fix cancellation issues in the real-time executor (#1905 )	2018-01-01 23:10:29 -05:00
Marco Slot	3fd65cb91b	Do not raise errors in the real-time executor (#1903 )	2018-01-01 22:26:31 -05:00
Onder Kalaci	a1bbdf2d44	Outer joins should also use subquery pushdown planner if join clause is not supported This change allows unsupported clauses to go through query pushdown planner instead of erroring out as we already do for non-outer joins.	2017-12-29 16:40:47 +02:00
Marco Slot	09c09f650f	Recursively plan set operations when leaf nodes recur	2017-12-26 13:46:55 +02:00
mehmet furkan şahin	446893234a	unsupported subquery error messages are fixed	2017-12-25 15:10:59 +03:00
mehmet furkan şahin	57bc86e23d	new debug output for subplans	2017-12-25 09:50:51 +03:00
Marco Slot	fa7fa2734b	Log remote commands sent via MultiClientSendQuery	2017-12-22 16:18:40 +01:00
Murat Tuncer	87c6f306f1	Fix join clause eq restrictions (#1884 ) We used to error out if the join clause includes filters like t1.a < t2.a even if other filter like t1.key = t2.key exists. Recently we lifted that restriction in subquery planning by not lifting that restriction and focusing on equivalance classes provided by postgres. This checkin forwards previously erroring out real-time queries due to join clauses to subquery planner and let it handle the join even if the query does not have a subquery. We are now pushing down queries that do not have any subqueries in it. Error message looked misleading, changed to a more descriptive one.	2017-12-22 12:16:14 +03:00
metdos	32b7e152a3	Get shard resource locks for only DMLs	2017-12-22 10:30:41 +02:00
Murat Tuncer	a9cf0c3e66	Fix CTE column alias issue (#1893 ) We were creating intermediate query result's target names from subquery target list. Now we also check if cte re-defines its column name aliases, and create intermediate result query accordingly.	2017-12-22 09:39:40 +03:00
Brian Cloutier	377b31dcf7	Remove enable_deadlock_prevention prevention warning	2017-12-21 14:47:52 +01:00
Brian Cloutier	fb7b86fa14	Replace strtoull with pg_strtouint64 The macro we were using to detect strtoull isn't set on Windows, and just in case there are differences use a portable function from PG instead of calling strtoull directly.	2017-12-21 14:28:51 +01:00
mehmet furkan şahin	fd546cf322	Intermediate result size limitation This commit introduces a new GUC to limit the intermediate result size which we handle when we use read_intermediate_result function for CTEs and complex subqueries.	2017-12-21 14:26:56 +03:00
Onder Kalaci	0d5a4b9c72	Recursively plan subqueries that are not safe to pushdown With this commit, Citus recursively plans subqueries that are not safe to pushdown, in other words, requires a merge step. The algorithm is simple: Recursively traverse the query from bottom up (i.e., bottom meaning the leaf queries). On each level, check whether the query is safe to pushdown (or a single repartition subquery). If the answer is yes, do not touch that subquery. If the answer is no, plan the subquery seperately (i.e., create a subPlan for it) and replace the subquery with a call to `read_intermediate_results(planId, subPlanId)`. During the the execution, run the subPlans first, and make them avaliable to the next query executions. Some of the queries hat this change allows us: * Subqueries with LIMIT * Subqueries with GROUP BY/DISTINCT on non-partition keys * Subqueries involving re-partition joins, router queries * Mixed usage of subqueries and CTEs (i.e., use CTEs in subqueries as well). Nested subqueries as long as we support the subquery inside the nested subquery. * Subqueries with local tables (i.e., those subqueries has the limitation that they have to be leaf subqueries) * VIEWs on the distributed tables just works (i.e., the limitations mentioned below still applies to views) Some of the queries that is still NOT supported: * Corrolated subqueries that are not safe to pushdown * Window function on non-partition keys * Recursively planned subqueries or CTEs on the outer side of an outer join * Only recursively planned subqueries and CTEs in the FROM (i.e., not any distributed tables in the FROM) and subqueries in WHERE clause * Subquery joins that are not on the partition columns (i.e., each subquery is individually joined on partition keys but not the upper level subquery.) * Any limitation that logical planner applies such as aggregate distincts (except for count) when GROUP BY is on non-partition key, or array_agg with ORDER BY	2017-12-21 08:37:40 +02:00
Onder Kalaci	e12ea914b9	Refactor ErrorIfQueryNotSupported to defer errors	2017-12-20 09:03:49 +02:00
Onder Kalaci	71ce42b936	Refactor RecursivelyPlanSubqueriesAndCTEs() to make it ready to work with subqueries	2017-12-20 09:03:47 +02:00
Marco Slot	5e0539efa3	Plan CTEs when subquery pushdown is on	2017-12-19 16:34:56 +01:00
Marco Slot	44a1ea631a	Show distributed subplan ID in EXPLAIN output	2017-12-19 16:34:56 +01:00
Marco Slot	35dbacdb69	Do not reinitialise MyBackendData	2017-12-19 15:56:26 +01:00
Marco Slot	af201a2f6d	Allow intermediate results to be used in parallel workers	2017-12-18 19:05:08 +01:00
Marco Slot	7dab078e67	Set cost estimates for read_intermediate_result	2017-12-18 16:23:44 +01:00
Marco Slot	74bd33d0cc	Revert "Plan CTEs when subquery pushdown is on" This reverts commit `e3b953b8e3`.	2017-12-17 22:34:20 +01:00
Marco Slot	aca5f35ab9	Revert "Show distributed subplan ID in EXPLAIN output" This reverts commit `686b079272`.	2017-12-17 22:34:04 +01:00
Marco Slot	e3b953b8e3	Plan CTEs when subquery pushdown is on	2017-12-17 21:49:36 +01:00
Marco Slot	686b079272	Show distributed subplan ID in EXPLAIN output	2017-12-16 11:32:01 +01:00
Marco Slot	ea6b98fda4	Allow count(distinct) in queries with a subquery	2017-12-15 15:24:26 +01:00
Marco Slot	9ee0e68882	Do not take extra access exclusive lock partitioned tables	2017-12-15 13:02:31 +01:00
Marco Slot	5a69fc1b17	Relax checks on recurring tuples in FROM with sublinks	2017-12-15 11:56:06 +01:00
Marco Slot	a64f0060ba	Reduce the frequency of FinishConnectionIO calls during COPY (#1864 )	2017-12-14 13:21:59 -05:00
Marco Slot	2e2b4e81fa	Add support for CTEs in distributed queries	2017-12-14 09:32:55 +01:00
Marco Slot	d0335ec818	Send BEGIN for SELECTs in the router executor	2017-12-14 09:32:55 +01:00
Marco Slot	cbbd418af2	Add citus.copy_format OIDs to metadata cache	2017-12-14 09:32:55 +01:00
Marco Slot	66f9f1d6cd	Make some intermediate results functions public	2017-12-14 09:32:55 +01:00
Marco Slot	36ee21c323	Make CanUseBinaryCopyFormatForType public	2017-12-14 09:32:55 +01:00
Marco Slot	7d1191954d	Add DistributedSubPlan node	2017-12-14 09:32:55 +01:00
Onder Kalaci	86b2d9420c	Treat recurring tuples as reference table for GROUP BY checks read_intermediate_results() and immutable functions are implemented. Empty join trees seems not applicable here.	2017-12-13 14:55:42 +02:00
Marco Slot	d1a470a52e	Fix issue with multiple ANALYZE in transaction block	2017-12-12 10:28:48 +01:00
mehmet furkan şahin	3c941aedf1	adds citus.enable_repartition_joins GUC The new GUC allows Citus to switch between task executors when necessary	2017-12-11 09:36:37 +03:00
Marco Slot	60a1e31671	Allow queries with local tables in NeedsDistributedPlanning	2017-12-07 16:20:23 +01:00
Marco Slot	f8550b8c85	Fix issues with read_intermediate_result signature	2017-12-07 13:47:56 +01:00
Marco Slot	d8fea4efb8	Revert "Allow queries with local tables in NeedsDistributedPlanning" This reverts commit `d2bac081e8`.	2017-12-07 11:19:11 +01:00
Marco Slot	d2bac081e8	Allow queries with local tables in NeedsDistributedPlanning	2017-12-07 11:02:16 +01:00
Onder Kalaci	c42a92afd2	Fix bug related to incrementing an index not properly	2017-12-07 08:50:57 +02:00
Marco Slot	eab15aa035	Avoid deadlock in ColocatedTableId	2017-12-06 11:49:34 +01:00
Marco Slot	7279d42849	Treat read_intermediate_result as recurring tuples	2017-12-04 14:50:11 +01:00

... 2 3 4 5 6 ...

1145 Commits (c3296eb4ba128f233d26e00ebcf08d37a1343d7d)