citus

Commit Graph

Author	SHA1	Message	Date
Andres Freund	6ec34bed84	Remove remnants of commit_protocol.[ch].	2017-01-21 09:01:15 -08:00
Andres Freund	c390daed0f	Use interrupt checking libpq wrappers in router executor.	2017-01-09 14:02:45 -08:00
Andres Freund	7320c17f00	Convert router executor to placement connection management infrastructure. Remove the router specific transaction and shard management, and replace it with the new placement connection API. This mostly leaves behaviour alone, except that it is now, inside a transaction, legal to select from a shard to which no pre-existing connection exists. To simplify code the code handling task executions for select and modify has been split into two - the previous coding was starting to get confusing due to the amount of only conditionally applicable code. Modification connections & transactions are now always established in parallel, not just for reference tables.	2017-01-09 13:13:02 -08:00
Onder Kalaci	6d050fd677	Use 2PC for reference table modification With this commit, we ensure that router executor always uses 2PC for reference table modifications and never mark the placements of it as INVALID.	2017-01-04 12:46:35 +02:00
Marco Slot	59bc5972fa	Use MultiConnection in multi-shard transactions	2016-12-30 14:43:21 -07:00
Marco Slot	92c7567008	Convert worker_transactions to new connection API	2016-12-23 16:14:29 +01:00
Marco Slot	11031bcf55	Enable evaluation of stable functions in INSERT..SELECT	2016-12-23 12:47:21 +01:00
Marco Slot	d745d7bf70	Add explicit RelationShards mapping to tasks	2016-12-23 10:23:43 +01:00
Andres Freund	80b34a5d6b	Integrate router executor into transaction management framework. One less place managing remote transactions. It also makes it fairly easy to use 2PC for certain modifications (e.g. reference tables). Just issue a CoordinatedTransactionUse2PC(). If every placement failure should cause the whole transaction to abort, additionally mark the relevant transactions as critical.	2016-12-12 15:18:12 -08:00
Andres Freund	a77cf36778	Use connection_management.c from within connection_cache.c. This is a temporary step towards removing connection_cache.c.	2016-12-07 11:44:24 -08:00
Andres Freund	3223b3c92d	Centralized Connection Lifetime Management. Connections are tracked and released by integrating into postgres' transaction handling. That allows to to use connections without having to resort to having to disable interrupts or using PG_TRY/CATCH blocks to avoid leaking connections. This is intended to eventually replace multi_client_executor.c and connection_cache.c, and to provide the basis of a centralized transaction management. The newly introduced transaction hook should, in the future, be the only one in citus, to allow for proper ordering between operations. For now this central handler is responsible for releasing connections and resetting XactModificationLevel after a transaction.	2016-12-07 11:43:18 -08:00
Marco Slot	b566c4815c	Pass down the correct type for null parameters	2016-11-11 07:14:08 +01:00
Onder Kalaci	a43e3bad56	Improve error semantics for INSERT..SELECT With this commit, we error out if a worker query cannot be executed on all placements of a target insert shard interval.	2016-10-27 14:09:05 +03:00
Marco Slot	275378aa45	Re-acquire metadata locks in RouterExecutorStart	2016-10-26 14:34:59 +02:00
Onder Kalaci	1673ea937c	Feature: INSERT INTO ... SELECT This commit adds INSERT INTO ... SELECT feature for distributed tables. We implement INSERT INTO ... SELECT by pushing down the SELECT to each shard. To compute that we use the router planner, by adding an "uninstantiated" constraint that the partition column be equal to a certain value. standard_planner() distributes that constraint to all the tables where it knows how to push the restriction safely. An example is that the tables that are connected via equi joins. The router planner then iterates over the target table's shards, for each we replace the "uninstantiated" restriction, with one that PruneShardList() handles. Do so by replacing the partitioning qual parameter added in multi_planner() with the current shard's actual boundary values. Also, add the current shard's boundary values to the top level subquery to ensure that even if the partitioning qual is not distributed to all the tables, we never run the queries on the shards that don't match with the current shard boundaries. Finally, perform the normal shard pruning to decide on whether to push the query to the current shard or not. We do not support certain SQLs on the subquery, which are described/commented on ErrorIfInsertSelectQueryNotSupported(). We also added some locking on the router executor. When an INSERT/SELECT command runs on a distributed table with replication factor >1, we need to ensure that it sees the same result on each placement of a shard. So we added the ability such that router executor takes exclusive locks on shards from which the SELECT in an INSERT/SELECT reads in order to prevent concurrent changes. This is not a very optimal solution, but it's simple and correct. The citus.all_modifications_commutative can be used to avoid aggressive locking. An INSERT/SELECT whose filters are known to exclude any ongoing writes can be marked as commutative. See RequiresConsistentSnapshot() for the details. We also moved the decison of whether the multiPlan should be executed on the router executor or not to the planning phase. This allowed us to integrate multi task router executor tasks to the router executor smoothly.	2016-10-26 10:01:00 +03:00
Marco Slot	271b20a23e	Parallelise DDL commands	2016-10-24 12:39:08 +02:00
Marco Slot	65f6d7c02a	Follow consistent execution order in parallel commands	2016-10-19 08:33:08 +02:00
Marco Slot	a497e7178c	Parallelise master_modify_multiple_shards	2016-10-19 08:33:08 +02:00
Marco Slot	9d98acfb6d	Move requiresMasterEvaluation from Task to Job	2016-10-19 08:23:06 +02:00
Marco Slot	213d8419c6	Refactor and redocument executor shard lock code	2016-10-19 08:13:35 +02:00
Marco Slot	33b7723530	Use UpdateShardPlacementState where appropriate	2016-10-07 11:59:20 -07:00
Andres Freund	982ad66753	Introduce placement IDs. So far placements were assigned an Oid, but that was just used to track insertion order. It also did so incompletely, as it was not preserved across changes of the shard state. The behaviour around oid wraparound was also not entirely as intended. The newly introduced, explicitly assigned, IDs are preserved across shard-state changes. The prime goal of this change is not to improve ordering of task assignment policies, but to make it easier to reference shards. The newly introduced UpdateShardPlacementState() makes use of that, and so will the in-progress connection and transaction management changes.	2016-10-07 11:59:20 -07:00
Jason Petersen	5f6264105d	Directly register router xact callbacks in PG_init Not entirely sure why we went with the shared memory hook approach, but it causes problems (multiple registration) during crashes. Changing to a simple direct registration call from PG_init.	2016-09-29 11:43:18 -06:00
Jason Petersen	0caf0d95f1	Fix unique-violation-in-xact segfault An interaction between ReraiseRemoteError and DML transaction support causes segfaults: * ReraiseRemoteError calls PurgeConnection, freeing a connection... * That connection is still in the xactParticipantHash At transaction end, the memory in the freed connection might happen to pass the "is this connection OK?" check, causing us to try to send an ABORT over that connection. By removing it from the transaction hash before calling ReraiseRemoteError, we avoid this possibility.	2016-09-27 16:44:03 -06:00
Metin Doslu	c9dcad9b05	Pass text oid inteads of invalid oid for null values Passing invalid oids even for null values in PQsendQueryParams() causes worker nodes to fail. Therefore, we pass text oid for null values.	2016-09-27 08:15:46 +03:00
Andres Freund	776b3868b9	Support NoMovement direction in router executor This is mainly interesting because it allows to use RETURN QUERY/RETURN QUERY EXECUTE and FOR ... IN .. LOOPs in plpgsql.	2016-09-26 18:28:36 -06:00
Jason Petersen	850c51947a	Re-permit DDL in transactions, selectively Recent changes to DDL and transaction logic resulted in a "regression" from the viewpoint of users. Previously, DDL commands were allowed in multi-command transaction blocks, though they were not processed in any actual transactional manner. We improved the atomicity of our DDL code, but added a restriction that DDL commands themselves must not occur in any BEGIN/END transaction block. To give users back the original functionality (and improved atomicity) we now keep track of whether a multi-command transaction has modified data (DML) or schema (DDL). Interleaving the two modification types in a single transaction is disallowed. This first step simply permits a single DDL command in such a block, admittedly an incomplete solution, but one which will permit us to add full multi-DDL command support in a subsequent commit.	2016-08-30 20:37:19 -06:00
Andres Freund	7fdb5fbe29	Skip over unreferenced parameters when router executing prepared statement. When an unreferenced prepared statement parameter does not explicitly have a type assigned, we cannot deserialize it, to send to the remote side. That commonly happens inside plpgsql functions, where local variables are passed in as unused prepared statement parameters.	2016-08-05 14:12:06 -07:00
Jason Petersen	eba8396501	Avoid attempting to lock invalid shard identifier A recent change generates a "dummy" shard placement with its identifier set to INVALID_SHARD_ID for SELECT queries against distributed tables with no shards. Normally, no lock is acquired for SELECT statements, but if all_modifications_commutative is set to true, we will acquire a shared lock, triggering an assertion failure within LockShardResource in the above case. The "dummy" shard placement is actually necessary to ensure such empty queries have somewhere to execute, and INVALID_SHARD_ID seems the most appropriate value for the dummy's shard identifier field, so the most straightforward fix is to just avoid locking invalid shard identifiers.	2016-08-04 13:49:51 -07:00
Murat Tuncer	c20080992d	Remove PostgreSQL 9.4 support	2016-07-26 20:16:09 +03:00
Jason Petersen	5d525fba24	Permit "single-shard" transactions Allows the use of modification commands (INSERT/UPDATE/DELETE) within transaction blocks (delimited by BEGIN and ROLLBACK/COMMIT), so long as all modifications hit a subset of nodes involved in the first such com- mand in the transaction. This does not circumvent the requirement that each individual modification command must still target a single shard. For instance, after sending BEGIN, a user might INSERT some rows to a shard replicated on two nodes. Subsequent modifications can hit other shards, so long as they are on one or both of these nodes. SAVEPOINTs are supported, though if the user actually attempts to send a ROLLBACK command that specifies a SAVEPOINT they will receive an ERROR at the end of the topmost transaction. Placements are only marked inactive if at least one replica succeeds in a transaction where others fail. Non-atomic behavior is possible if the shard targeted by the initial modification within a transaction has a higher replication factor than another shard within the same block and a node with the latter shard has a failure during the COMMIT phase. Other methods of denoting transaction blocks (multi-statement commands sent all at once and functions written in e.g. PL/pgSQL or other such languages) are not presently supported; their treatment remains the same as before.	2016-07-21 15:57:22 -06:00
Metin Doslu	a811e09dd4	Add support for prepared statements with parameterized non-partition columns in router executor	2016-07-21 11:09:28 +03:00
Brian Cloutier	0cad3b22cc	Simplify code and fix include guards in citus_clauses	2016-07-13 11:45:51 -07:00
Brian Cloutier	af9515f669	Only reparse queries if the planner flags them for reparsing	2016-07-13 11:45:51 -07:00
Brian Cloutier	4820366a6f	citus_indent and some renaming	2016-07-13 11:45:51 -07:00
Brian Cloutier	ae91768c96	Evaluate functions on the master - Enables using VOLATILE functions (like nextval()) in INSERT queries - Enables using STABLE functions (like now()) targetLists and joinTrees UPDATE and INSERT can now contain non-immutable functions. INSERT can contain any kind of expression, while UPDATE can contain any STABLE function, so long as a Var is not passed into the STABLE function, even indirectly. UPDATE TagetEntry's can now also include Vars. There's an exception, CASE/COALESCE statements may not contain mutable functions. Functions calls in master_modify_multiple_shards are also evaluated.	2016-07-13 11:45:51 -07:00
Andres Freund	c38c1adce1	Combine router executor paths for select and modify commands. The upcoming RETURNING support would otherwise require too much duplication. This contains most of the pieces required for RETURNING support, except removing the planner checks and adjusting regression test output.	2016-07-01 13:07:12 -07:00
Jason Petersen	e064cacea9	Minor formatting fix Noticed that uncrustify doesn't like the array-of-struct literals, so omitting them from formatting (at least here).	2016-06-28 13:09:57 -06:00
Jason Petersen	8b788eb899	Use literal instead of constant to fix 9.4 build PG_UINT32_MAX doesn't exist before 9.5. Missed this because I removed my assert-enabled builds during packaging work. Fixes #619	2016-06-28 12:36:14 -06:00
Jason Petersen	d39508d8b3	Minor formatting/comment fixes	2016-06-08 10:34:07 -06:00
Amos Bird	fd1b088208	Add overflow checks.	2016-06-08 10:30:03 +08:00
Amos Bird	107b926262	Eliminates the possibilities of counter overflows. This patch uses scanint8 instead of pg_atoi to make sure the affected tuples counter never gets overflow.	2016-06-08 10:30:03 +08:00
Jason Petersen	9ba02928ac	Refactor ReportRemoteError to remove boolean arg Broke it into two explicitly-named functions instead: WarnRemoteError and ReraiseRemoteError.	2016-06-07 12:38:32 -06:00
Metin Doslu	7d0c90b398	Fail fast on constraint violations in router executor	2016-06-07 18:11:17 +03:00
Marco Slot	fc4f23065a	Add EXPLAIN for simple distributed queries	2016-04-30 00:11:02 +02:00
Murat Tuncer	a88d3ecd4e	Add dynamic executor selection - non-router plannable queries can be executed by router executor if they satisfy the criteria - router executor is removed from configuration, now task executor can not be set to router - removed some tests that error out for router executor	2016-04-21 09:15:33 +03:00
Murat Tuncer	938546b938	Add router plannable check and router planning logic for single shard select queries	2016-04-21 09:15:33 +03:00
Jason Petersen	423e6c8ea0	Update copyright dates Fixed configure variable and updated all end dates to 2016.	2016-03-23 17:14:37 -06:00
Jason Petersen	8ad5b09251	Merge pull request #344 from citusdata/fix_shard_lock_acquisition#342 Ensure router executor acquires proper shard lock cr: @onderkalaci	2016-02-16 16:43:39 -07:00
Jason Petersen	0d196d1bf4	Ensure router executor acquires proper shard lock Though Citus' Task struct has a shardId field, it doesn't have the same semantics as the one previously used in pg_shard code. The analogous field in the Citus Task is anchorShardId. I've also added an argument check to the relevant locking function to catch future locking attempts which pass an invalid argument.	2016-02-16 11:20:18 -07:00

1 2

53 Commits (6ec34bed84b42653e05cbe7249c0c81876d885a9)