Commit Graph

53 Commits (6ec34bed84b42653e05cbe7249c0c81876d885a9)

Author SHA1 Message Date
Andres Freund 6ec34bed84 Remove remnants of commit_protocol.[ch]. 2017-01-21 09:01:15 -08:00
Andres Freund c390daed0f Use interrupt checking libpq wrappers in router executor. 2017-01-09 14:02:45 -08:00
Andres Freund 7320c17f00 Convert router executor to placement connection management infrastructure.
Remove the router specific transaction and shard management, and
replace it with the new placement connection API.  This mostly leaves
behaviour alone, except that it is now, inside a transaction, legal to
select from a shard to which no pre-existing connection exists.

To simplify code the code handling task executions for select and
modify has been split into two - the previous coding was starting to
get confusing due to the amount of only conditionally applicable code.

Modification connections & transactions are now always established in
parallel, not just for reference tables.
2017-01-09 13:13:02 -08:00
Onder Kalaci 6d050fd677 Use 2PC for reference table modification
With this commit, we ensure that router executor always uses
2PC for reference table modifications and never mark the placements
of it as INVALID.
2017-01-04 12:46:35 +02:00
Marco Slot 59bc5972fa
Use MultiConnection in multi-shard transactions 2016-12-30 14:43:21 -07:00
Marco Slot 92c7567008 Convert worker_transactions to new connection API 2016-12-23 16:14:29 +01:00
Marco Slot 11031bcf55 Enable evaluation of stable functions in INSERT..SELECT 2016-12-23 12:47:21 +01:00
Marco Slot d745d7bf70 Add explicit RelationShards mapping to tasks 2016-12-23 10:23:43 +01:00
Andres Freund 80b34a5d6b Integrate router executor into transaction management framework.
One less place managing remote transactions. It also makes it fairly
easy to use 2PC for certain modifications (e.g. reference tables). Just
issue a CoordinatedTransactionUse2PC(). If every placement failure
should cause the whole transaction to abort, additionally mark the
relevant transactions as critical.
2016-12-12 15:18:12 -08:00
Andres Freund a77cf36778 Use connection_management.c from within connection_cache.c.
This is a temporary step towards removing connection_cache.c.
2016-12-07 11:44:24 -08:00
Andres Freund 3223b3c92d Centralized Connection Lifetime Management.
Connections are tracked and released by integrating into postgres'
transaction handling. That allows to to use connections without having
to resort to having to disable interrupts or using PG_TRY/CATCH blocks
to avoid leaking connections.

This is intended to eventually replace multi_client_executor.c and
connection_cache.c, and to provide the basis of a centralized
transaction management.

The newly introduced transaction hook should, in the future, be the only
one in citus, to allow for proper ordering between operations.  For now
this central handler is responsible for releasing connections and
resetting XactModificationLevel after a transaction.
2016-12-07 11:43:18 -08:00
Marco Slot b566c4815c Pass down the correct type for null parameters 2016-11-11 07:14:08 +01:00
Onder Kalaci a43e3bad56 Improve error semantics for INSERT..SELECT
With this commit, we error out if a worker query cannot be executed
on all placements of a target insert shard interval.
2016-10-27 14:09:05 +03:00
Marco Slot 275378aa45 Re-acquire metadata locks in RouterExecutorStart 2016-10-26 14:34:59 +02:00
Onder Kalaci 1673ea937c Feature: INSERT INTO ... SELECT
This commit adds INSERT INTO ... SELECT feature for distributed tables.

We implement INSERT INTO ... SELECT by pushing down the SELECT to
each shard. To compute that we use the router planner, by adding
an "uninstantiated" constraint that the partition column be equal to a
certain value. standard_planner() distributes that constraint to all
the tables where it knows how to push the restriction safely. An example
is that the tables that are connected via equi joins.

The router planner then iterates over the target table's shards,
for each we replace the "uninstantiated" restriction, with one that
PruneShardList() handles. Do so by replacing the partitioning qual
parameter added in multi_planner() with the current shard's
actual boundary values. Also, add the current shard's boundary values to the
top level subquery to ensure that even if the partitioning qual is
not distributed to all the tables, we never run the queries on the shards
that don't match with the current shard boundaries. Finally, perform the
normal shard pruning to decide on whether to push the query to the
current shard or not.

We do not support certain SQLs on the subquery, which are described/commented
on ErrorIfInsertSelectQueryNotSupported().

We also added some locking on the router executor. When an INSERT/SELECT command
runs on a distributed table with replication factor >1, we need to ensure that
it sees the same result on each placement of a shard. So we added the ability
such that router executor takes exclusive locks on shards from which the SELECT
in an INSERT/SELECT reads in order to prevent concurrent changes. This is not a
very optimal solution, but it's simple and correct. The
citus.all_modifications_commutative can be used to avoid aggressive locking.
An INSERT/SELECT whose filters are known to exclude any ongoing writes can be
marked as commutative. See RequiresConsistentSnapshot() for the details.

We also moved the decison of whether the multiPlan should be executed on
the router executor or not to the planning phase. This allowed us to
integrate multi task router executor tasks to the router executor smoothly.
2016-10-26 10:01:00 +03:00
Marco Slot 271b20a23e Parallelise DDL commands 2016-10-24 12:39:08 +02:00
Marco Slot 65f6d7c02a Follow consistent execution order in parallel commands 2016-10-19 08:33:08 +02:00
Marco Slot a497e7178c Parallelise master_modify_multiple_shards 2016-10-19 08:33:08 +02:00
Marco Slot 9d98acfb6d Move requiresMasterEvaluation from Task to Job 2016-10-19 08:23:06 +02:00
Marco Slot 213d8419c6 Refactor and redocument executor shard lock code 2016-10-19 08:13:35 +02:00
Marco Slot 33b7723530 Use UpdateShardPlacementState where appropriate 2016-10-07 11:59:20 -07:00
Andres Freund 982ad66753 Introduce placement IDs.
So far placements were assigned an Oid, but that was just used to track
insertion order. It also did so incompletely, as it was not preserved
across changes of the shard state. The behaviour around oid wraparound
was also not entirely as intended.

The newly introduced, explicitly assigned, IDs are preserved across
shard-state changes.

The prime goal of this change is not to improve ordering of task
assignment policies, but to make it easier to reference shards.  The
newly introduced UpdateShardPlacementState() makes use of that, and so
will the in-progress connection and transaction management changes.
2016-10-07 11:59:20 -07:00
Jason Petersen 5f6264105d
Directly register router xact callbacks in PG_init
Not entirely sure why we went with the shared memory hook approach, but
it causes problems (multiple registration) during crashes. Changing to
a simple direct registration call from PG_init.
2016-09-29 11:43:18 -06:00
Jason Petersen 0caf0d95f1
Fix unique-violation-in-xact segfault
An interaction between ReraiseRemoteError and DML transaction support
causes segfaults:

  * ReraiseRemoteError calls PurgeConnection, freeing a connection...
  * That connection is still in the xactParticipantHash

At transaction end, the memory in the freed connection might happen to
pass the "is this connection OK?" check, causing us to try to send an
ABORT over that connection. By removing it from the transaction hash
before calling ReraiseRemoteError, we avoid this possibility.
2016-09-27 16:44:03 -06:00
Metin Doslu c9dcad9b05 Pass text oid inteads of invalid oid for null values
Passing invalid oids even for null values in PQsendQueryParams() causes worker
nodes to fail. Therefore, we pass text oid for null values.
2016-09-27 08:15:46 +03:00
Andres Freund 776b3868b9
Support NoMovement direction in router executor
This is mainly interesting because it allows to use RETURN QUERY/RETURN
QUERY EXECUTE and FOR ... IN .. LOOPs in plpgsql.
2016-09-26 18:28:36 -06:00
Jason Petersen 850c51947a
Re-permit DDL in transactions, selectively
Recent changes to DDL and transaction logic resulted in a "regression"
from the viewpoint of users. Previously, DDL commands were allowed in
multi-command transaction blocks, though they were not processed in any
actual transactional manner. We improved the atomicity of our DDL code,
but added a restriction that DDL commands themselves must not occur in
any BEGIN/END transaction block.

To give users back the original functionality (and improved atomicity)
we now keep track of whether a multi-command transaction has modified
data (DML) or schema (DDL). Interleaving the two modification types in
a single transaction is disallowed.

This first step simply permits a single DDL command in such a block,
admittedly an incomplete solution, but one which will permit us to add
full multi-DDL command support in a subsequent commit.
2016-08-30 20:37:19 -06:00
Andres Freund 7fdb5fbe29 Skip over unreferenced parameters when router executing prepared statement.
When an unreferenced prepared statement parameter does not explicitly
have a type assigned, we cannot deserialize it, to send to the remote
side.  That commonly happens inside plpgsql functions, where local
variables are passed in as unused prepared statement parameters.
2016-08-05 14:12:06 -07:00
Jason Petersen eba8396501
Avoid attempting to lock invalid shard identifier
A recent change generates a "dummy" shard placement with its identifier
set to INVALID_SHARD_ID for SELECT queries against distributed tables
with no shards. Normally, no lock is acquired for SELECT statements,
but if all_modifications_commutative is set to true, we will acquire a
shared lock, triggering an assertion failure within LockShardResource
in the above case.

The "dummy" shard placement is actually necessary to ensure such empty
queries have somewhere to execute, and INVALID_SHARD_ID seems the most
appropriate value for the dummy's shard identifier field, so the most
straightforward fix is to just avoid locking invalid shard identifiers.
2016-08-04 13:49:51 -07:00
Murat Tuncer c20080992d Remove PostgreSQL 9.4 support 2016-07-26 20:16:09 +03:00
Jason Petersen 5d525fba24
Permit "single-shard" transactions
Allows the use of modification commands (INSERT/UPDATE/DELETE) within
transaction blocks (delimited by BEGIN and ROLLBACK/COMMIT), so long as
all modifications hit a subset of nodes involved in the first such com-
mand in the transaction. This does not circumvent the requirement that
each individual modification command must still target a single shard.

For instance, after sending BEGIN, a user might INSERT some rows to a
shard replicated on two nodes. Subsequent modifications can hit other
shards, so long as they are on one or both of these nodes.

SAVEPOINTs are supported, though if the user actually attempts to send
a ROLLBACK command that specifies a SAVEPOINT they will receive an
ERROR at the end of the topmost transaction.

Placements are only marked inactive if at least one replica succeeds
in a transaction where others fail. Non-atomic behavior is possible if
the shard targeted by the initial modification within a transaction has
a higher replication factor than another shard within the same block
and a node with the latter shard has a failure during the COMMIT phase.

Other methods of denoting transaction blocks (multi-statement commands
sent all at once and functions written in e.g. PL/pgSQL or other such
languages) are not presently supported; their treatment remains the
same as before.
2016-07-21 15:57:22 -06:00
Metin Doslu a811e09dd4 Add support for prepared statements with parameterized non-partition columns in router executor 2016-07-21 11:09:28 +03:00
Brian Cloutier 0cad3b22cc Simplify code and fix include guards in citus_clauses 2016-07-13 11:45:51 -07:00
Brian Cloutier af9515f669 Only reparse queries if the planner flags them for reparsing 2016-07-13 11:45:51 -07:00
Brian Cloutier 4820366a6f citus_indent and some renaming 2016-07-13 11:45:51 -07:00
Brian Cloutier ae91768c96 Evaluate functions on the master
- Enables using VOLATILE functions (like nextval()) in INSERT queries
- Enables using STABLE functions (like now()) targetLists and joinTrees

UPDATE and INSERT can now contain non-immutable functions. INSERT can contain any kind of
expression, while UPDATE can contain any STABLE function, so long as a Var is not passed
into the STABLE function, even indirectly. UPDATE TagetEntry's can now also include Vars.

There's an exception, CASE/COALESCE statements may not contain mutable functions.

Functions calls in master_modify_multiple_shards are also evaluated.
2016-07-13 11:45:51 -07:00
Andres Freund c38c1adce1 Combine router executor paths for select and modify commands.
The upcoming RETURNING support would otherwise require too much
duplication.  This contains most of the pieces required for RETURNING
support, except removing the planner checks and adjusting regression
test output.
2016-07-01 13:07:12 -07:00
Jason Petersen e064cacea9
Minor formatting fix
Noticed that uncrustify doesn't like the array-of-struct literals, so
omitting them from formatting (at least here).
2016-06-28 13:09:57 -06:00
Jason Petersen 8b788eb899
Use literal instead of constant to fix 9.4 build
PG_UINT32_MAX doesn't exist before 9.5. Missed this because I removed
my assert-enabled builds during packaging work.

Fixes #619
2016-06-28 12:36:14 -06:00
Jason Petersen d39508d8b3
Minor formatting/comment fixes 2016-06-08 10:34:07 -06:00
Amos Bird fd1b088208 Add overflow checks. 2016-06-08 10:30:03 +08:00
Amos Bird 107b926262 Eliminates the possibilities of counter overflows.
This patch uses scanint8 instead of pg_atoi to make sure the affected
tuples counter never gets overflow.
2016-06-08 10:30:03 +08:00
Jason Petersen 9ba02928ac
Refactor ReportRemoteError to remove boolean arg
Broke it into two explicitly-named functions instead: WarnRemoteError
and ReraiseRemoteError.
2016-06-07 12:38:32 -06:00
Metin Doslu 7d0c90b398 Fail fast on constraint violations in router executor 2016-06-07 18:11:17 +03:00
Marco Slot fc4f23065a Add EXPLAIN for simple distributed queries 2016-04-30 00:11:02 +02:00
Murat Tuncer a88d3ecd4e Add dynamic executor selection
- non-router plannable queries can be executed
  by router executor if they satisfy the criteria
- router executor is removed from configuration,
  now task executor can not be set to router
- removed some tests that error out for router executor
2016-04-21 09:15:33 +03:00
Murat Tuncer 938546b938 Add router plannable check and router planning logic
for single shard select queries
2016-04-21 09:15:33 +03:00
Jason Petersen 423e6c8ea0
Update copyright dates
Fixed configure variable and updated all end dates to 2016.
2016-03-23 17:14:37 -06:00
Jason Petersen 8ad5b09251 Merge pull request #344 from citusdata/fix_shard_lock_acquisition#342
Ensure router executor acquires proper shard lock

cr: @onderkalaci
2016-02-16 16:43:39 -07:00
Jason Petersen 0d196d1bf4
Ensure router executor acquires proper shard lock
Though Citus' Task struct has a shardId field, it doesn't have the same
semantics as the one previously used in pg_shard code. The analogous
field in the Citus Task is anchorShardId. I've also added an argument
check to the relevant locking function to catch future locking attempts
which pass an invalid argument.
2016-02-16 11:20:18 -07:00