Commit Graph

151 Commits (39c760ae7b05f3c0814f66048992ceb1b8b5b3f2)

Author SHA1 Message Date
Andres Freund 5e35f6a5dd Convert router executor to placement connection management infrastructure.
Remove the router specific transaction and shard management, and
replace it with the new placement connection API.  This mostly leaves
behaviour alone, except that it is now, inside a transaction, legal to
select from a shard to which no pre-existing connection exists.

To simplify code the code handling task executions for select and
modify has been split into two - the previous coding was starting to
get confusing due to the amount of only conditionally applicable code.

Modification connections & transactions are now always established in
parallel, not just for reference tables.
2017-01-09 13:13:02 -08:00
Andres Freund b8a1c0678c Centralized shard/placement connection and state management.
Currently there are several places in citus that map placements to
connections and that manage placement health. Centralize this
knowledge.  Because of the centralized knowledge about which
connection has previously been used for which shard/placement, this
also provides the basis for relaxing restrictions around combining
various forms of DDL/DML.

Connections for a placement can now be acquired using
GetPlacementConnection(). If the connection is used for DML or DDL the
FOR_DDL/DML flags should be used respectively.  If an individual
remote transaction fails (but the transaction on the master succeeds)
and FOR_DDL/DML have been specified, the placement is marked as
invalid, unless that'd mark all placements for a shard as invalid.
2017-01-09 13:13:02 -08:00
Andres Freund c6498cb04e Remove unused LogPreparedTransactions() function.
This is unused since 92c7567008.
2017-01-06 09:15:01 -08:00
Burak Yucesoy 9d756de3ae Replicate reference tables when new node is added
With this change, we start to replicate all reference tables to the new node when new node
is added to the cluster with master_add_node command. We also update replication factor
of reference table's colocation group.
2017-01-05 14:30:41 +03:00
Onder Kalaci 0a05f12475 Use 2PC for reference table modification
With this commit, we ensure that router executor always uses
2PC for reference table modifications and never mark the placements
of it as INVALID.
2017-01-04 12:46:35 +02:00
Burak Yucesoy 541e45c26e Add upgrade_to_reference_table
With this change we introduce new UDF, upgrade_to_reference_table, which can be used to
upgrade existing broadcast tables reference tables. For upgrading, we require that given
table contains only one shard.
2017-01-02 17:54:42 +02:00
Eren Basak da3ce88091 Error on Unsupported Features on Workers
This change makes the metadata workers error out on unsupported commands.
2017-01-02 16:03:45 +03:00
Marco Slot 6b7404f59c Use MultiConnection in multi-shard transactions 2016-12-30 14:43:21 -07:00
Metin Doslu 8282fe4af0 Add binary search capability to ShardIndex()
Renamed FindShardIntervalIndex() to ShardIndex() and added binary search
capability. It used to assume that hash partition tables are always
uniformly distributed which is not true if upcoming tenant isolation
feature is applied. This commit also reduces code duplication.
2016-12-30 18:55:34 +02:00
Marco Slot e55a27a487 Enable transaction recovery in connection API 2016-12-23 16:14:29 +01:00
Marco Slot 06e3eff3d2 Convert worker_transactions to new connection API 2016-12-23 16:14:29 +01:00
Marco Slot 6ea2cb7c8e Add a wrapper for PQsendQuery 2016-12-23 16:14:29 +01:00
Marco Slot b9cc1d4d2c Connectionapify SendCommandListToWorkerInSingleTransaction 2016-12-23 16:14:29 +01:00
Eren Basak 93bc2c6c12 Handle MX tables on workers during drop table commands 2016-12-23 15:43:32 +03:00
Eren Basak bdf732d115 Propagate `mark_tables_colocated` changes in `pg_dist_partition` table to metadata workers. 2016-12-23 15:43:32 +03:00
Eren Basak 9876e253b7 Propagate DDL commands to metadata workers for MX tables 2016-12-23 15:43:32 +03:00
Eren Basak 3d9540e500 Propagate MX table and shard metadata on `create_distributed_table` call 2016-12-23 15:43:32 +03:00
Marco Slot 9cdea04466 Enable evaluation of stable functions in INSERT..SELECT 2016-12-23 12:47:21 +01:00
Marco Slot f058ba3ec0 Add explicit RelationShards mapping to tasks 2016-12-23 10:23:43 +01:00
Onder Kalaci 807fc1cc28 Reference Table Support - Phase 1
With this commit, we implemented some basic features of reference tables.

To start with, a reference table is
  * a distributed table whithout a distribution column defined on it
  * the distributed table is single sharded
  * and the shard is replicated to all nodes

Reference tables follows the same code-path with a single sharded
tables. Thus, broadcast JOINs are applicable to reference tables.
But, since the table is replicated to all nodes, table fetching is
not required any more.

Reference tables support the uniqueness constraints for any column.

Reference tables can be used in INSERT INTO .. SELECT queries with
the following rules:
  * If a reference table is in the SELECT part of the query, it is
    safe join with another reference table and/or hash partitioned
    tables.
  * If a reference table is in the INSERT part of the query, all
    other participating tables should be reference tables.

Reference tables follow the regular co-location structure. Since
all reference tables are single sharded and replicated to all nodes,
they are always co-located with each other.

Queries involving only reference tables always follows router planner
and executor.

Reference tables can have composite typed columns and there is no need
to create/define the necessary support functions.

All modification queries, master_* UDFs, EXPLAIN, DDLs, TRUNCATE,
sequences, transactions, COPY, schema support works on reference
tables as expected. Plus, all the pre-requisites associated with
distribution columns are dismissed.
2016-12-20 14:09:35 +02:00
Eren Basak 5eb90d6d93 Add citus.node_connection_timeout GUC 2016-12-20 14:11:37 +03:00
Murat Tuncer 6c95f0352e Make router planner active at all times
We used to disable router planner and executor
when task executor is set to task-tracker.

This change enables router planning and execution
at all times regardless of task execution mode.

We are introducing a hidden flag enable_router_execution
to enable/disable router execution. Its default value is
true. User may disable router planning by setting it to false.
2016-12-20 11:24:01 +03:00
Jason Petersen 54c2efccc1 Add targeted VACUUM/ANALYZE support
Adds support for VACUUM and ANALYZE commands which target a specific
distributed table. After grabbing the appropriate locks, this imple-
mentation sends VACUUM commands to each placement (using one connec-
tion per placement). These commands are sent in parallel, so users
with large tables will benefit from sharding. Except for VERBOSE, all
VACUUM and ANALYZE options are supported, including the explicit
column list used by ANALYZE.

As with many of our utility commands, the local command also runs. In
the VACUUM/ANALYZE case, the local command is executed before any re-
mote propagation. Because error handling is managed after local proc-
essing, this can result in a VACUUM completing locally but erroring
out when distributed processing commences: a minor technicality in all
cases, as there isn't really much reason to ever roll back a VACUUM (an
impossibility in any case, as VACUUM cannot run within a transaction).

Remote propagation of targeted VACUUM/ANALYZE is controlled by the
enable_ddl_propagation setting; warnings are emitted if such a command
is attempted when DDL propagation is disabled. Unqualified VACUUM or
ANALYZE is not handled, but a warning message informs the user of this.

Implementation note: this commit adds a "BARE" value to MultiShard-
CommitProtocol. When active, no BEGIN command is ever sent to remote
nodes, useful for commands such as VACUUM/ANALYZE which must not run in
a transaction block. This value is not user-facing and is reset at
transaction end.
2016-12-16 16:59:06 -07:00
Metin Doslu fc908a3ab6 Refactor distribution column type check for colocation 2016-12-16 15:24:45 +02:00
Metin Doslu d43a01ebae Don't allow tables with different replication models to be colocated 2016-12-16 15:23:49 +02:00
Metin Doslu 4ff550429f Add colocate_with option to create_distributed_table()
With this commit, we support three versions of colocate_with: i.default, ii.none
and iii. a specific table name.
2016-12-16 14:53:35 +02:00
Metin Doslu 15aeac0502 Move colocation related functions to colocation_utils.c 2016-12-16 14:52:40 +02:00
Eren Basak 59b95a958d Propagate CREATE SCHEMA commands with the correct AUTHORIZATION clause in start_metadata_sync_to_node 2016-12-14 10:53:12 +03:00
Eren Basak 72e5e826b8 Make start_metadata_sync_to_node UDF to propagate foreign-key constraints 2016-12-14 10:53:12 +03:00
Eren Basak eea27619a0 Make truncate triggers propagated on start_metadata_sync_to_node call 2016-12-14 10:53:10 +03:00
Eren Basak 9c415d4f34 Add start_metadata_sync_to_node UDF
This change adds `start_metadata_sync_to_node` UDF which copies the metadata about nodes and MX tables
from master to the specified worker, sets its local group ID and marks its hasmetadata to true to
allow it receive future DDL changes.
2016-12-13 10:48:03 +03:00
Andres Freund 90ad02f5fb Integrate router executor into transaction management framework.
One less place managing remote transactions. It also makes it fairly
easy to use 2PC for certain modifications (e.g. reference tables). Just
issue a CoordinatedTransactionUse2PC(). If every placement failure
should cause the whole transaction to abort, additionally mark the
relevant transactions as critical.
2016-12-12 15:18:12 -08:00
Andres Freund 21449ed0a0 Convert multi_shard_transaction.[ch] to new framework. 2016-12-12 15:18:12 -08:00
Andres Freund 7ac4a2bc94 Coordinated remote transaction management. 2016-12-12 15:18:12 -08:00
Andres Freund ccb2a35b6b Add PQgetResult() wrapper handling interrupts.
This makes it possible to implement cancelling queries blocked on
communication with remote nodes.
2016-12-12 15:18:12 -08:00
Andres Freund 3754e74acb Move multi_client_executor.[ch] ontop of connection_management.[ch].
That way connections can be automatically closed after errors and such,
and the connection management infrastructure gets wider testing.  It
also fixes a few issues around connection string building.
2016-12-07 11:44:24 -08:00
Andres Freund c7f19f6c83 Use connection_management.c from within connection_cache.c.
This is a temporary step towards removing connection_cache.c.
2016-12-07 11:44:24 -08:00
Andres Freund 7bc7284b61 Add initial helpers to make interactions with MultiConnection et al. easier.
This includes basic infrastructure for logging of commands sent to
remote/worker nodes. Note that this has no effect as of yet, since no
callers are converted to the new infrastructure.
2016-12-07 11:44:24 -08:00
Andres Freund 8e32951eb9 Centralized Connection Lifetime Management.
Connections are tracked and released by integrating into postgres'
transaction handling. That allows to to use connections without having
to resort to having to disable interrupts or using PG_TRY/CATCH blocks
to avoid leaking connections.

This is intended to eventually replace multi_client_executor.c and
connection_cache.c, and to provide the basis of a centralized
transaction management.

The newly introduced transaction hook should, in the future, be the only
one in citus, to allow for proper ordering between operations.  For now
this central handler is responsible for releasing connections and
resetting XactModificationLevel after a transaction.
2016-12-07 11:43:18 -08:00
Andres Freund 9564e1e7fc Add some basic helpers to make use of dynahash hashtables easier. 2016-12-06 14:15:36 -08:00
Eren Basak 517db3648a Propagate node add/remove to the nodes with hasmetadata=true
This change propagates the changes done by `master_add_node` and `master_remove_node`
to the workers that contain metadata.
2016-12-02 14:43:32 +03:00
Brian Cloutier 14b47b769c Remove dead code: ResponsiveWorkerNodeList 2016-12-02 13:14:11 +03:00
Marco Slot 1d3caceda4 Use co-located shard ID in multi_shard_transaction 2016-11-02 11:01:19 +01:00
Önder Kalacı 34fa711b14 Always CASCADE while dropping a shard 2016-11-01 10:16:34 +01:00
Metin Doslu 520e7e3cb2 Add mark_tables_colocated() to update colocation groups
Added a new UDF, mark_tables_colocated(), to colocate tables with the same
configuration (shard count, shard replication count and distribution column type).
2016-10-26 17:29:03 +03:00
Marco Slot 7a606f1c1d Re-acquire metadata locks in RouterExecutorStart 2016-10-26 14:34:59 +02:00
Brian Cloutier c7e722e624 Fix segfault during EXPLAIN EXECUTE
Fix citusdata/citus#886

The way postgres' explain hook is designed means that our hook is never
called during EXPLAIN EXECUTE. So, we special-case EXPLAIN EXECUTE by
catching it in the utility hook.  We then replace the EXECUTE with the
original query and pass it back to Citus.
2016-10-26 15:18:42 +03:00
Andres Freund ebbef819f2 Invalidate relcache after pg_dist_shard_placement changes.
This forces prepared statements to be re-planned after changes of the
placement metadata. There's some locking issues remaining, but that's a
a separate task.

Also add regression tests verifying that invalidations take effect on
prepared statements.
2016-10-26 03:36:35 -07:00
Onder Kalaci 9e82cd6d2d Feature: INSERT INTO ... SELECT
This commit adds INSERT INTO ... SELECT feature for distributed tables.

We implement INSERT INTO ... SELECT by pushing down the SELECT to
each shard. To compute that we use the router planner, by adding
an "uninstantiated" constraint that the partition column be equal to a
certain value. standard_planner() distributes that constraint to all
the tables where it knows how to push the restriction safely. An example
is that the tables that are connected via equi joins.

The router planner then iterates over the target table's shards,
for each we replace the "uninstantiated" restriction, with one that
PruneShardList() handles. Do so by replacing the partitioning qual
parameter added in multi_planner() with the current shard's
actual boundary values. Also, add the current shard's boundary values to the
top level subquery to ensure that even if the partitioning qual is
not distributed to all the tables, we never run the queries on the shards
that don't match with the current shard boundaries. Finally, perform the
normal shard pruning to decide on whether to push the query to the
current shard or not.

We do not support certain SQLs on the subquery, which are described/commented
on ErrorIfInsertSelectQueryNotSupported().

We also added some locking on the router executor. When an INSERT/SELECT command
runs on a distributed table with replication factor >1, we need to ensure that
it sees the same result on each placement of a shard. So we added the ability
such that router executor takes exclusive locks on shards from which the SELECT
in an INSERT/SELECT reads in order to prevent concurrent changes. This is not a
very optimal solution, but it's simple and correct. The
citus.all_modifications_commutative can be used to avoid aggressive locking.
An INSERT/SELECT whose filters are known to exclude any ongoing writes can be
marked as commutative. See RequiresConsistentSnapshot() for the details.

We also moved the decison of whether the multiPlan should be executed on
the router executor or not to the planning phase. This allowed us to
integrate multi task router executor tasks to the router executor smoothly.
2016-10-26 10:01:00 +03:00
Onder Kalaci 75dd6ee3ab Add ability to reorder target list for INSERT/SELECT queries
The necessity for this functionality comes from the fact that ruleutils.c is not supposed to be
used on "rewritten" queries (i.e. ones that have been passed through QueryRewrite()).
Query rewriting is the process in which views and such are expanded,
and, INSERT/UPDATE targetlists are reordered to match the physical order,
defaults etc. For the details of reordeing, see transformInsertRow().
2016-10-26 10:00:03 +03:00