citus

Commit Graph

Author	SHA1	Message	Date
velioglu	b0efffae1c	Correct planner and add more tests	2017-08-11 10:16:13 +03:00
velioglu	7550b8ad52	Fix anchor shard id selection when reference table exists	2017-08-11 10:09:47 +03:00
velioglu	ceba81ce35	Move physical planner checks to logical planner	2017-08-11 10:09:47 +03:00
velioglu	0359d03530	Add set operation check for reference tables	2017-08-11 10:09:47 +03:00
velioglu	c4e3b8b5e1	Add planner changes and tests for subquery on reference tables	2017-08-11 10:09:47 +03:00
velioglu	45717dd013	Check equivalence on reference tables for subquery pushdown	2017-08-11 10:09:47 +03:00
Marco Slot	0ae265c436	Add citus_create_restore_point for distributed snapshots	2017-08-11 07:36:20 +02:00
Marco Slot	fdff210ef7	Wait for commit/abort/prepare results asynchronously	2017-08-11 00:03:06 +02:00
Marco Slot	fca986f214	Add API for waiting for multiple connections	2017-08-11 00:03:06 +02:00
Brian Cloutier	9d93fb5551	Create citus.use_secondary_nodes GUC This GUC has two settings, 'always' and 'never'. When it's set to 'never' all behavior stays exactly as it was prior to this commit. When it's set to 'always' only SELECT queries are allowed to run, and only secondary nodes are used when processing those queries. Add some helper functions: - WorkerNodeIsSecondary(), checks the noderole of the worker node - WorkerNodeIsReadable(), returns whether we're currently allowed to read from this node - ActiveReadableNodeList(), some functions (namely, the ones on the SELECT path) don't require working with Primary Nodes. They should call this function instead of ActivePrimaryNodeList(), because the latter will error out in contexts where we're not allowed to write to nodes. - ActiveReadableNodeCount(), like the above, replaces ActivePrimaryNodeCount(). - EnsureModificationsCanRun(), error out if we're not currently allowed to run queries which modify data. (Either we're in read-only mode or use_secondary_nodes is set) Some parts of the code were switched over to use readable nodes instead of primary nodes: - Deadlock detection - DistributedTableSize, - the router, real-time, and task tracker executors - ShardPlacement resolution	2017-08-10 17:37:17 +03:00
Brian Cloutier	3fc87a7a29	Metadata sync also syncs nodes in other clusters	2017-08-10 16:55:55 +03:00
Brian Cloutier	0dee4f8418	Metadata sync syncs all nodes, not just primaries	2017-08-10 16:55:55 +03:00
Eren Başak	f9470329e5	Remove test_helper_functions.h inclusions	2017-08-10 12:42:46 +03:00
Eren Başak	3061737712	Define Some Utility Functions This change declares two new functions: `master_update_table_statistics` updates the statistics of shards belong to the given table as well as its colocated tables. `get_colocated_shard_array` returns the ids of colocated shards of a given shard.	2017-08-10 12:42:46 +03:00
Brian Cloutier	1961add6f9	Improve error message when there are no nodes for a placement	2017-08-10 12:38:51 +03:00
Jason Petersen	dee66e3959	Final review feedback	2017-08-10 01:10:09 -07:00
Jason Petersen	6a35c2937c	Enable multi-row INSERTs This is a pretty substantial refactoring of the existing modify path within the router executor and planner. In particular, we now hunt for all VALUES range table entries in INSERT statements and group the rows contained therein by shard identifier. These rows are stashed away for later in "ModifyRoute" elements. During deparse, the appropriate RTE is extracted from the Query and its values list is replaced by these rows before any SQL is generated. In this way, we can create multiple Tasks, but only one per shard, to piecemeal execute a multi-row INSERT. The execution of jobs containing such tasks now exclusively go through the "multi-router executor" which was previously used for e.g. INSERT INTO ... SELECT. By piggybacking onto that executor, we participate in ongoing trans- actions, get rollback-ability, etc. In short order, the only remaining use of the "single modify" router executor will be for bare single- row INSERT statements (i.e. those not in a transaction). This change appropriately handles deferred pruning as well as master- evaluated functions.	2017-08-10 00:32:46 -07:00
velioglu	7e436c0277	Add bool expression to pruning instance with a function	2017-08-10 08:56:36 +03:00
Andres Freund	e8b793c454	Support for IN (const, list) and = ANY(const, b, c) pruning.	2017-08-10 08:56:36 +03:00
Onder Kalaci	b5ea3ab6a3	Improve locking semantics for backend management We use the backend shared memory lock for preventing new backends to be part of a new distributed transaction or an existing backend to leave a distributed transaction while we're reading the all backends' data. The primary goal is to provide consistent view of the current distributed transactions while doing the deadlock detection.	2017-08-09 17:17:12 +03:00
Brian Cloutier	2e0916e15a	Add master_add_secondary_node() UDF	2017-08-09 17:10:48 +03:00
Marco Slot	08ed6d8269	Prevent pg_dist_node changes during master_create_empty_shard	2017-08-09 14:22:09 +02:00
Marco Slot	3a0571e69b	Remove LockMetadataSnapshot	2017-08-09 14:09:54 +02:00
Marco Slot	c2f8bafa05	Fix shard creation vs. pg_dist_node change locking	2017-08-09 14:09:54 +02:00
Marco Slot	868ee6be83	Fix and simplify pg_dist_node locking	2017-08-09 14:09:54 +02:00
Burak Yucesoy	8455d1a4ef	Ensure we are allowing partitioned tables at all appropriate places	2017-08-09 10:01:35 +03:00
Burak Yucesoy	2eee556738	Add distributed partitioned table support for COPY For partitioned tables, PostgreSQL opens partition and its partitions in BeginCopyFrom and it expects its caller to close those relations. However, we do not have quick access to opened relations and performing special operations for partitioned tables isn't necessary in coordinator node. Therefore before calling BeginCopyFrom, we change relkind of those partitioned tables to RELKIND_RELATION. This prevents PostgreSQL to open its partitions as well.	2017-08-09 10:01:35 +03:00
Burak Yucesoy	31f3221342	Add distributed partitioned table support to router plannable queries In standart_planner, PostgreSQL expands partitioned tables to their partitions and call our restriction hook for each partition. It also, for some queries, skips the partitioned table itself completely. This behaviour makes it difficult to prune shards and decide whether query is router plannable or not. To prevent this behaviour, we change inh flag of partitioned tables to false in the query tree. In this case, PostgreSQL treats those partitioned tables as regular relations and does not expand them. This behaviour is inline with our expectations, because we do not want to treat partitioned tables differently on coordinator. Although we are not entirely comfortable with modifying query tree, other solutions to this problem is overly complicated.	2017-08-09 10:01:35 +03:00
Burak Yucesoy	fddf9b3fcc	Add distributed partitioned table support distributed table creation With this PR, Citus starts to support all possible ways to create distributed partitioned tables. These are; - Distributing already created partitioning hierarchy - CREATE TABLE ... PARTITION OF a distributed_table - ALTER TABLE distributed_table ATTACH PARTITION non_distributed_table - ALTER TABLE distributed_table ATTACH PARTITION distributed_table We also support DETACHing partitions from partitioned tables and propogating TRUNCATE and DDL commands to distributed partitioned tables. This PR also refactors some parts of distributed table creation logic.	2017-08-09 10:01:35 +03:00
Metin Doslu	b8a9e7c1bf	Add support for UPDATE/DELETE with subqueries	2017-08-08 21:35:08 +03:00
Marco Slot	d3e9746236	Avoid connections that accessed non-colocated placements in multi-shard commands	2017-08-08 18:32:34 +02:00
Brian Cloutier	7060ade6fe	GetNodeTuple returns NULL it node does not exist It never throws an error.	2017-08-08 13:12:06 +03:00
Brian Cloutier	a3e9bef685	All users of WorkerNodeHash take an AccessShareLock The metadata cache simulates a SELECT on pg_dist_node. Now the locks it takes also simulate that SELECT.	2017-08-08 13:12:06 +03:00
Brian Cloutier	5914c992e6	cluster management UDFs see nodes in different clusters - master_activate_node and master_disable_node correctly toggle isActive, without crashing - master_add_node rejects duplicate nodes, even if they're in different clusters - master_remove_node allows removing nodes in different clusters	2017-08-08 13:12:06 +03:00
Brian Cloutier	3151b52a0b	Add citus.cluster_name GUC - Nodes with a nodecluster which does not match citus.cluster_name are excluded from the metadata cache and never seen by another part of Citus.	2017-08-08 13:12:06 +03:00
Brian Cloutier	94947c0d54	Refactor: ReplicateShardToAllWorkers more explicitly locks pg_dist_node	2017-08-08 13:12:06 +03:00
Brian Cloutier	f87fefa323	Refactor: DistributedTableSize more explicitly only locks pg_dist_node	2017-08-08 13:12:06 +03:00
Brian Cloutier	3769381366	Fix inaccurate comment on SetNodeState	2017-08-08 13:12:06 +03:00
Brian Cloutier	fbecf48a03	Disallow adding primary nodes to non-default clusters	2017-08-08 11:18:31 +03:00
Brian Cloutier	5618e69386	Add pg_dist_node.nodecluster	2017-08-08 11:18:31 +03:00
Brian Cloutier	e7846ba7d1	Allow metadata sync functions on secondaries {start,stop}_metadata_sync_to_node now toggle the hasMetadata flag when run on secondaries but don't attempt to actually sync any metadata.	2017-08-07 18:46:51 +03:00
Marco Slot	4cc7c36596	Simplify metadata lock acquisition for DML	2017-08-07 15:36:58 +02:00
Marco Slot	aa7ca81548	Execute UPDATE/DELETE statements with 0 shards	2017-08-07 15:36:58 +02:00
Marco Slot	bac60bb64f	Function evaluation descends into expression trees	2017-08-06 19:53:05 +02:00
Brian Cloutier	37985de85e	master_disable_node no longer crashes when given a non-existant node	2017-08-04 11:14:54 +03:00
Hadi Moshayedi	8229a64fe8	Remove distributed tables' dependency on distribution key columns. (#1527 ) This change removes distributed tables' dependency on distribution key columns. We already check that we cannot drop distribution key columns in ErrorIfUnsupportedAlterTableStmt() at multi_utility.c, so we don't need to have distributed table to distribution key column dependency to avoid dropping of distribution key column. Furthermore, having this dependency causes some warnings in pg_dump --schema-only (See #866), which are not desirable. This change also adds check to disallow drop of distribution keys when citus.enable_ddl_propagation is set to false. Regression tests are updated accordingly.	2017-08-03 10:07:04 -04:00
Murat Tuncer	fa18899cf9	Remove serialization/deserialization of multiplan node (#1477 ) introduces copy functions for Citus MultiPlan nodes. uses ExtensibleNode mechanism to store MultiPlan data drops serialiazation of MultiPlans	2017-08-02 08:24:00 +03:00
Burak Yucesoy	7769f1d012	Refactor distributed table creation logic This commit is preperation for introducing distributed partitioned table support. We want to clean and refactor some code in distributed table creation logic so that we can handle partitioned tables in more robust way.	2017-07-31 11:11:23 +03:00
Brian Cloutier	b20a086a8f	master_activate_node UDF also returns noderole	2017-07-28 16:02:43 +03:00
Murat Tuncer	26f020dc6e	Make maxTaskStringSize configurable (#1501 ) maxTaskStringSize determines the size of worker query string. It was originally hard coded to a specific value. This has caused issues at some users. Since it determines initial shared memory allocation, we did not want to set it to an arbitrary higher number. Instead made it configurable. This commit introduces a new GUC variable max_task_string_size Changes in this variable requires restart to be in effect.	2017-07-27 11:39:12 -07:00

1 2 3 4 5 ...

630 Commits (c4eb6c5153d2387cd1124bdda18f4ee12b762e27)