citus

Commit Graph

Author	SHA1	Message	Date
Murat Tuncer	6f3262546f	Enable top level subquery join queries This work enables - Top level subquery joins - Joins between subqueries and relations - Joins involving more than 2 range table entries A new regression test file is added to reflect enabled test cases	2017-04-19 11:46:54 +03:00
Onder Kalaci	96d781e0e9	Subquery pushdown planner uses original query With this commit, we change the input to the logical planner for subquery pushdown. Before this commit, the planner was relying on the query tree that is transformed by the postgresql planner. After this commit, the planner uses the original query. The main motivation behind this change is the simplify deparsing of subqueries.	2017-04-19 11:46:54 +03:00
Onder Kalaci	f5d7ab60ce	Enabling physical planner for subquery pushdown changes This commit applies the logic that exists in INSERT .. SELECT planning to the subquery pushdown changes. The main algorithm is followed as : - pick an anchor relation (i.e., target relation) - per each target shard interval - add the target shard interval's shard range as a restriction to the relations (if all relations joined on the partition keys) - Check whether the query is router plannable per target shard interval. - If router plannable, create a task	2017-04-19 11:46:52 +03:00
Marco Slot	dfd7d86948	Stop using a sequence to generate unique job IDs	2017-04-18 11:31:51 +02:00
Marco Slot	5e58804d44	Support query parameters in combination with function evaluation	2017-04-17 15:40:55 +02:00
Marco Slot	0bcc227a62	Create indexes after worker_append_table_to_shard during shard repair	2017-04-17 15:17:21 +02:00
Burak Yucesoy	e9095e62ec	Decouple reference table replication With this change we add an option to add a node without replicating all reference tables to that node. If a node is added with this option, we mark the node as inactive and no queries will sent to that node. We also added two new UDFs; - master_activate_node(host, port): - marks node as active and replicates all reference tables to that node - master_add_inactive_node(host, port): - only adds node to pg_dist_node	2017-04-17 13:33:31 +03:00
Onder Kalaci	1cb6a34ba8	Remove uninstantiated qual logic, use attribute equivalences In this PR, we aim to deduce whether each of the RTE_RELATION is joined with at least on another RTE_RELATION on their partition keys. If each RTE_RELATION follows the above rule, we can conclude that all RTE_RELATIONs are joined on their partition keys. In order to do that, we invented a new equivalence class namely: AttributeEquivalenceClass. In very simple words, a AttributeEquivalenceClass is identified by an unique id and consists of a list of AttributeEquivalenceMembers. Each AttributeEquivalenceMember is designed to identify attributes uniquely within the whole query. The necessity of this arise since varno attributes are defined within a single level of a query. Instead, here we want to identify each RTE_RELATION uniquely and try to find equality among each RTE_RELATION's partition key. Whenever we find an equality clause A = B, where both A and B originates from relation attributes (i.e., not random expressions), we create an AttributeEquivalenceClass to record this knowledge. If we later find another equivalence B = C, we create another AttributeEquivalenceClass. Finally, we can apply transitity rules and generate a new AttributeEquivalenceClass which includes A, B and C. Note that equality among the members are identified by the varattno and rteIdentity. Each equality among RTE_RELATION is saved using an AttributeEquivalenceClass where each member attribute is identified by a AttributeEquivalenceMember. In the final step, we try generate a common attribute equivalence class that holds as much as AttributeEquivalenceMembers whose attributes are a partition keys.	2017-04-13 11:51:26 +03:00
Burak Yucesoy	a09614553f	Add enable_version_checks GUC and address feedback	2017-04-04 19:11:13 +03:00
Jason Petersen	1c2056ec74	Self-implemented review feedback The use of a bare src/ rather than $srcdir caused configure to fail during VPATH builds. With our additional dependency upon AWK, we need to call AC_PROG_AWK, otherwise environments may not have $AWK set. Finally, citus_version.h should be in .gitignore.	2017-04-03 22:55:12 -06:00
Burak Yucesoy	087d8427e3	Error out if binary citus version does not match installed extension With this change, we start to error out if loaded citus binaries does not match the available major version or installed citus extension version. In this case we force user to restart the server or run ALTER EXTENSION depending on the situation	2017-04-03 17:36:13 -06:00
Jason Petersen	4cdfc3a10f	Address review feedback Should just about do it.	2017-04-03 11:44:57 -06:00
Jason Petersen	cf775c4773	Improve CONCURRENTLY-related error messages Thought this looked slightly nicer than the default behavior. Changed preventTransaction to concurrent to be clearer that this code path presently affects CONCURRENTLY code only.	2017-04-03 11:19:15 -06:00
Jason Petersen	dd9365433e	Update documentation Ensure all functions have comments, etc.	2017-04-03 11:19:15 -06:00
Jason Petersen	d904e96c59	Address MX CONCURRENTLY problems Adds a non-transactional multi-command method to propagate DDLs to all MX/metadata-synced nodes.	2017-04-03 11:19:15 -06:00
Jason Petersen	dea6c44f75	Remove CONCURRENTLY checks, fix tests Still pending failure testing, which broke with my recent changes.	2017-04-03 11:19:15 -06:00
Jason Petersen	95d8d27c4f	Change IndexStmt to generate worker DDL on master Because we can't execute CREATE INDEX CONCURRENTLY during transactions, worker_apply_shard_ddl_command is insufficient.	2017-04-03 11:19:14 -06:00
Marco Slot	0f355a4a48	Batch task_tracker_status calls to reduce task-tracker query times	2017-03-31 11:54:11 +02:00
Jason Petersen	34a62abb7d	Address code review comments	2017-03-22 17:29:17 -06:00
Jason Petersen	d95b5bbad3	Rework ReplicateGrantStmt to use new flow This was the impetus for the previous commit that changed from using a DDLJob * to a List * of them.	2017-03-22 17:29:16 -06:00
Jason Petersen	a02a2a90c7	Refactor ExecuteDistDDLCommand to expect struct Will let us separate out the determination of what to execute from its actual execution.	2017-03-22 17:21:49 -06:00
velioglu	e32aff1a26	Size UDFs implemented citus_table_size, citus_relation_size and citus_total_relation_size UDFs are implemented.	2017-03-16 13:50:30 +03:00
Metin Doslu	1f838199f8	Use CustomScan API for query execution Custom Scan is a node in the planned statement which helps external providers to abstract data scan not just for foreign data wrappers but also for regular relations so you can benefit your version of caching or hardware optimizations. This sounds like only an abstraction on the data scan layer, but we can use it as an abstraction for our distributed queries. The only thing we need to do is to find distributable parts of the query, plan for them and replace them with a Citus Custom Scan. Then, whenever PostgreSQL hits this custom scan node in its Vulcano style execution, it will call our callback functions which run distributed plan and provides tuples to the upper node as it scans a regular relation. This means fewer code changes, fewer bugs and more supported features for us! First, in the distributed query planner phase, we create a Custom Scan which wraps the distributed plan. For real-time and task-tracker executors, we add this custom plan under the master query plan. For router executor, we directly pass the custom plan because there is not any master query. Then, we simply let the PostgreSQL executor run this plan. When it hits the custom scan node, we call the related executor parts for distributed plan, fill the tuple store in the custom scan and return results to PostgreSQL executor in Vulcano style, a tuple per XXX_ExecScan() call. * Modify planner to utilize Custom Scan node. * Create different scan methods for different executors. * Use native PostgreSQL Explain for master part of queries.	2017-03-14 12:17:51 +02:00
Andres Freund	52358fe891	Initial temp table removal implementation	2017-03-14 12:09:49 +02:00
Jason Petersen	6f4886cd11	Revert "Remove unused SendCommandToWorker" This reverts commit `c8c308c109`.	2017-03-13 15:48:51 -06:00
Brian Cloutier	c8c308c109	Remove unused SendCommandToWorker	2017-03-08 16:30:23 +03:00
Brian Cloutier	95936ff481	Remove unused master_get_round_robin_candidate_nodes	2017-03-07 11:51:24 +03:00
Brian Cloutier	807beb7bc0	Remove master_get_local_first_candidate_nodes	2017-03-07 11:50:59 +03:00
Murat Tuncer	72027f2eba	Remove default clause from shard DDL when sequences are used	2017-03-01 17:32:48 +03:00
Marco Slot	db98c28354	Address review feedback in COPY refactoring	2017-02-28 17:39:45 +01:00
Marco Slot	bf3541cb24	Add CitusCopyDestReceiver infrastructure	2017-02-28 17:24:45 +01:00
Eren Basak	df9cf346ee	Enforce statement based replication on old APIs and non-hash tables This change ignores `citus.replication_model` setting and uses the statement based replication in - Tables distributed via the old `master_create_distributed_table` function - Append and range partitioned tables, even if created via `create_distributed_table` function This seems like the easiest solution to #1191, without changing the existing behavior and harming existing users with custom scripts. This change also prevents RF>1 on streaming replicated tables on `master_create_worker_shards` Prior to this change, `master_create_worker_shards` command was not checking the replication model of the target table, thus allowing RF>1 with streaming replicated tables. With this change, `master_create_worker_shards` errors out on the case.	2017-02-16 10:37:53 -08:00
Brian Cloutier	1173f3f225	Refactor CheckShardPlacements - Break CheckShardPlacements into multiple functions (The most important is MarkFailedShardPlacements), so that we can get rid of the global CoordinatedTransactionUses2PC. - Call MarkFailedShardPlacements in the router executor, so we mark shards as invalid and stop using them while inside transaction blocks.	2017-01-26 13:20:45 +02:00
Marco Slot	ba940a1de9	Use coordinator instead of schema node in terminology	2017-01-25 11:07:23 +01:00
Burak Yucesoy	484cb12cd0	Add LoadShardPlacement UDF This UDF returns a shard placement from cache given shard id and placement id. At the moment it iterates over all shard placements of given shard by ShardPlacementList and searches given placement id in that list, which is not a good solution performance-wise. However, currently, this function will be used only when there is a failed transaction. If a need arises we can optimize this function in the future.	2017-01-23 21:04:57 +03:00
Marco Slot	1585c02322	Use placement connection API for multi-shard transactions	2017-01-23 18:34:50 +01:00
Andres Freund	6939cb8c56	Hack up PREPARE/EXECUTE for nearly all distributed queries. All router, real-time, task-tracker plannable queries should now have full prepared statement support (and even use router when possible), unless they don't go through the custom plan interface (which basically just affects LANGUAGE SQL (not plpgsql) functions). This is achieved by forcing postgres' planner to always choose a custom plan, by assigning very low costs to plans with bound parameters (i.e. ones were the postgres planner replanned the query upon EXECUTE with all parameter values provided), instead of the generic one. This requires some trickery, because for custom plans to work the costs for a non-custom plan have to be known, which means we can't error out when planning the generic plan. Instead we have to return a "faux" plan, that'd trigger an error message if executed. But due to the custom plan logic that plan will likely (unless called by an SQL function, or because we can't support that query for some reason) not be executed; instead the custom plan will be chosen.	2017-01-23 09:23:50 -08:00
Andres Freund	c244b8ef4a	Make router planner error handling more flexible. So far router planner had encapsulated different functionality in MultiRouterPlanCreate. Modifications always go through router, selects sometimes. Modifications always error out if the query is unsupported, selects return NULL. Especially the error handling is a problem for the upcoming extension of prepared statement support. Split MultiRouterPlanCreate into CreateRouterPlan and CreateModifyPlan, and change them to not throw errors. Instead errors are now reported by setting the new MultiPlan->plannigError. Callers of router planner functionality now have to throw errors themselves if desired, but also can skip doing so. This is a pre-requisite for expanding prepared statement support. While touching all those lines, improve a number of error messages by getting them closer to the postgres error message guidelines.	2017-01-23 09:23:50 -08:00
Andres Freund	557ccc6fda	Support for deferred error messages. It can be useful, e.g. in the upcoming prepared statement support, to be able to return an error from a function that is not raised immediately, but can later be thrown. That allows e.g. to attempt to plan a statment using different methods and to create good error messages in each planner, but to only error out after all planners have been run. To enable that create support for deferred error messages that can be created (supporting errorcode, message, detail, hint) in one function, and then thrown in different place.	2017-01-23 09:23:50 -08:00
Jason Petersen	56197dbdba	Add replication_model GUC This adds a replication_model GUC which is used as the replication model for any new distributed table that is not a reference table. With this change, tables with replication factor 1 are no longer implicitly MX tables. The GUC is similarly respected during empty shard creation for e.g. existing append-partitioned tables. If the model is set to streaming while replication factor is greater than one, table and shard creation routines will error until this invalid combination is corrected. Changing this parameter requires superuser permissions.	2017-01-23 09:05:14 -07:00
Brian Cloutier	fe5465aa4e	Port master_append_table_to_shard to new connection API (#1149 ) If any placements fail it doesn't update shard statistics on those placements. A minor enabling refactor: Make CoordinatedTransactionUses2PC public (it used to be CoordinatedTransactionUse2PC but that symbol already existed, so renamed it as well)	2017-01-23 15:57:44 +02:00
Marco Slot	ea855ddf86	Add an enable_deadlock_prevention flag to allow router transactions to expand to multiple nodes	2017-01-22 17:31:24 +01:00
Andres Freund	78b085106a	Remove connection_cache.[ch].	2017-01-21 09:01:15 -08:00
Andres Freund	6ec34bed84	Remove remnants of commit_protocol.[ch].	2017-01-21 09:01:15 -08:00
Andres Freund	fd717d6da9	Consistently libpq forward declaration in remote_commands.h.	2017-01-21 09:01:14 -08:00
Murat Tuncer	d76f781ae4	Convert multi copy to use new connection api This enables proper transactional behaviour for copy and relaxes some restrictions like combining COPY with single-row modifications. It also provides the basis for relaxing restrictions further, and for optionally allowing connection caching.	2017-01-20 19:15:19 -08:00
Andres Freund	3a36d32c43	Mark some now unnecessarily exposed multi_planner.c functions static.	2017-01-20 12:31:56 -08:00
Andres Freund	0f28a11970	Remove citus.explain_multi_logical/physical_plan. They make fixing explain for prepared statement harder, and they don't really fit into EXPLAIN in the first place. Additionally they're currently not exercised in any tests.	2017-01-20 12:31:19 -08:00
Metin Doslu	2bd8f8f12e	Add a function to delete shard metadata from MX nodes	2017-01-20 14:38:01 +02:00
Metin Doslu	93e626c896	Refactor get_shard_id_for_distribution_column() and other minor changes	2017-01-20 14:38:01 +02:00

1 2 3 4 5

211 Commits (1a0678bc7f410871904b9094a0cfeb776651837d)