citus

Commit Graph

Author	SHA1	Message	Date
Andres Freund	fcd150c7c8	Invalidate relcache after pg_dist_shard_placement changes. This forces prepared statements to be re-planned after changes of the placement metadata. There's some locking issues remaining, but that's a a separate task. Also add regression tests verifying that invalidations take effect on prepared statements.	2016-10-26 03:36:35 -07:00
Onder Kalaci	1673ea937c	Feature: INSERT INTO ... SELECT This commit adds INSERT INTO ... SELECT feature for distributed tables. We implement INSERT INTO ... SELECT by pushing down the SELECT to each shard. To compute that we use the router planner, by adding an "uninstantiated" constraint that the partition column be equal to a certain value. standard_planner() distributes that constraint to all the tables where it knows how to push the restriction safely. An example is that the tables that are connected via equi joins. The router planner then iterates over the target table's shards, for each we replace the "uninstantiated" restriction, with one that PruneShardList() handles. Do so by replacing the partitioning qual parameter added in multi_planner() with the current shard's actual boundary values. Also, add the current shard's boundary values to the top level subquery to ensure that even if the partitioning qual is not distributed to all the tables, we never run the queries on the shards that don't match with the current shard boundaries. Finally, perform the normal shard pruning to decide on whether to push the query to the current shard or not. We do not support certain SQLs on the subquery, which are described/commented on ErrorIfInsertSelectQueryNotSupported(). We also added some locking on the router executor. When an INSERT/SELECT command runs on a distributed table with replication factor >1, we need to ensure that it sees the same result on each placement of a shard. So we added the ability such that router executor takes exclusive locks on shards from which the SELECT in an INSERT/SELECT reads in order to prevent concurrent changes. This is not a very optimal solution, but it's simple and correct. The citus.all_modifications_commutative can be used to avoid aggressive locking. An INSERT/SELECT whose filters are known to exclude any ongoing writes can be marked as commutative. See RequiresConsistentSnapshot() for the details. We also moved the decison of whether the multiPlan should be executed on the router executor or not to the planning phase. This allowed us to integrate multi task router executor tasks to the router executor smoothly.	2016-10-26 10:01:00 +03:00
Onder Kalaci	e0d83d65af	Add ability to reorder target list for INSERT/SELECT queries The necessity for this functionality comes from the fact that ruleutils.c is not supposed to be used on "rewritten" queries (i.e. ones that have been passed through QueryRewrite()). Query rewriting is the process in which views and such are expanded, and, INSERT/UPDATE targetlists are reordered to match the physical order, defaults etc. For the details of reordeing, see transformInsertRow().	2016-10-26 10:00:03 +03:00
Jason Petersen	73f5b8b05f	Move all funcs to pg_catalog, add test to verify We'd been relying on a single SET search_path command in an earlier script, but a subsequent script RESET search_path, causing any further bare functions to be created in the first schema on the search path. However, starting with an older extension version and executing ALTER scripts one at a time DOES avoid putting any functions in the public namespace, so I wrote an upgrade script resilient to that, especially because PostgreSQL 9.5 will error out if a function is already in the schema it's being moved to.	2016-10-25 12:45:53 -06:00
Brian Cloutier	c6b74b023f	Treat nodePort as the 8byte number it is	2016-10-25 16:31:48 +03:00
Brian Cloutier	2e96f6ab27	Fix crash when upgrading to Citus 6 Between restart (running the new code) and ALTER EXTENSION citus UPGRADE there was an inconsistency where we assumed that pg_dist_partition had the repmodel column set. Now we give it a default value if the column doesn't exist yet.	2016-10-24 15:18:29 +03:00
Marco Slot	271b20a23e	Parallelise DDL commands	2016-10-24 12:39:08 +02:00
Burak Yucesoy	5a03acf2bf	Foreign Constraint Support for create_distributed_table and shard move With this change, we now push down foreign key constraints created during CREATE TABLE statements. We also start to send foreign constraints during shard move along with other DDL statements	2016-10-21 15:38:55 +03:00
Marco Slot	02d2b86e68	Re-disable master evaluation for SELECT	2016-10-21 10:51:47 +02:00
Metin Doslu	405335fcee	Add create_reference_table() create_reference_table() creates a hash distributed table with shard count equals to 1 and replication factor equals to shard_replication_factor configuration value.	2016-10-20 15:29:30 +03:00
Metin Doslu	d3e7d9dc8d	Final refactoring	2016-10-20 11:29:11 +03:00
Metin Doslu	58ac477ffb	Change return type of BuildDistributionKeyFromColumnName() to Var * BuildDistributionKeyFromColumnName() always returns a Var pointer, so there is no reason to return a Node pointer instead of a Var pointer.	2016-10-20 10:59:31 +03:00
Metin Doslu	161093908e	Convert colocationid to uint32	2016-10-20 10:59:31 +03:00
Metin Doslu	8334d853c0	Add local function GetNextShardId()	2016-10-20 10:59:31 +03:00
Metin Doslu	40bdafa8d1	Add create_distributed_table() create_distributed_table() creates a hash distributed table with default values of shard count and shard replication factor.	2016-10-20 10:58:25 +03:00
Metin Doslu	d04f4f5935	Add guc variable for shard count	2016-10-19 10:44:50 +03:00
Marco Slot	65f6d7c02a	Follow consistent execution order in parallel commands	2016-10-19 08:33:08 +02:00
Marco Slot	a497e7178c	Parallelise master_modify_multiple_shards	2016-10-19 08:33:08 +02:00
Marco Slot	9d98acfb6d	Move requiresMasterEvaluation from Task to Job	2016-10-19 08:23:06 +02:00
Marco Slot	213d8419c6	Refactor and redocument executor shard lock code	2016-10-19 08:13:35 +02:00
Andres Freund	ac14b2edbc	Support PostgreSQL 9.6 Adds support for PostgreSQL 9.6 by copying in the requisite ruleutils file and refactoring the out/readfuncs code to flexibly support the old-style copy/pasted out/readfuncs (prior to 9.6) or use extensible node APIs (in 9.6 and higher). Most version-specific code within this change is only needed to set new fields in the AggRef nodes we build for aggregations. Version-specific test output files were added in certain cases, though in most they were not necessary. Each such file begins by e.g. printing the major version in order to clarify its purpose. The comment atop citus_nodes.h details how to add support for new nodes for when that becomes necessary.	2016-10-18 16:23:55 -06:00
Murat Tuncer	b453f6c7ab	Add master_run_on_worker UDF	2016-10-18 17:59:54 +03:00
Eren Basak	cee7b54e7c	Add worker transaction and transaction recovery infrastructure	2016-10-18 14:18:14 +03:00
Metin Doslu	27616cca52	Add regression tests for parameterized queries	2016-10-18 14:02:50 +03:00
Eren Basak	f3ede37c9f	Add hasmetadata column to pg_dist_node	2016-10-17 11:52:18 +03:00
Eren Basak	c7bf2021fa	Add metadata infrastructure for pg_dist_local_group table	2016-10-17 11:52:18 +03:00
Eren Basak	8f477d18f1	Add pg_dist_local_group Metadata Table This change adds the pg_dist_local_group metadata table, which indicates the group id of the current node. It is expected that this table contains one and only one row, which only contains the group id of the node as an integer.	2016-10-14 11:41:14 +03:00
Eren Basak	630f199d3c	Fix changing placement ids in metadata snapshot test	2016-10-14 11:13:16 +03:00
Brian Cloutier	6c3d79b4e7	Drop shardalias	2016-10-14 11:03:26 +03:00
Burak Yucesoy	6668d19a3b	Make shard transfer functions co-location aware With this change, master_copy_shard_placement and master_move_shard_placement functions start to copy/move given shard along with its co-located shards.	2016-10-13 18:16:40 +03:00
Metin Doslu	d03a2af778	Add HAVING support This commit completes having support in Citus by adding having support for real-time and task-tracker executors. Multiple tests are added to regression tests to cover new supported queries with having support.	2016-10-13 15:47:53 +03:00
Eren Basak	ed3af403fd	Add Metadata Snapshot Infrastructure This change adds the required infrastructure about metadata snapshot from MX codebase into Citus, mainly metadata_sync.c file and master_metadata_snapshot UDF.	2016-10-13 10:40:14 +03:00
Jason Petersen	d140d1c934	Use single-quote interpolation in partition test Noticed an old issue and this outdated comment. Figured I'd fix it.	2016-10-10 13:03:43 -06:00
Jason Petersen	bcfc58a7c7	Fix tests and tell Travis to run them all Two sets of tests are fixed by this change: * multi_agg_approximate_distinct * those in multi_task_tracker_extra_schedule The first broke when we renamed stage to load in many files and was never being run because the HyperLogLog extension wasn't easily available in Debian. Now it's in our repo, so we install it and run the test. I removed the distinct HLL target in favor of just always running it and providing an output variant to handle when the extension is absent. Basically, if PostgreSQL thinks HLL is available, the test installs it and runs normally, otherwise the absent variant is used. The second broke when I removed a test variant, erroneously believing it to be related to an older Citus version. I've added a line in that test to clarify why the variant is necessary (a practice we should widely adopt).	2016-10-07 17:32:54 -06:00
Marco Slot	33b7723530	Use UpdateShardPlacementState where appropriate	2016-10-07 11:59:20 -07:00
Andres Freund	982ad66753	Introduce placement IDs. So far placements were assigned an Oid, but that was just used to track insertion order. It also did so incompletely, as it was not preserved across changes of the shard state. The behaviour around oid wraparound was also not entirely as intended. The newly introduced, explicitly assigned, IDs are preserved across shard-state changes. The prime goal of this change is not to improve ordering of task assignment policies, but to make it easier to reference shards. The newly introduced UpdateShardPlacementState() makes use of that, and so will the in-progress connection and transaction management changes.	2016-10-07 11:59:20 -07:00
Metin Doslu	d94a65e0e9	Reduce minimum value of task_tracker_delay to 1ms	2016-10-07 09:55:56 +03:00
Brian Cloutier	9d6699b07c	Switch from pg_worker_list.conf file to pg_dist_node metadata table. Related to #786 This change adds the `pg_dist_node` table that contains the information about the workers in the cluster, replacing the previously used `pg_worker_list.conf` file (or the one specified with `citus.worker_list_file`). Upon update, `pg_worker_list.conf` file is read and `pg_dist_node` table is populated with the file's content. After that, `pg_worker_list.conf` file is renamed to `pg_worker_list.conf.obsolete` For adding and removing nodes, the change also includes two new UDFs: `master_add_node` and `master_remove_node`, which require superuser permissions. 'citus.worker_list_file' guc is kept for update purposes but not used after the update is finished.	2016-10-05 13:01:35 +03:00
Marco Slot	32b2bd4ed8	Add replication model column to pg_dist_partition	2016-10-05 01:14:28 +02:00
Onder Kalaci	0993f2fb2c	Update ColocatedShardPlacementList() function name to ColocatedShardIntervalList() which was intented.	2016-10-04 09:51:42 +03:00
Marco Slot	fe3ffdb013	Avoid use of pnstrdup	2016-10-04 00:31:53 +02:00
Robin Thomas	f677fadbe6	Provides safe, idempotent shard-extended names to any object name related to a table that might be distributed, allowing any name that is within regular PostgreSQL length limits to be extended with a shard ID for use in shards on workers. Handles multi-byte character boundaries in identifiers when making prefixes for shard-extended names. Includes tests. Uses hash_any from PostgreSQL's access/hashfunc.c. Removes AppendShardIdToStringInfo() as it's used only once and arguably is best replaced there with a call to AppendShardIdToName(). Adds UDF shard_name(object_name, shard_id) to expose the shard-extended name logic to other PL/PGSQL, UDFs and scripts. Bumps version to 6.0-2 to allow for UDF to be created in migration script. Fixes citusdata/citus#781 and citusdata/citus#179.	2016-10-03 17:02:34 -04:00
Andres Freund	de32b7bbad	Don't create hash-table of zero size in TaskHashCreate(). hash_create(), called by TaskHashCreate(), doesn't work correctly for a zero sized hash table. This triggers valgrind errors, and could potentially cause crashes even without valgring. This currently happens for Jobs with 0 tasks. These probably should be optimized away before reaching TaskHashCreate(), but that's a bigger change.	2016-10-03 13:07:43 -07:00
Andres Freund	6d050bc9f8	Initialize count_agg_clauses argument to 0. count_agg_clause adds the cost of the aggregates to the state variable, it doesn't reinitialize it. That is intentional, as it is used to incrementally add costs in some places.	2016-10-03 13:07:43 -07:00
Andres Freund	a6150c2916	Lower "waiting for activity on tasks took longer than" log level. It's perfectly normal to wait longer in several circumstances, and the output can lead to spurious regression output changes.	2016-10-03 13:07:43 -07:00
Marco Slot	a4efb60b54	Change logicalrelid type in pg_dist_partition and pg_dist_shard to regclass	2016-10-03 20:27:16 +02:00
Marco Slot	fc93974238	Remove EventInvokeTrigger from regression test output	2016-10-03 20:21:15 +02:00
Robin Thomas	c507a0df1c	During repartitions, the partitionColumnType argument sent to workers is now a `::regtype` using the qualified name of the column type, not the column type OID which may differ between master/worker nodes. Test coverage of a hash reparitition using a UDT as the join column. Note that the UDFs `worker_hash_partition_table` and `worker_range_partition_table` are unchanged, and rightly expect an OID for the column type; but the planner code building the commands now allows for `::regtype` casting to do its magic. Fixes citusdata/citus#111.	2016-10-03 13:41:20 -04:00
Robin Thomas	b1493e299e	Added test coverage for partial unique indexes and exclude constraints.	2016-10-03 10:47:30 -04:00
Eren Basak	ac3a4eee21	Fix command counter increment bug Fixes citusdata/citus#714 On `InsertShardRow`, we previously called `CommandCounterIncrement()` before `CitusInvalidateRelcacheByRelid(relationId);`. This might prevent to skip invalidation of the distributed table in the next access within the same session.	2016-10-03 17:00:27 +03:00

1 2 3 4 5 ...

286 Commits (837ec67c80d65e3905ebb9f37b7eb5adcd0144b5)