citus

Commit Graph

Author	SHA1	Message	Date
Marco Slot	172bb457e6	Take shard metadata lock in master_append_table_to_shard	2016-12-02 15:56:30 +01:00
Eren Basak	fb88b167a7	Propagate node add/remove to the nodes with hasmetadata=true This change propagates the changes done by `master_add_node` and `master_remove_node` to the workers that contain metadata.	2016-12-02 14:43:32 +03:00
Brian Cloutier	a4096c9f45	Remove dead code: ResponsiveWorkerNodeList	2016-12-02 13:14:11 +03:00
Andres Freund	0a4889d0af	Use system psql if available, to fix travis build errors. On some systems a new libpq is available than what we're compiling against, but until now we used psql in the version we're compiling against. That' a problem, because (quoting Jason): With 9.6, libpq's default handling of CONTEXT changed: it is hidden unless the level is ERROR or higher. We addressed this ourselves using the SHOW_CONTEXT variable (by setting "always" in pg_regress_multi): in 9.5, this is ignored (and unneeded), in 9.6, it ensures old behavior is preserved. For 9.6 we'd already worked around the problem by specifying that context should always be shown, but < 9.6 psql doesn't know how to do that. As there's no csql anymore, which strictly tied us to a specific version of psql/csql, we can now just use the system's psql if available. We still fall back to the psql of the installation we're compiling against, if there's no other psql in PATH.	2016-12-01 15:58:23 -08:00
Onder Kalaci	df974e15b8	Bugfix for deparsing INSERT..SELECT queries which involve constant values This commit fixes a bug when the SELECT target list includes a constant value. Previous behaviour of target list re-ordering: * Iterate over the INSERT target list * If it includes a Var, find the corresponding SELECT entry and update its resno accordingly * If it does not include a Var (which we only considered to be DEFAULTs), generate a new SELECT target entry * If the processed target entry count in SELECT target list is less than the original SELECT target list (GROUP BY elements not included in the SELECT target entry), add them in the SELECT target list and update the resnos accordingly. * However, this step was leading to add the CONST SELECT target entries twice. The reason is that when CONST target list entries appear in the SELECT target list, the INSERT target list doesn't include a Var. Instead, it includes CONST as it does for DEFAULTs. New behaviour of target list re-ordering: * Iterate over the INSERT target list * If it includes a Var, find the corresponding SELECT entry and update its resno accordingly * If it does not include a Var (which we consider to be DEFAULTs and CONSTs on the SELECT), generate a new SELECT target entry * If any target entry remains on the SELECT target list which are resjunk, (GROUP BY elements not included in the SELECT target entry), keep them in the SELECT target list by updating the resnos.	2016-12-01 10:41:56 +02:00
Murat Tuncer	45762006f3	Add support for filters Ensures filter clauses are stripped from master query, and pushed down to worker queries.	2016-12-01 08:53:46 +03:00
Sumedh Pathak	0a0d4784b9	Change DDL error message to say "unsupported" instead of "supported"	2016-11-26 10:30:09 +01:00
Murat Tuncer	b5c1ecb684	Fix failures during pg_upgrade - fix error in CitusHasBeenLoaded() - allow creation of pg_catalog tables during upgrade	2016-11-11 17:22:45 -08:00
Marco Slot	b566c4815c	Pass down the correct type for null parameters	2016-11-11 07:14:08 +01:00
Metin Doslu	a0c92b38cb	Use AccessShareLock on the source table while creating a colocated table While creating a colocated table, we don't want the source table to be dropped. However, using a ShareLock blocks DML statements on the source table, and using AccessShareLock is enough to prevent DROP. Therefore, we just loosened the lock to AccessShareLock.	2016-11-10 09:17:05 -08:00
Eren Basak	444f14d546	Add Column Definition List for Output Columns for master_add_node This change allows seeing the names of columns of `master_add_node`, using `SELECT * FROM master_add_node(...)` by specifying output columns in UDF definition.	2016-11-07 14:08:58 -08:00
Marco Slot	c157c3b419	Disallow SendCommandListToWorkerInSingleTransaction when modifications have occurred	2016-11-02 12:26:56 +01:00
Marco Slot	f6b3af7a49	Use co-located shard ID in multi_shard_transaction	2016-11-02 11:01:19 +01:00
Samay Sharma	82e5faa190	Avoid error during CREATE INDEX IF NOT EXISTS Previously, we threw an error when we ran CREATE INDEX IF NOT EXISTS with an already existing index. This change enables expected behavior by checking if the statement has IF NOT EXISTS before throwing the error. We also ensure that we don't execute the command on the workers, if an index already exists on the master.	2016-11-01 14:51:19 -07:00
Burak Yucesoy	b30b339f91	Fix typo in error message	2016-11-01 16:58:27 +02:00
Burak Yucesoy	6246702a4c	Change error message we displayed for foreign constraints if RF > 1 At the moment, we do not support foreign constraints if replication factor is greater than 1. However foreign constraints can be used in cloud with high availability option. Therefore we do not want to create an impression such that foreign constraints with high availability is not supported at all. We call users to action with this error message.	2016-11-01 15:47:19 +02:00
Önder Kalacı	83e1719541	Always CASCADE while dropping a shard	2016-11-01 10:16:34 +01:00
Brian Cloutier	50805f1e5c	Copy raw_parse_tree before using it Address citusdata/citus#922. Fixes a segfault in PG's installcheck caused by our reuse of raw_parse_tree when handling EXPLAIN EXECUTE.	2016-10-27 18:25:49 +03:00
Onder Kalaci	a43e3bad56	Improve error semantics for INSERT..SELECT With this commit, we error out if a worker query cannot be executed on all placements of a target insert shard interval.	2016-10-27 14:09:05 +03:00
Andres Freund	dfe7b357c5	Simple isolationtester dml vs. repair tests.	2016-10-27 00:31:41 -07:00
Andres Freund	121b868da5	Add very basic isolationtester infrastructure including a trivial test.	2016-10-27 00:31:41 -07:00
Andres Freund	c3e1d49e34	Don't try to shutdown servers that have not been started in regression tests. This avoids spurious output from failing shutdowns and uninitialized variable warnings if pg_regress_multi.pl fails before starting servers.	2016-10-27 00:31:41 -07:00
Metin Doslu	c6f5cabbe3	Error on different shard placement count In ErrorIfShardPlacementsNotColocated(), while checking if shards are colocated, error out if matching shard intervals have different number of shard placements.	2016-10-26 18:46:05 +03:00
Onder Kalaci	9cd549f21f	Add stub for Copy shard placement This commit does not change the current behaviour, but, helps to implement enterprise feature without any version changes.	2016-10-26 17:57:55 +03:00
Metin Doslu	4e555880b7	Add mark_tables_colocated() to update colocation groups Added a new UDF, mark_tables_colocated(), to colocate tables with the same configuration (shard count, shard replication count and distribution column type).	2016-10-26 17:29:03 +03:00
Marco Slot	275378aa45	Re-acquire metadata locks in RouterExecutorStart	2016-10-26 14:34:59 +02:00
Brian Cloutier	1e6d1ef67e	Fix segfault during EXPLAIN EXECUTE Fix citusdata/citus#886 The way postgres' explain hook is designed means that our hook is never called during EXPLAIN EXECUTE. So, we special-case EXPLAIN EXECUTE by catching it in the utility hook. We then replace the EXECUTE with the original query and pass it back to Citus.	2016-10-26 15:18:42 +03:00
Burak Yucesoy	fc2fea839b	Only repair given shard Previously, when a repair is requested on a shard, we also repair all co-located shards of given shard, which may cause repairing already healthy shards. With this change, we only repair given shard.	2016-10-26 14:36:37 +03:00
Brian Cloutier	80c8cfeabe	Don't add a raw 32-bit int to tuples in create_distributed_table	2016-10-26 14:02:42 +03:00
Andres Freund	fcd150c7c8	Invalidate relcache after pg_dist_shard_placement changes. This forces prepared statements to be re-planned after changes of the placement metadata. There's some locking issues remaining, but that's a a separate task. Also add regression tests verifying that invalidations take effect on prepared statements.	2016-10-26 03:36:35 -07:00
Onder Kalaci	1673ea937c	Feature: INSERT INTO ... SELECT This commit adds INSERT INTO ... SELECT feature for distributed tables. We implement INSERT INTO ... SELECT by pushing down the SELECT to each shard. To compute that we use the router planner, by adding an "uninstantiated" constraint that the partition column be equal to a certain value. standard_planner() distributes that constraint to all the tables where it knows how to push the restriction safely. An example is that the tables that are connected via equi joins. The router planner then iterates over the target table's shards, for each we replace the "uninstantiated" restriction, with one that PruneShardList() handles. Do so by replacing the partitioning qual parameter added in multi_planner() with the current shard's actual boundary values. Also, add the current shard's boundary values to the top level subquery to ensure that even if the partitioning qual is not distributed to all the tables, we never run the queries on the shards that don't match with the current shard boundaries. Finally, perform the normal shard pruning to decide on whether to push the query to the current shard or not. We do not support certain SQLs on the subquery, which are described/commented on ErrorIfInsertSelectQueryNotSupported(). We also added some locking on the router executor. When an INSERT/SELECT command runs on a distributed table with replication factor >1, we need to ensure that it sees the same result on each placement of a shard. So we added the ability such that router executor takes exclusive locks on shards from which the SELECT in an INSERT/SELECT reads in order to prevent concurrent changes. This is not a very optimal solution, but it's simple and correct. The citus.all_modifications_commutative can be used to avoid aggressive locking. An INSERT/SELECT whose filters are known to exclude any ongoing writes can be marked as commutative. See RequiresConsistentSnapshot() for the details. We also moved the decison of whether the multiPlan should be executed on the router executor or not to the planning phase. This allowed us to integrate multi task router executor tasks to the router executor smoothly.	2016-10-26 10:01:00 +03:00
Onder Kalaci	e0d83d65af	Add ability to reorder target list for INSERT/SELECT queries The necessity for this functionality comes from the fact that ruleutils.c is not supposed to be used on "rewritten" queries (i.e. ones that have been passed through QueryRewrite()). Query rewriting is the process in which views and such are expanded, and, INSERT/UPDATE targetlists are reordered to match the physical order, defaults etc. For the details of reordeing, see transformInsertRow().	2016-10-26 10:00:03 +03:00
Jason Petersen	73f5b8b05f	Move all funcs to pg_catalog, add test to verify We'd been relying on a single SET search_path command in an earlier script, but a subsequent script RESET search_path, causing any further bare functions to be created in the first schema on the search path. However, starting with an older extension version and executing ALTER scripts one at a time DOES avoid putting any functions in the public namespace, so I wrote an upgrade script resilient to that, especially because PostgreSQL 9.5 will error out if a function is already in the schema it's being moved to.	2016-10-25 12:45:53 -06:00
Brian Cloutier	c6b74b023f	Treat nodePort as the 8byte number it is	2016-10-25 16:31:48 +03:00
Brian Cloutier	2e96f6ab27	Fix crash when upgrading to Citus 6 Between restart (running the new code) and ALTER EXTENSION citus UPGRADE there was an inconsistency where we assumed that pg_dist_partition had the repmodel column set. Now we give it a default value if the column doesn't exist yet.	2016-10-24 15:18:29 +03:00
Marco Slot	271b20a23e	Parallelise DDL commands	2016-10-24 12:39:08 +02:00
Burak Yucesoy	5a03acf2bf	Foreign Constraint Support for create_distributed_table and shard move With this change, we now push down foreign key constraints created during CREATE TABLE statements. We also start to send foreign constraints during shard move along with other DDL statements	2016-10-21 15:38:55 +03:00
Marco Slot	02d2b86e68	Re-disable master evaluation for SELECT	2016-10-21 10:51:47 +02:00
Metin Doslu	405335fcee	Add create_reference_table() create_reference_table() creates a hash distributed table with shard count equals to 1 and replication factor equals to shard_replication_factor configuration value.	2016-10-20 15:29:30 +03:00
Metin Doslu	d3e7d9dc8d	Final refactoring	2016-10-20 11:29:11 +03:00
Metin Doslu	58ac477ffb	Change return type of BuildDistributionKeyFromColumnName() to Var * BuildDistributionKeyFromColumnName() always returns a Var pointer, so there is no reason to return a Node pointer instead of a Var pointer.	2016-10-20 10:59:31 +03:00
Metin Doslu	161093908e	Convert colocationid to uint32	2016-10-20 10:59:31 +03:00
Metin Doslu	8334d853c0	Add local function GetNextShardId()	2016-10-20 10:59:31 +03:00
Metin Doslu	40bdafa8d1	Add create_distributed_table() create_distributed_table() creates a hash distributed table with default values of shard count and shard replication factor.	2016-10-20 10:58:25 +03:00
Metin Doslu	d04f4f5935	Add guc variable for shard count	2016-10-19 10:44:50 +03:00
Marco Slot	65f6d7c02a	Follow consistent execution order in parallel commands	2016-10-19 08:33:08 +02:00
Marco Slot	a497e7178c	Parallelise master_modify_multiple_shards	2016-10-19 08:33:08 +02:00
Marco Slot	9d98acfb6d	Move requiresMasterEvaluation from Task to Job	2016-10-19 08:23:06 +02:00
Marco Slot	213d8419c6	Refactor and redocument executor shard lock code	2016-10-19 08:13:35 +02:00
Andres Freund	ac14b2edbc	Support PostgreSQL 9.6 Adds support for PostgreSQL 9.6 by copying in the requisite ruleutils file and refactoring the out/readfuncs code to flexibly support the old-style copy/pasted out/readfuncs (prior to 9.6) or use extensible node APIs (in 9.6 and higher). Most version-specific code within this change is only needed to set new fields in the AggRef nodes we build for aggregations. Version-specific test output files were added in certain cases, though in most they were not necessary. Each such file begins by e.g. printing the major version in order to clarify its purpose. The comment atop citus_nodes.h details how to add support for new nodes for when that becomes necessary.	2016-10-18 16:23:55 -06:00
Murat Tuncer	b453f6c7ab	Add master_run_on_worker UDF	2016-10-18 17:59:54 +03:00
Eren Basak	cee7b54e7c	Add worker transaction and transaction recovery infrastructure	2016-10-18 14:18:14 +03:00
Metin Doslu	27616cca52	Add regression tests for parameterized queries	2016-10-18 14:02:50 +03:00
Eren Basak	f3ede37c9f	Add hasmetadata column to pg_dist_node	2016-10-17 11:52:18 +03:00
Eren Basak	c7bf2021fa	Add metadata infrastructure for pg_dist_local_group table	2016-10-17 11:52:18 +03:00
Eren Basak	8f477d18f1	Add pg_dist_local_group Metadata Table This change adds the pg_dist_local_group metadata table, which indicates the group id of the current node. It is expected that this table contains one and only one row, which only contains the group id of the node as an integer.	2016-10-14 11:41:14 +03:00
Eren Basak	630f199d3c	Fix changing placement ids in metadata snapshot test	2016-10-14 11:13:16 +03:00
Brian Cloutier	6c3d79b4e7	Drop shardalias	2016-10-14 11:03:26 +03:00
Burak Yucesoy	6668d19a3b	Make shard transfer functions co-location aware With this change, master_copy_shard_placement and master_move_shard_placement functions start to copy/move given shard along with its co-located shards.	2016-10-13 18:16:40 +03:00
Metin Doslu	d03a2af778	Add HAVING support This commit completes having support in Citus by adding having support for real-time and task-tracker executors. Multiple tests are added to regression tests to cover new supported queries with having support.	2016-10-13 15:47:53 +03:00
Eren Basak	ed3af403fd	Add Metadata Snapshot Infrastructure This change adds the required infrastructure about metadata snapshot from MX codebase into Citus, mainly metadata_sync.c file and master_metadata_snapshot UDF.	2016-10-13 10:40:14 +03:00
Jason Petersen	d140d1c934	Use single-quote interpolation in partition test Noticed an old issue and this outdated comment. Figured I'd fix it.	2016-10-10 13:03:43 -06:00
Jason Petersen	bcfc58a7c7	Fix tests and tell Travis to run them all Two sets of tests are fixed by this change: * multi_agg_approximate_distinct * those in multi_task_tracker_extra_schedule The first broke when we renamed stage to load in many files and was never being run because the HyperLogLog extension wasn't easily available in Debian. Now it's in our repo, so we install it and run the test. I removed the distinct HLL target in favor of just always running it and providing an output variant to handle when the extension is absent. Basically, if PostgreSQL thinks HLL is available, the test installs it and runs normally, otherwise the absent variant is used. The second broke when I removed a test variant, erroneously believing it to be related to an older Citus version. I've added a line in that test to clarify why the variant is necessary (a practice we should widely adopt).	2016-10-07 17:32:54 -06:00
Marco Slot	33b7723530	Use UpdateShardPlacementState where appropriate	2016-10-07 11:59:20 -07:00
Andres Freund	982ad66753	Introduce placement IDs. So far placements were assigned an Oid, but that was just used to track insertion order. It also did so incompletely, as it was not preserved across changes of the shard state. The behaviour around oid wraparound was also not entirely as intended. The newly introduced, explicitly assigned, IDs are preserved across shard-state changes. The prime goal of this change is not to improve ordering of task assignment policies, but to make it easier to reference shards. The newly introduced UpdateShardPlacementState() makes use of that, and so will the in-progress connection and transaction management changes.	2016-10-07 11:59:20 -07:00
Metin Doslu	d94a65e0e9	Reduce minimum value of task_tracker_delay to 1ms	2016-10-07 09:55:56 +03:00
Brian Cloutier	9d6699b07c	Switch from pg_worker_list.conf file to pg_dist_node metadata table. Related to #786 This change adds the `pg_dist_node` table that contains the information about the workers in the cluster, replacing the previously used `pg_worker_list.conf` file (or the one specified with `citus.worker_list_file`). Upon update, `pg_worker_list.conf` file is read and `pg_dist_node` table is populated with the file's content. After that, `pg_worker_list.conf` file is renamed to `pg_worker_list.conf.obsolete` For adding and removing nodes, the change also includes two new UDFs: `master_add_node` and `master_remove_node`, which require superuser permissions. 'citus.worker_list_file' guc is kept for update purposes but not used after the update is finished.	2016-10-05 13:01:35 +03:00
Marco Slot	32b2bd4ed8	Add replication model column to pg_dist_partition	2016-10-05 01:14:28 +02:00
Onder Kalaci	0993f2fb2c	Update ColocatedShardPlacementList() function name to ColocatedShardIntervalList() which was intented.	2016-10-04 09:51:42 +03:00
Marco Slot	fe3ffdb013	Avoid use of pnstrdup	2016-10-04 00:31:53 +02:00
Robin Thomas	f677fadbe6	Provides safe, idempotent shard-extended names to any object name related to a table that might be distributed, allowing any name that is within regular PostgreSQL length limits to be extended with a shard ID for use in shards on workers. Handles multi-byte character boundaries in identifiers when making prefixes for shard-extended names. Includes tests. Uses hash_any from PostgreSQL's access/hashfunc.c. Removes AppendShardIdToStringInfo() as it's used only once and arguably is best replaced there with a call to AppendShardIdToName(). Adds UDF shard_name(object_name, shard_id) to expose the shard-extended name logic to other PL/PGSQL, UDFs and scripts. Bumps version to 6.0-2 to allow for UDF to be created in migration script. Fixes citusdata/citus#781 and citusdata/citus#179.	2016-10-03 17:02:34 -04:00
Andres Freund	de32b7bbad	Don't create hash-table of zero size in TaskHashCreate(). hash_create(), called by TaskHashCreate(), doesn't work correctly for a zero sized hash table. This triggers valgrind errors, and could potentially cause crashes even without valgring. This currently happens for Jobs with 0 tasks. These probably should be optimized away before reaching TaskHashCreate(), but that's a bigger change.	2016-10-03 13:07:43 -07:00
Andres Freund	6d050bc9f8	Initialize count_agg_clauses argument to 0. count_agg_clause adds the cost of the aggregates to the state variable, it doesn't reinitialize it. That is intentional, as it is used to incrementally add costs in some places.	2016-10-03 13:07:43 -07:00
Andres Freund	a6150c2916	Lower "waiting for activity on tasks took longer than" log level. It's perfectly normal to wait longer in several circumstances, and the output can lead to spurious regression output changes.	2016-10-03 13:07:43 -07:00
Marco Slot	a4efb60b54	Change logicalrelid type in pg_dist_partition and pg_dist_shard to regclass	2016-10-03 20:27:16 +02:00
Marco Slot	fc93974238	Remove EventInvokeTrigger from regression test output	2016-10-03 20:21:15 +02:00
Robin Thomas	c507a0df1c	During repartitions, the partitionColumnType argument sent to workers is now a `::regtype` using the qualified name of the column type, not the column type OID which may differ between master/worker nodes. Test coverage of a hash reparitition using a UDT as the join column. Note that the UDFs `worker_hash_partition_table` and `worker_range_partition_table` are unchanged, and rightly expect an OID for the column type; but the planner code building the commands now allows for `::regtype` casting to do its magic. Fixes citusdata/citus#111.	2016-10-03 13:41:20 -04:00
Robin Thomas	b1493e299e	Added test coverage for partial unique indexes and exclude constraints.	2016-10-03 10:47:30 -04:00
Eren Basak	ac3a4eee21	Fix command counter increment bug Fixes citusdata/citus#714 On `InsertShardRow`, we previously called `CommandCounterIncrement()` before `CitusInvalidateRelcacheByRelid(relationId);`. This might prevent to skip invalidation of the distributed table in the next access within the same session.	2016-10-03 17:00:27 +03:00
Onder Kalaci	a533b8e7c1	Differentiate worker and master job temporary folders This commit enables to create different worker and master temporary folders. This change is important for citus-mx on task-tracker execution. In simple words, on citus-mx, the worker could actually be reponsible for the master tasks as well. Prior to this change, both master and worker logic on task-tracker executor was accessing and using the same files for different purposes which was dangerous on certain cases (i.e., when task_tracker_delay is low).	2016-10-03 14:24:08 +03:00
Andres Freund	77efe7fcd4	Move task tracker lwlocks into their own tranche. RequestAddinLWLocks()/LWLockAssign() are gone in 9.6. Luckily all citus supported postgres versions support tranches, so use those.	2016-09-30 16:06:49 -06:00
Jason Petersen	f59cf2b818	Remove references to 9.4 Some still lingered.	2016-09-29 17:35:19 -06:00
Jason Petersen	37631cd132	Remove alternate multi_hash test file This was made irrelevant by Citus v5.1.0.	2016-09-29 16:43:19 -06:00
Jason Petersen	6671cf5171	Remove unused dumputils.h header Believe this was used by csql, which is now gone.	2016-09-29 15:54:38 -06:00
Jason Petersen	1c560dfa9c	Update ruleutils_95 with latest PostgreSQL changes Hand-applied changes from a diff I generated between 9.5.0 and 9.5.4.	2016-09-29 15:54:38 -06:00
Marco Slot	c4bc0742a7	Make count return 0 if all shards are pruned away Before this change, count on a distributed returned NULL if all shards were pruned away, because on the master we replace with count(..) call with a sum(..) call to sum the counts from the shards. However, sum returns NULL when there are no rows, whereas count is expected to return 0.	2016-09-29 20:27:26 +02:00
Jason Petersen	5b80d4e8dd	Directly register multi-shard callbacks in PG_init I had changed these callbacks to use the same method I chose for the router executor (for consistency), but as that method is flawed, we now want to ensure we directly register them from PG_init as well.	2016-09-29 11:43:19 -06:00
Jason Petersen	5f6264105d	Directly register router xact callbacks in PG_init Not entirely sure why we went with the shared memory hook approach, but it causes problems (multiple registration) during crashes. Changing to a simple direct registration call from PG_init.	2016-09-29 11:43:18 -06:00
Burak Yucesoy	1ee39eb098	Internal co-location API With this commit we introduce internal API for co-location related operations.	2016-09-29 11:56:53 +03:00
Marco Slot	5cdbe2b86c	Remove copy_to_distributed_table	2016-09-28 11:27:54 -06:00
Murat Tuncer	5b42318ac4	Make where false queries router plannable	2016-09-28 18:49:26 +03:00
Murat Tuncer	c16dec88c3	Add UDF master_expire_table_cache	2016-09-28 12:08:37 +03:00
Jason Petersen	0caf0d95f1	Fix unique-violation-in-xact segfault An interaction between ReraiseRemoteError and DML transaction support causes segfaults: * ReraiseRemoteError calls PurgeConnection, freeing a connection... * That connection is still in the xactParticipantHash At transaction end, the memory in the freed connection might happen to pass the "is this connection OK?" check, causing us to try to send an ABORT over that connection. By removing it from the transaction hash before calling ReraiseRemoteError, we avoid this possibility.	2016-09-27 16:44:03 -06:00
Metin Doslu	c9dcad9b05	Pass text oid inteads of invalid oid for null values Passing invalid oids even for null values in PQsendQueryParams() causes worker nodes to fail. Therefore, we pass text oid for null values.	2016-09-27 08:15:46 +03:00
Andres Freund	776b3868b9	Support NoMovement direction in router executor This is mainly interesting because it allows to use RETURN QUERY/RETURN QUERY EXECUTE and FOR ... IN .. LOOPs in plpgsql.	2016-09-26 18:28:36 -06:00
Murat Tuncer	32003c4aa1	Add tests with spaces in table names	2016-09-26 18:23:43 -06:00
Murat Tuncer	2f78fb8f1b	Remove extra space	2016-09-26 18:23:43 -06:00
Murat Tuncer	902e68c9ef	Refactor SendQueryToPlacements api	2016-09-26 18:23:43 -06:00
Murat Tuncer	6317bbe9a8	Address feedback	2016-09-26 18:23:42 -06:00
Murat Tuncer	877694296f	Fix regression test failures after rebase	2016-09-26 18:23:42 -06:00
Murat Tuncer	2eec0167be	Add support for truncate statement	2016-09-26 18:23:42 -06:00
Marco Slot	3318288d75	Fix segmentation fault in case of joins with WHERE 1=0	2016-09-26 15:12:29 +02:00
Robin Thomas	614c858375	Forbid EXCLUDE constraints on distributed tables just as we forbid UNIQUE or PRIMARY KEY constraints. Also, properly propagate valid EXCLUDE constraints to worker shard tables. If an EXCLUDE constraint includes the distribution column, the operator must be an equality operator. Tests in regression suite for exclusion constraints that include the partition column, omit it, and include it but with non-equality operator. Regression tests also verify that valid exclusion constraints are propagated to the shard tables. And the tests work in different timezones now. Fixes citusdata/citus#748 and citusdata/citus#778.	2016-09-21 14:02:42 -04:00
Metin Doslu	35eceb6cca	Remove pg_toast_* references from regression tests pg_toast_* oids are constantly changing, and this causes regression tests to fail time to time. With this commit, we remove all of the pg_toast_* references from regression test outputs.	2016-09-09 11:31:51 +03:00
Jason Petersen	74f4e0003b	Permit multiple DDL commands in a transaction Three changes here to get to true multi-statement, multi-relation DDL transactions (same functionality pre-5.2, with benefits of atomicity): 1. Changed the multi-shard utility hook to always run (consistency with router executor hook, removes ad-hoc "installed" boolean) 2. Change the global connection list in multi_shard_transaction to instead be a hash; update related functions to operate on global hash instead of local hash/global list 3. Remove check within DDL code to prevent subsequent DDL commands; place unset/reset guard around call to ConnectToNode to permit connecting to additional nodes after DDL transaction has begun In addition, code has been added to raise an error if a ROLLBACK TO SAVEPOINT is attempted (similar to router executor), and comprehensive tests execute all multi-DDL scenarios (full success, user ROLLBACK, any actual errors (say, duplicate index), partial failure (duplicate index on one node but not others), partial COMMIT (one node fails), and 2PC partial PREPARE (one node fails)). Interleavings with other commands (DML, \copy) are similarly all covered.	2016-09-08 22:35:55 -05:00
Eric B. Ridge	e80f1612a6	Add syscols in queries; extend relnames in indexes To permit use with ZomboDB (https://github.com/zombodb/zombodb), two changes were necessary: 1. Permit use of `tableoid` system column in queries 2. Extend relation names appearing in index expressions The first is accomplished by simply changing the deparse logic to allow system columns in queries destined for distributed tables. The latter was slightly more complex, given that DDL extension currently occurs on workers. But since indexes cannot reference tables other than the one being indexed, it is safe to look for any relation reference ending in a '*' character and extend their penultimate segments with a shard id. This change also adds an error to prevent users from distributing any relations using the WITH (OIDS) feature, which is unsupported.	2016-09-07 11:54:55 -05:00
Marco Slot	6f6cb1a0d6	Allow noop updates of the partition column	2016-09-07 14:22:41 +02:00
Jason Petersen	ed027f060e	Add sort call to shard placement test The comparator is kind of broken, but I think this is better than the current state of random failures.	2016-09-06 11:07:27 -05:00
Jason Petersen	b3684074f3	Fix CreateShardConnectionHash memory leak The call to hash_create specified HASH_CONTEXT without actually setting one using the provided HASHCTL. The hashes returned by this function are used locally, so simply using CurrentMemoryContext is sufficient.	2016-09-06 10:17:18 -05:00
Metin Doslu	5b50f2c333	Add complex subquery pushdown regression tests	2016-09-02 14:21:51 +03:00
Metin Doslu	7d212b847f	Add outer join clause list extraction for subquery pushdown logic In subquery pushdown, we allow outer joins if the join condition is on the partition columns. WhereClauseList() used to return all join conditions including outer joins. However, this has been changed with a commit related to outer join support on regular queries. With this commit, we refactored ExtractFromExpressionWalker() to return two lists of qualifiers. The first list is for inner join and filter clauses and the second list is for outer join clauses. Therefore, we can also use outer join clauses to check subquery pushdown prerequisites.	2016-09-02 11:54:44 +03:00
Burak Yucesoy	12d1aba1fc	Error out at master_create_distributed_table if the table has any rows Before this change, we do not check whether given table which already contains any data in master_create_distributed_table command. If that table contains any data, making it it distributed, makes that data hidden to user. With this change, we now gave error to user if the table contains data.	2016-09-01 17:42:47 +03:00
Jason Petersen	850c51947a	Re-permit DDL in transactions, selectively Recent changes to DDL and transaction logic resulted in a "regression" from the viewpoint of users. Previously, DDL commands were allowed in multi-command transaction blocks, though they were not processed in any actual transactional manner. We improved the atomicity of our DDL code, but added a restriction that DDL commands themselves must not occur in any BEGIN/END transaction block. To give users back the original functionality (and improved atomicity) we now keep track of whether a multi-command transaction has modified data (DML) or schema (DDL). Interleaving the two modification types in a single transaction is disallowed. This first step simply permits a single DDL command in such a block, admittedly an incomplete solution, but one which will permit us to add full multi-DDL command support in a subsequent commit.	2016-08-30 20:37:19 -06:00
Metin Doslu	75618fc3fb	Return false in MultiClientQueryResult() on failing query	2016-08-29 17:05:35 +03:00
Brian Cloutier	4ecd6b58fb	Remove csql, \stage is no longer needed	2016-08-26 10:41:59 +03:00
Brian Cloutier	640bb8863b	Remove check-multi-fdw tests, nobody uses Citus with fdws	2016-08-26 10:41:33 +03:00
Jason Petersen	e54d3f6d32	Rename test files with 'stage' in name Ignored FDW files as those test are being removed entirely, I believe.	2016-08-22 13:32:53 -06:00
Jason Petersen	b391abda3d	Replace verb 'stage' with 'load' in test comments "Staging table" will be the only valid use of 'stage' from now on, we will now say "load" when talking about data ingestion. If creation of shards is its own step, we'll just say "shard creation".	2016-08-22 13:24:18 -06:00
Jason Petersen	35e9f51348	Replace verb 'stage' with 'load' in schedules "Staging table" will be the only valid use of 'stage' from now on.	2016-08-22 11:48:41 -06:00
Eren Başak	0322916700	Lowercase \copy to match PostgreSQL's style for local/psql-level functions	2016-08-22 11:31:26 -06:00
Eren Basak	b513f1c911	Replace \stage With \copy on Regression Tests Fixes #547 This change removes all references to \stage in the regression tests and puts \COPY instead. Doing so changed shard counts, min/max values on some test tables (lineitem, orders, etc.).	2016-08-22 11:31:26 -06:00
Robin Thomas	010cbf16fc	Remove all usage of pg_dist_shard.shardalias in extension code. (#739 ) Remove regression test of non-null shardalias.	2016-08-19 17:06:22 +03:00
Jason Petersen	91578ff149	Remove HAVE_INTTYPES_H ifdefs I've been seeing warnings on OS X/clang for a while about these lines and finally got tired of it. The main problem is that PRIu64 expects a uint64_t but we were passing a uint64 (a PostgreSQL-defined type). In PostgreSQL 9.5, we now have INT64_MODIFIER, so can build our own zero- padded unsigned 64-bit int format modifier that expects a PostgreSQL- provided uint64 type. This simplifies the code slightly (no more ifdefs) and gets rid of the warning that's been annoying me since April (my TODO creation time).	2016-08-18 15:19:53 -06:00
Jason Petersen	900f7590ab	Fix Travis local_first_candidate_nodes failures A recent change to the image used in Travis causes some problems for the code we use here to ensure the local replica is first. Since this code is essentially dead in a post-stage world anyhow, we're OK with ripping out the tests to placate Travis.	2016-08-14 23:12:10 -06:00
Murat Tuncer	3a49cf830e	Remove a router planner test for materialized view PostgreSQL 9.5.4 stopped calling planner for materialized view create command when NO DATA option is provided. This causes our test to behave differently between pre-9.5.4 and 9.5.4.	2016-08-14 22:57:09 -06:00
Andres Freund	7fdb5fbe29	Skip over unreferenced parameters when router executing prepared statement. When an unreferenced prepared statement parameter does not explicitly have a type assigned, we cannot deserialize it, to send to the remote side. That commonly happens inside plpgsql functions, where local variables are passed in as unused prepared statement parameters.	2016-08-05 14:12:06 -07:00
Jason Petersen	eba8396501	Avoid attempting to lock invalid shard identifier A recent change generates a "dummy" shard placement with its identifier set to INVALID_SHARD_ID for SELECT queries against distributed tables with no shards. Normally, no lock is acquired for SELECT statements, but if all_modifications_commutative is set to true, we will acquire a shared lock, triggering an assertion failure within LockShardResource in the above case. The "dummy" shard placement is actually necessary to ensure such empty queries have somewhere to execute, and INVALID_SHARD_ID seems the most appropriate value for the dummy's shard identifier field, so the most straightforward fix is to just avoid locking invalid shard identifiers.	2016-08-04 13:49:51 -07:00
Metin Doslu	3ff1877108	Bump version numbers for 5.2 release	2016-08-01 13:48:24 -07:00
Marco Slot	9705cbcdf8	Rewrite WorkerShardStats to avoid invalid value bugs	2016-07-29 20:11:18 +02:00
Marco Slot	5e432449ba	Add MultiClientExecute and MultiClientValueIsNull for simple remote query execution	2016-07-29 20:07:18 +02:00
Andres Freund	63fb8311cb	Don't access pg_dist_partition->partkey directly, use heap_getattr(). Text datums can't be directly accessed via the struct equivalence trick used to access catalogs. That's because, as an optimization, they're sometimes aligned to 1 byte ("text"'s alignment), and sometimes to 4 bytes. That depends on it being a short varlena (cf. VARATT_NOT_PAD_BYTE) or not. In the case at hand here, partkey became longer than 127 characters - the boundary for short varlenas (cf. VARATT_CAN_MAKE_SHORT()). Thus it became 4 byte/int aligned. Which lead to the direct struct access accessing the wrong data. The fix is simply to never access partkey that way - to enforce that, hide partkey ehind the usual ifdef. Fixes: #674	2016-07-29 10:02:36 -07:00
Eren Başak	bb3893d0d8	Set 1PC as the Default Commit Protocol for DDL Commands Fixes #679 This change sets the default commit protocol for distributed DDL commands to '1pc'. If the user issues a distributed DDL command with this default setting, then once in a session, a NOTICE message is shown about using '2pc' being extra safe.	2016-07-29 16:42:55 +03:00
Jason Petersen	bedf53d566	Quick fix for possible segfault in PurgeConnection Now that connections can be acquired without going through the cache, we have to handle cases where functions assume the cache has been ini- tialized.	2016-07-29 00:12:56 -06:00
Jason Petersen	abe7304898	Support SERIAL/BIGSERIAL non-partition columns This adds support for SERIAL/BIGSERIAL column types. Because we now can evaluate functions on the master (during execution), adding this is a matter of ensuring the table creation step works properly. To accomplish this, I've added some logic to detect sequences owned by a table (i.e. those related to its columns). Simply creating a sequence and using it in a default value is insufficient; users who do so must ensure the sequence is owned by the column using it. Fortunately, this is exactly what SERIAL and BIGSERIAL do, which is the use case we're targeting with this feature. While testing this, I found that worker_apply_shard_ddl_command actually adds shard identifiers to sequence names, though I found no places that use or test this path. I removed that code so that sequence names are not mutated and will match those used by a SERIAL default value expression. Our use of the new-to-9.5 CREATE SEQUENCE IF NOT EXISTS syntax means we are dropping support for 9.4 (which is being done regardless, but makes this change simpler). I've removed 9.4 from the Travis build matrix. Some edge cases are possible in ALTER SEQUENCE, COPY FROM (on workers), and CREATE SEQUENCE OWNED BY. I've added errors for each so that users understand when and why certain operations are prohibited.	2016-07-28 23:55:40 -06:00
Burak Yucesoy	6f20af9e38	Remove schema name parameter from API functions We remove schema name parameter from worker_fetch_foreign_file and worker_fetch_regular_table functions. We now send schema name concatanated with table name.	2016-07-28 20:41:05 +03:00
Burak Yucesoy	a649b47bac	Add old version(without schema name parameter) of api functions back Fixes #676 We added old versions (i.e. without schema name) of worker_apply_shard_ddl_command, worker_fetch_foreign_file and worker_fetch_regular_table back. During function call of one of these functions, we set schema name as public schema and call the newer version of the functions.	2016-07-28 20:40:38 +03:00
Eren Başak	8a590c9e5b	Remove AllFinalizedlacementsAccessible Function This change removes AllFinalizedPlacementsAccessible function since, we open connections to all shard placements before any command is sent so we immediately error out if a shard placement is not accessible.	2016-07-28 17:24:37 +03:00
Eren Başak	40f8149320	Allow Cancellation During Distributed DDL Commands This change allows users to interrupt long running DDL commands. Interrupt requests are handled after each DDL command being propagated to a shard placement, which means that generally the cancel request will be processed right after the execution of the DDL is finished in the current placement.	2016-07-28 17:12:07 +03:00
Murat Tuncer	cc33a450c4	Expand router planner coverage We can now support richer set of queries in router planner. This allow us to support CTEs, joins, window function, subqueries if they are known to be executed at a single worker with a single task (all tables are filtered down to a single shard and a single worker contains all table shards referenced in the query). Fixes : #501	2016-07-27 23:35:38 +03:00
Murat Tuncer	c20080992d	Remove PostgreSQL 9.4 support	2016-07-26 20:16:09 +03:00
Onder Kalaci	56c0b0825f	Fix bug related to poll timeout This commit fixes a bug on setting polling timeout. The code updated to comform to the comment that is already placed.	2016-07-26 09:55:47 +03:00
Burak Yucesoy	c1a3478c3b	Remove warnings on schema creation Since now we support schema related operations, there is no need to warn user about schema usage.	2016-07-22 18:24:23 +03:00
Burak Yucesoy	bdff72ed75	Fix ALTER TABLE SET SCHEMA Fixes #132 We hook into ALTER ... SET SCHEMA and warn out if user tries to change schema of a distributed table. We also hook into ALTER TABLE ALL IN TABLE SPACE statements and warn out if citus has been loaded.	2016-07-22 17:52:40 +03:00
Murat Tuncer	5d996a6891	Fix outer join crash when subquery is flatten	2016-07-22 17:01:19 +03:00
Burak Yucesoy	b58872b441	Fix worker_fetch_regular_table with schema Fixes #504 Fixes #646 We changed signature of worker_fetch_regular_table to accept schema name as parameter to make it work with schemas.	2016-07-22 00:44:02 -06:00
Jason Petersen	5d525fba24	Permit "single-shard" transactions Allows the use of modification commands (INSERT/UPDATE/DELETE) within transaction blocks (delimited by BEGIN and ROLLBACK/COMMIT), so long as all modifications hit a subset of nodes involved in the first such com- mand in the transaction. This does not circumvent the requirement that each individual modification command must still target a single shard. For instance, after sending BEGIN, a user might INSERT some rows to a shard replicated on two nodes. Subsequent modifications can hit other shards, so long as they are on one or both of these nodes. SAVEPOINTs are supported, though if the user actually attempts to send a ROLLBACK command that specifies a SAVEPOINT they will receive an ERROR at the end of the topmost transaction. Placements are only marked inactive if at least one replica succeeds in a transaction where others fail. Non-atomic behavior is possible if the shard targeted by the initial modification within a transaction has a higher replication factor than another shard within the same block and a node with the latter shard has a failure during the COMMIT phase. Other methods of denoting transaction blocks (multi-statement commands sent all at once and functions written in e.g. PL/pgSQL or other such languages) are not presently supported; their treatment remains the same as before.	2016-07-21 15:57:22 -06:00
Burak Yucesoy	20debfc0ee	Fix COUNT DISTINCT approximation with schema Fixes #555 Before this change, we were resolving HLL function and type Oid without qualified name. Now we find the schema name where HLL objects are stored and generate qualified names for each objects. Similar fix is also applied for cstore_table_size function call.	2016-07-21 17:29:18 +03:00
Burak Yucesoy	bca672e0a4	Fix master_apply_delete_command with schema Fixes #73	2016-07-21 15:09:20 +03:00
Burak Yucesoy	2f0158dde1	Change worker_apply_shard_ddl_command to accept schema name as parameter Fixes #565 Fixes #626 To add schema support to citus, we need to schema-prefix all table names, object names etc. in the queries sent to worker nodes. However; query deparsing is not available for most of DDL commands, therefore it is not easy to generate worker query in the master node. As a solution we are sending schema names along with shard id and query to run to worker nodes with worker_apply_shard_ddl_command. To not break \STAGE command we pass public schema as paramater while calling worker_apply_shard_ddl_command from there. This will not cause problem if user uses \STAGE in different schema because passes schema name is used only if there is no schema name is given in the query.	2016-07-21 14:17:26 +03:00
Metin Doslu	a811e09dd4	Add support for prepared statements with parameterized non-partition columns in router executor	2016-07-21 11:09:28 +03:00
Marco Slot	2388968b62	Move CompleteShardPlacementTransactions to multi_shard_transaction.c	2016-07-20 12:10:46 +02:00
Burak Yucesoy	a0e8f9eb64	Always schema-prefix worker queries Fixes #215 Fixes #267 Fixes #502 Fixes #556 Fixes #557 Fixes #560 Fixes #568 Fixes #623 Fixes #624 With this change we schema-prefix table names, operator names and composite types.	2016-07-20 10:42:24 +03:00
Eren Başak	1063fbc80a	Fix Unused Parameter isTopLevel in ExecuteDistributedDDLCommand This change fixes the unused variable problem in `ExecuteDistributedDDLCommand` function (multi_utility.c). The parameter is meant to be used in PreventTransactionChain call.	2016-07-19 14:14:02 +03:00
Eren	3eaff48114	Propagate DDL Commands with 2PC Fixes #513 This change modifies the DDL Propagation logic so that DDL queries are propagated via 2-Phase Commit protocol. This way, failures during the execution of distributed DDL commands will not leave the table in an intermediate state and the pending prepared transactions can be commited manually. DDL commands are not allowed inside other transaction blocks or functions. DDL commands are performed with 2PC regardless of the value of `citus.multi_shard_commit_protocol` parameter. The workflow of the successful case is this: 1. Open individual connections to all shard placements and send `BEGIN` 2. Send `SELECT worker_apply_shard_ddl_command(<shardId>, <DDL Command>)` to all connections, one by one, in a serial manner. 3. Send `PREPARE TRANSCATION <transaction_id>` to all connections. 4. Sedn `COMMIT` to all connections. Failure cases: - If a worker problem occurs before sending of all DDL commands is finished, then all changes are rolled back. - If a worker problem occurs after all DDL commands are sent but not after `PREPARE TRANSACTION` commands are finished, then all changes are rolled back. However, if a worker node is failed, then the prepared transactions in that worker should be rolled back manually. - If a worker problem occurs during `COMMIT PREPARED` statements are being sent, then the prepared transactions on the failed workers should be commited manually. - If master fails before the first 'PREPARE TRANSACTION' is sent, then nothing is changed on workers. - If master fails during `PREPARE TRANSACTION` commands are being sent, then the prepared transactions on workers should be rolled back manually. - If master fails during `COMMIT PREPARED` or `ROLLBACK PREPARED` commands are being sent, then the remaining prepared transactions on the workers should be handled manually. This change also helps with #480, since failed DDL changes no longer mark failed placements as inactive.	2016-07-19 10:44:11 +03:00
Murat Tuncer	4d992c8143	Make router planner use original query	2016-07-18 18:23:04 +03:00
Eren	5b54e28f93	Add LIMIT/OFFSET Support Fixes #394 This change adds LIMIT/OFFSET support for non router-plannable distributed queries. In cases that we can push the LIMIT down, we add the OFFSET value to that LIMIT in the worker queries. When a query with LIMIT x OFFSET y is issued, the query is propagated to the workers as LIMIT (x+y) OFFSET 0, and on the master table, the original LIMIT and OFFSET values are used. With this change, we can use OFFSET wherever we can use LIMIT.	2016-07-18 12:00:24 +03:00
Andres Freund	4cf0a4e48e	citus_indent fixups	2016-07-13 11:45:51 -07:00
Brian Cloutier	0cad3b22cc	Simplify code and fix include guards in citus_clauses	2016-07-13 11:45:51 -07:00
Brian Cloutier	08384ddc71	cosmetic changes	2016-07-13 11:45:51 -07:00
Brian Cloutier	af9515f669	Only reparse queries if the planner flags them for reparsing	2016-07-13 11:45:51 -07:00
Brian Cloutier	4820366a6f	citus_indent and some renaming	2016-07-13 11:45:51 -07:00
Brian Cloutier	ae91768c96	Evaluate functions on the master - Enables using VOLATILE functions (like nextval()) in INSERT queries - Enables using STABLE functions (like now()) targetLists and joinTrees UPDATE and INSERT can now contain non-immutable functions. INSERT can contain any kind of expression, while UPDATE can contain any STABLE function, so long as a Var is not passed into the STABLE function, even indirectly. UPDATE TagetEntry's can now also include Vars. There's an exception, CASE/COALESCE statements may not contain mutable functions. Functions calls in master_modify_multiple_shards are also evaluated.	2016-07-13 11:45:51 -07:00
Burak Yucesoy	cab03a6274	Fix COPY produces error when using array of user-defined types Fixes #463 OID of user-defined types may be different in master and worker nodes. This causes errors while sending data between nodes with binary nodes. Because binary copy format adds OID of the element if it is in an array. The code adding OID is in PostgreSQL code, therefore we cannot change it. Instead we decided to use text format if we try to send array of user-defined type.	2016-07-13 11:12:24 +03:00
Jason Petersen	41ed433b0e	Remove hash-pruning logic for NULL values It turns out some tests exercised this behavior, but removing it should have no ill effects. Besides, both copy and INSERT disallow NULLs in a table's partition column. Fixes a bug where anti-joins on hash-partitioned distributed tables would incorrectly prune shards early, result in incorrect results (test included).	2016-07-06 17:04:21 -06:00
Andres Freund	4549e06884	Add regression tests for RETURNING.	2016-07-01 13:07:12 -07:00
Andres Freund	cccba66f24	Support RETURNING for modification commands. Fixes: #242	2016-07-01 13:07:12 -07:00
Andres Freund	c38c1adce1	Combine router executor paths for select and modify commands. The upcoming RETURNING support would otherwise require too much duplication. This contains most of the pieces required for RETURNING support, except removing the planner checks and adjusting regression test output.	2016-07-01 13:07:12 -07:00
Andres Freund	e1282b6d70	Remember original targetlist in MultiQueryContainerNode(). The old targetlist wasn't used so far, but the upcoming RETURNING support relies on it. This also allows to get rid of some crufty code in multi_executor.c:multi_ExecutorStart(), which used the worker query's targetlist instead of the main statement's (which didn't have one up to now).	2016-07-01 12:50:12 -07:00
Andres Freund	f78c135e63	Fix definition of faux targetlist element inserted to prevent backward scans. The targetlist contains TargetEntrys containing expressions, not expressions directly. That didn't matter so far, but with the upcoming RETURNING support, the targetlist is inspected to build a TupleDesc. ExecCleanTypeFromTL hits an assert when looking at something that's not a TargetEntry. Mark the entry as resjunk, so it's not actually used.	2016-07-01 12:50:12 -07:00
Andres Freund	d5ad8d7db9	Add tests verifying that updates return correct tuple counts. This unfortunately requires adding a new table, triggering renumbering of a number of shard ids.	2016-07-01 12:50:12 -07:00
Metin Doslu	e5ecf92328	Add null check to SqlStateMatchesCategory() Fixes #634	2016-07-01 12:28:46 -07:00
Jason Petersen	e064cacea9	Minor formatting fix Noticed that uncrustify doesn't like the array-of-struct literals, so omitting them from formatting (at least here).	2016-06-28 13:09:57 -06:00
Jason Petersen	8b788eb899	Use literal instead of constant to fix 9.4 build PG_UINT32_MAX doesn't exist before 9.5. Missed this because I removed my assert-enabled builds during packaging work. Fixes #619	2016-06-28 12:36:14 -06:00
Andres Freund	700c076629	Provide our own psqlscan.l->psqlscan.l rule. As postgres's generic .l -> .c Makefile rule uses ifdef - which is evaluated early, not during rule evaluation - we have to override the rule, in addition to the detection of FLEX in the previous commit. Fixes: #439	2016-06-22 11:03:23 -07:00
Jason Petersen	16fc92bf6b	Purge connection if re-raising error The only way we re-raise an error is if the raiseError flag is true, so might as well purge connection in that block rather than independently checking errorLevel.	2016-06-21 09:51:12 -06:00
Murat Tuncer	fb99585ca5	Refactor multi_planner to create router plan directly If router plan creation fails, it falls back to normal planner	2016-06-21 12:50:21 +03:00
Burak Yucesoy	78aaad2738	Fix master_append_table_to_shard to work with schemas Fixes #78 With this change, it is possible to append a table in any schema to shard. The function master_append_table_to_shard now supports schema names.	2016-06-17 04:35:00 +03:00
Andres Freund	2e8e8d377e	Store ShardInterval instead of shardId in RangeTableFragments. For CITUS_RTE_RELATION type fragments, reloading shardIntervals from the database is rather expensive. So store a pointer to the full shard interval, instead of just the shard id. There's no new memory lifetime hazards here, because we already passed a pointer to the shardInterval's ->shardId field around. The plan time for the query in issue #607 goes from 2889 ms to 106 ms. with this change.	2016-06-16 17:31:35 -07:00
Andres Freund	211a9721a9	Use cached comparator in ShardIntervalsOverlap(). By far the most expensive part of ShardIntervalsOverlap() is computing the function to use to determine overlap. Luckily we already have that computed and cached. The plan time for the query in issue #607 goes from 8764 ms to 2889 ms with this change.	2016-06-16 17:21:19 -07:00
Andres Freund	38f4722f6f	Add tests for LEFT JOIN ON clauses preventing matches left/right.	2016-06-16 16:53:02 -07:00
Marco Slot	52bc209c37	Do not copy outer join clauses into WHERE	2016-06-16 16:42:32 -07:00
Metin Doslu	7d18488676	Drop function from public and create in pg_catalog Fixes #600	2016-06-16 14:08:40 -07:00
Murat Tuncer	9cb6a022a5	Reduce regression test runtime -Added 2 more schedules for task-tracker and multi-binary instead of running multi_schedule 3 times -set task-tracker-delay for each long running schedule	2016-06-15 16:35:07 +03:00
Burak Yucesoy	4a718d293b	Append shardId before escaping the table name Fixes #550, fixes #545 If table name contains special characters, it needs to be escaped. However in some cases, we escape table name before appending shardId, which causes syntax error in the queries sent to worker nodes. With this change we now append shardId before escaping table names.	2016-06-15 04:15:40 +03:00
Murat Tuncer	31df82ba7a	Remove variant files This checkin removes variant files we needed due to differences in outputs of pg94 and pg95 runs. However, variant file for test multi_upsert stays since this file tests for a feature that does not exist in pg94, and outputs are drastically different.	2016-06-13 12:12:06 +03:00
Eren	57256b3476	Eliminate compile time warnings in multi_logical_optimizer.c This change removes some issues about mixed declarations and code in TablePartitioningSupportsDistinct() and WorkerExtendedOpNode() functions.	2016-06-10 12:27:12 +03:00
Murat Tuncer	bb3eee63e7	Refactor task tracker cleanup to enable workers receive cleanup jobs Long sleep is replaced by multiple small sleeps. Maximum timeout is also increased since we do not have to wait for that long most of the cases.	2016-06-09 17:03:54 +03:00
Murat Tuncer	0db413491c	Fix crash in count distinct with filters in repartition subqueries now copies all column references in count distinct aggreagete to worker target list and group by. Master target list is also updated to reflect changes in attribute order. Fixes 569	2016-06-09 11:47:24 +03:00
Jason Petersen	d39508d8b3	Minor formatting/comment fixes	2016-06-08 10:34:07 -06:00
Amos Bird	fd1b088208	Add overflow checks.	2016-06-08 10:30:03 +08:00
Amos Bird	107b926262	Eliminates the possibilities of counter overflows. This patch uses scanint8 instead of pg_atoi to make sure the affected tuples counter never gets overflow.	2016-06-08 10:30:03 +08:00
Burak Yücesoy	323f1151e0	Fix wrong storage type for foreign tables Fixes #496 Previously we do not check whether table is foreign or not while creating empty shards, and set storage type to 't'(Standard table) or 'c'(Columnar table). Now if the table is foreign table(but not CStore foreign table) we set storage type to 'f'(Foreign table). If it is CStore foreign table, we set its storage type to 'c', i.e. columnar table have priority over foreign table. Please note that 'c' is only used for CStore tables not for other possible columnar stores at the moment. Possible improvement could be checking for other columnar stores, though I am not sure if there is a way to check it for all other columnar stores.	2016-06-08 04:12:01 +03:00
Jason Petersen	a19520b9bd	Add back test for INSERT where all placements fail Since we now short-circuit on certain remote errors, we want to ensure we preserve the old behavior of not modifying any placement states if a non-short-circuiting error occurs on all placements.	2016-06-07 13:21:23 -06:00
Jason Petersen	48f4e5d1a5	Make ReportRemoteError's CONTEXT style-compliant There's not a ton of documentation about what CONTEXT lines should look like, but this seems like the most dominant pattern. Similarly, users should expect lowercase, non-period strings.	2016-06-07 12:47:16 -06:00
Jason Petersen	9ba02928ac	Refactor ReportRemoteError to remove boolean arg Broke it into two explicitly-named functions instead: WarnRemoteError and ReraiseRemoteError.	2016-06-07 12:38:32 -06:00
Metin Doslu	7d0c90b398	Fail fast on constraint violations in router executor	2016-06-07 18:11:17 +03:00
Metin Doslu	15eed396b3	Update ereport format	2016-06-07 15:58:32 +03:00
Metin Doslu	28a16beba7	Update only shard length on statistics update for hash-partitioned Update only the shard length on master_update_shard_statistics() call for hash-partitioned tables. Fixes #519.	2016-06-07 15:04:29 +03:00
Eren	5512bb359a	Set Explicit ShardId/JobId In Regression Tests Fixes #271 This change sets ShardIds and JobIds for each test case. Before this change, when a new test that somehow increments Job or Shard IDs is added, then the tests after the new test should be updated. ShardID and JobID sequences are set at the beginning of each file with the following commands: ``` ALTER SEQUENCE pg_catalog.pg_dist_shardid_seq RESTART 290000; ALTER SEQUENCE pg_catalog.pg_dist_jobid_seq RESTART 290000; ``` ShardIds and JobIds are multiples of 10000. Exceptions are: - multi_large_shardid: shardid and jobid sequences are set to much larger values - multi_fdw_large_shardid: same as above - multi_join_pruning: Causes a race condition with multi_hash_pruning since they are run in parallel.	2016-06-07 14:32:44 +03:00
Murat Tuncer	360e884de1	Add enable_ddl_propagation flag to control automatic ddl propagation	2016-06-06 13:42:46 +03:00
Murat Tuncer	20ba0f72a6	Change equality operator check for operator expressions	2016-06-06 12:34:16 +03:00
Burak Yücesoy	2f096cad74	Update regression tests where metadata edited manually Fixes #302 Since our previous syntax did not allow creating hash partitioned tables, some of the previous tests manually changed partition method to hash to be able to test it. With this change we remove unnecessary workaround and create hash distributed tables instead. Also in some tests metadata was created manually. With this change we also fixed this issue.	2016-06-04 13:50:42 +00:00
Burak Yucesoy	5db357eb1a	Remove ONLY clause from worker queries Fixes #475 With this change we prevent addition of ONLY clause to queries prepared for worker nodes. When we add ONLY clause we may miss the inherited tables in worker nodes created by users manually.	2016-06-03 11:42:43 +03:00
Andres Freund	3dac0a4d14	Rely less on remote_task_check_interval. When executing queries with citus.task_executor = 'real-time', query execution could, so far, spend a significant amount of time sleeping. That's because we were a) sleeping after several phases of query execution, even if we're not waiting for network IO b) sleeping for a fixed amount of time when waiting for network IO; often a lot longer than actually required. Just reducing the amount of time slept isn't a real solution, because that just increases CPU usage. Instead have the real-time executor's ManageTaskExecution return whether a task is currently being processed, waiting for reads or writes, or failed. When all tasks are waiting for IO use poll() to wait for IO readyness. That requires to slightly redefine how connection timeouts are handled: before we counted the number of times ManageTaskExecution() was called, and compared that with the timeout divided by the task check interval. That, if processing of tasks took a while, could significantly increase the time till a timeout occurred. Because it was based on the ManageTaskExecution() being called on a constant interval, this approach isn't feasible anymore. Instead measure the actual time since connection establishment was started. That could in theory, if task processing takes a very long time, lead to few passes over PQconnectPoll(). The problem of sleeping too much also exists for the 'task-tracker' executor, but is generally less problematic there, as processing the individual tasks usually will take longer. That said, for e.g. the regression tests it'd be helpful to use a similar approach.	2016-06-02 12:11:16 -06:00
Metin Doslu	d4c4eaa9ff	Move master_update_shard_statistics() to pg_catalog Fixes #546	2016-06-02 10:52:47 +03:00
Jason Petersen	e774f22ed4	Fix formatting Checking in citus_indent output.	2016-05-27 15:13:28 -06:00
Amos Bird	92788a0d9c	Remove redundant implementations of error funcs. This patch does some basic cleaning jobs. It removes duplicated implementations of ReportRemoteError() and related ones and adjusts regression tests.	2016-05-27 15:12:59 -06:00
Jason Petersen	c0c71cb8c5	Merge branch credativ:reproducible cr: @jasonmp85	2016-05-27 12:45:55 -06:00
Matthew Seaman	332c322b4f	Add inet includes for htonl and htons funtions Needed to fix FreeBSD builds.	2016-05-27 12:36:12 -06:00
Murat Tuncer	2b0d6473b9	Add complex distinct count support for repartitioned subqueries Single table repartition subqueries now support count(distinct column) and count(distinct (case when ...)) expressions. Repartition query extracts column used in aggregate expression and adds them to target list and group by list, master query stays the same (count (distinct ...)) but attribute numbers inside the aggregate expression is modified to reflect changes in repartition query.	2016-05-27 15:43:05 +03:00
Metin Doslu	afa74ce5ca	Make master_create_empty_shard() aware of the shard placement policy Now, master_create_empty_shard() will create shards according to the value of citus.shard_placement_policy which also makes default round-robin instead of random.	2016-05-27 15:05:53 +03:00
eren	132d9212d0	ADD master_modify_multiple_shards UDF Fixes #10 This change creates a new UDF: master_modify_multiple_shards Parameters: modify_query: A simple DELETE or UPDATE query as a string. The UDF is similar to the existing master_apply_delete_command UDF. Basically, given the modify query, it prunes the shard list, re-constructs the query for each shard and sends the query to the placements. Depending on the value of citus.multi_shard_commit_protocol, the commit can be done in one-phase or two-phase manner. Limitations: * It cannot be called inside a transaction block * It only be called with simple operator expressions (like Single Shard Modify) Sample Usage: ``` SELECT master_modify_multiple_shards( 'DELETE FROM customer_delete_protocol WHERE c_custkey > 500 AND c_custkey < 500'); ```	2016-05-26 17:30:35 +03:00
Burak Yucesoy	0e71ffd937	Fix #469 This change renames one of the ReceiveRegularFile functions with more descriptive name.	2016-05-26 12:03:36 +03:00
Christoph Berg	7df82baf46	Sort list of objects in src/backend/distributed/Makefile Make's $(wildcard) does not sort the glob result, but returns filenames in filesystem ordering. This makes the build result vary and hence unreproducible on the binary level. Fix by adding $(sort). Spotted by Debian's reproducible builds project.	2016-05-18 10:42:20 +02:00
Jason Petersen	4ca4f10966	Add multi_copy test outputs to gitignore	2016-05-10 13:36:56 -06:00
Jason Petersen	61b6394e4b	Add gitignore rules for latest install files Got tired of dirty git tree.	2016-05-10 11:57:11 -06:00
Marco Slot	1b4fbc76e2	Add JSON/XML validation to EXPLAIN regression tests and fix issues	2016-05-06 11:30:07 +02:00
Lukas Fittl	2f694f7af3	Distributed EXPLAIN: Generate valid JSON output. This modifies the EXPLAIN output functions to actually generate valid JSON output when (FORMAT JSON) is being used. Fixes #494.	2016-05-05 12:48:01 +02:00
Onder Kalaci	d7fd56df89	Fix check-full failures This commit fixes failures happen during check-full. The change does make clean seperation of executor types in certain places to keep the outputs stable.	2016-05-05 12:28:22 +03:00
Andres Freund	5f282dd241	Stamp 5.1 release.	2016-05-04 18:05:41 -07:00
Andres Freund	4d7bcfdd35	Generate extension versions from the previous one.	2016-05-04 18:05:41 -07:00
Onder Kalaci	38da3c826b	Fix compile time warning This change fixes a compile time warning related to definition/declaration order of the code.	2016-05-04 09:42:10 +03:00
Marco Slot	845aebfe19	Remove costs from explain regression tests	2016-05-03 22:11:23 +02:00
Metin Doslu	866271b765	Add COPY support on worker nodes for append partitioned relations Now, we can copy to an append-partitioned distributed relation from any worker node by providing master options such as; COPY relation_name FROM file_path WITH (delimiter '\|', master_host 'localhost', master_port 5432); where master_port is optional and default is 5432.	2016-05-03 16:00:00 +03:00
Marco Slot	0c140cf333	Add deprecation warning to copy_to_distributed_table	2016-05-03 14:08:42 +02:00
Brian Cloutier	58535eb337	Query Planning Performance Improvments (#474 ) - Only look at pruned shards when determining AnchorTable - Use cached shardIntervalCompareFunction during copartition check	2016-05-03 10:48:46 +03:00
Marco Slot	24a74fb0ae	Remove spurious intermediate regression test files	2016-05-02 12:30:15 +02:00
Jason Petersen	510783f84f	Force bad connections in tests by closing sockets Based on Andres' suggestion, I removed SetConnectionStatus, moving its functionality directly into set_connection_status_bad, which now simply shuts down the socket underlying a particular connection. This keeps the functionality as-is while removing our questionable use of internal libpq headers.	2016-04-29 15:56:04 -07:00
Marco Slot	fc4f23065a	Add EXPLAIN for simple distributed queries	2016-04-30 00:11:02 +02:00
eren	7e19ebe679	FIX "mixed declarations and code" Warning in multi_physical_planner.c Fixes #477 This change fixes the compile time warning message in BuildMapMergeJob in multi_physical_planner.c about mixed declarations and code. Basically, the problematic declaration is moved up so that no expression is before it.	2016-04-29 11:18:04 +03:00
Brian Cloutier	0036eb3253	Allow references to columns in UPDATE statements (#472 ) Allow references to columns in UPDATE statements Queries like "UPDATE tbl SET column = column + 1" are now allowed, so long as you don't use any IMMUTABLE functions.	2016-04-28 05:45:16 -07:00
eren	ab240a7d4c	Rename copy_transaction_manager This change renames the distributed transaction manager parameter from citus.copy_transaction_manager to citus.multi_shard_commit_protocol. Distributed transaction manager has been used only by the COPY on hash partitioned tables but it can be used by upcoming features so, we needed to rename so that its name do not contain a reference to COPY. The change also includes renames like transaction_manager_options to commit_protocol_options and TRANSACTION_MANAGER_1PC to COMMIT_PROTOCOL_1PC. With this change, declaration of MultiShardCommitProtocol (was CopyTransactionManager) is moved from multi_copy.c to multi_transaction.c.	2016-04-28 15:12:50 +03:00
Andres Freund	0ce1e3ddaf	Perform permission checks on operations re-implemented by citus. Currently that's just COPY FROM. There's other places where we could check for permissions earlier (to fail less verbosely), but since there's other pending changes in the whole DDL area, which is affected by this, I'm just adding a note to those places.	2016-04-27 10:28:36 -07:00
Andres Freund	758a70a8ff	Create new shards as owned the distributed table's owner. That's important because ownership of relations implies special privileges. Without this change, a distributed table can be accessible by a table's owner, but a shard created by another user might not.	2016-04-27 10:28:33 -07:00
Andres Freund	3a264db2fe	Add ReplicateGrantStmt(). This is the basis for coordinating GRANT/REVOKE across nodes.	2016-04-27 10:28:25 -07:00
Andres Freund	7c281fbe07	Add pg_get_table_grants() function and support extending GRANTs.	2016-04-27 10:28:25 -07:00
Andres Freund	eae65404d0	Grant SELECT for pg_catalog.pg_dist* to PUBLIC. Given pg_class et al. are readable by everyone there's little point in restricting read only access to citus catalogs.	2016-04-27 10:28:25 -07:00
Andres Freund	a5b3dcddb3	Run some commands as superuser to allow normal users to execute queries. Some small parts of citus currently require superuser privileges; which is obviously not desirable for production scenarios. Run these small parts under superuser privileges (we use the extension owner) to avoid that. This does not yet coordinate grants between master and workers. Thus it allows to create shards, load data, and run queries as a non-superuser, but it is not easily possible to allow differentiated accesses to several users.	2016-04-27 10:28:22 -07:00
Andres Freund	25615ee9d7	Add CitusExtensionOwner(), to execute some priviledged operations under. There exist some operations we have to execute with elevated privileges. The most expedient user for that is the user owning the citusdb extension.	2016-04-27 10:26:08 -07:00
Andres Freund	bf87e08331	Replace direct inserts in csql's \stage by serverside functions. \stage so far directly inserted into pg_dist_shard and pg_dist_shard_placement. That makes it hard to do effective permission checks. Thus move the inserts into two C functions. These two new functions aren't the nicest abstraction. But as we are planning to obsolete \stage, it doesn't seem worthwhile to refactor the client-side code of \stage to allow the use of master_create_empty_shard() et al.	2016-04-27 10:23:35 -07:00
Andres Freund	12a246de37	Perform permission checks in functions manipulating distributed tables. Previously several commands, amongst them commands like master_create_distributed_table(), were allowed for everyone. That's not good: Even though citus currently requires superuser permissions, we shouldn't allow non-superusers to perform actions as sensitive as making a table distributed. There's no checks on the worker_* functions, as these usually just punt the action to underlying postgres functionality, which then perform the necessary checks.	2016-04-27 10:22:20 -07:00
Andres Freund	25f919576f	Add very basic infrastructure for schema upgrade scripts. Citus' extension version now has a -$schemaversion appendix. When the schema is changed, a new schema version has to be added; changes to the same schema version several commits inside a single pull request are ok. Schema migration scripts between each schema version have to be added. To ensure upgrade scripts work correctly a new regression test ensures that all steps work. The extension scripts to-be-used for CREATE EXTENSION (i.e. not extension updates) are generated by concatenating citus.sql and the relevant migration scripts.	2016-04-27 10:00:08 -07:00
Andres Freund	5ffce3393a	Always create database for regression tests with a fixed username. Otherwise the owner of relations and such will depend on the username of the user running the regression tests. As "postgres" is the most common username for that purpose, hardcode that in pg_regress_multi.pl.	2016-04-27 10:00:08 -07:00
Andres Freund	42d232c0e8	Use the current session's username when connecting to worker nodes. So far we've always used libpq defaults when connecting to workers; bar special environment variables being set that'll always be the user that started the server. That's not desirable because it prevents using users with fewer privileges. Thus change the various APIs creating connections to workers to always use usernames. That means: 1) MultiClientConnect() needs to, optionally, accept a username 2) GetOrEstablishConnection(), including the underlying cache, need to use the current user as part of the connection cache key. That way connections for separate users are distinct, and we always use one with the correct authorization. 3) The task tracker needs to keep track of the username associated with a task, so it can use it when establishing connections outside the originating session.	2016-04-27 10:00:08 -07:00
Onder Kalaci	108114ab99	Apply final code review feedback - Fix o(n^2) loop to o(n) - Collapse two if statements into a single one - Some coding conventions feedback	2016-04-27 10:36:03 +03:00
Onder Kalaci	c4b783b70b	Fix Merge Conflict This commit fixes merge conflicts.	2016-04-26 11:18:47 +03:00
Onder Kalaci	6c7abc2ba5	Add fast shard pruning path for INSERTs on hash partitioned tables This commit adds a fast shard pruning path for INSERTs on hash-partitioned tables. The rationale behind this change is that if there exists a sorted shard interval array, a single index lookup on the array allows us to find the corresponding shard interval. As mentioned above, we need a sorted (wrt shardminvalue) shard interval array. Thus, this commit updates shardIntervalArray to sortedShardIntervalArray in the metadata cache. Then uses the low-level API that is defined in multi_copy to handle the fast shard pruning. The performance impact of this change is more apparent as more shards exist for a distributed table. Previous implementation was relying on linear search through the shard intervals. However, this commit relies on constant lookup time on shard interval array. Thus, the shard pruning becomes less dependent on the shard count.	2016-04-26 11:16:00 +03:00
Brian Cloutier	7a6d689259	Clear metadata_cache upon DROP EXTENSION When we notice that pg_dist_partition is being invalidated we assume that the citus extension is being dropped and drop state such as extensionLoaded and the cached oids of all the metadata tables. This frees the user from needing to reconnect after running DROP EXTENSION, so we also no longer send a warning message.	2016-04-22 07:25:49 -07:00
Murat Tuncer	a88d3ecd4e	Add dynamic executor selection - non-router plannable queries can be executed by router executor if they satisfy the criteria - router executor is removed from configuration, now task executor can not be set to router - removed some tests that error out for router executor	2016-04-21 09:15:33 +03:00
Murat Tuncer	938546b938	Add router plannable check and router planning logic for single shard select queries	2016-04-21 09:15:33 +03:00
Brian Cloutier	7b1dc0d511	Support count(distinct) on hash partitioned tables Also add test to ensure we get the same results when running count(distinct) on range and hash partitioned tables.	2016-04-20 04:54:07 -07:00
eren	53186b4e67	FIX Warning Message in multi_logical_optimizer.c With #426, some new warning messages started to arise, because of cross assignment of Node and Expr pointers. This change fixes the warnings with type casts.	2016-04-20 11:33:29 +03:00
eren	448527c3af	Fix JOINs on varchar columns with subquery pushdown Fixes #379 Varchar VAR struct is wrapped in RELABELTYPE struct inside PostgreSQL code and IsPartitionColumnRecursive function considers only VAR types so returning false for varchar. This change adds strip_implicit_coercions() call to the columnExpression in IsPartitionColumnRecursive function so that we get rid of implicit coercions like RELABELTYPE are stripped to VAR.	2016-04-19 21:55:50 -06:00
eren	399b5738b0	Fix Join Problem With VARCHAR Partition Columns This change fixes the problem with joins with VARCHAR columns. Prior to this change, when we tried to do large table joins on varchar columns, we got an error of the form: ERROR: cannot perform local joins that involve expressions DETAIL: local joins can be performed between columns only. This is because we have a check in CheckJoinBetweenColumns() which requires the join clause to have only 'Var' nodes (i.e. columns). Postgres adds a relabel t ype cast to cast the varchar to text; hence the type of the node is not T_Var and the join fails. The fix involves calling strip_implicit_coercions() to the left and right arguments so that RELABELTYPE is stripped to VAR. Fixes #76.	2016-04-19 21:55:50 -06:00
eren	1ffc30d7f5	Fix Shard Pruning Problem With Subqueries on VARCHAR Partition Columns Fixes #375 Prior to this change, shard pruning couldn't be done if: - Table is hash-distributed - Partition column of is VARCHAR - Query to be pruned is a subquery There were two problems: - A bug in left-side/right-side checks for the partition column - We were not considering relabeled types (VARCHAR was relabeled as TEXT)	2016-04-19 21:55:50 -06:00
Metin Doslu	132a77f992	Add COPY support on master node for append partitioned relations	2016-04-19 21:57:59 +03:00
Andres Freund	39233c54ac	Remove wholly unused variable. This avoids a -Wunused warning.	2016-04-19 12:31:13 -06:00
Andres Freund	29b8576a33	Annotate variables only used for asserts with PG_USED_FOR_ASSERTS_ONLY. This avoids '-Wunused-but-set-variable' type warnings when compiling without assertions, e.g. against a system postgres.	2016-04-19 12:31:12 -06:00
Jason Petersen	30fdb59a80	Add clarifying comment in HashableClauseMutator While reading this code last week, it appeared as though there was no place we ensured that the partition clause actually used equality ops. As such, I was worried that we might transform a clause such as id < 5 into a constraint like hash(id) = hash(5) when doing shard pruning. The relevant code seemed to just ensure: 1. The node is an OpExpr 2. With a related hash function 3. It compares the partition column 4. Against a constant A superficial reading implied we didn't actually make sure the original op was equality-related, but it turns out the hash lookup function DOES ensure that for us. So I added a comment.	2016-04-19 12:21:11 -06:00
Murat Tuncer	0b35c47932	Merge pull request #410 from citusdata/350-error-during-duplicate-index-creation Error out earlier when creating an index with a name collision.	2016-04-19 07:26:31 +03:00
Brian Cloutier	1df4b8ba32	Better error on "CREATE INDEX already_exists ..." Previously (if you're creating the index with the same name on different tables) we successfully ran the command on the workers before failing it on the master and leaving no record of the index. Now we check whether the index exists on the master before sending commands to the workers. -- Also make the error better when user attampts to create an index without a name. Previously those statements returned: brian=# create index on c (b); WARNING: could not receive query results from localhost:9700 DETAIL: Client error: cannot extend name for null index name ERROR: could not execute DDL command on worker node shards They now return brian=# create index on c (b); ERROR: creating index without a name on a distributed table is currently unsupported	2016-04-18 13:33:53 -07:00
Jason Petersen	c0b6505720	Merge branch 'master' into 422-incorrect-node-port-type	2016-04-15 12:30:50 -06:00
Brian Cloutier	35bbc1dbfd	Treat nodePort as the 64bit integer it is	2016-04-15 11:29:36 -07:00
Jason Petersen	fb4f84207c	Fix use of INT64CONST macro This macro is intended to receive a bare integer literal (no suffix). It adds a suffix as necessary, depending upon available features. On e.g. 32-bit platforms, the existing code failed to compile because a suffix was added to the existing suffix. This fixes that problem.	2016-04-15 12:13:56 -06:00
Matthew Seaman	b1a3801e58	Regularize include paths for some postgresql headers. Addresses #411	2016-04-15 09:37:22 -07:00
Matthew Seaman	897a1126e7	Include appropriate headers for htons() and htonl().	2016-04-15 09:37:08 -07:00
Matthew Seaman	55e81ce23d	Add sys/stat.h include to files using S_IRUSR and S_IWUSR macros.	2016-04-15 09:34:22 -07:00
eren	3c8f275aa9	Clarify Error Message Related to shared_preload_libraries Fixes #363 This change modifies the error message given when Citus is attempted to be loaded other than shared_preload_libraries. Explanations have been extended with that shared_preload_parameters parameter is in postgresql.conf and citus should be at the beginning.	2016-04-13 12:12:21 +03:00
eren	64aefed46f	Fix SELECT problem with no target list Prior to this change, performing a SELECT query without a target list caused backend to crash. Sample Query: SELECT FROM github_events; (without any * before FROM) PostgreSQL: ``` -- (39599 rows) ``` Citus: ``` server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> ``` The problem was an unnecessary Assert on column list in SetRangeTblExtraData(citus_nodefuncs.c)	2016-04-13 11:08:14 +03:00
Metin Doslu	1150ce6414	Send COPY rows in binary format	2016-04-12 20:22:31 +02:00
Marco Slot	d25ee8fbd8	Support for COPY FROM, based on pg_shard PR by Postres Pro	2016-04-12 20:22:31 +02:00
Onder Kalaci	d917d9a615	Allow all types of nodes in the WHERE clauses This change removes the whitelisting check on the WHERE clauses. Note that, before this change, citus was already allowing all types of nodes with the following format (i.e., wrap with a boolean test): * SELECT col FROM table WHERE (ANY EXPRESSION) is TRUE; Thus, this change is mostly useful for allowing the expressions in the WHERE clause directly and avoiding "unsupport clause type" errors.	2016-03-30 16:39:58 +03:00
eren	8a284aab92	Fixes issue #313 Prior to this change, it was not possible to use UDFs in repartitioned subqueries. The reason is that we were setting the search path explicitly and omiting public schema from that path. This change adds the public schema to the explicitly set search path.	2016-03-30 15:39:12 +03:00
eren	ef6d5c7571	Fix spurious NOTICE messages with ANY/ALL Fixes issue #258 Prior to this change, Citus gives a deceptive NOTICE message when a query including ANY or ALL on a non-partition column is issued on a hash partitioned table. Let the github_events table be hash-distributed on repo_id column. Then, issuing this query: SELECT count(*) FROM github_events WHERE event_id = ANY ('{1,2,3}') Gives this message: NOTICE: cannot use shard pruning with ANY (array expression) HINT: Consider rewriting the expression with OR clauses. Note that since event_id is not the partition column, shard pruning would not be applied in any case. However, the NOTICE message would be valid and be given if the ANY clause would have been applied on repo_id column. Reviewer: Murat Tuncer	2016-03-25 14:30:02 +02:00
Murat Tuncer	e74decf8b4	Allow users to create unique indexes Users can now create unique indexes on partition columns for hash and range distributed tables.	2016-03-24 09:31:06 +02:00
Jason Petersen	697d3ea09b	Merge latest 5.0 release fixes	2016-03-23 17:43:34 -06:00
Jason Petersen	1eebb9a6c3	Fix strlcpy off-by-one error WORKER_LENGTH + 1 is too large. Fixing this has no impact on the string that is ultimately copied, as it's impossible for the source string to be any larger to begin with.	2016-03-23 17:34:34 -06:00
Jason Petersen	423e6c8ea0	Update copyright dates Fixed configure variable and updated all end dates to 2016.	2016-03-23 17:14:37 -06:00
Andres Freund	53309461cb	Improve DDL replication related regression tests. The previous form of the test, utilizing DEBUG2, included too much output dependent on the specifc system and version. Reformulate it to explicitly connect to workers and show the schema there, when necessary. The only remaining difference in some of the remaining alternate regression test files was due to an older minor version release change. Remove those as well.	2016-03-17 16:05:54 -07:00
Andres Freund	5311960200	Make :master_port and :worker_$n_port available to all regression tests. There already exist tests that locally embed knowledge about port numbers, and there's more tests requiring that. Instead of copying \set's to several tests, make these port number variables available to all tests.	2016-03-17 16:05:54 -07:00
Andres Freund	b60f5da774	Copy toplevel queryId to citus' master statement. multi_ExecutorStart() replaces the original planned statement with the master select statement. As that hasn't gone through the parse analysis hooks, it'll not have a associated queryId. This prevents extensions pg_stat_statements to show useful data associated with the query.	2016-03-14 17:27:52 -07:00
Jason Petersen	a53fb90ef9	Fix various build issues I came across several places we weren't as flexible or resilient as we should have been in our build logic. They include: * Not using `DESTDIR` in the install-header destination * Allowing callers to specify `VPATH` or `srcdir` (which breaks) * Using absolute path for SCRIPTS (9.5 prepends srcdir) * Including libpq-int in a confusing way (extracted this function) * Having server includes come first during csql build (client must) In particular, I hit all of these attempting to build with pg_buildext in Debian. It passes in an explicit VPATH, as well as srcdir (breaking all recursive make invocations), and also uses DESTDIR during install. In addition, a PGDG-enabled Debian box will have the latest libpq-dev headers (e.g. 9.5) even when building against an older server version (e.g. 9.4). This leads to problems when including e.g. `c.h`, which is ambiguous. While compiling more client-side code (csql), we need to ensure the newer libpq headers are included _first_, so I fixed that.	2016-03-11 13:38:47 -07:00
Jason Petersen	297bd5768d	Final formatting fixes	2016-02-17 17:20:14 -07:00
Andres Freund	974a121d50	Make 'all' default src/backend/distributed target Otherwise typing 'make' will just build citusdb--5.0.sql, not particularly helpful.	2016-02-17 16:51:55 -07:00
Andres Freund	0a1c8723c4	Fix make install for VPATH builds. copy_to_distributed_table is in the source, not the build directory. As there might be scripts in either at some point, install scripts from both.	2016-02-17 16:48:06 -07:00
Marco Slot	71af0e496e	Rename citusdb to citus in regression test output	2016-02-17 23:33:30 +01:00
Marco Slot	75a141a7c6	Merge remote-tracking branch 'origin/master' into feature/drop_shards_on_drop_table	2016-02-17 22:52:58 +01:00
Marco Slot	9aa1f1e1e7	Rename topLevel variable to isTopLevel	2016-02-17 22:52:35 +01:00
Jason Petersen	ec1e74e7f9	Change tests to use default staging policy The default staging policy is now round-robin, though tests were still configured to use local-first. Testing with the shipping default seems like the best option, correctness-wise, and since local-first has some issues with OSes where connecting from localhost doesn't always resolve to 'localhost', just going with the default is a win-win.	2016-02-17 11:03:17 -07:00
Murat Tuncer	df5851366c	Fixed merge leftovers	2016-02-17 15:44:24 +02:00
Murat Tuncer	3528d7ce85	Merge from master branch into feature/citusdb-to-citus	2016-02-17 14:49:01 +02:00
Metin Doslu	6123022ca7	Add check for count distinct on single table subqueries Fixes #314	2016-02-17 14:24:07 +02:00
Murat Tuncer	2160a2951b	Merge pull request #334 from citusdata/feature/append_table_to_shard Add support for appending to cstore table shards	2016-02-17 09:19:33 +02:00
Jason Petersen	8ad5b09251	Merge pull request #344 from citusdata/fix_shard_lock_acquisition#342 Ensure router executor acquires proper shard lock cr: @onderkalaci	2016-02-16 16:43:39 -07:00
Jason Petersen	0d196d1bf4	Ensure router executor acquires proper shard lock Though Citus' Task struct has a shardId field, it doesn't have the same semantics as the one previously used in pg_shard code. The analogous field in the Citus Task is anchorShardId. I've also added an argument check to the relevant locking function to catch future locking attempts which pass an invalid argument.	2016-02-16 11:20:18 -07:00
Marco Slot	37f580f9c7	Trim comment about invalidating dropped relations	2016-02-16 14:04:12 +01:00
Marco Slot	2af6797c04	Perform relcache invalidation in CitusInvalidateRelcacheByRelid	2016-02-16 12:59:38 +01:00
Murat Tuncer	444f305165	Add support for appending to cstore table shards - Flexed the check which prevented append operation cstore tables since its storage type is not SHARD_STORAGE_TABLE. - Used process utility function to perform copy operation in worker_append_table_to shard() instead of directly calling postgresql DoCopy(). - Removed the additional check in master_create_empty_shard() function. This check was redundant and erroneous since it was called after CheckDistributedTable() call. - Modified WorkerTableSize() function to retrieve cstore table shard size correctly.	2016-02-16 13:58:39 +02:00
Marco Slot	52f11223e5	Drop shards when a distributed table is dropped After this change, shards and associated metadata are automatically dropped when running DROP TABLE on a distributed table, which fixes #230. It also adds schema support for master_apply_delete_command, which fixes #73. Dropping the shards happens in the master_drop_all_shards UDF, which is called from the SQL_DROP trigger. Inside the trigger, the table is no longer visible and calling master_apply_delete_command directly wouldn't work and oid <-> name mappings are not available. The master_drop_all_shards function therefore takes the relation id, schema name, and table name as parameters, which can be obtained from pg_event_trigger_dropped_objects() in the SQL_DROP trigger. If the user calls master_drop_all_shards while the table still exists, the schema name and table name are ignored. Author: Marco Slot Reviewed-By: Andres Freund	2016-02-16 10:54:29 +01:00
Jason Petersen	920e0c406d	Format csql's stage files These are entirely Citus-produced, so need full formatting.	2016-02-15 23:37:37 -07:00
Jason Petersen	19c529f311	Omit most of copy_options from formatting Only a small portion is Citus style.	2016-02-15 23:37:37 -07:00
Jason Petersen	2b5ae847d4	Make copy_options.ch similar to PostgreSQL copy.c We reorganized these functions in our copy; not sure why (makes diffing harder). I'm moving it back.	2016-02-15 23:37:37 -07:00
Jason Petersen	bc23113732	Omit backend/copy.c-inspired parts from formatting I think we need to assess whether this function is still as in-sync with upstream as we believe, but for now I'm omitting it from formatting.	2016-02-15 23:29:33 -07:00
Jason Petersen	74372f70e0	Omit get_extension_schema from formatting It exactly matches the implementation in extension.c.	2016-02-15 23:29:33 -07:00
Jason Petersen	f874a56e24	Omit RangeVarCallbackForDropIndex from formatting I removed two braces to have this function remain more similar to the original PostgreSQL function and added uncrustify commands to disable formatting of its contents.	2016-02-15 23:29:33 -07:00
Jason Petersen	fdb37682b2	First formatting attempt Skipped csql, ruleutils, readfuncs, and functions obviously copied from PostgreSQL. Seeing how this looks, then continuing.	2016-02-15 23:29:32 -07:00
Murat Tuncer	55c44b48dd	Changed product name to citus All citusdb references in - extension, binary names - file headers - all configuration name prefixes - error/warning messages - some functions names - regression tests are changed to be citus.	2016-02-15 16:04:31 +02:00
Jason Petersen	334f800016	Merge pull request #329 from citusdata/feature-fix_naming_conflicts#236 Rename GetConnection to address name conflict cr: @onderkalaci	2016-02-12 16:58:33 -07:00
Jason Petersen	4494e57bbd	Rename GetConnection to address name conflict The postgres_fdw extension has an extern function with an identical signature, which can cause problems when both extensions are loaded. A simple rename can fix this for now (this is the only function with) such a conflict.	2016-02-12 13:35:02 -07:00
Önder Kalacı	a55287411b	Merge pull request #332 from citusdata/bugfix/memory_context_leak Remove unnecessary memory context switch on the planner	2016-02-12 11:13:12 -08:00
Onder Kalaci	0a6839e544	Perform distributed planning in the calling memory context Previously we used, for historical reasons, MessageContext. That is problematic if a single message from the client causes a lot of statements to be planned. E.g. for the copy_to_distributed_table script one insert statement is planned for each row inserted via COPY, and only freed when COPY has finished.	2016-02-12 20:50:40 +02:00
Jason Petersen	b1ef2e59a2	Merge pull request #331 from citusdata/feature-permit_dml_to_append_tables#321 Allow DML commands on append-partitioned tables cr: @lithp	2016-02-12 11:24:06 -07:00
Jason Petersen	6f308c5e2d	Allow DML commands on append-partitioned tables This entirely removes any restriction on the type of partitioning during DML planning and execution. Though there aren't actually any technical limitations preventing DML commands against append- (or even range-) partitioned tables, we had initially forbidden this, as any future stage operation could cause shards to overlap, banning all subsequent DML operations to partition values contained within more than one shards. This ended up mostly restricting us, so we're now removing that restriction.	2016-02-11 16:09:35 -07:00
Jason Petersen	d164305929	Handle hash-partitioned aliased data types When two data types have the same binary representation, PostgreSQL may add an implicit coercion between them by wrapping a node in a relabel type. This wrapper signals that the wrapped value is completely binary compatible with the designated "final type" of the relabel node. As an example, the varchar type is often relabeled to text, since functions provided for use with text (comparisons, hashes, etc.) are completely compatible with varchar as well. The hash-partitioned codepath contains functions that verify queries actually contain an equality constraint on the partition column, but those functions expect such constraints to be comparison operations between a Var and Const. The RelabelType wrapper node causes these functions to always return false, which bypasses shard pruning.	2016-02-11 13:50:43 -07:00
Onder Kalaci	136306a1fe	Initial commit of Citus 5.0	2016-02-11 04:05:32 +02:00

... 9 10 11 12 13 ...

815 Commits (temp_tables)