citus

Commit Graph

Author	SHA1	Message	Date
metdos	35f864bcaf	Respect enable_hashagg in the master planner	2018-02-05 15:06:00 +02:00
Marco Slot	6a6e986c2b	Add EXPLAIN regression test with subplans	2017-12-19 16:34:56 +01:00
Marco Slot	e49254f876	Revert "Add EXPLAIN regression test with subplans" This reverts commit `8b6d641227`.	2017-12-17 22:34:31 +01:00
Marco Slot	8b6d641227	Add EXPLAIN regression test with subplans	2017-12-17 22:00:25 +01:00
Marco Slot	f4ceea5a3d	Enable 2PC by default	2017-11-22 11:26:58 +01:00
mehmet furkan şahin	34709c2a16	Regression tests parallelization PART-1	2017-11-20 18:03:37 +03:00
mehmet furkan şahin	314fc09d90	regression test shard_count is changed from 32 to 4	2017-11-20 12:47:49 +03:00
Marco Slot	89eb833375	Use citus.next_shard_id where practical in regression tests	2017-11-15 10:12:05 +01:00
Brian Cloutier	7be1545843	Support implicit casts during INSERT/SELECT It's possible to build INSERT SELECT queries which include implicit casts, currently we attempt to support these by adding explicit casts to the SELECT query, but this sometimes crashes because we don't update all nodes with the new types. (SortClauses, for instance) This commit removes those explicit casts and passes an unmodified SELECT query to the COPY executor (how we implement INSERT SELECT under the scenes). In lieu of those cases, COPY has been given some extra logic to inspect queries, notice that the types don't line up with the table it's supposed to be inserting into, and "manually" casting every tuple before sending them to workers.	2017-11-03 22:27:15 -07:00
velioglu	0b5db5d826	Support multi shard update/delete queries	2017-10-25 15:52:38 +03:00
Jason Petersen	b4474fc0b0	Modify version-output tests for PostgreSQL 11 Basically we just care whether the running version is before or after PostgreSQL 10, so testing the major version against 9 and printing a boolean is sufficient.	2017-09-25 17:20:24 -07:00
Jason Petersen	8cb69e3a14	Add alias for target in multi-row INSERTs This is necessary for multi-row INSERTs for the same reasons we use it in e.g. UPSERTs: if the range table list has more than one entry, then PostgreSQL's deparse logic requires that vars be prefixed by the name of their corresponding range table entry. This of course doesn't affect single-row INSERTs, but since multi-row INSERTs have a VALUE RTE, they were affected. The piece of ruleutils which builds range table names wasn't modified to handle shard extension; instead UPSERT/INSERT INTO ... SELECT added an alias to the RTE. When present, this alias is favored. Doing the same in the multi-row INSERT case fixes RETURNING for such commands.	2017-08-23 10:24:00 +02:00
velioglu	b0efffae1c	Correct planner and add more tests	2017-08-11 10:16:13 +03:00
Jason Petersen	addde54464	Add some tests	2017-08-10 00:32:46 -07:00
Metin Doslu	b8a9e7c1bf	Add support for UPDATE/DELETE with subqueries	2017-08-08 21:35:08 +03:00
Marco Slot	aa7ca81548	Execute UPDATE/DELETE statements with 0 shards	2017-08-07 15:36:58 +02:00
velioglu	6ea15fbb25	Make create_distributed_table transactional	2017-07-18 12:35:40 +03:00
Jason Petersen	2204da19f0	Support PostgreSQL 10 (#1379 ) Adds support for PostgreSQL 10 by copying in the requisite ruleutils and updating all API usages to conform with changes in PostgreSQL 10. Most changes are fairly minor but they are numerous. One particular obstacle was the change in \d behavior in PostgreSQL 10's psql; I had to add SQL implementations (views, mostly) to mimic the pre-10 output.	2017-06-26 02:35:46 -06:00
Marco Slot	2f8ac82660	Execute INSERT..SELECT via coordinator if it cannot be pushed down Add a second implementation of INSERT INTO distributed_table SELECT ... that is used if the query cannot be pushed down. The basic idea is to execute the SELECT query separately and pass the results into the distributed table using a CopyDestReceiver, which is also used for COPY and create_distributed_table. When planning the SELECT, we go through planner hooks again, which means the SELECT can also be a distributed query. EXPLAIN is supported, but EXPLAIN ANALYZE is not because preventing double execution was a lot more complicated in this case.	2017-06-22 15:46:30 +02:00
Onder Kalaci	df494c0403	Improve subquery pushdown regression tests - Use native postgres function for composite key btree functions - Move explain tests to multi_explain.sql (get rid of .out _0.out files) - Get rid of input/output files for multi_subquery.sql by moving table creations - Update some comments	2017-05-30 14:05:15 +03:00
Jason Petersen	97f8302c9c	Change version-sensitive tests to handle '10' Previously assumed period in version; this makes tests future-proof.	2017-05-16 11:05:34 -06:00
Metin Doslu	b6659bec22	Send explain queries with savepoints With this commit, we started to send explain queries within a savepoint. After running explain query, we rollback to savepoint. This saves us from side effects of EXPLAIN ANALYZE on DML queries.	2017-04-28 12:13:48 -07:00
Burak Yucesoy	d6cb88a73a	Stabilize test outputs	2017-04-21 16:08:52 +03:00
Marco Slot	40829c2ba9	Set citus.enable_unique_job_ids in tests with job ID in output	2017-04-18 11:42:32 +02:00
Metin Doslu	bcff6aa96c	Update regression tests for changing explain output	2017-03-22 15:25:00 -06:00
Metin Doslu	1f838199f8	Use CustomScan API for query execution Custom Scan is a node in the planned statement which helps external providers to abstract data scan not just for foreign data wrappers but also for regular relations so you can benefit your version of caching or hardware optimizations. This sounds like only an abstraction on the data scan layer, but we can use it as an abstraction for our distributed queries. The only thing we need to do is to find distributable parts of the query, plan for them and replace them with a Citus Custom Scan. Then, whenever PostgreSQL hits this custom scan node in its Vulcano style execution, it will call our callback functions which run distributed plan and provides tuples to the upper node as it scans a regular relation. This means fewer code changes, fewer bugs and more supported features for us! First, in the distributed query planner phase, we create a Custom Scan which wraps the distributed plan. For real-time and task-tracker executors, we add this custom plan under the master query plan. For router executor, we directly pass the custom plan because there is not any master query. Then, we simply let the PostgreSQL executor run this plan. When it hits the custom scan node, we call the related executor parts for distributed plan, fill the tuple store in the custom scan and return results to PostgreSQL executor in Vulcano style, a tuple per XXX_ExecScan() call. * Modify planner to utilize Custom Scan node. * Create different scan methods for different executors. * Use native PostgreSQL Explain for master part of queries.	2017-03-14 12:17:51 +02:00
Andres Freund	52358fe891	Initial temp table removal implementation	2017-03-14 12:09:49 +02:00
Andres Freund	6939cb8c56	Hack up PREPARE/EXECUTE for nearly all distributed queries. All router, real-time, task-tracker plannable queries should now have full prepared statement support (and even use router when possible), unless they don't go through the custom plan interface (which basically just affects LANGUAGE SQL (not plpgsql) functions). This is achieved by forcing postgres' planner to always choose a custom plan, by assigning very low costs to plans with bound parameters (i.e. ones were the postgres planner replanned the query upon EXECUTE with all parameter values provided), instead of the generic one. This requires some trickery, because for custom plans to work the costs for a non-custom plan have to be known, which means we can't error out when planning the generic plan. Instead we have to return a "faux" plan, that'd trigger an error message if executed. But due to the custom plan logic that plan will likely (unless called by an SQL function, or because we can't support that query for some reason) not be executed; instead the custom plan will be chosen.	2017-01-23 09:23:50 -08:00
Onder Kalaci	9f0bd4cb36	Reference Table Support - Phase 1 With this commit, we implemented some basic features of reference tables. To start with, a reference table is * a distributed table whithout a distribution column defined on it * the distributed table is single sharded * and the shard is replicated to all nodes Reference tables follows the same code-path with a single sharded tables. Thus, broadcast JOINs are applicable to reference tables. But, since the table is replicated to all nodes, table fetching is not required any more. Reference tables support the uniqueness constraints for any column. Reference tables can be used in INSERT INTO .. SELECT queries with the following rules: * If a reference table is in the SELECT part of the query, it is safe join with another reference table and/or hash partitioned tables. * If a reference table is in the INSERT part of the query, all other participating tables should be reference tables. Reference tables follow the regular co-location structure. Since all reference tables are single sharded and replicated to all nodes, they are always co-located with each other. Queries involving only reference tables always follows router planner and executor. Reference tables can have composite typed columns and there is no need to create/define the necessary support functions. All modification queries, master_* UDFs, EXPLAIN, DDLs, TRUNCATE, sequences, transactions, COPY, schema support works on reference tables as expected. Plus, all the pre-requisites associated with distribution columns are dismissed.	2016-12-20 14:09:35 +02:00
Brian Cloutier	1e6d1ef67e	Fix segfault during EXPLAIN EXECUTE Fix citusdata/citus#886 The way postgres' explain hook is designed means that our hook is never called during EXPLAIN EXECUTE. So, we special-case EXPLAIN EXECUTE by catching it in the utility hook. We then replace the EXECUTE with the original query and pass it back to Citus.	2016-10-26 15:18:42 +03:00
Andres Freund	ac14b2edbc	Support PostgreSQL 9.6 Adds support for PostgreSQL 9.6 by copying in the requisite ruleutils file and refactoring the out/readfuncs code to flexibly support the old-style copy/pasted out/readfuncs (prior to 9.6) or use extensible node APIs (in 9.6 and higher). Most version-specific code within this change is only needed to set new fields in the AggRef nodes we build for aggregations. Version-specific test output files were added in certain cases, though in most they were not necessary. Each such file begins by e.g. printing the major version in order to clarify its purpose. The comment atop citus_nodes.h details how to add support for new nodes for when that becomes necessary.	2016-10-18 16:23:55 -06:00
Metin Doslu	d03a2af778	Add HAVING support This commit completes having support in Citus by adding having support for real-time and task-tracker executors. Multiple tests are added to regression tests to cover new supported queries with having support.	2016-10-13 15:47:53 +03:00
Andres Freund	982ad66753	Introduce placement IDs. So far placements were assigned an Oid, but that was just used to track insertion order. It also did so incompletely, as it was not preserved across changes of the shard state. The behaviour around oid wraparound was also not entirely as intended. The newly introduced, explicitly assigned, IDs are preserved across shard-state changes. The prime goal of this change is not to improve ordering of task assignment policies, but to make it easier to reference shards. The newly introduced UpdateShardPlacementState() makes use of that, and so will the in-progress connection and transaction management changes.	2016-10-07 11:59:20 -07:00
Marco Slot	c4bc0742a7	Make count return 0 if all shards are pruned away Before this change, count on a distributed returned NULL if all shards were pruned away, because on the master we replace with count(..) call with a sum(..) call to sum the counts from the shards. However, sum returns NULL when there are no rows, whereas count is expected to return 0.	2016-09-29 20:27:26 +02:00
Eren Basak	b513f1c911	Replace \stage With \copy on Regression Tests Fixes #547 This change removes all references to \stage in the regression tests and puts \COPY instead. Doing so changed shard counts, min/max values on some test tables (lineitem, orders, etc.).	2016-08-22 11:31:26 -06:00
Murat Tuncer	31df82ba7a	Remove variant files This checkin removes variant files we needed due to differences in outputs of pg94 and pg95 runs. However, variant file for test multi_upsert stays since this file tests for a feature that does not exist in pg94, and outputs are drastically different.	2016-06-13 12:12:06 +03:00
Eren	5512bb359a	Set Explicit ShardId/JobId In Regression Tests Fixes #271 This change sets ShardIds and JobIds for each test case. Before this change, when a new test that somehow increments Job or Shard IDs is added, then the tests after the new test should be updated. ShardID and JobID sequences are set at the beginning of each file with the following commands: ``` ALTER SEQUENCE pg_catalog.pg_dist_shardid_seq RESTART 290000; ALTER SEQUENCE pg_catalog.pg_dist_jobid_seq RESTART 290000; ``` ShardIds and JobIds are multiples of 10000. Exceptions are: - multi_large_shardid: shardid and jobid sequences are set to much larger values - multi_fdw_large_shardid: same as above - multi_join_pruning: Causes a race condition with multi_hash_pruning since they are run in parallel.	2016-06-07 14:32:44 +03:00
Marco Slot	1b4fbc76e2	Add JSON/XML validation to EXPLAIN regression tests and fix issues	2016-05-06 11:30:07 +02:00
Lukas Fittl	2f694f7af3	Distributed EXPLAIN: Generate valid JSON output. This modifies the EXPLAIN output functions to actually generate valid JSON output when (FORMAT JSON) is being used. Fixes #494.	2016-05-05 12:48:01 +02:00
Onder Kalaci	d7fd56df89	Fix check-full failures This commit fixes failures happen during check-full. The change does make clean seperation of executor types in certain places to keep the outputs stable.	2016-05-05 12:28:22 +03:00
Marco Slot	845aebfe19	Remove costs from explain regression tests	2016-05-03 22:11:23 +02:00
Marco Slot	fc4f23065a	Add EXPLAIN for simple distributed queries	2016-04-30 00:11:02 +02:00

42 Commits (aba2f47cdf875ab06c3286c465d34de75c514d99)