Commit Graph

76 Commits (4d992c81432bbdbb789a1bbc9f2bdf3df4fbf596)

Author SHA1 Message Date
Murat Tuncer 4d992c8143 Make router planner use original query 2016-07-18 18:23:04 +03:00
Eren 5b54e28f93 Add LIMIT/OFFSET Support
Fixes #394

This change adds LIMIT/OFFSET support for non router-plannable
distributed queries.

In cases that we can push the LIMIT down, we add the OFFSET value to
that LIMIT in the worker queries. When a query with LIMIT x OFFSET y is issued,
the query is propagated to the workers as LIMIT (x+y) OFFSET 0, and on the
master table, the original LIMIT and OFFSET values are used. With this change,
we can use OFFSET wherever we can use LIMIT.
2016-07-18 12:00:24 +03:00
Brian Cloutier ae91768c96 Evaluate functions on the master
- Enables using VOLATILE functions (like nextval()) in INSERT queries
- Enables using STABLE functions (like now()) targetLists and joinTrees

UPDATE and INSERT can now contain non-immutable functions. INSERT can contain any kind of
expression, while UPDATE can contain any STABLE function, so long as a Var is not passed
into the STABLE function, even indirectly. UPDATE TagetEntry's can now also include Vars.

There's an exception, CASE/COALESCE statements may not contain mutable functions.

Functions calls in master_modify_multiple_shards are also evaluated.
2016-07-13 11:45:51 -07:00
Burak Yucesoy cab03a6274 Fix COPY produces error when using array of user-defined types
Fixes #463

OID of user-defined types may be different in master and worker nodes. This causes errors
while sending data between nodes with binary nodes. Because binary copy format adds OID
of the element if it is in an array. The code adding OID is in PostgreSQL code, therefore
we cannot change it. Instead we decided to use text format if we try to send array of
user-defined type.
2016-07-13 11:12:24 +03:00
Jason Petersen 41ed433b0e
Remove hash-pruning logic for NULL values
It turns out some tests exercised this behavior, but removing it should
have no ill effects. Besides, both copy and INSERT disallow NULLs in a
table's partition column.

Fixes a bug where anti-joins on hash-partitioned distributed tables
would incorrectly prune shards early, result in incorrect results (test
included).
2016-07-06 17:04:21 -06:00
Andres Freund 4549e06884 Add regression tests for RETURNING. 2016-07-01 13:07:12 -07:00
Andres Freund cccba66f24 Support RETURNING for modification commands.
Fixes: #242
2016-07-01 13:07:12 -07:00
Andres Freund d5ad8d7db9 Add tests verifying that updates return correct tuple counts.
This unfortunately requires adding a new table, triggering renumbering
of a number of shard ids.
2016-07-01 12:50:12 -07:00
Burak Yucesoy 78aaad2738 Fix master_append_table_to_shard to work with schemas
Fixes #78

With this change, it is possible to append a table in any schema to shard. The function
master_append_table_to_shard now supports schema names.
2016-06-17 04:35:00 +03:00
Andres Freund 38f4722f6f Add tests for LEFT JOIN ON clauses preventing matches left/right. 2016-06-16 16:53:02 -07:00
Marco Slot 52bc209c37 Do not copy outer join clauses into WHERE 2016-06-16 16:42:32 -07:00
Murat Tuncer 9cb6a022a5 Reduce regression test runtime
-Added 2 more schedules for task-tracker and multi-binary
 instead of running multi_schedule 3 times
-set task-tracker-delay for each long running schedule
2016-06-15 16:35:07 +03:00
Burak Yucesoy 4a718d293b Append shardId before escaping the table name
Fixes #550, fixes #545

If table name contains special characters, it needs to be escaped. However in some cases,
we escape table name before appending shardId, which causes syntax error in the queries
sent to worker nodes. With this change we now append shardId before escaping table names.
2016-06-15 04:15:40 +03:00
Murat Tuncer 31df82ba7a Remove variant files
This checkin removes variant files we needed
due to differences in outputs of pg94 and pg95 runs.
However, variant file for test multi_upsert stays
since this file tests for a feature that does not
exist in pg94, and outputs are drastically different.
2016-06-13 12:12:06 +03:00
Murat Tuncer bb3eee63e7 Refactor task tracker cleanup to enable workers receive cleanup jobs
Long sleep is replaced by multiple small sleeps. Maximum timeout
is also increased since we do not have to wait for that long most
of the cases.
2016-06-09 17:03:54 +03:00
Murat Tuncer 0db413491c Fix crash in count distinct with filters in repartition subqueries
now copies all column references in count distinct aggreagete
to worker target list and group by. Master target list is
also updated to reflect changes in attribute order.

Fixes 569
2016-06-09 11:47:24 +03:00
Burak Yücesoy 323f1151e0 Fix wrong storage type for foreign tables
Fixes #496

Previously we do not check whether table is foreign or not while creating empty
shards, and set storage type to 't'(Standard table) or 'c'(Columnar table). Now
if the table is foreign table(but not CStore foreign table) we set storage
type to 'f'(Foreign table). If it is CStore foreign table, we set its storage
type to 'c', i.e. columnar table have priority over foreign table.

Please note that 'c' is only used for CStore tables not for other possible
columnar stores at the moment. Possible improvement could be checking for other
columnar stores, though I am not sure if there is a way to check it for all
other columnar stores.
2016-06-08 04:12:01 +03:00
Jason Petersen a19520b9bd
Add back test for INSERT where all placements fail
Since we now short-circuit on certain remote errors, we want to ensure
we preserve the old behavior of not modifying any placement states if
a non-short-circuiting error occurs on all placements.
2016-06-07 13:21:23 -06:00
Jason Petersen 48f4e5d1a5
Make ReportRemoteError's CONTEXT style-compliant
There's not a ton of documentation about what CONTEXT lines should look
like, but this seems like the most dominant pattern. Similarly, users
should expect lowercase, non-period strings.
2016-06-07 12:47:16 -06:00
Metin Doslu 7d0c90b398 Fail fast on constraint violations in router executor 2016-06-07 18:11:17 +03:00
Metin Doslu 15eed396b3 Update ereport format 2016-06-07 15:58:32 +03:00
Metin Doslu 28a16beba7 Update only shard length on statistics update for hash-partitioned
Update only the shard length on master_update_shard_statistics() call for
hash-partitioned tables.

Fixes #519.
2016-06-07 15:04:29 +03:00
Eren 5512bb359a Set Explicit ShardId/JobId In Regression Tests
Fixes #271

This change sets ShardIds and JobIds for each test case. Before this change,
when a new test that somehow increments Job or Shard IDs is added, then
the tests after the new test should be updated.

ShardID and JobID sequences are set at the beginning of each file with the
following commands:

```
ALTER SEQUENCE pg_catalog.pg_dist_shardid_seq RESTART 290000;
ALTER SEQUENCE pg_catalog.pg_dist_jobid_seq RESTART 290000;
```

ShardIds and JobIds are multiples of 10000. Exceptions are:
- multi_large_shardid: shardid and jobid sequences are set to much larger values
- multi_fdw_large_shardid: same as above
- multi_join_pruning: Causes a race condition with multi_hash_pruning since
they are run in parallel.
2016-06-07 14:32:44 +03:00
Murat Tuncer 360e884de1 Add enable_ddl_propagation flag to control automatic ddl propagation 2016-06-06 13:42:46 +03:00
Burak Yücesoy 2f096cad74 Update regression tests where metadata edited manually
Fixes #302

Since our previous syntax did not allow creating hash partitioned tables,
some of the previous tests manually changed partition method to hash to
be able to test it. With this change we remove unnecessary workaround and
create hash distributed tables instead. Also in some tests metadata was
created manually. With this change we also fixed this issue.
2016-06-04 13:50:42 +00:00
Metin Doslu d4c4eaa9ff Move master_update_shard_statistics() to pg_catalog
Fixes #546
2016-06-02 10:52:47 +03:00
Amos Bird 92788a0d9c
Remove redundant implementations of error funcs.
This patch does some basic cleaning jobs. It removes duplicated
implementations of ReportRemoteError() and related ones and adjusts
regression tests.
2016-05-27 15:12:59 -06:00
Murat Tuncer 2b0d6473b9 Add complex distinct count support for repartitioned subqueries
Single table repartition subqueries now support count(distinct column)
and count(distinct (case when ...)) expressions. Repartition query
extracts column used in aggregate expression and adds them to target
list and group by list, master query stays the same (count (distinct ...))
but attribute numbers inside the aggregate expression is modified to
reflect changes in repartition query.
2016-05-27 15:43:05 +03:00
Metin Doslu afa74ce5ca Make master_create_empty_shard() aware of the shard placement policy
Now, master_create_empty_shard() will create shards according to the
value of citus.shard_placement_policy which also makes default round-robin
instead of random.
2016-05-27 15:05:53 +03:00
eren 132d9212d0 ADD master_modify_multiple_shards UDF
Fixes #10

This change creates a new UDF: master_modify_multiple_shards
Parameters:
  modify_query: A simple DELETE or UPDATE query as a string.

The UDF is similar to the existing master_apply_delete_command UDF.
Basically, given the modify query, it prunes the shard list, re-constructs
the query for each shard and sends the query to the placements.

Depending on the value of citus.multi_shard_commit_protocol, the commit
can be done in one-phase or two-phase manner.

Limitations:
* It cannot be called inside a transaction block
* It only be called with simple operator expressions (like Single Shard Modify)

Sample Usage:
```
SELECT master_modify_multiple_shards(
  'DELETE FROM customer_delete_protocol WHERE c_custkey > 500 AND c_custkey < 500');
```
2016-05-26 17:30:35 +03:00
Jason Petersen 4ca4f10966
Add multi_copy test outputs to gitignore 2016-05-10 13:36:56 -06:00
Marco Slot 1b4fbc76e2 Add JSON/XML validation to EXPLAIN regression tests and fix issues 2016-05-06 11:30:07 +02:00
Lukas Fittl 2f694f7af3 Distributed EXPLAIN: Generate valid JSON output.
This modifies the EXPLAIN output functions to actually generate
valid JSON output when (FORMAT JSON) is being used.

Fixes #494.
2016-05-05 12:48:01 +02:00
Onder Kalaci d7fd56df89 Fix check-full failures
This commit fixes failures happen during check-full. The change does make
clean seperation of executor types in certain places to keep the outputs
stable.
2016-05-05 12:28:22 +03:00
Andres Freund 5f282dd241 Stamp 5.1 release. 2016-05-04 18:05:41 -07:00
Marco Slot 845aebfe19 Remove costs from explain regression tests 2016-05-03 22:11:23 +02:00
Metin Doslu 866271b765 Add COPY support on worker nodes for append partitioned relations
Now, we can copy to an append-partitioned distributed relation from
any worker node by providing master options such as;

COPY relation_name FROM file_path WITH (delimiter '|', master_host 'localhost', master_port 5432);

where master_port is optional and default is 5432.
2016-05-03 16:00:00 +03:00
Marco Slot 24a74fb0ae Remove spurious intermediate regression test files 2016-05-02 12:30:15 +02:00
Marco Slot fc4f23065a Add EXPLAIN for simple distributed queries 2016-04-30 00:11:02 +02:00
Brian Cloutier 0036eb3253 Allow references to columns in UPDATE statements (#472)
Allow references to columns in UPDATE statements

Queries like "UPDATE tbl SET column = column + 1" are now allowed, so long as you don't use any IMMUTABLE functions.
2016-04-28 05:45:16 -07:00
Andres Freund a5b3dcddb3 Run some commands as superuser to allow normal users to execute queries.
Some small parts of citus currently require superuser privileges; which
is obviously not desirable for production scenarios. Run these small
parts under superuser privileges (we use the extension owner) to avoid
that.

This does not yet coordinate grants between master and workers. Thus it
allows to create shards, load data, and run queries as a non-superuser,
but it is not easily possible to allow differentiated accesses to
several users.
2016-04-27 10:28:22 -07:00
Andres Freund 25f919576f Add very basic infrastructure for schema upgrade scripts.
Citus' extension version now has a -$schemaversion appendix.  When the
schema is changed, a new schema version has to be added; changes to the
same schema version several commits inside a single pull request are ok.

Schema migration scripts between each schema version have to be
added. To ensure upgrade scripts work correctly a new regression test
ensures that all steps work.

The extension scripts to-be-used for CREATE EXTENSION (i.e. not
extension updates) are generated by concatenating citus.sql and the
relevant migration scripts.
2016-04-27 10:00:08 -07:00
Andres Freund 5ffce3393a Always create database for regression tests with a fixed username.
Otherwise the owner of relations and such will depend on the username of
the user running the regression tests. As "postgres" is the most common
username for that purpose, hardcode that in pg_regress_multi.pl.
2016-04-27 10:00:08 -07:00
Onder Kalaci c4b783b70b Fix Merge Conflict
This commit fixes merge conflicts.
2016-04-26 11:18:47 +03:00
Onder Kalaci 6c7abc2ba5 Add fast shard pruning path for INSERTs on hash partitioned tables
This commit adds a fast shard pruning path for INSERTs on
hash-partitioned tables. The rationale behind this change is
that if there exists a sorted shard interval array, a single
index lookup on the array allows us to find the corresponding
shard interval. As mentioned above, we need a sorted
(wrt shardminvalue) shard interval array. Thus, this commit
updates shardIntervalArray to sortedShardIntervalArray in the
metadata cache. Then uses the low-level API that is defined in
multi_copy to handle the fast shard pruning.

The performance impact of this change is more apparent as more
shards exist for a distributed table. Previous implementation
was relying on linear search through the shard intervals. However,
this commit relies on constant lookup time on shard interval
array. Thus, the shard pruning becomes less dependent on the
shard count.
2016-04-26 11:16:00 +03:00
Brian Cloutier 7a6d689259 Clear metadata_cache upon DROP EXTENSION
When we notice that pg_dist_partition is being invalidated we assume
that the citus extension is being dropped and drop state such as
extensionLoaded and the cached oids of all the metadata tables.

This frees the user from needing to reconnect after running DROP
EXTENSION, so we also no longer send a warning message.
2016-04-22 07:25:49 -07:00
Murat Tuncer a88d3ecd4e Add dynamic executor selection
- non-router plannable queries can be executed
  by router executor if they satisfy the criteria
- router executor is removed from configuration,
  now task executor can not be set to router
- removed some tests that error out for router executor
2016-04-21 09:15:33 +03:00
Murat Tuncer 938546b938 Add router plannable check and router planning logic
for single shard select queries
2016-04-21 09:15:33 +03:00
Brian Cloutier 7b1dc0d511 Support count(distinct) on hash partitioned tables
Also add test to ensure we get the same results when running
count(distinct) on range and hash partitioned tables.
2016-04-20 04:54:07 -07:00
eren 448527c3af
Fix JOINs on varchar columns with subquery pushdown
Fixes #379

Varchar VAR struct is wrapped in RELABELTYPE struct inside PostgreSQL code and
IsPartitionColumnRecursive function considers only VAR types so returning false
for varchar.

This change adds strip_implicit_coercions() call to the columnExpression in
IsPartitionColumnRecursive function so that we get rid of implicit coercions like
RELABELTYPE are stripped to VAR.
2016-04-19 21:55:50 -06:00