Commit Graph

618 Commits (0049f7334af31027cc17a205f5812f477fd6bd4b)

Author SHA1 Message Date
Andres Freund 4bf6b8cdfa Perform range based pruning if equality pruning has survivor.
We previously dismissed this as unimportant, but it turns out to be
very useful for the upcoming subquery pushdown, where a user might
specify an equality constraint in a subquery, and the subquery
pushdown machinery adds >= and <= restrictions on the shard boundary.
Previously the latter restriction was ignored.
2017-04-28 17:35:18 -07:00
Andres Freund 042020eabf Use stricter qual for pruning if both >/< and >=/<= are present.
Previously, if both =< and < (>= and < respectively) were specified,
we always used the latter restriction.  Instead use the stricter one.
2017-04-28 17:35:18 -07:00
Marco Slot 053f10d91c Fix list length lookup in WorkerGetLiveNodeCount 2017-04-29 02:13:20 +02:00
Burak Yucesoy edd69310fd Fix check-vanilla tests
It semms that GEQO optimizations, when it is set to on, create their own memory context
and free it after when it is no longer necessary. In join multi_join_restriction_hook
we allocate our variables in the CurrentMemoryContext, which is GEQO's memory context
if it is active. To prevent deallocation of our variables when GEQO's memory context is
freed, we started to allocate memory fo these variables in separate MemoryContext.
2017-04-29 01:55:18 +02:00
Marco Slot 97d36b7dfe Check whether relation ID exists in citus_relation_size 2017-04-29 01:39:39 +02:00
Andres Freund f6ef7f2c03 Faster shard pruning.
So far citus used postgres' predicate proofing logic for shard
pruning, except for INSERT and COPY which were already optimized for
speed.  That turns out to be too slow:
* Shard pruning for SELECTs is currently O(#shards), because
  PruneShardList calls predicate_refuted_by() for every
  shard. Obviously using an O(N) type algorithm for general pruning
  isn't good.
* predicate_refuted_by() is quite expensive on its own right. That's
  primarily because it's optimized for doing a single refutation
  proof, rather than performing the same proof over and over.
* predicate_refuted_by() does not keep persistent state (see 2.) for
  function calls, which means that a lot of syscache lookups will be
  performed. That's particularly bad if the partitioning key is a
  composite key, because without a persistent FunctionCallInfo
  record_cmp() has to repeatedly look-up the type definition of the
  composite key. That's quite expensive.

Thus replace this with custom-code that works in two phases:
1) Search restrictions for constraints that can be pruned upon
2) Use those restrictions to search for matching shards in the most
   efficient manner available:
   a) Binary search / Hash Lookup in case of hash partitioned tables
   b) Binary search for equal clauses in case of range or append
      tables without overlapping shards.
   c) Binary search for inequality clauses, searching for both lower
      and upper boundaries, again in case of range or append
      tables without overlapping shards.
   d) exhaustive search testing each ShardInterval

My measurements suggest that we are considerably, often orders of
magnitude, faster than the previous solution, even if we have to fall
back to exhaustive pruning.
2017-04-28 14:40:41 -07:00
Andres Freund 2013090a77 Add DistTableCacheEntry->hasOverlappingShardInterval.
This determines whether it's possible to perform binary search on
sortedShardIntervalArray or not.  If e.g. two shards have overlapping
ranges, that'd be prohibitive.

That'll be useful in later commit introducing faster shard pruning.
2017-04-28 14:40:38 -07:00
Andres Freund 15d427f931 Add DistTableCacheEntry->shardValueCompareFunction.
That's useful when comparing values a hash-partitioned table is
filtered by.  The existing shardIntervalCompareFunction is about
comparing hashed values, not unhashed ones.

The added btree opclass function is so we can get a comparator
back. This should be changed much more widely, but is not necessary so
far.
2017-04-28 14:40:38 -07:00
Andres Freund f3172e9719 Build DistTableCacheEntry->shardIntervalCompareFunction even for 0 shards.
Previously we, unnecessarily, used a the first shard's type
information to to look up the comparison function.  But that
information is already available, so use it.  That's helpful because
we sometimes want to access the comparator function even if there's no
shards.
2017-04-28 14:40:38 -07:00
Andres Freund 99642306ed Fix: Make FindShardIntervalIndex robust against 0 shards. 2017-04-28 14:40:38 -07:00
Metin Doslu d411892fe6 Send explain queries with savepoints
With this commit, we started to send explain queries within a savepoint. After
running explain query, we rollback to savepoint. This saves us from side effects
of EXPLAIN ANALYZE on DML queries.
2017-04-28 12:13:48 -07:00
Jason Petersen 1c353e68aa Remove FastShardPruning method
With the other simplifications, it doesn't make sense to keep around.
2017-04-27 13:32:36 -06:00
Jason Petersen 06497d74f5 Refactor FindShardInterval to use cacheEntry
All callers fetch a cache entry and extract/compute arguments for the
eventual FindShardInterval call, so it makes more sense to refactor
into that function itself; this solves the use-after-free bug, too.
2017-04-27 13:32:36 -06:00
Andres Freund 4fe14bdeda Some cleanup in multi_subquery test.
Remove trailing whitespace and use of EXPLAIN instead of
EXPLAIN (COSTS OFF).
2017-04-26 11:33:56 -07:00
Andres Freund f064c33d5c Add back pruning coverage lost in last commit.
Because we can't rely on the debuggin message anymore, add a bunch of
explain statements that roughly fulfill the same purpose.
2017-04-26 11:33:56 -07:00
Andres Freund 5b389eb6d7 Boring regression test output adjustments.
Soon shard pruning will be optimized not to generally work linearly
anymore.  Thus we can't print the pruned shard intervals as currently
done anymore.

The current printing of shard ids also prevents us from running tests
in parallel, as otherwise shard ids aren't linearly numbered.
2017-04-26 11:33:56 -07:00
Andres Freund 9e4ec991d8 Skip exhaustive test in CoPartitionedTables() if declared colocated.
That's considerably cheaper.
2017-04-26 11:19:17 -07:00
Marco Slot 1b4ebd490d Only process error if not NULL in StoreErrorMessage 2017-04-21 17:01:01 +02:00
Marco Slot 326f8d9d61 Use right sizeof in UpdateRelationColocationGroup 2017-04-21 16:37:09 +02:00
Burak Yucesoy a35d0cd8af Configure valgrind command line arguments 2017-04-21 16:30:12 +03:00
Burak Yucesoy 9312ef8bcf Stabilize test outputs 2017-04-21 16:08:52 +03:00
Eren Basak 71d99b72ce Add support for proper valgrind tests
This change allows valgrind tests (`make check-multi-vg`) to be
run seamlessly without test output errors and timeout problems.
2017-04-21 16:08:52 +03:00
Marco Slot 7d1f7b8923 Support expressions in the partition column in INSERTs 2017-04-21 14:05:52 +02:00
velioglu a26edd2249 Implement ALTER TABLE ADD CONSTRAINT command 2017-04-20 15:02:33 +03:00
velioglu 5b3e47de7a Log message of across shard queries according to the log level 2017-04-20 12:24:46 +03:00
velioglu be3cdb14ea Change native hash function with worker_hash 2017-04-19 22:16:55 +03:00
Jason Petersen f999bcd7ca Enable distributed ALTER TABLE ... RENAME COLUMN
Pretty straightforward. Had some concerns about locking, but due to the
fact that all distributed operations use either some level of deparsing
or need to enumerate column names, they all block during any concurrent
column renames (due to the AccessExclusive lock).

In addition, I had some misgivings about permitting renames of the dis-
tribution column, but nothing bad comes from just allowing them.

Finally, I tried to trigger any sort of error using prepared statements
and could not trigger any errors not also exhibited by plain PostgreSQL
tables.
2017-04-18 22:47:48 -06:00
Marco Slot 0f63edc5b4 Add basic read-only transaction tests 2017-04-18 11:42:33 +02:00
Marco Slot 53899946e7 Remove redundant pg_dist_jobid_seq restarts in tests 2017-04-18 11:42:32 +02:00
Marco Slot d7a5f6997c Set citus.enable_unique_job_ids in tests with job ID in output 2017-04-18 11:42:32 +02:00
Marco Slot c7603215dd Stop using a sequence to generate unique job IDs 2017-04-18 11:31:51 +02:00
Burak Yucesoy 58a809b0e8 Set default value of isactive to true
With this change, we set to default value of isactive column to true so that
upgrading users all nodes will be marked as active to not break their environment.
2017-04-18 09:40:44 +03:00
Burak Yucesoy 5aefe20725 Fix node copy error
Instead of directly returning heap tuple obtained from heap scan
we return copied version of it.
2017-04-17 19:38:18 +03:00
Metin Doslu f45c2c43b5 Fix table in name in prepared statement regression tests 2017-04-17 16:17:30 +02:00
Marco Slot a4c98727be Support UPDATE/DELETE with parameterised partition column qual 2017-04-17 16:17:30 +02:00
Marco Slot 2fbe546ddd Support query parameters in combination with function evaluation 2017-04-17 15:40:55 +02:00
Marco Slot ccc796cf66 Create indexes after worker_append_table_to_shard during shard repair 2017-04-17 15:17:21 +02:00
Burak Yucesoy d58cb416a4 Decouple reference table replication
With this change we add an option to add a node without replicating all reference
tables to that node. If a node is added with this option, we mark the node as
inactive and no queries will sent to that node.

We also added two new UDFs;
 - master_activate_node(host, port):
    - marks node as active and replicates all reference tables to that node
 - master_add_inactive_node(host, port):
    - only adds node to pg_dist_node
2017-04-17 13:33:31 +03:00
Burak Yucesoy cd5dc2693d Error out on parameterized SQL functions
Before this commit, we were erroring out for queries containing parameterized SQL functions
like 'SELECT parameterized_sql_query(value)' as we should, however we were returning wrong
results for queries like 'SELECT * FROM parameterized_sql_query(value)'. With this commit
we started to error out on such queries too.
2017-04-13 16:36:24 +03:00
Onder Kalaci 6c9296aca0 Remove uninstantiated qual logic, use attribute equivalences
In this PR, we aim to deduce whether each of the RTE_RELATION
is joined with at least on another RTE_RELATION on their partition keys. If each
RTE_RELATION follows the above rule, we can conclude that all RTE_RELATIONs are
joined on their partition keys.

In order to do that, we invented a new equivalence class namely:
AttributeEquivalenceClass. In very simple words, a AttributeEquivalenceClass is
identified by an unique id and consists of a list of AttributeEquivalenceMembers.

Each AttributeEquivalenceMember is designed to identify attributes uniquely within the
whole query. The necessity of this arise since varno attributes are defined within
a single level of a query. Instead, here we want to identify each RTE_RELATION uniquely
and try to find equality among each RTE_RELATION's partition key.

Whenever we find an equality clause A = B, where both A and B originates from
relation attributes (i.e., not random expressions), we create an
AttributeEquivalenceClass to record this knowledge. If we later find another
equivalence B = C, we create another AttributeEquivalenceClass. Finally, we can
apply transitity rules and generate a new AttributeEquivalenceClass which includes
A, B and C.

Note that equality among the members are identified by the varattno and rteIdentity.

Each equality among RTE_RELATION is saved using an AttributeEquivalenceClass where
each member attribute is identified by a AttributeEquivalenceMember. In the final
step, we try generate a common attribute equivalence class that holds as much as
AttributeEquivalenceMembers whose attributes are a partition keys.
2017-04-13 11:51:26 +03:00
velioglu 584c0c34a3 Change checks with built-in type 2017-04-11 14:41:37 +03:00
velioglu 5ba77d8abb Check binary output function of type. 2017-04-10 16:28:09 +03:00
Jason Petersen fc2c23f15a Use RESET for GUC test, not reconnect
More limited in what it does, better test.
2017-04-04 16:40:17 -06:00
Jason Petersen b0a8d9da34 Add comments, use strncmp, clean up GUC desc.
Good to go!
2017-04-04 16:16:49 -06:00
Jason Petersen 2ce82abb04 Clean up remaining error messages
Added details and hints, based off of similar PostgreSQL scenarios.
2017-04-04 16:11:59 -06:00
Jason Petersen 41612177be Clean up ErrorIfUnstableCreateOrAlterExtensionStmt
Swaps an Assert in for an ereport, and adds details and hints to the
error message to help users with a possibly confusing scenario.
2017-04-04 15:58:57 -06:00
Jason Petersen 0e6e42c59a Refactor utility-skip/extn-check code
This was getting pretty long and complex in the context of the main
utility hook. Moved out the checks for what should skip Citus process-
ing and what should have version checks performed.
2017-04-04 15:07:22 -06:00
Burak Yucesoy 66a801dd4e Add enable_version_checks GUC and address feedback 2017-04-04 19:11:13 +03:00
Jason Petersen 0707f262b6 Self-implemented review feedback
The use of a bare src/ rather than $srcdir caused configure to fail
during VPATH builds. With our additional dependency upon AWK, we need
to call AC_PROG_AWK, otherwise environments may not have $AWK set.
Finally, citus_version.h should be in .gitignore.
2017-04-03 22:55:12 -06:00
Burak Yucesoy 63b232e4ba Error out if binary citus version does not match installed extension
With this change, we start to error out if loaded citus binaries does not match
the available major version or installed citus extension version. In this case
we force user to restart the server or run ALTER EXTENSION depending on the
situation
2017-04-03 17:36:13 -06:00
Jason Petersen 963090fe05 Address review feedback
Should just about do it.
2017-04-03 11:44:57 -06:00
Jason Petersen afe6908e26 Improve CONCURRENTLY-related error messages
Thought this looked slightly nicer than the default behavior.

Changed preventTransaction to concurrent to be clearer that this code
path presently affects CONCURRENTLY code only.
2017-04-03 11:19:15 -06:00
Jason Petersen ddc8d7111b Update documentation
Ensure all functions have comments, etc.
2017-04-03 11:19:15 -06:00
Jason Petersen d128ad723a Address MX CONCURRENTLY problems
Adds a non-transactional multi-command method to propagate DDLs to all
MX/metadata-synced nodes.
2017-04-03 11:19:15 -06:00
Jason Petersen afa9bd4840 Add code to set index validity on failure
Coordinator code marks index as invalid as a base, set it as valid in a
transactional layer atop that base, then proceeds with worker commands.
If a worker command has problems, the rollback results in an index with
isvalid = false. If everything succeeds, the user sees a valid index.
2017-04-03 11:19:15 -06:00
Jason Petersen 236c6900ff Remove CONCURRENTLY checks, fix tests
Still pending failure testing, which broke with my recent changes.
2017-04-03 11:19:15 -06:00
Jason Petersen c7f31ee90a Change DropStmt to generate worker DDL on master
Because we can't execute DROP INDEX CONCURRENTLY during transactions,
worker_apply_shard_ddl_command is insufficient.
2017-04-03 11:19:15 -06:00
Jason Petersen 7173d85071 Change IndexStmt to generate worker DDL on master
Because we can't execute CREATE INDEX CONCURRENTLY during transactions,
worker_apply_shard_ddl_command is insufficient.
2017-04-03 11:19:14 -06:00
Marco Slot a339c7bbd6 Batch task_tracker_status calls to reduce task-tracker query times 2017-03-31 11:54:11 +02:00
Metin Doslu 5670389bec Add disable/enable trigger all support 2017-03-29 22:00:14 +03:00
Onder Kalaci 6b66a023aa Fix pushing down wrong queries for INSERT ... SELECT queries
Before this commit, in certain cases router planner allowed pushing
down JOINs that are not on the partition keys.

With @anarazel's suggestion, we change the logic to use uninstantiated
parameter. Previously, the planner was traversing on the restriction
information and once it finds the parameter, it was replacing it with
the shard range. With this commit, instead of traversing the restrict
infos, the planner explicitly checks for the equivalence of the relation
partition key with the uninstantiated parameter. If finds an equivalence,
it adds the restrictions. In this way, we have more control over the
queries that are pushed down.
2017-03-24 11:37:35 +02:00
Jason Petersen ef1a42c4dc Address code review comments 2017-03-22 17:29:17 -06:00
Jason Petersen 48db2f1fc8 Rework ReplicateGrantStmt to use new flow
This was the impetus for the previous commit that changed from using a
DDLJob * to a List * of them.
2017-03-22 17:29:16 -06:00
Jason Petersen 41b2317457 Change DDLJob usage to be wrapped in lists
To prepare for GRANT fixes.
2017-03-22 17:29:16 -06:00
Jason Petersen dd3f2f6fbb Fix MX tests
Missed some of these. One had a bad DDL statement to begin with (mixed
up column type and column name) and other was just master/worker order.
2017-03-22 17:21:49 -06:00
Jason Petersen a5d32a0c22 Move worker execution to after master, fix tests
Some tests relied on worker errors though local commands were invalid.
Fixed those by ensuring preconditions were met to have command work
correctly. Otherwise most test changes are related to slight changes
in local/remote error ordering.
2017-03-22 17:21:49 -06:00
Jason Petersen c04ecae919 Remove execution from stmt-specific util functions
Now have a single Execute call in the main body.
2017-03-22 17:21:49 -06:00
Jason Petersen 55910d4851 Rename Process*Stmt functions to Plan*Stmt
To reflect their new purpose planning a DDLJob rather than fully
processing a distributed DDL statement.
2017-03-22 17:21:49 -06:00
Jason Petersen 041aff8eed Refactor ExecuteDistDDLCommand to expect struct
Will let us separate out the determination of what to execute from its
actual execution.
2017-03-22 17:21:49 -06:00
Jason Petersen 5838581854 Minor permissions test fix
When running under Enterprise, some of the GRANT commands and whatnot
are propagated. Guarding that section with a call to disable DDL prop.
fixes everything.
2017-03-22 17:07:05 -06:00
Metin Doslu 2260adf163 Add basic permission checking tests 2017-03-22 15:25:00 -06:00
Metin Doslu 1268a2553d Update regression tests for changing explain output 2017-03-22 15:25:00 -06:00
Metin Doslu 16a014e50d Fix access permission checks for distributed relations
With this commit, we add the range table list of the original query to our
custom plan. Therefore, PostgreSQL can check relations in the original query
for access permissions and error out if the proper access is not granted.
2017-03-22 15:25:00 -06:00
Murat Tuncer 86e938ab96 Rephrase router modify errors
generic "distributed modifications must target exactly one shard"
message is replaced by more context aware error messages.
2017-03-16 15:09:10 +03:00
velioglu d7e244792f Size UDFs implemented
citus_table_size, citus_relation_size and citus_total_relation_size UDFs are implemented.
2017-03-16 13:50:30 +03:00
Metin Doslu 76ab7040cb Use CustomScan API for query execution
Custom Scan is a node in the planned statement which helps external providers
to abstract data scan not just for foreign data wrappers but also for regular
relations so you can benefit your version of caching or hardware optimizations.
This sounds like only an abstraction on the data scan layer, but we can use it
as an abstraction for our distributed queries. The only thing we need to do is
to find distributable parts of the query, plan for them and replace them with
a Citus Custom Scan. Then, whenever PostgreSQL hits this custom scan node in
its Vulcano style execution, it will call our callback functions which run
distributed plan and provides tuples to the upper node as it scans a regular
relation. This means fewer code changes, fewer bugs and more supported features
for us!

First, in the distributed query planner phase, we create a Custom Scan which
wraps the distributed plan. For real-time and task-tracker executors, we add
this custom plan under the master query plan. For router executor, we directly
pass the custom plan because there is not any master query. Then, we simply let
the PostgreSQL executor run this plan. When it hits the custom scan node, we
call the related executor parts for distributed plan, fill the tuple store in
the custom scan and return results to PostgreSQL executor in Vulcano style,
a tuple per XXX_ExecScan() call.

* Modify planner to utilize Custom Scan node.
* Create different scan methods for different executors.
* Use native PostgreSQL Explain for master part of queries.
2017-03-14 12:17:51 +02:00
Andres Freund 2a6188d8a1 Initial temp table removal implementation 2017-03-14 12:09:49 +02:00
Jason Petersen 73e0e2a79a Revert "Remove unused SendCommandToWorker"
This reverts commit c8c308c109.
2017-03-13 15:48:51 -06:00
Murat Tuncer 7abc7080f2 Enable router planner for queries on range partitioned tables
Router planner now supports queries using range partitioned
tables. Queries on append partitioned tables are still not
supported.
2017-03-09 16:39:15 +03:00
Brian Cloutier ebc7779457 Remove unused SendCommandToWorker 2017-03-08 16:30:23 +03:00
Brian Cloutier aed36acfeb Remove unused master_stage_shard_{placement_,}row 2017-03-07 11:59:26 +03:00
Brian Cloutier 9f876986e2 Remove unused master_get_round_robin_candidate_nodes 2017-03-07 11:51:24 +03:00
Brian Cloutier c3e9bb880b Remove master_get_local_first_candidate_nodes 2017-03-07 11:50:59 +03:00
Andres Freund 99d660c45f Fix SendRemoteCommandParams() handling of a NULL MultiConnection->pgConn. (#1271)
Previously we'd segfault in PQisnonblocking() which, contrary to other
libpq calls, doesn't handle a NULL PQconn (because there'd be no
appropriate return value for that).

cr: @jasonmp85
2017-03-03 12:02:15 -07:00
Murat Tuncer e718b10ce9 Remove default clause from shard DDL when sequences are used 2017-03-01 17:32:48 +03:00
Marco Slot 0a31d33cf9 Fix spelling in master_initialize_node_metadata comment 2017-03-01 12:27:50 +01:00
Jason Petersen d3653051ab Rename misleading allowEmpty parameter
Last bit of PR feedback.
2017-02-28 22:48:00 -07:00
Marco Slot ba764be3bb Address review feedback in create_distributed_table data loading 2017-02-28 17:39:45 +01:00
Marco Slot 29b1fb97c5 Address review feedback in COPY refactoring 2017-02-28 17:39:45 +01:00
Marco Slot 92c8d6cf54 Use CitusCopyDestReceiver for regular COPY 2017-02-28 17:24:45 +01:00
Marco Slot 10e1131516 Load data into distributed table on creation 2017-02-28 17:24:45 +01:00
Marco Slot ae9d2be84e Add CitusCopyDestReceiver infrastructure 2017-02-28 17:24:45 +01:00
Burak Velioglu 291f6f3bd2 Merge branch 'master' into disallow_master_appy_delete_on_hash 2017-02-24 10:40:23 +02:00
velioglu a19770c6c8 Fix error message of start_metadata_sync_to_node
Single quotation mark is added around nodename to make the
error code consistent with master_add_node usage.
2017-02-22 18:03:58 +03:00
Metin Doslu f73c0c2ab5 Get reproducible costs between different PostgreSQL versions 2017-02-22 15:40:02 +02:00
Burak Velioglu fa112e9c99 Disallow master_apply_delete_command on hash distributed table
Delete operation is blocked for any table distributed by hash using master_apply_delete_command. Suggested master_modify_multiple_shards command as a hint.
2017-02-22 11:54:46 +03:00
Andres Freund a4f2bf1266 Use DEBUG2 instead of DEBUG4 in INSERT SELECT tests & debug message.
During later work the transaction debug output will change (as it will
in postgres 10), which makes it hard to see actual changes in the
INSERT ... SELECT ... test.  Reduce to DEBUG2 after changing a debug
message to that log level.
2017-02-20 12:56:16 +02:00
Eren Basak 99ebe06af5 Enforce statement based replication on old APIs and non-hash tables
This change ignores `citus.replication_model` setting and uses the
statement based replication in

- Tables distributed via the old `master_create_distributed_table` function
- Append and range partitioned tables, even if created via
`create_distributed_table` function

This seems like the easiest solution to #1191, without changing the existing
behavior and harming existing users with custom scripts.

This change also prevents RF>1 on streaming replicated tables on `master_create_worker_shards`

Prior to this change, `master_create_worker_shards` command was not checking
the replication model of the target table, thus allowing RF>1 with streaming
replicated tables. With this change, `master_create_worker_shards` errors
out on the case.
2017-02-16 10:37:53 -08:00
Jason Petersen 10afe08cd9 Fix tests broken by new PostgreSQL patch releases (#1220)
PostgreSQL 9.5.6 and 9.6.2 were released today and broke several tests
by adding TABLESPACE pg_default output to some DDL commands. Fixed all
occurrences.

cr: @anarazel
2017-02-09 16:53:02 -07:00
Onder Kalaci 49ed391b3e Bugfix for creating foreign key
This commit fixes crash for adding foreign keys without
specifying the referenced column crashes the backend.
2017-02-07 09:34:24 +02:00