Commit Graph

580 Commits (111f739610550944048ea8d6c16cca8621c36f31)

Author SHA1 Message Date
Burak Yucesoy cd5dc2693d Error out on parameterized SQL functions
Before this commit, we were erroring out for queries containing parameterized SQL functions
like 'SELECT parameterized_sql_query(value)' as we should, however we were returning wrong
results for queries like 'SELECT * FROM parameterized_sql_query(value)'. With this commit
we started to error out on such queries too.
2017-04-13 16:36:24 +03:00
Onder Kalaci 6c9296aca0 Remove uninstantiated qual logic, use attribute equivalences
In this PR, we aim to deduce whether each of the RTE_RELATION
is joined with at least on another RTE_RELATION on their partition keys. If each
RTE_RELATION follows the above rule, we can conclude that all RTE_RELATIONs are
joined on their partition keys.

In order to do that, we invented a new equivalence class namely:
AttributeEquivalenceClass. In very simple words, a AttributeEquivalenceClass is
identified by an unique id and consists of a list of AttributeEquivalenceMembers.

Each AttributeEquivalenceMember is designed to identify attributes uniquely within the
whole query. The necessity of this arise since varno attributes are defined within
a single level of a query. Instead, here we want to identify each RTE_RELATION uniquely
and try to find equality among each RTE_RELATION's partition key.

Whenever we find an equality clause A = B, where both A and B originates from
relation attributes (i.e., not random expressions), we create an
AttributeEquivalenceClass to record this knowledge. If we later find another
equivalence B = C, we create another AttributeEquivalenceClass. Finally, we can
apply transitity rules and generate a new AttributeEquivalenceClass which includes
A, B and C.

Note that equality among the members are identified by the varattno and rteIdentity.

Each equality among RTE_RELATION is saved using an AttributeEquivalenceClass where
each member attribute is identified by a AttributeEquivalenceMember. In the final
step, we try generate a common attribute equivalence class that holds as much as
AttributeEquivalenceMembers whose attributes are a partition keys.
2017-04-13 11:51:26 +03:00
velioglu 584c0c34a3 Change checks with built-in type 2017-04-11 14:41:37 +03:00
velioglu 5ba77d8abb Check binary output function of type. 2017-04-10 16:28:09 +03:00
Jason Petersen fc2c23f15a Use RESET for GUC test, not reconnect
More limited in what it does, better test.
2017-04-04 16:40:17 -06:00
Jason Petersen b0a8d9da34 Add comments, use strncmp, clean up GUC desc.
Good to go!
2017-04-04 16:16:49 -06:00
Jason Petersen 2ce82abb04 Clean up remaining error messages
Added details and hints, based off of similar PostgreSQL scenarios.
2017-04-04 16:11:59 -06:00
Jason Petersen 41612177be Clean up ErrorIfUnstableCreateOrAlterExtensionStmt
Swaps an Assert in for an ereport, and adds details and hints to the
error message to help users with a possibly confusing scenario.
2017-04-04 15:58:57 -06:00
Jason Petersen 0e6e42c59a Refactor utility-skip/extn-check code
This was getting pretty long and complex in the context of the main
utility hook. Moved out the checks for what should skip Citus process-
ing and what should have version checks performed.
2017-04-04 15:07:22 -06:00
Burak Yucesoy 66a801dd4e Add enable_version_checks GUC and address feedback 2017-04-04 19:11:13 +03:00
Jason Petersen 0707f262b6 Self-implemented review feedback
The use of a bare src/ rather than $srcdir caused configure to fail
during VPATH builds. With our additional dependency upon AWK, we need
to call AC_PROG_AWK, otherwise environments may not have $AWK set.
Finally, citus_version.h should be in .gitignore.
2017-04-03 22:55:12 -06:00
Burak Yucesoy 63b232e4ba Error out if binary citus version does not match installed extension
With this change, we start to error out if loaded citus binaries does not match
the available major version or installed citus extension version. In this case
we force user to restart the server or run ALTER EXTENSION depending on the
situation
2017-04-03 17:36:13 -06:00
Jason Petersen 963090fe05 Address review feedback
Should just about do it.
2017-04-03 11:44:57 -06:00
Jason Petersen afe6908e26 Improve CONCURRENTLY-related error messages
Thought this looked slightly nicer than the default behavior.

Changed preventTransaction to concurrent to be clearer that this code
path presently affects CONCURRENTLY code only.
2017-04-03 11:19:15 -06:00
Jason Petersen ddc8d7111b Update documentation
Ensure all functions have comments, etc.
2017-04-03 11:19:15 -06:00
Jason Petersen d128ad723a Address MX CONCURRENTLY problems
Adds a non-transactional multi-command method to propagate DDLs to all
MX/metadata-synced nodes.
2017-04-03 11:19:15 -06:00
Jason Petersen afa9bd4840 Add code to set index validity on failure
Coordinator code marks index as invalid as a base, set it as valid in a
transactional layer atop that base, then proceeds with worker commands.
If a worker command has problems, the rollback results in an index with
isvalid = false. If everything succeeds, the user sees a valid index.
2017-04-03 11:19:15 -06:00
Jason Petersen 236c6900ff Remove CONCURRENTLY checks, fix tests
Still pending failure testing, which broke with my recent changes.
2017-04-03 11:19:15 -06:00
Jason Petersen c7f31ee90a Change DropStmt to generate worker DDL on master
Because we can't execute DROP INDEX CONCURRENTLY during transactions,
worker_apply_shard_ddl_command is insufficient.
2017-04-03 11:19:15 -06:00
Jason Petersen 7173d85071 Change IndexStmt to generate worker DDL on master
Because we can't execute CREATE INDEX CONCURRENTLY during transactions,
worker_apply_shard_ddl_command is insufficient.
2017-04-03 11:19:14 -06:00
Marco Slot a339c7bbd6 Batch task_tracker_status calls to reduce task-tracker query times 2017-03-31 11:54:11 +02:00
Metin Doslu 5670389bec Add disable/enable trigger all support 2017-03-29 22:00:14 +03:00
Onder Kalaci 6b66a023aa Fix pushing down wrong queries for INSERT ... SELECT queries
Before this commit, in certain cases router planner allowed pushing
down JOINs that are not on the partition keys.

With @anarazel's suggestion, we change the logic to use uninstantiated
parameter. Previously, the planner was traversing on the restriction
information and once it finds the parameter, it was replacing it with
the shard range. With this commit, instead of traversing the restrict
infos, the planner explicitly checks for the equivalence of the relation
partition key with the uninstantiated parameter. If finds an equivalence,
it adds the restrictions. In this way, we have more control over the
queries that are pushed down.
2017-03-24 11:37:35 +02:00
Jason Petersen ef1a42c4dc Address code review comments 2017-03-22 17:29:17 -06:00
Jason Petersen 48db2f1fc8 Rework ReplicateGrantStmt to use new flow
This was the impetus for the previous commit that changed from using a
DDLJob * to a List * of them.
2017-03-22 17:29:16 -06:00
Jason Petersen 41b2317457 Change DDLJob usage to be wrapped in lists
To prepare for GRANT fixes.
2017-03-22 17:29:16 -06:00
Jason Petersen dd3f2f6fbb Fix MX tests
Missed some of these. One had a bad DDL statement to begin with (mixed
up column type and column name) and other was just master/worker order.
2017-03-22 17:21:49 -06:00
Jason Petersen a5d32a0c22 Move worker execution to after master, fix tests
Some tests relied on worker errors though local commands were invalid.
Fixed those by ensuring preconditions were met to have command work
correctly. Otherwise most test changes are related to slight changes
in local/remote error ordering.
2017-03-22 17:21:49 -06:00
Jason Petersen c04ecae919 Remove execution from stmt-specific util functions
Now have a single Execute call in the main body.
2017-03-22 17:21:49 -06:00
Jason Petersen 55910d4851 Rename Process*Stmt functions to Plan*Stmt
To reflect their new purpose planning a DDLJob rather than fully
processing a distributed DDL statement.
2017-03-22 17:21:49 -06:00
Jason Petersen 041aff8eed Refactor ExecuteDistDDLCommand to expect struct
Will let us separate out the determination of what to execute from its
actual execution.
2017-03-22 17:21:49 -06:00
Jason Petersen 5838581854 Minor permissions test fix
When running under Enterprise, some of the GRANT commands and whatnot
are propagated. Guarding that section with a call to disable DDL prop.
fixes everything.
2017-03-22 17:07:05 -06:00
Metin Doslu 2260adf163 Add basic permission checking tests 2017-03-22 15:25:00 -06:00
Metin Doslu 1268a2553d Update regression tests for changing explain output 2017-03-22 15:25:00 -06:00
Metin Doslu 16a014e50d Fix access permission checks for distributed relations
With this commit, we add the range table list of the original query to our
custom plan. Therefore, PostgreSQL can check relations in the original query
for access permissions and error out if the proper access is not granted.
2017-03-22 15:25:00 -06:00
Murat Tuncer 86e938ab96 Rephrase router modify errors
generic "distributed modifications must target exactly one shard"
message is replaced by more context aware error messages.
2017-03-16 15:09:10 +03:00
velioglu d7e244792f Size UDFs implemented
citus_table_size, citus_relation_size and citus_total_relation_size UDFs are implemented.
2017-03-16 13:50:30 +03:00
Metin Doslu 76ab7040cb Use CustomScan API for query execution
Custom Scan is a node in the planned statement which helps external providers
to abstract data scan not just for foreign data wrappers but also for regular
relations so you can benefit your version of caching or hardware optimizations.
This sounds like only an abstraction on the data scan layer, but we can use it
as an abstraction for our distributed queries. The only thing we need to do is
to find distributable parts of the query, plan for them and replace them with
a Citus Custom Scan. Then, whenever PostgreSQL hits this custom scan node in
its Vulcano style execution, it will call our callback functions which run
distributed plan and provides tuples to the upper node as it scans a regular
relation. This means fewer code changes, fewer bugs and more supported features
for us!

First, in the distributed query planner phase, we create a Custom Scan which
wraps the distributed plan. For real-time and task-tracker executors, we add
this custom plan under the master query plan. For router executor, we directly
pass the custom plan because there is not any master query. Then, we simply let
the PostgreSQL executor run this plan. When it hits the custom scan node, we
call the related executor parts for distributed plan, fill the tuple store in
the custom scan and return results to PostgreSQL executor in Vulcano style,
a tuple per XXX_ExecScan() call.

* Modify planner to utilize Custom Scan node.
* Create different scan methods for different executors.
* Use native PostgreSQL Explain for master part of queries.
2017-03-14 12:17:51 +02:00
Andres Freund 2a6188d8a1 Initial temp table removal implementation 2017-03-14 12:09:49 +02:00
Jason Petersen 73e0e2a79a Revert "Remove unused SendCommandToWorker"
This reverts commit c8c308c109.
2017-03-13 15:48:51 -06:00
Murat Tuncer 7abc7080f2 Enable router planner for queries on range partitioned tables
Router planner now supports queries using range partitioned
tables. Queries on append partitioned tables are still not
supported.
2017-03-09 16:39:15 +03:00
Brian Cloutier ebc7779457 Remove unused SendCommandToWorker 2017-03-08 16:30:23 +03:00
Brian Cloutier aed36acfeb Remove unused master_stage_shard_{placement_,}row 2017-03-07 11:59:26 +03:00
Brian Cloutier 9f876986e2 Remove unused master_get_round_robin_candidate_nodes 2017-03-07 11:51:24 +03:00
Brian Cloutier c3e9bb880b Remove master_get_local_first_candidate_nodes 2017-03-07 11:50:59 +03:00
Andres Freund 99d660c45f Fix SendRemoteCommandParams() handling of a NULL MultiConnection->pgConn. (#1271)
Previously we'd segfault in PQisnonblocking() which, contrary to other
libpq calls, doesn't handle a NULL PQconn (because there'd be no
appropriate return value for that).

cr: @jasonmp85
2017-03-03 12:02:15 -07:00
Murat Tuncer e718b10ce9 Remove default clause from shard DDL when sequences are used 2017-03-01 17:32:48 +03:00
Marco Slot 0a31d33cf9 Fix spelling in master_initialize_node_metadata comment 2017-03-01 12:27:50 +01:00
Jason Petersen d3653051ab Rename misleading allowEmpty parameter
Last bit of PR feedback.
2017-02-28 22:48:00 -07:00
Marco Slot ba764be3bb Address review feedback in create_distributed_table data loading 2017-02-28 17:39:45 +01:00
Marco Slot 29b1fb97c5 Address review feedback in COPY refactoring 2017-02-28 17:39:45 +01:00
Marco Slot 92c8d6cf54 Use CitusCopyDestReceiver for regular COPY 2017-02-28 17:24:45 +01:00
Marco Slot 10e1131516 Load data into distributed table on creation 2017-02-28 17:24:45 +01:00
Marco Slot ae9d2be84e Add CitusCopyDestReceiver infrastructure 2017-02-28 17:24:45 +01:00
Burak Velioglu 291f6f3bd2 Merge branch 'master' into disallow_master_appy_delete_on_hash 2017-02-24 10:40:23 +02:00
velioglu a19770c6c8 Fix error message of start_metadata_sync_to_node
Single quotation mark is added around nodename to make the
error code consistent with master_add_node usage.
2017-02-22 18:03:58 +03:00
Metin Doslu f73c0c2ab5 Get reproducible costs between different PostgreSQL versions 2017-02-22 15:40:02 +02:00
Burak Velioglu fa112e9c99 Disallow master_apply_delete_command on hash distributed table
Delete operation is blocked for any table distributed by hash using master_apply_delete_command. Suggested master_modify_multiple_shards command as a hint.
2017-02-22 11:54:46 +03:00
Andres Freund a4f2bf1266 Use DEBUG2 instead of DEBUG4 in INSERT SELECT tests & debug message.
During later work the transaction debug output will change (as it will
in postgres 10), which makes it hard to see actual changes in the
INSERT ... SELECT ... test.  Reduce to DEBUG2 after changing a debug
message to that log level.
2017-02-20 12:56:16 +02:00
Eren Basak 99ebe06af5 Enforce statement based replication on old APIs and non-hash tables
This change ignores `citus.replication_model` setting and uses the
statement based replication in

- Tables distributed via the old `master_create_distributed_table` function
- Append and range partitioned tables, even if created via
`create_distributed_table` function

This seems like the easiest solution to #1191, without changing the existing
behavior and harming existing users with custom scripts.

This change also prevents RF>1 on streaming replicated tables on `master_create_worker_shards`

Prior to this change, `master_create_worker_shards` command was not checking
the replication model of the target table, thus allowing RF>1 with streaming
replicated tables. With this change, `master_create_worker_shards` errors
out on the case.
2017-02-16 10:37:53 -08:00
Jason Petersen 10afe08cd9 Fix tests broken by new PostgreSQL patch releases (#1220)
PostgreSQL 9.5.6 and 9.6.2 were released today and broke several tests
by adding TABLESPACE pg_default output to some DDL commands. Fixed all
occurrences.

cr: @anarazel
2017-02-09 16:53:02 -07:00
Onder Kalaci 49ed391b3e Bugfix for creating foreign key
This commit fixes crash for adding foreign keys without
specifying the referenced column crashes the backend.
2017-02-07 09:34:24 +02:00
Brian Cloutier 0c5373c28e Utility hook does nothing if the extension is not loaded 2017-02-02 17:48:31 +02:00
Brian Cloutier bd6c39215b Set a memory context when throwing deferred errors 2017-02-02 15:14:21 +02:00
Brian Cloutier 911137ce66 Start remote transactions in master_append_table_to_shard
Add a call to RemoteTransactionBeginIfNecessary so that BEGIN is
actually sent to the remote connections. This means that ROLLBACK and
Ctrl-C are respected and don't leave the table in a partial state.
2017-02-01 18:12:19 +02:00
Eren Basak 8efb00768e Fix Random Fails on Travis
This change fixes the random failures on Travis, which is a bug introduced
with citus/#1124. Before this fix, travis was failing randomly on `check_multi_mx`
test schedule, specifically in the parallel group of `multi_mx_metadata`,
'multi_mx_modifications` and `multi_mx_modifying_xacts` tests. This change fixes this
by serializing these three test cases.
2017-01-31 15:23:06 -08:00
Eren Basak b458832416 Allow dropping sequences on mx workers
This change allows users to drop sequences on MX workers. Previously, Citus didn't allow dropping
sequences on MX workers because it could cause shards to be dropped if `DROP SEQUENCE ... CASCADE`
is used. We now allow that since allowing sequence creation but not dropping hurts user experience
and also may cause problems with custom Citus solutions.
2017-01-31 14:51:44 -08:00
Brian Cloutier 0e135dfd72 Fix bug where router executor sends query to failed connections 2017-01-27 09:40:30 +02:00
Brian Cloutier 24654ac7e1 Refactor CheckShardPlacements
- Break CheckShardPlacements into multiple functions (The most important
  is MarkFailedShardPlacements), so that we can get rid of the global
  CoordinatedTransactionUses2PC.
- Call MarkFailedShardPlacements in the router executor, so we mark
  shards as invalid and stop using them while inside transaction blocks.
2017-01-26 13:20:45 +02:00
Murat Tuncer 59ca49a826 Add copy failure tests inside transactions 2017-01-26 11:54:40 +03:00
Murat Tuncer 0500c31a30 Fix dependent tests 2017-01-25 19:19:39 +03:00
Murat Tuncer 5a8bc76912 Add failure case for regression tests 2017-01-25 19:19:39 +03:00
Marco Slot e55faf10f0 Mark failed placements as inactive immediately after COPY 2017-01-25 19:19:39 +03:00
Marco Slot b4c5a0781b Don't mark placements inactive in COPY after successful connection 2017-01-25 19:19:38 +03:00
Marco Slot 83cb58bf40 Set placement to inactive on connection failure in COPY 2017-01-25 19:19:38 +03:00
Marco Slot 8971b3ed75 Short circuit in multi_ProcessUtility on ABORT/COMMIT 2017-01-25 11:57:00 +01:00
Marco Slot 5f61d8ea5a Always skip foreign key validation when enable_ddl_propagation is off 2017-01-25 11:56:59 +01:00
Marco Slot 8adb9c3ec1 Use coordinator instead of schema node in terminology 2017-01-25 11:07:23 +01:00
Marco Slot 46487dd583 Use bigserial instead of BIGINT in sequence error 2017-01-25 11:07:23 +01:00
Burak Yucesoy 240520d063 Add ORDER BY to some tests to have consistent output 2017-01-25 11:43:25 +02:00
Eren Basak 2536c77ef0 Add Regression Tests For Querying MX Tables from Workers 2017-01-24 10:36:59 +03:00
Burak Yucesoy 46547e67cc Convert DropShards to use new connection API
With this change DropShards function started to use new connection API. DropShards
function is used by DROP TABLE, master_drop_all_shards and master_apply_delete_command,
therefore all of these functions now support transactional operations. In DropShards
function, if we cannot reach a node, we mark shard state of related placements as
FILE_TO_DELETE and continue to drop remaining shards; however if any error occurs after
establishing the connection, we ROLLBACK whole operation.
2017-01-23 21:08:41 +03:00
Burak Yucesoy 4efeeb50c3 In case of failed transactions update shard state only if it is FILE_FINALIZED
Before this change, when a transaction failed, we update related placements shard states
to FILE_INACTIVE during XACT_EVENT_PRE_COMMIT. However that means if another code block
changed shard state to something else (e.g. FILE_TO_DELETE) before XACT_EVENT_PRE_COMMIT
we overwrite that. To prevent that problem, in case of failure we started to change
shard state, only if its current shard state is FILE_FINALIZED.
2017-01-23 21:04:57 +03:00
Burak Yucesoy 02880a717b Add LoadShardPlacement UDF
This UDF returns a shard placement from cache given shard id and placement id. At the
moment it iterates over all shard placements of given shard by ShardPlacementList and
searches given placement id in that list, which is not a good solution performance-wise.
However, currently, this function will be used only when there is a failed transaction.
If a need arises we can optimize this function in the future.
2017-01-23 21:04:57 +03:00
Marco Slot cff95c310e Use placement connection API for multi-shard transactions 2017-01-23 18:34:50 +01:00
Andres Freund 970c81f589 Hack up PREPARE/EXECUTE for nearly all distributed queries.
All router, real-time, task-tracker plannable queries should now have
full prepared statement support (and even use router when possible),
unless they don't go through the custom plan interface (which
basically just affects LANGUAGE SQL (not plpgsql) functions).

This is achieved by forcing postgres' planner to always choose a
custom plan, by assigning very low costs to plans with bound
parameters (i.e. ones were the postgres planner replanned the query
upon EXECUTE with all parameter values provided), instead of the
generic one.

This requires some trickery, because for custom plans to work the
costs for a non-custom plan have to be known, which means we can't
error out when planning the generic plan.  Instead we have to return a
"faux" plan, that'd trigger an error message if executed.  But due to
the custom plan logic that plan will likely (unless called by an SQL
function, or because we can't support that query for some reason) not
be executed; instead the custom plan will be chosen.
2017-01-23 09:23:50 -08:00
Andres Freund 67da5611f7 Make router planner error handling more flexible.
So far router planner had encapsulated different functionality in
MultiRouterPlanCreate. Modifications always go through router, selects
sometimes. Modifications always error out if the query is unsupported,
selects return NULL.  Especially the error handling is a problem for
the upcoming extension of prepared statement support.

Split MultiRouterPlanCreate into CreateRouterPlan and
CreateModifyPlan, and change them to not throw errors.

Instead errors are now reported by setting the new
MultiPlan->plannigError.

Callers of router planner functionality now have to throw errors
themselves if desired, but also can skip doing so.

This is a pre-requisite for expanding prepared statement support.

While touching all those lines, improve a number of error messages by
getting them closer to the postgres error message guidelines.
2017-01-23 09:23:50 -08:00
Andres Freund e531d69e92 Centralize more of distributed planning into CreateDistributedPlan().
The name CreatePhysicalPlan() hasn't been accurate for a while, and
the split of work between multi_planner() and CreatePhysicalPlan()
doesn't seem perfect.  So rename to CreateDistributedPlan() and move a
bit more logic in there.
2017-01-23 09:23:50 -08:00
Andres Freund 83fe9bf489 Support for deferred error messages.
It can be useful, e.g. in the upcoming prepared statement support, to
be able to return an error from a function that is not raised
immediately, but can later be thrown.  That allows e.g. to attempt to
plan a statment using different methods and to create good error
messages in each planner, but to only error out after all planners
have been run.

To enable that create support for deferred error messages that can be
created (supporting errorcode, message, detail, hint) in one function,
and then thrown in different place.
2017-01-23 09:23:50 -08:00
Andres Freund 52249a3a19 Make usage of static a bit more consistent in multi_planner.c. 2017-01-23 09:23:50 -08:00
Jason Petersen b5734eb11f Add replication_model GUC
This adds a replication_model GUC which is used as the replication
model for any new distributed table that is not a reference table.
With this change, tables with replication factor 1 are no longer
implicitly MX tables.

The GUC is similarly respected during empty shard creation for e.g.
existing append-partitioned tables. If the model is set to streaming
while replication factor is greater than one, table and shard creation
routines will error until this invalid combination is corrected.

Changing this parameter requires superuser permissions.
2017-01-23 09:05:14 -07:00
Brian Cloutier e4b65d03a2 Port master_append_table_to_shard to new connection API (#1149)
If any placements fail it doesn't update shard statistics on those placements.

A minor enabling refactor: Make CoordinatedTransactionUses2PC public (it used to be CoordinatedTransactionUse2PC but that symbol already existed, so renamed it as well)
2017-01-23 15:57:44 +02:00
Burak Yucesoy 4bb9842660 Reword error message for outer joins requiring repartition
We changed error message which appears when user tries to execute outer join command and
that command requires repartitioning. Old error message mentioned about 1-to-1 shard
partitioning which may not be clear to user.
2017-01-23 10:42:36 +03:00
Marco Slot ac919337f1 Add an enable_deadlock_prevention flag to allow router transactions to expand to multiple nodes 2017-01-22 17:31:24 +01:00
Marco Slot 190fce2f70 Ensure job IDs are unique across workers 2017-01-22 16:55:14 +01:00
Andres Freund a596858463 Remove connection_cache.[ch]. 2017-01-21 09:01:15 -08:00
Andres Freund 0bdb22268f Remove remnants of commit_protocol.[ch]. 2017-01-21 09:01:15 -08:00
Andres Freund 9d3d6a2c22 Consistently libpq forward declaration in remote_commands.h. 2017-01-21 09:01:14 -08:00
Andres Freund f3cbe57c60 Minimal citus tools conversion to new connection API. 2017-01-21 09:01:14 -08:00
Önder Kalacı 5a7ba99abf Merge branch 'master' into fix_command_counter_increment 2017-01-21 09:21:19 +02:00