Commit Graph

635 Commits (490765a7542aa4243b12981aa79bc3121220e009)

Author SHA1 Message Date
Amos Bird 107b926262 Eliminates the possibilities of counter overflows.
This patch uses scanint8 instead of pg_atoi to make sure the affected
tuples counter never gets overflow.
2016-06-08 10:30:03 +08:00
Jason Petersen 9ba02928ac
Refactor ReportRemoteError to remove boolean arg
Broke it into two explicitly-named functions instead: WarnRemoteError
and ReraiseRemoteError.
2016-06-07 12:38:32 -06:00
Metin Doslu 7d0c90b398 Fail fast on constraint violations in router executor 2016-06-07 18:11:17 +03:00
Murat Tuncer 360e884de1 Add enable_ddl_propagation flag to control automatic ddl propagation 2016-06-06 13:42:46 +03:00
Andres Freund 3dac0a4d14
Rely less on remote_task_check_interval.
When executing queries with citus.task_executor = 'real-time', query
execution could, so far, spend a significant amount of time
sleeping. That's because we were
a) sleeping after several phases of query execution, even if we're not
   waiting for network IO
b) sleeping for a fixed amount of time when waiting for network IO;
   often a lot longer than actually required.
Just reducing the amount of time slept isn't a real solution, because
that just increases CPU usage.

Instead have the real-time executor's ManageTaskExecution return whether
a task is currently being processed, waiting for reads or writes, or
failed. When all tasks are waiting for IO use poll() to wait for IO
readyness.

That requires to slightly redefine how connection timeouts are handled:
before we counted the number of times ManageTaskExecution() was called,
and compared that with the timeout divided by the task check
interval. That, if processing of tasks took a while, could significantly
increase the time till a timeout occurred. Because it was based on the
ManageTaskExecution() being called on a constant interval, this approach
isn't feasible anymore.  Instead measure the actual time since
connection establishment was started. That could in theory, if task
processing takes a very long time, lead to few passes over
PQconnectPoll().

The problem of sleeping too much also exists for the 'task-tracker'
executor, but is generally less problematic there, as processing the
individual tasks usually will take longer. That said, for e.g. the
regression tests it'd be helpful to use a similar approach.
2016-06-02 12:11:16 -06:00
Jason Petersen e774f22ed4
Fix formatting
Checking in citus_indent output.
2016-05-27 15:13:28 -06:00
Amos Bird 92788a0d9c
Remove redundant implementations of error funcs.
This patch does some basic cleaning jobs. It removes duplicated
implementations of ReportRemoteError() and related ones and adjusts
regression tests.
2016-05-27 15:12:59 -06:00
Burak Yucesoy 0e71ffd937 Fix #469
This change renames one of the ReceiveRegularFile functions with
more descriptive name.
2016-05-26 12:03:36 +03:00
Metin Doslu 866271b765 Add COPY support on worker nodes for append partitioned relations
Now, we can copy to an append-partitioned distributed relation from
any worker node by providing master options such as;

COPY relation_name FROM file_path WITH (delimiter '|', master_host 'localhost', master_port 5432);

where master_port is optional and default is 5432.
2016-05-03 16:00:00 +03:00
Marco Slot fc4f23065a Add EXPLAIN for simple distributed queries 2016-04-30 00:11:02 +02:00
Andres Freund 0ce1e3ddaf Perform permission checks on operations re-implemented by citus.
Currently that's just COPY FROM.  There's other places where we could
check for permissions earlier (to fail less verbosely), but since
there's other pending changes in the whole DDL area, which is affected
by this, I'm just adding a note to those places.
2016-04-27 10:28:36 -07:00
Andres Freund 758a70a8ff Create new shards as owned the distributed table's owner.
That's important because ownership of relations implies special
privileges. Without this change, a distributed table can be accessible
by a table's owner, but a shard created by another user might not.
2016-04-27 10:28:33 -07:00
Andres Freund 3a264db2fe Add ReplicateGrantStmt().
This is the basis for coordinating GRANT/REVOKE across nodes.
2016-04-27 10:28:25 -07:00
Andres Freund a5b3dcddb3 Run some commands as superuser to allow normal users to execute queries.
Some small parts of citus currently require superuser privileges; which
is obviously not desirable for production scenarios. Run these small
parts under superuser privileges (we use the extension owner) to avoid
that.

This does not yet coordinate grants between master and workers. Thus it
allows to create shards, load data, and run queries as a non-superuser,
but it is not easily possible to allow differentiated accesses to
several users.
2016-04-27 10:28:22 -07:00
Andres Freund 42d232c0e8 Use the current session's username when connecting to worker nodes.
So far we've always used libpq defaults when connecting to workers; bar
special environment variables being set that'll always be the user that
started the server.  That's not desirable because it prevents using
users with fewer privileges.

Thus change the various APIs creating connections to workers to always
use usernames. That means:
1) MultiClientConnect() needs to, optionally, accept a username
2) GetOrEstablishConnection(), including the underlying cache, need to
   use the current user as part of the connection cache key. That way
   connections for separate users are distinct, and we always use one
   with the correct authorization.
3) The task tracker needs to keep track of the username associated with
   a task, so it can use it when establishing connections outside the
   originating session.
2016-04-27 10:00:08 -07:00
Brian Cloutier 7a6d689259 Clear metadata_cache upon DROP EXTENSION
When we notice that pg_dist_partition is being invalidated we assume
that the citus extension is being dropped and drop state such as
extensionLoaded and the cached oids of all the metadata tables.

This frees the user from needing to reconnect after running DROP
EXTENSION, so we also no longer send a warning message.
2016-04-22 07:25:49 -07:00
Murat Tuncer a88d3ecd4e Add dynamic executor selection
- non-router plannable queries can be executed
  by router executor if they satisfy the criteria
- router executor is removed from configuration,
  now task executor can not be set to router
- removed some tests that error out for router executor
2016-04-21 09:15:33 +03:00
Murat Tuncer 938546b938 Add router plannable check and router planning logic
for single shard select queries
2016-04-21 09:15:33 +03:00
Andres Freund 29b8576a33
Annotate variables only used for asserts with PG_USED_FOR_ASSERTS_ONLY.
This avoids '-Wunused-but-set-variable' type warnings when compiling
without assertions, e.g. against a system postgres.
2016-04-19 12:31:12 -06:00
Brian Cloutier 1df4b8ba32 Better error on "CREATE INDEX already_exists ..."
Previously (if you're creating the index with the same name on different
tables) we successfully ran the command on the workers before failing it
on the master and leaving no record of the index.

Now we check whether the index exists on the master before sending
commands to the workers.

--

Also make the error better when user attampts to create an index without
a name. Previously those statements returned:

brian=# create index on c (b);
WARNING:  could not receive query results from localhost:9700
DETAIL:  Client error: cannot extend name for null index name
ERROR:  could not execute DDL command on worker node shards

They now return

brian=# create index on c (b);
ERROR:  creating index without a name on a distributed table is
currently unsupported
2016-04-18 13:33:53 -07:00
Matthew Seaman 55e81ce23d Add sys/stat.h include to files using S_IRUSR and S_IWUSR macros. 2016-04-15 09:34:22 -07:00
Metin Doslu 1150ce6414 Send COPY rows in binary format 2016-04-12 20:22:31 +02:00
Marco Slot d25ee8fbd8 Support for COPY FROM, based on pg_shard PR by Postres Pro 2016-04-12 20:22:31 +02:00
Murat Tuncer e74decf8b4 Allow users to create unique indexes
Users can now create unique indexes on partition columns
for hash and range distributed tables.
2016-03-24 09:31:06 +02:00
Jason Petersen 697d3ea09b
Merge latest 5.0 release fixes 2016-03-23 17:43:34 -06:00
Jason Petersen 423e6c8ea0
Update copyright dates
Fixed configure variable and updated all end dates to 2016.
2016-03-23 17:14:37 -06:00
Andres Freund b60f5da774 Copy toplevel queryId to citus' master statement.
multi_ExecutorStart() replaces the original planned statement with the
master select statement. As that hasn't gone through the parse analysis
hooks, it'll not have a associated queryId.  This prevents extensions
pg_stat_statements to show useful data associated with the query.
2016-03-14 17:27:52 -07:00
Murat Tuncer 3528d7ce85 Merge from master branch into feature/citusdb-to-citus 2016-02-17 14:49:01 +02:00
Jason Petersen 8ad5b09251 Merge pull request #344 from citusdata/fix_shard_lock_acquisition#342
Ensure router executor acquires proper shard lock

cr: @onderkalaci
2016-02-16 16:43:39 -07:00
Jason Petersen 0d196d1bf4
Ensure router executor acquires proper shard lock
Though Citus' Task struct has a shardId field, it doesn't have the same
semantics as the one previously used in pg_shard code. The analogous
field in the Citus Task is anchorShardId. I've also added an argument
check to the relevant locking function to catch future locking attempts
which pass an invalid argument.
2016-02-16 11:20:18 -07:00
Jason Petersen f874a56e24
Omit RangeVarCallbackForDropIndex from formatting
I removed two braces to have this function remain more similar to the
original PostgreSQL function and added uncrustify commands to disable
formatting of its contents.
2016-02-15 23:29:33 -07:00
Jason Petersen fdb37682b2
First formatting attempt
Skipped csql, ruleutils, readfuncs, and functions obviously copied from
PostgreSQL. Seeing how this looks, then continuing.
2016-02-15 23:29:32 -07:00
Murat Tuncer 55c44b48dd Changed product name to citus
All citusdb references in
- extension, binary names
- file headers
- all configuration name prefixes
- error/warning messages
- some functions names
- regression tests

are changed to be citus.
2016-02-15 16:04:31 +02:00
Jason Petersen 4494e57bbd
Rename GetConnection to address name conflict
The postgres_fdw extension has an extern function with an identical
signature, which can cause problems when both extensions are loaded.
A simple rename can fix this for now (this is the only function with)
such a conflict.
2016-02-12 13:35:02 -07:00
Onder Kalaci 136306a1fe Initial commit of Citus 5.0 2016-02-11 04:05:32 +02:00