Commit Graph

6 Commits (e31830f3fb5dfc46067477b11937649e740b271a)

Author SHA1 Message Date
Marco Slot 061662a51f Add MultiClientExecute and MultiClientValueIsNull for simple remote query execution 2016-07-29 20:07:18 +02:00
Andres Freund ee5bb2297b Rely less on remote_task_check_interval.
When executing queries with citus.task_executor = 'real-time', query
execution could, so far, spend a significant amount of time
sleeping. That's because we were
a) sleeping after several phases of query execution, even if we're not
   waiting for network IO
b) sleeping for a fixed amount of time when waiting for network IO;
   often a lot longer than actually required.
Just reducing the amount of time slept isn't a real solution, because
that just increases CPU usage.

Instead have the real-time executor's ManageTaskExecution return whether
a task is currently being processed, waiting for reads or writes, or
failed. When all tasks are waiting for IO use poll() to wait for IO
readyness.

That requires to slightly redefine how connection timeouts are handled:
before we counted the number of times ManageTaskExecution() was called,
and compared that with the timeout divided by the task check
interval. That, if processing of tasks took a while, could significantly
increase the time till a timeout occurred. Because it was based on the
ManageTaskExecution() being called on a constant interval, this approach
isn't feasible anymore.  Instead measure the actual time since
connection establishment was started. That could in theory, if task
processing takes a very long time, lead to few passes over
PQconnectPoll().

The problem of sleeping too much also exists for the 'task-tracker'
executor, but is generally less problematic there, as processing the
individual tasks usually will take longer. That said, for e.g. the
regression tests it'd be helpful to use a similar approach.
2016-06-02 12:11:16 -06:00
Andres Freund 3dae284bbe Use the current session's username when connecting to worker nodes.
So far we've always used libpq defaults when connecting to workers; bar
special environment variables being set that'll always be the user that
started the server.  That's not desirable because it prevents using
users with fewer privileges.

Thus change the various APIs creating connections to workers to always
use usernames. That means:
1) MultiClientConnect() needs to, optionally, accept a username
2) GetOrEstablishConnection(), including the underlying cache, need to
   use the current user as part of the connection cache key. That way
   connections for separate users are distinct, and we always use one
   with the correct authorization.
3) The task tracker needs to keep track of the username associated with
   a task, so it can use it when establishing connections outside the
   originating session.
2016-04-27 10:00:08 -07:00
Jason Petersen a95c9da472 Update copyright dates
Fixed configure variable and updated all end dates to 2016.
2016-03-23 17:14:37 -06:00
Jason Petersen 166f96bb83 First formatting attempt
Skipped csql, ruleutils, readfuncs, and functions obviously copied from
PostgreSQL. Seeing how this looks, then continuing.
2016-02-15 23:29:32 -07:00
Onder Kalaci 136306a1fe Initial commit of Citus 5.0 2016-02-11 04:05:32 +02:00