citus

Commit Graph

Author	SHA1	Message	Date
Andres Freund	3dac0a4d14	Rely less on remote_task_check_interval. When executing queries with citus.task_executor = 'real-time', query execution could, so far, spend a significant amount of time sleeping. That's because we were a) sleeping after several phases of query execution, even if we're not waiting for network IO b) sleeping for a fixed amount of time when waiting for network IO; often a lot longer than actually required. Just reducing the amount of time slept isn't a real solution, because that just increases CPU usage. Instead have the real-time executor's ManageTaskExecution return whether a task is currently being processed, waiting for reads or writes, or failed. When all tasks are waiting for IO use poll() to wait for IO readyness. That requires to slightly redefine how connection timeouts are handled: before we counted the number of times ManageTaskExecution() was called, and compared that with the timeout divided by the task check interval. That, if processing of tasks took a while, could significantly increase the time till a timeout occurred. Because it was based on the ManageTaskExecution() being called on a constant interval, this approach isn't feasible anymore. Instead measure the actual time since connection establishment was started. That could in theory, if task processing takes a very long time, lead to few passes over PQconnectPoll(). The problem of sleeping too much also exists for the 'task-tracker' executor, but is generally less problematic there, as processing the individual tasks usually will take longer. That said, for e.g. the regression tests it'd be helpful to use a similar approach.	2016-06-02 12:11:16 -06:00
Murat Tuncer	2b0d6473b9	Add complex distinct count support for repartitioned subqueries Single table repartition subqueries now support count(distinct column) and count(distinct (case when ...)) expressions. Repartition query extracts column used in aggregate expression and adds them to target list and group by list, master query stays the same (count (distinct ...)) but attribute numbers inside the aggregate expression is modified to reflect changes in repartition query.	2016-05-27 15:43:05 +03:00
Onder Kalaci	6c7abc2ba5	Add fast shard pruning path for INSERTs on hash partitioned tables This commit adds a fast shard pruning path for INSERTs on hash-partitioned tables. The rationale behind this change is that if there exists a sorted shard interval array, a single index lookup on the array allows us to find the corresponding shard interval. As mentioned above, we need a sorted (wrt shardminvalue) shard interval array. Thus, this commit updates shardIntervalArray to sortedShardIntervalArray in the metadata cache. Then uses the low-level API that is defined in multi_copy to handle the fast shard pruning. The performance impact of this change is more apparent as more shards exist for a distributed table. Previous implementation was relying on linear search through the shard intervals. However, this commit relies on constant lookup time on shard interval array. Thus, the shard pruning becomes less dependent on the shard count.	2016-04-26 11:16:00 +03:00
Murat Tuncer	938546b938	Add router plannable check and router planning logic for single shard select queries	2016-04-21 09:15:33 +03:00
Jason Petersen	423e6c8ea0	Update copyright dates Fixed configure variable and updated all end dates to 2016.	2016-03-23 17:14:37 -06:00
Jason Petersen	fdb37682b2	First formatting attempt Skipped csql, ruleutils, readfuncs, and functions obviously copied from PostgreSQL. Seeing how this looks, then continuing.	2016-02-15 23:29:32 -07:00
Onder Kalaci	136306a1fe	Initial commit of Citus 5.0	2016-02-11 04:05:32 +02:00

7 Commits (027a7a717d336b078e03d8a7b84ea9fa596a3c55)