Fixes#475
With this change we prevent addition of ONLY clause to queries prepared for
worker nodes. When we add ONLY clause we may miss the inherited tables in
worker nodes created by users manually.
When executing queries with citus.task_executor = 'real-time', query
execution could, so far, spend a significant amount of time
sleeping. That's because we were
a) sleeping after several phases of query execution, even if we're not
waiting for network IO
b) sleeping for a fixed amount of time when waiting for network IO;
often a lot longer than actually required.
Just reducing the amount of time slept isn't a real solution, because
that just increases CPU usage.
Instead have the real-time executor's ManageTaskExecution return whether
a task is currently being processed, waiting for reads or writes, or
failed. When all tasks are waiting for IO use poll() to wait for IO
readyness.
That requires to slightly redefine how connection timeouts are handled:
before we counted the number of times ManageTaskExecution() was called,
and compared that with the timeout divided by the task check
interval. That, if processing of tasks took a while, could significantly
increase the time till a timeout occurred. Because it was based on the
ManageTaskExecution() being called on a constant interval, this approach
isn't feasible anymore. Instead measure the actual time since
connection establishment was started. That could in theory, if task
processing takes a very long time, lead to few passes over
PQconnectPoll().
The problem of sleeping too much also exists for the 'task-tracker'
executor, but is generally less problematic there, as processing the
individual tasks usually will take longer. That said, for e.g. the
regression tests it'd be helpful to use a similar approach.
Single table repartition subqueries now support count(distinct column)
and count(distinct (case when ...)) expressions. Repartition query
extracts column used in aggregate expression and adds them to target
list and group by list, master query stays the same (count (distinct ...))
but attribute numbers inside the aggregate expression is modified to
reflect changes in repartition query.
Now, master_create_empty_shard() will create shards according to the
value of citus.shard_placement_policy which also makes default round-robin
instead of random.
Fixes#10
This change creates a new UDF: master_modify_multiple_shards
Parameters:
modify_query: A simple DELETE or UPDATE query as a string.
The UDF is similar to the existing master_apply_delete_command UDF.
Basically, given the modify query, it prunes the shard list, re-constructs
the query for each shard and sends the query to the placements.
Depending on the value of citus.multi_shard_commit_protocol, the commit
can be done in one-phase or two-phase manner.
Limitations:
* It cannot be called inside a transaction block
* It only be called with simple operator expressions (like Single Shard Modify)
Sample Usage:
```
SELECT master_modify_multiple_shards(
'DELETE FROM customer_delete_protocol WHERE c_custkey > 500 AND c_custkey < 500');
```
Make's $(wildcard) does not sort the glob result, but returns filenames
in filesystem ordering. This makes the build result vary and hence
unreproducible on the binary level. Fix by adding $(sort).
Spotted by Debian's reproducible builds project.
This commit fixes failures happen during check-full. The change does make
clean seperation of executor types in certain places to keep the outputs
stable.
Now, we can copy to an append-partitioned distributed relation from
any worker node by providing master options such as;
COPY relation_name FROM file_path WITH (delimiter '|', master_host 'localhost', master_port 5432);
where master_port is optional and default is 5432.
Based on Andres' suggestion, I removed SetConnectionStatus, moving its
functionality directly into set_connection_status_bad, which now simply
shuts down the socket underlying a particular connection.
This keeps the functionality as-is while removing our questionable use
of internal libpq headers.