Commit Graph

454 Commits (bc1a800f7095c8fe29f0c95e21c97dc008eca225)

Author SHA1 Message Date
Marco Slot 2868e02a3d Implement SELECT function call delegation.
When a function is marked as colocated with a distributed table,
we try delegating queries of kind "SELECT func(...)" to workers.

We currently only support this simple form, and don't delegate
forms like "SELECT f1(...), f2(...)", "SELECT f1(...) FROM ...",
or function calls inside transactions.

As a side effect, we also fix the transactional semantics of DO blocks.
Previously we didn't consider a DO block a multi-statement transaction.
Now we do.

Co-authored-by: Marco Slot <marco@citusdata.com>
Co-authored-by: serprex <serprex@users.noreply.github.com>
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
2019-09-27 09:13:25 -07:00
Philip Dubé 06faba91c0 Include ifdefs for pg12 API changes, update local_shard_executiuon test to avoid CTE inlining 2019-09-23 20:22:35 +00:00
Marco Slot d85d77634d Handle anonymous composite types on the target list 2019-09-23 14:53:02 +02:00
Philip Dubé 2aa6852dea Begin searching AggregateNames from 1, not 0 2019-09-12 16:55:05 +00:00
Philip Dubé e5cd298a98 pg12 revised layout of FunctionCallInfoData
See a9c35cf85c

clang raises a warning due to FunctionCall2InfoData technically being variable sized
This is fine, as the struct is the size we want it to be. So silence the warning
2019-08-22 19:02:35 +00:00
Philip Dubé bee779e7d4 planner/distributed_planner.c: get_func_cost replaced with add_function_cost in pg12 2019-08-22 19:02:10 +00:00
Philip Dubé be3285828f Collations matter for hashing strings in pg12
See https://www.postgresql.org/docs/12/collation.html#COLLATION-NONDETERMINISTIC
2019-08-22 18:58:37 +00:00
Philip Dubé 018ad1c58e pg12: version_compat.h, tuples, oids, misc 2019-08-22 18:57:23 +00:00
Philip Dubé 68c4b71f93 Fix up includes with pg12 changes 2019-08-22 18:56:21 +00:00
Philip Dubé b77c52f95b PlanRouterQuery: don't store list of list of shard intervals in relationShardList 2019-08-02 14:08:57 +00:00
Philip Dubé 064bd66a20 Avoid segfault in logging queries 2019-07-31 15:28:46 +00:00
Philip Dubé 0915027389 DistributedPlan: replace operation with modLevel
This causes no behaviorial changes, only organizes better to implement modifying CTEs

Also rename ExtactInsertRangeTableEntry to ExtractResultRelationRTE,
as the source of this function didn't match the documentation

Remove Task's upsertQuery in favor of ROW_MODIFY_NONCOMMUTATIVE

Split up AcquireExecutorShardLock into more internal functions

Tests: Normalize multi_reference_table multi_create_table_constraints
2019-07-16 13:58:18 -07:00
Önder Kalacı 40da78c6fd
Introduce the adaptive executor (#2798)
With this commit, we're introducing the Adaptive Executor. 


The commit message consists of two distinct sections. The first part explains
how the executor works. The second part consists of the commit messages of
the individual smaller commits that resulted in this commit. The readers
can search for the each of the smaller commit messages on 
https://github.com/citusdata/citus and can learn more about the history
of the change.

/*-------------------------------------------------------------------------
 *
 * adaptive_executor.c
 *
 * The adaptive executor executes a list of tasks (queries on shards) over
 * a connection pool per worker node. The results of the queries, if any,
 * are written to a tuple store.
 *
 * The concepts in the executor are modelled in a set of structs:
 *
 * - DistributedExecution:
 *     Execution of a Task list over a set of WorkerPools.
 * - WorkerPool
 *     Pool of WorkerSessions for the same worker which opportunistically
 *     executes "unassigned" tasks from a queue.
 * - WorkerSession:
 *     Connection to a worker that is used to execute "assigned" tasks
 *     from a queue and may execute unasssigned tasks from the WorkerPool.
 * - ShardCommandExecution:
 *     Execution of a Task across a list of placements.
 * - TaskPlacementExecution:
 *     Execution of a Task on a specific placement.
 *     Used in the WorkerPool and WorkerSession queues.
 *
 * Every connection pool (WorkerPool) and every connection (WorkerSession)
 * have a queue of tasks that are ready to execute (readyTaskQueue) and a
 * queue/set of pending tasks that may become ready later in the execution
 * (pendingTaskQueue). The tasks are wrapped in a ShardCommandExecution,
 * which keeps track of the state of execution and is referenced from a
 * TaskPlacementExecution, which is the data structure that is actually
 * added to the queues and describes the state of the execution of a task
 * on a particular worker node.
 *
 * When the task list is part of a bigger distributed transaction, the
 * shards that are accessed or modified by the task may have already been
 * accessed earlier in the transaction. We need to make sure we use the
 * same connection since it may hold relevant locks or have uncommitted
 * writes. In that case we "assign" the task to a connection by adding
 * it to the task queue of specific connection (in
 * AssignTasksToConnections). Otherwise we consider the task unassigned
 * and add it to the task queue of a worker pool, which means that it
 * can be executed over any connection in the pool.
 *
 * A task may be executed on multiple placements in case of a reference
 * table or a replicated distributed table. Depending on the type of
 * task, it may not be ready to be executed on a worker node immediately.
 * For instance, INSERTs on a reference table are executed serially across
 * placements to avoid deadlocks when concurrent INSERTs take conflicting
 * locks. At the beginning, only the "first" placement is ready to execute
 * and therefore added to the readyTaskQueue in the pool or connection.
 * The remaining placements are added to the pendingTaskQueue. Once
 * execution on the first placement is done the second placement moves
 * from pendingTaskQueue to readyTaskQueue. The same approach is used to
 * fail over read-only tasks to another placement.
 *
 * Once all the tasks are added to a queue, the main loop in
 * RunDistributedExecution repeatedly does the following:
 *
 * For each pool:
 * - ManageWorkPool evaluates whether to open additional connections
 *   based on the number unassigned tasks that are ready to execute
 *   and the targetPoolSize of the execution.
 *
 * Poll all connections:
 * - We use a WaitEventSet that contains all (non-failed) connections
 *   and is rebuilt whenever the set of active connections or any of
 *   their wait flags change.
 *
 *   We almost always check for WL_SOCKET_READABLE because a session
 *   can emit notices at any time during execution, but it will only
 *   wake up WaitEventSetWait when there are actual bytes to read.
 *
 *   We check for WL_SOCKET_WRITEABLE just after sending bytes in case
 *   there is not enough space in the TCP buffer. Since a socket is
 *   almost always writable we also use WL_SOCKET_WRITEABLE as a
 *   mechanism to wake up WaitEventSetWait for non-I/O events, e.g.
 *   when a task moves from pending to ready.
 *
 * For each connection that is ready:
 * - ConnectionStateMachine handles connection establishment and failure
 *   as well as command execution via TransactionStateMachine.
 *
 * When a connection is ready to execute a new task, it first checks its
 * own readyTaskQueue and otherwise takes a task from the worker pool's
 * readyTaskQueue (on a first-come-first-serve basis).
 *
 * In cases where the tasks finish quickly (e.g. <1ms), a single
 * connection will often be sufficient to finish all tasks. It is
 * therefore not necessary that all connections are established
 * successfully or open a transaction (which may be blocked by an
 * intermediate pgbouncer in transaction pooling mode). It is therefore
 * essential that we take a task from the queue only after opening a
 * transaction block.
 *
 * When a command on a worker finishes or the connection is lost, we call
 * PlacementExecutionDone, which then updates the state of the task
 * based on whether we need to run it on other placements. When a
 * connection fails or all connections to a worker fail, we also call
 * PlacementExecutionDone for all queued tasks to try the next placement
 * and, if necessary, mark shard placements as inactive. If a task fails
 * to execute on all placements, the execution fails and the distributed
 * transaction rolls back.
 *
 * For multi-row INSERTs, tasks are executed sequentially by
 * SequentialRunDistributedExecution instead of in parallel, which allows
 * a high degree of concurrency without high risk of deadlocks.
 * Conversely, multi-row UPDATE/DELETE/DDL commands take aggressive locks
 * which forbids concurrency, but allows parallelism without high risk
 * of deadlocks. Note that this is unrelated to SEQUENTIAL_CONNECTION,
 * which indicates that we should use at most one connection per node, but
 * can run tasks in parallel across nodes. This is used when there are
 * writes to a reference table that has foreign keys from a distributed
 * table.
 *
 * Execution finishes when all tasks are done, the query errors out, or
 * the user cancels the query.
 *
 *-------------------------------------------------------------------------
 */



All the commits involved here:
* Initial unified executor prototype

* Latest changes

* Fix rebase conflicts to master branch

* Add missing variable for assertion

* Ensure that master_modify_multiple_shards() returns the affectedTupleCount

* Adjust intermediate result sizes

The real-time executor uses COPY command to get the results
from the worker nodes. Unified executor avoids that which
results in less data transfer. Simply adjust the tests to lower
sizes.

* Force one connection per placement (or co-located placements) when requested

The existing executors (real-time and router) always open 1 connection per
placement when parallel execution is requested.

That might be useful under certain circumstances:

(a) User wants to utilize as much as CPUs on the workers per
distributed query
(b) User has a transaction block which involves COPY command

Also, lots of regression tests rely on this execution semantics.
So, we'd enable few of the tests with this change as well.

* For parameters to be resolved before using them

For the details, see PostgreSQL's copyParamList()

* Unified executor sorts the returning output

* Ensure that unified executor doesn't ignore sequential execution of DDLJob's

Certain DDL commands, mainly creating foreign keys to reference tables,
should be executed sequentially. Otherwise, we'd end up with a self
distributed deadlock.

To overcome this situaiton, we set a flag `DDLJob->executeSequentially`
and execute it sequentially. Note that we have to do this because
the command might not be called within a transaction block, and
we cannot call `SetLocalMultiShardModifyModeToSequential()`.

This fixes at least two test: multi_insert_select_on_conflit.sql and
multi_foreign_key.sql

Also, I wouldn't mind scattering local `targetPoolSize` variables within
the code. The reason is that we'll soon have a GUC (or a global
variable based on a GUC) that'd set the pool size. In that case, we'd
simply replace `targetPoolSize` with the global variables.

* Fix 2PC conditions for DDL tasks

* Improve closing connections that are not fully established in unified execution

* Support foreign keys to reference tables in unified executor

The idea for supporting foreign keys to reference tables is simple:
Keep track of the relation accesses within a transaction block.
    - If a parallel access happens on a distributed table which
      has a foreign key to a reference table, one cannot modify
      the reference table in the same transaction. Otherwise,
      we're very likely to end-up with a self-distributed deadlock.
    - If an access to a reference table happens, and then a parallel
      access to a distributed table (which has a fkey to the reference
      table) happens, we switch to sequential mode.

Unified executor misses the function calls that marks the relation
accesses during the execution. Thus, simply add the necessary calls
and let the logic kick in.

* Make sure to close the failed connections after the execution

* Improve comments

* Fix savepoints in unified executor.

* Rebuild the WaitEventSet only when necessary

* Unclaim connections on all errors.

* Improve failure handling for unified executor

   - Implement the notion of errorOnAnyFailure. This is similar to
     Critical Connections that the connection managament APIs provide
   - If the nodes inside a modifying transaction expand, activate 2PC
   - Fix few bugs related to wait event sets
   - Mark placement INACTIVE during the execution as much as possible
     as opposed to we do in the COMMIT handler
   - Fix few bugs related to scheduling next placement executions
   - Improve decision on when to use 2PC

Improve the logic to start a transaction block for distributed transactions

- Make sure that only reference table modifications are always
  executed with distributed transactions
- Make sure that stored procedures and functions are executed
  with distributed transactions

* Move waitEventSet to DistributedExecution

This could also be local to RunDistributedExecution(), but in that case
we had to mark it as "volatile" to avoid PG_TRY()/PG_CATCH() issues, and
cast it to non-volatile when doing WaitEventSetFree(). We thought that
would make code a bit harder to read than making this non-local, so we
move it here. See comments for PG_TRY() in postgres/src/include/elog.h
and "man 3 siglongjmp" for more context.

* Fix multi_insert_select test outputs

Two things:
   1) One complex transaction block is now supported. Simply update
      the test output
   2) Due to dynamic nature of the unified executor, the orders of
      the errors coming from the shards might change (e.g., all of
      the queries on the shards would fail, but which one appears
      on the error message?). To fix that, we simply added it to
      our shardId normalization tool which happens just before diff.

* Fix subeury_and_cte test

The error message is updated from:
	failed to execute task
To:
        more than one row returned by a subquery or an expression

which is a lot clearer to the user.

* Fix intermediate_results test outputs

Simply update the error message from:
	could not receive query results
to
	result "squares" does not exist

which makes a lot more sense.

* Fix multi_function_in_join test

The error messages update from:
     Failed to execute task XXX
To:
     function f(..) does not exist

* Fix multi_query_directory_cleanup test

The unified executor does not create any intermediate files.

* Fix with_transactions test

A test case that just started to work fine

* Fix multi_router_planner test outputs

The error message is update from:
	Could not receive query results
To:
	Relation does not exists

which is a lot more clearer for the users

* Fix multi_router_planner_fast_path test

The error message is update from:
	Could not receive query results
To:
	Relation does not exists

which is a lot more clearer for the users

* Fix isolation_copy_placement_vs_modification by disabling select_opens_transaction_block

* Fix ordering in isolation_multi_shard_modify_vs_all

* Add executor locks to unified executor

* Make sure to allocate enought WaitEvents

The previous code was missing the waitEvents for the latch and
postmaster death.

* Fix rebase conflicts for master rebase

* Make sure that TRUNCATE relies on unified executor

* Implement true sequential execution for multi-row INSERTS

Execute the individual tasks executed one by one. Note that this is different than
MultiShardConnectionType == SEQUENTIAL_CONNECTION case (e.g., sequential execution
mode). In that case, running the tasks across the nodes in parallel is acceptable
and implemented in that way.

However, the executions that are qualified here would perform poorly if the
tasks across the workers are executed in parallel. We currently qualify only
one class of distributed queries here, multi-row INSERTs. If we do not enforce
true sequential execution, concurrent multi-row upserts could easily form
a distributed deadlock when the upserts touch the same rows.

* Remove SESSION_LIFESPAN flag in unified_executor

* Apply failure test updates

We've changed the failure behaviour a bit, and also the error messages
that show up to the user. This PR covers majority of the updates.

* Unified executor honors citus.node_connection_timeout

With this commit, unified executor errors out if even
a single connection cannot be established within
citus.node_connection_timeout.

And, as a side effect this fixes failure_connection_establishment
test.

* Properly increment/decrement pool size variables

Before this commit, the idle and active connection
counts were not properly calculated.

* insert_select_executor goes through unified executor.

* Add missing file for task tracker

* Modify ExecuteTaskListExtended()'s signature

* Sort output of INSERT ... SELECT ... RETURNING

* Take partition locks correctly in unified executor

* Alternative implementation for force_max_query_parallelization

* Fix compile warnings in unified executor

* Fix style issues

* Decrement idleConnectionCount when idle connection is lost

* Always rebuild the wait event sets

In the previous implementation, on waitFlag changes, we were only
modifying the wait events. However, we've realized that it might
be an over optimization since (a) we couldn't see any performance
benefits (b) we see some errors on failures and because of (a)
we prefer to disable it now.

* Make sure to allocate enough sized waitEventSet

With multi-row INSERTs, we might have more sessions than
task*workerCount after few calls of RunDistributedExecution()
because the previous sessions would also be alive.

Instead, re-allocate events when the connectino set changes.

* Implement SELECT FOR UPDATE on reference tables

On master branch, we do two extra things on SELECT FOR UPDATE
queries on reference tables:
   - Acquire executor locks
   - Execute the query on all replicas

With this commit, we're implementing the same logic on the
new executor.

* SELECT FOR UPDATE opens transaction block even if SelectOpensTransactionBlock disabled

Otherwise, users would be very confused and their logic is very likely
to break.

* Fix build error

* Fix the newConnectionCount calculation in ManageWorkerPool

* Fix rebase conflicts

* Fix minor test output differences

* Fix citus indent

* Remove duplicate sorts that is added with rebase

* Create distributed table via executor

* Fix wait flags in CheckConnectionReady

* failure_savepoints output for unified executor.

* failure_vacuum output (pg 10) for unified executor.

* Fix WaitEventSetWait timeout in unified executor

* Stabilize failure_truncate test output

* Add an ORDER BY to multi_upsert

* Fix regression test outputs after rebase to master

* Add executor.c comment

* Rename executor.c to adaptive_executor.c

* Do not schedule tasks if the failed placement is not ready to execute

Before the commit, we were blindly scheduling the next placement executions
even if the failed placement is not on the ready queue. Now, we're ensuring
that if failed placement execution is on a failed pool or session where the
execution is on the pendingQueue, we do not schedule the next task. Because
the other placement execution should be already running.

* Implement a proper custom scan node for adaptive executor

- Switch between the executors, add GUC to set the pool size
- Add non-adaptive regression test suites
- Enable CIRCLE CI for non-adaptive tests
- Adjust test output files

* Add slow start interval to the executor

* Expose max_cached_connection_per_worker to user

* Do not start slow when there are cached connections

* Consider ExecutorSlowStartInterval in NextEventTimeout

* Fix memory issues with ReceiveResults().

* Disable executor via TaskExecutorType

* Make sure to execute the tests with the other executor

* Use task_executor_type to enable-disable adaptive executor

* Remove useless code

* Adjust the regression tests

* Add slow start regression test

* Rebase to master

* Fix test failures in adaptive executor.

* Rebase to master - 2

* Improve comments & debug messages

* Set force_max_query_parallelization in isolation_citus_dist_activity

* Force max parallelization for creating shards when asked to use exclusive connection.

* Adjust the default pool size

* Expand description of max_adaptive_executor_pool_size GUC

* Update warnings in FinishRemoteTransactionCommit()

* Improve session clean up at the end of execution

Explicitly list all the states that the execution might end,
otherwise warn.

* Remove MULTI_CONNECTION_WAIT_RETRY which is not used at all

* Add more ORDER BYs to multi_mx_partitioning
2019-06-28 14:04:40 +02:00
Philip Dubé db7fdb1854 Router planner: bail on volatile functions in CTEs 2019-06-26 10:32:01 +02:00
Philip Dubé 5c62f9935a Router planner: reject SELECT FOR UPDATE ctes 2019-06-26 10:32:01 +02:00
Philip Dubé 77efec04a0 Router Planner: accept SELECT_CMD ctes in modification queries 2019-06-26 10:32:01 +02:00
Philip Dubé 84fe626378 multi_router_planner: refactor error propagation 2019-06-26 10:32:01 +02:00
Hadi Moshayedi 8e2d328530 Search all outer node levels for lateral join params. 2019-06-04 10:14:05 -07:00
Philip Dubé b5ced403d8 Also check rewrittenQuery jointree for outer join 2019-06-04 07:47:35 -07:00
Hadi Moshayedi 23207a43e0 Fix a typo: WITH CARDINALITY -> WITH ORDINALITY 2019-05-24 15:49:17 -07:00
exialin 59e54de54d Minor code clean-up 2019-05-24 14:26:26 +02:00
Hanefi Onaldi 4d737177e6
Remove redundant active placement filters and unneded sort operations
If a query is router executable, it hits a single shard and therefore has a
single task associated with it. Therefore there is no need to sort the task list
that has a single element.

Also we already have a list of active shard placements, sending it in param
and reuse it.
2019-05-24 14:16:50 +03:00
Philip Dubé 16886b3c63 Fix misc typos 2019-05-23 17:23:27 -07:00
Hadi Moshayedi dce9260c0e Fix an include in recusive_planning.c 2019-05-20 18:57:03 -07:00
Hanefi Onaldi 4030d603eb
Merge pull request #2691 from citusdata/update_changelog
Add 8.1.2 and 8.2.1 changelog entries
2019-05-15 09:18:58 +03:00
Jason Petersen 71d5d1c865 Enable variable shadowing warnings; fix all
Rather than wait for another place like the previous commit to bite us,
I think we should turn on this warning.
2019-04-30 13:24:25 -06:00
Hadi Moshayedi c9b1d9c2d1 Check all placements aren't inactive 2019-04-26 10:04:55 -07:00
Hadi Moshayedi 7b1d03772d Don't schedule tasks on inactive nodes. 2019-04-26 10:04:54 -07:00
Murat Tuncer 1424f75ec9 Support columns referencing an aliased joins
We used to rely on PG function flatten_join_alias_vars
to resolve actual columns referenced in target entry list.

The function goes deep and finds the actual relation. This logic
usually works fine. However, when joins are given an alias, inner
relation names are not visible to target entry entry. Thus relation
resolving should stop when we the target entry column refers an
rte of an aliased join.

We stopped using PG function and provided our own flatten function.
2019-03-26 09:46:22 +03:00
Jason Petersen 4c7f78bd7e Code review feedback 2019-03-25 22:07:27 -05:00
Jason Petersen 6a0dc7756e Formatting fixes
Noticed a lot of weird lines wrapped at 80; our standard is 90.
2019-03-22 20:32:19 -06:00
Jason Petersen 6acf52660c Always coerce RHS of pruning op to part. key type
Our assumption that strip_implicit_coercions would leave us with a bi-
nary-compatible type to that of the partition key was wrong. Instead,
we should ensure the RHS of the comparison we perform is proactively
coerced into a compatible type (at least binary compatible).
2019-03-22 20:32:19 -06:00
Jason Petersen 5baa257c91 Add second assert to guard against future changes
This isn't entirely necessary but I feel safer with it here.
2019-03-22 20:32:19 -06:00
Jason Petersen 69adb627c3 Add Assert that will crash before coercion fix is in 2019-03-22 20:32:19 -06:00
Marco Slot e8152d9b6d Only look in top-level rtable in ExtractFirstDistributedTableId 2019-03-20 12:14:46 +03:00
Marco Slot ee6a0b6943 Speed up RTE walkers
Do it in two ways (a) re-use the rte list as much as possible instead of
re-calculating over and over again (b) Limit the recursion to the relevant
parts of the query tree
2019-03-20 12:14:46 +03:00
Marco Slot 5ff1821411 Cache the current database name
Purely for performance reasons.
2019-03-20 12:14:46 +03:00
Marco Slot 0ea4e52df5 Add nodeId to shardPlacements and use it for shard placement comparisons
Before this commit, shardPlacements were identified with shardId, nodeName
and nodeport. Instead of using nodeName and nodePort, we now use nodeId
since it apparently has performance benefits in several places in the
code.
2019-03-20 12:14:46 +03:00
Onder Kalaci ad5ff1d01a Some queries lead to infinite recursion with recurisve planning
The rule for infinite recursion is the following:

    - If the query contains a subquery which is recursively planned, and
      no other subqueries can be recursively planned due to correlation
      (e.g., LATERAL joins), the planner keeps recursing again and again.

One interesting thing here is that even if a subquery contains only intermediate
result(s), we re-recursively plan that. In the end, the logic in the code does the following:

  - Try recursive planning any of the subqueries in the query tree
     - If any subquery is recursively planned, call the planner again
        where the subquery is replaced with the intermediate result.
        - Try recursively planning any of the queries
          - If any subquery is recursively planned, call the planner again
            where the subquery (in this case it is already intermediate result)
            is replaced with the intermediate result.
              - Try recursively planning any of the queries
                - If any subquery is recursively planned, call the planner again
                  where the subquery (in this case it is already intermediate result)
                  is replaced with the intermediate result.
                  - Try recursively planning any of the queries
                    - If any subquery is recursively planned, call the planner again
                      where the subquery (in this case it is already intermediate result)
                      is replaced with the intermediate result.
                      ......
2019-03-18 10:35:00 +03:00
velioglu faf50849d7 Enhance pushdown planning logic to handle full outer joins with using clause
Since flattening query may flatten outer joins' columns into coalesce expr that is
in the USING part, and that was not expected before this commit, these queries were
erroring out. It is fixed by this commit with considering coalesce expression as well.
2019-03-05 11:49:30 +03:00
Onder Kalaci f706772b2f Round-robin task assignment policy relies on local transaction id
Before this commit, round-robin task assignment policy was relying
on the taskId. Thus, even inside a transaction, the tasks were
assigned to different nodes. This was especially problematic
while reading from reference tables within transaction blocks.
Because, we had to expand the distributed transaction to many
nodes that are not necessarily already in the distributed transaction.
2019-02-22 19:26:38 +03:00
Onder Kalaci e521e7e39c Apply feedback 2019-02-22 18:14:30 +03:00
Onder Kalaci 407d0e30f5 Fix selectForUpdate bug 2019-02-21 18:21:41 +03:00
Onder Kalaci f144bb4911 Introduce fast path router planning
In this context, we define "Fast Path Planning for SELECT" as trivial
queries where Citus can skip relying on the standard_planner() and
handle all the planning.

For router planner, standard_planner() is mostly important to generate
the necessary restriction information. Later, the restriction information
generated by the standard_planner is used to decide whether all the shards
that a distributed query touches reside on a single worker node. However,
standard_planner() does a lot of extra things such as cost estimation and
execution path generations which are completely unnecessary in the context
of distributed planning.

There are certain types of queries where Citus could skip relying on
standard_planner() to generate the restriction information. For queries
in the following format, Citus does not need any information that the
standard_planner() generates:

  SELECT ... FROM single_table WHERE distribution_key = X;  or
  DELETE FROM single_table WHERE distribution_key = X; or
  UPDATE single_table SET value_1 = value_2 + 1 WHERE distribution_key = X;

Note that the queries might not be as simple as the above such that
GROUP BY, WINDOW FUNCIONS, ORDER BY or HAVING etc. are all acceptable. The
only rule is that the query is on a single distributed (or reference) table
and there is a "distribution_key = X;" in the WHERE clause. With that, we
could use to decide the shard that a distributed query touches reside on
a worker node.
2019-02-21 13:27:01 +03:00
Hanefi Onaldi 148dcad0bb
More documentation and stale comments rewritten 2019-02-04 20:21:51 +03:00
Hanefi Onaldi 825666f912
Query samples in docs and better errors 2019-02-04 19:20:02 +03:00
Hanefi Onaldi 574b071113
Add wrapper function introduced in PG11 for compatibility 2019-02-04 19:20:02 +03:00
Hanefi Onaldi 1106e14385
Wrap functions in subqueries
remove debug logs to fix travis tests

Support RowType functions in joins

Regression tests for a custom type function in join
2019-02-04 19:19:29 +03:00
Murat Tuncer b36b59dd4f Relax reference table restrictions in subquery union pushdowns
We used to error out if there is a reference table
in the query participating a union. This has caused
pushdownable queries to be evaluated in coordinator.

Now we let reference tables inside union queries as long
as there is a distributed table in from clause.

Existing join checks (reference table on the outer part)
sufficient enought that we do not need check the join relation
of reference tables.
2019-01-31 15:34:29 +03:00
Onder Kalaci ec67381ba2 Queries with only intermediate results do not rely on task assignment policy
Previously we allowed task assignment policy to have affect on router queries
with only intermediate results. However, that is erroneous since the code-path
that assigns placements relies on shardIds and placements, which doesn't exists
for intermediate results.

With this commit, we do not apply task assignment policies when a router query
hits only intermediate results.
2019-01-28 17:59:17 +03:00
velioglu 1bb0ec316a Reset planner restriction context instead of popping with recursive planning 2019-01-17 14:35:16 +03:00
Jason Petersen 339e6e661e
Remove 9.6 (#2554)
Removes support and code for PostgreSQL 9.6

cr: @velioglu
2019-01-16 13:11:24 -07:00
Marco Slot 1656b519c4 Plan outer joins through pushdown planning 2019-01-05 20:55:27 +01:00
Murat Tuncer b389bebda1 Move repeated code to a function 2019-01-03 17:19:01 +03:00
Murat Tuncer 2ed7d24591 Fix having clause bug for complex joins
We update column attributes of various clauses for a query
inluding target columns, select clauses when we introduce
new range table entries in the query.

It seems having clause column attributes were not updated.

This fix resolves the issue
2019-01-03 17:07:26 +03:00
velioglu 90704d9a52 Fix getting function oid to get hll_add_agg id 2018-12-10 14:16:19 +03:00
velioglu 8764a19464 Adds support for disabling hash agg with hll functions on coordinator query 2018-12-07 18:49:25 +03:00
Marco Slot 0388324fbe Expand planner readme 2018-12-04 09:55:19 +01:00
Onder Kalaci b6ebd791a6 Sort task list for multi-task explain outputs
This is purely for ensuring that regression tests do not randomly fail.
2018-11-30 11:19:37 -07:00
Marco Slot 8893cc141d Support INSERT...SELECT with ON CONFLICT or RETURNING via coordinator
Before this commit, Citus supported INSERT...SELECT queries with
ON CONFLICT or RETURNING clauses only for pushdownable ones, since
queries supported via coordinator were utilizing COPY infrastructure
of PG to send selected tuples to the target worker nodes.

After this PR, INSERT...SELECT queries with ON CONFLICT or RETURNING
clauses will be performed in two phases via coordinator. In the first
phase selected tuples will be saved to the intermediate table which
is colocated with target table of the INSERT...SELECT query. Note that,
a utility function to save results to the colocated intermediate result
also implemented as a part of this commit. In the second phase, INSERT..
SELECT query is directly run on the worker node using the intermediate
table as the source table.
2018-11-30 15:29:12 +03:00
Hanefi Onaldi 088a2ef66a throw an error when a subquery has grouping set clause 2018-11-30 13:11:32 +03:00
Nils Dijk f9520be011
Round robin queries to reference tables with task_assignment_policy set to `round-robin` (#2472)
Description: Support round-robin `task_assignment_policy` for queries to reference tables.

This PR allows users to query multiple placements of shards in a round robin fashion. When `citus.task_assignment_policy` is set to `'round-robin'` the planner will use a round robin scheduling feature when multiple shard placements are available.

The primary use-case is spreading the load of reference table queries to all the nodes in the cluster instead of hammering only the first placement of the reference table. Since reference tables share the same path for selecting the shards with single shard queries that have multiple placements (`citus.shard_replication_factor > 1`) this setting also allows users to spread the query load on these shards.

For modifying queries we do not apply a round-robin strategy. This would be negated by an extra reordering step in the executor for such queries where a `first-replica` strategy is enforced.
2018-11-15 15:11:15 +01:00
Marco Slot f383e4f307
Description: Refactor code that handles DDL commands from one file into a module
The file handling the utility functions (DDL) for citus organically grew over time and became unreasonably large. This refactor takes that file and refactored the functionality into separate files per command. Initially modeled after the directory and file layout that can be found in postgres.

Although the size of the change is quite big there are barely any code changes. Only one two functions have been added for readability purposes:

- PostProcessIndexStmt which is extracted from PostProcessUtility
- PostProcessAlterTableStmt which is extracted from multi_ProcessUtility

A README.md has been added to `src/backend/distributed/commands` describing the contents of the module and every file in the module.
We need more documentation around the overloading of the COPY command, for now the boilerplate has been added for people with better knowledge to fill out.
2018-11-14 13:36:27 +01:00
Murat Tuncer cc401a2616 Create function_utils for pg function call related utilities 2018-11-07 15:29:38 +03:00
Onder Kalaci 9e2e2a7300 Make sure to access PARAM_EXTERN accurately in PG 11
PG 11 has change the way that PARAM_EXTERN is processed.
This commit ensures that Citus follows the same pattern.

For details see the related Postgres commit:
6719b238e8
2018-10-25 21:55:03 +03:00
Jason Petersen ae9a98c2d1
Attempt to address planner context crashes
Both of these are a bit of a shot in the dark. In one case, we noticed
a stack trace where a caller received a null pointer and attempted to
dereference the memory context field (at 0x010). In the other, I saw
that any error thrown from within AdjustParseTree could keep the stack
from being cleaned up (presumably if we push we should always pop).

Both stack traces were collected during times of high memory pressure
and locally reproducing the problem locally or otherwise has been very
tricky (i.e. it hasn't been reproduced reliably at all).
2018-10-18 08:41:51 -06:00
Jason Petersen 9fb951c312
Fix user-facing typos
Lintian found these (presumably by looking in the text section and
running them through e.g. aspell).
2018-10-09 16:54:03 -07:00
Marco Slot 1ca9a5b867 Do not allow unresolved parameters in INSERT...SELECT 2018-09-24 14:12:04 +02:00
Onder Kalaci c1b5a04f6e Allow partitioned tables with replication factor > 1
With this commit, we all partitioned distributed tables with
replication factor > 1. However, we also have many restrictions.

In summary, we disallow all kinds of modifications (including DDLs)
on the partition tables. Instead, the user is allowed to run the
modifications over the parent table.

The necessity for such a restriction have two aspects:
   - We need to acquire shard resource locks appropriately
   - We need to handle marking partitions INVALID in case
     of any failures. Note that, in theory, the parent table
     should also become INVALID, which is too aggressive.
2018-09-21 14:40:41 +03:00
Onder Kalaci 41d606b575 Use tree walker instad of mutator in relation visibility
This commit uses *_walker instead of *_mutator for performance reasons.
Given that we're only updating a functionId in the tree, the approach
seems fine.
2018-09-18 09:33:01 +03:00
velioglu bd30e3e908 Add support for writing to reference tables from MX nodes 2018-08-27 18:15:04 +03:00
Onder Kalaci 910ea392f5 Prevent multiple placements of a single shard to lead huge memory allocations 2018-08-22 19:25:01 +03:00
Onder Kalaci cb481f55cf Prevent excessive number of unnecessary range table traversal 2018-08-22 11:45:00 +03:00
mehmet furkan şahin ef9f38b68d ApplyLogRedaction noop func is added 2018-08-17 14:48:54 -07:00
Onder Kalaci 974cbf11a5 Hide shard names on MX worker nodes
This commit by default enables hiding shard names on MX workers
by simple replacing `pg_table_is_visible()` calls with
`citus_table_is_visible()` calls on the MX worker nodes. The latter
function filters out tables that are known to be shards.

The main motivation of this change is a better UX. The functionality
can be opted out via a GUC.

We also added two views, namely citus_shards_on_worker and
citus_shard_indexes_on_worker such that users can query
them to see the shards and their corresponding indexes.

We also added debug messages such that the filtered tables can
be interactively seen by setting the level to DEBUG1.
2018-08-07 14:21:45 +03:00
Nils Dijk 2d13900230
error on unsupported changing of distirbution column in ON CONFLICT for INSERT ... SELECT 2018-07-23 15:18:21 +02:00
Nils Dijk 6a15e1c9fc
extract ErrorIfOnConflictNotSupported function for reuse 2018-07-23 12:20:10 +02:00
mehmet furkan şahin 3afa7f425d Topn aggregates are supported 2018-07-10 14:33:42 +03:00
Murat Tuncer f20258ef10 Expand count distinct support
We can now support more complex count distinct operations by
pulling necessary columns to coordinator and evalutating the
aggreage at coordinator.

It supports broad range of expression with the restriction that
the expression must contain a column.
2018-07-06 09:44:20 +03:00
mehmet furkan şahin 06217be326 hll aggregate functions are supported natively 2018-07-04 16:41:09 +03:00
Murat Tuncer e532755a6e Fix bug in partition column extraction
added strip_implicit_coercion prior to
checking if the expression is Const.
This is important to find values for types
like bigint.
2018-07-02 18:08:16 +03:00
Murat Tuncer 4d35b92016 Add groundwork for citus_stat_statements api 2018-06-27 14:20:03 +03:00
velioglu 53b2e81d01 Adds SELECT ... FOR UPDATE support for router plannable queries 2018-06-18 13:55:17 +03:00
Marco Slot fd4ff29f2f Add a debug message with distribution column value 2018-06-05 15:09:17 +03:00
velioglu caa27161ca Check volatile functions in modify queries 2018-05-08 11:16:40 +03:00
Marco Slot 2f9c8c6af0 Allow DML commands with unreferenced SELECT CTEs 2018-05-03 14:53:26 +02:00
Marco Slot f8cfe07fd1 Support intermediate results in distributed INSERT..SELECT 2018-05-03 14:42:28 +02:00
Marco Slot 90cdfff602 Implement recursive planning for DML statements 2018-05-03 14:42:28 +02:00
Onder Kalaci 317dd02a2f Implement single repartitioning on hash distributed tables
* Change worker_hash_partition_table() such that the
     divergence between Citus planner's hashing and
     worker_hash_partition_table() becomes the same.

   * Rename single partitioning to single range partitioning.

   * Add single hash repartitioning. Basically, logical planner
     treats single hash and range partitioning almost equally.
     Physical planner, on the other hand, treats single hash and
     dual hash repartitioning almost equally (except for JoinPruning).

   * Add a new GUC to enable this feature
2018-05-02 18:50:55 +03:00
velioglu 32bcd610c1 Support modify queries with multiple tables
With this commit we begin to support modify queries with multiple
tables if these queries are pushdownable.
2018-05-02 16:22:26 +03:00
velioglu d9fa69c031 Refactor query pushdown related logic 2018-05-02 15:03:09 +03:00
velioglu 121ff39b26 Removes large_table_shard_count GUC 2018-04-29 10:34:50 +02:00
Onder Kalaci 832c91e28c Move processing each part of the query into its own functions
This commit doesn't change any of the logic at all.

Instead, the goal is to:

 * Get rid of any code duplication
 * Incremental changes to the optimizer made it slightly hard
   to follow the code, improve that and make it easier to
   implement new features
 * Simplify the code by moving each part of query processing (e.g.,
   DISTINCT, LIMIT etc) into its own function
 * Make the interaction between each part of the query more
   obvious (e.g., How DISTINCT affects LIMIT etc)
2018-04-27 17:32:38 +03:00
Marco Slot 304b3a41ba Cache the partition column Var 2018-04-26 14:58:16 -06:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Onder Kalaci ac8f2f1e6d Eliminate code duplication in WorkerExtendedOpNode()
Before this commit, we had code duplication in the
WorkerExtendedOpNode(). The duplication was
noticeable and any change is prone to bugs.

The PR consists of 4 commits. Each commit incrementally
fixes the problem by moving certain parts of the duplicated
code into smaller, better-documented functions.
2018-04-25 08:54:59 +03:00
Onder Kalaci ee748d9140 Unify extendedOpNode Processing
Before this commit, we had a divergence among
the creation of master/worker extended op nodes.

This commit moves the related parts into a single place
and allows the creation of master/extended op nodes to
share a common data structure.
2018-04-24 11:56:38 +03:00
Onder Kalaci 814f0e3acc Ensure Citus never try to access a not planned subquery
PostgreSQL might remove some of the subqueries when they do not
contribute to the query result at all. Citus should not try to
access such subqueries during planning.
2018-04-20 13:52:00 +03:00
mehmet furkan şahin e5a5502b16 Adds support for multiple ANDs in Having
This PR adds support for multiple AND expressions in Having
for pushdown planner. We simply make a call to make_ands_explicit
from MultiLogicalPlanOptimize for the having qual in
workerExtendedOpNode.
2018-04-16 14:14:48 +03:00
velioglu 82b2d21b0c Convert broadcast join to reference join
After this commit large_table_shard_count wont be used to
check whether broadcast join, which is renamed as reference
join, can be applied. Reference join can only be applied over
reference tables.
2018-04-13 12:58:14 +03:00
velioglu 1b92812be2 Add co-placement check to CoPartition function 2018-04-13 12:13:08 +03:00
Marco Slot ee132c5ead Prune shards once per relation in subquery pushdown 2018-04-10 20:33:07 +02:00
velioglu 72dfe4a289 Adds colocation check to local join 2018-04-04 22:49:27 +03:00
velioglu 698d585fb5 Remove broadcast join logic
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
2018-03-30 11:45:19 +03:00
Murat Tuncer 1440caeef2
Fix incorrect limit pushdown when distinct clause is not superset of group by (#2035)
Pushing down limit and order by into workers may produce
wrong output when distinct on() clause has expressions,
aggregates, or window functions.

This checking allows pushing down of limits only if
distinct clause is a superset of group by clause. i.e. it contains all clauses in group by.
2018-03-07 13:24:56 +03:00
Onder Kalaci 40b898b59f Improve error messages for INSERT queries that have subqueries 2018-03-05 14:46:47 +02:00
Murat Tuncer 76f6883d5d
Add support for window functions that can be pushed down to worker (#2008)
This is the first of series of window function work.

We can now support window functions that can be pushed down to workers.
Window function must have distribution column in the partition clause
 to be pushed down.
2018-03-01 19:07:07 +03:00
Marco Slot e79db17b91 Update comment in WorkerAggregateExpressionList 2018-02-27 23:48:25 +01:00
Murat Tuncer e13c5beced
Fix worker query when order by avg aggregate is used (#2024)
We push down order by to worker query when limit is specified
(with some other additional checks). If the query has an expression
on an aggregate or avg aggregate by itself, and there is an order
by on this particular target we may send wrong order by to worker
query with potential to affect query result.

The fix creates a auxilary target entry in the worker query and
uses that target entry for sorting.
2018-02-28 12:12:54 +03:00
Metin Doslu bcf660475a Add support for modifying CTEs 2018-02-27 15:08:32 +02:00
velioglu 78e6d990a2 Fix master plan of the query with distinct, aggregate and group by clauses.
Before this PR, we were trusting on the columns of group by about
guaranteeing the uniqueness of the results. However, this assumption
is correct only if the columns in the group by is subset of columns
in the distinct clause. It can be wrong if we have part of group by
columns and some aggregation columns in the distinct clause. With
this PR, we add distinct plan on top of aggregate plan when necessary.
2018-02-26 15:30:15 +03:00
Onder Kalaci 1c930c96a3 Support non-co-located joins between subqueries
With #1804 (and related PRs), Citus gained the ability to
plan subqueries that are not safe to pushdown.

There are two high-level requirements for pushing down subqueries:

   * Individual subqueries that require a merge step (i.e., GROUP BY
     on non-distribution key, or LIMIT in the subquery etc). We've
     handled such subqueries via #1876.

    * Combination of subqueries that are not joined on distribution keys.
      This commit aims to recursively plan some of such subqueries to make
      the whole query safe to pushdown.

The main logic behind non colocated subquery joins is that we pick
an anchor range table entry and check for distribution key equality
of any  other subqueries in the given query. If for a given subquery,
we cannot find distribution key equality with the anchor rte, we
recursively plan that subquery.

We also used a hacky solution for picking relations as the anchor range
table entries. The hack is that we wrap them into a subquery. This is only
necessary since some of the attribute equivalance checks are based on
queries rather than range table entries.
2018-02-26 13:50:37 +02:00
Onder Kalaci 7b57e0562a Add infrastructure for detecting non-colocated subqueries 2018-02-26 13:28:25 +02:00
Onder Kalaci 4d70c86645 Leaf level recursive planning for non colocated subqueries
With this commit, we enable recursive planning for the subqueries
that are not joined on the distribution keys.
2018-02-26 13:28:24 +02:00
Onder Kalaci e998703ff8 Enable restriction eq. checks for top level set operations
We used to only support pushdownable set operations inside a
subquery, however, we could easily expand the restriction
checks to cover top level set operations as well.
2018-02-26 13:28:24 +02:00
Onder Kalaci e8aa532a90 Refactor checks for distribution key equality
Change some function names, ensure we stick to Citus'
function order rules etc.
2018-02-26 13:28:24 +02:00
Markus Sintonen 6202e80d06 Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg 2018-02-18 00:19:18 +02:00
velioglu 195ac948d2 Recursively plan subqueries in WHERE clause when FROM recurs 2018-02-13 19:52:12 +03:00
Marco Slot ee6a751798 Only copy distributed plan when modifying it 2018-02-12 16:30:55 +01:00
Onder Kalaci 94c5ac6ebb Remove duplicate join restrictions
We use PostgreSQL hooks to accumulate the join restrictions
and PostgreSQL gives us all the join paths it tries while
deciding on the join order. Thus, for queries that have many
joins, this function is likely to remove lots of duplicate join
restrictions. This becomes relevant for Citus on query pushdown
check peformance.
2018-02-12 18:35:05 +02:00
Onder Kalaci c228d8ff3d Refactor equivalance generation related codes
This commit changes the APIs for restriction generation to make future
changes simpler.
2018-02-12 18:35:04 +02:00
Onder Kalaci 2f2d350924 Refactor relation restriction related codes
This commit moves some of the functions to a more relevant
source file.
2018-02-12 18:35:04 +02:00
Murat Tuncer 901b543e20 Fix count distinct using field select on top level query
We were allowing count distict queries even if they were
not directly on columns if the query is grouped on
distribution column.

When performing these checks we were skipping subqueries
because they also perform this check in a more concise manner.
We relied on oid SUBQUERY_RELATION_ID (10000) to decide if
a given RTE relation id denotes a subquery, however, we also
use SUBQUERY_PUSHDOWN_RELATION_ID (10001) for some subqueries.

We skip both type of subqueries with this change.
2018-02-06 13:16:10 +03:00
metdos 35f864bcaf Respect enable_hashagg in the master planner 2018-02-05 15:06:00 +02:00
metdos 3d540d961c Fix typo in grouping_is_sortable() 2018-02-05 12:10:19 +02:00
Onder Kalaci a1bbdf2d44 Outer joins should also use subquery pushdown planner if join
clause is not supported

This change allows unsupported clauses to go through query pushdown
planner instead of erroring out as we already do for non-outer joins.
2017-12-29 16:40:47 +02:00
Marco Slot 09c09f650f Recursively plan set operations when leaf nodes recur 2017-12-26 13:46:55 +02:00
mehmet furkan şahin 446893234a unsupported subquery error messages are fixed 2017-12-25 15:10:59 +03:00
mehmet furkan şahin 57bc86e23d new debug output for subplans 2017-12-25 09:50:51 +03:00
Murat Tuncer 87c6f306f1
Fix join clause eq restrictions (#1884)
We used to error out if the join clause includes filters like
t1.a < t2.a even if other filter like t1.key = t2.key exists.

Recently we lifted that restriction in subquery planning by
not lifting that restriction and focusing on equivalance classes
provided by postgres.

This checkin forwards previously erroring out real-time queries
due to join clauses to subquery planner and let it handle the
join even if the query does not have a subquery.

We are now pushing down queries that do not have any
subqueries in it. Error message looked misleading, changed to a more descriptive one.
2017-12-22 12:16:14 +03:00
Murat Tuncer a9cf0c3e66
Fix CTE column alias issue (#1893)
We were creating intermediate query result's target
names from subquery target list. Now we also check
if cte re-defines its column name aliases, and create
intermediate result query accordingly.
2017-12-22 09:39:40 +03:00
Onder Kalaci 0d5a4b9c72 Recursively plan subqueries that are not safe to pushdown
With this commit, Citus recursively plans subqueries that
are not safe to pushdown, in other words, requires a merge
step.

The algorithm is simple: Recursively traverse the query from bottom
up (i.e., bottom meaning the leaf queries). On each level, check
whether the query is safe to pushdown (or a single repartition
subquery). If the answer is yes, do not touch that subquery. If the
answer is no, plan the subquery seperately (i.e., create a subPlan
for it) and replace the subquery with a call to
`read_intermediate_results(planId, subPlanId)`. During the the
execution, run the subPlans first, and make them avaliable to the
next query executions.

Some of the queries hat this change allows us:

   * Subqueries with LIMIT
   * Subqueries with GROUP BY/DISTINCT on non-partition keys
   * Subqueries involving re-partition joins, router queries
   * Mixed usage of subqueries and CTEs (i.e., use CTEs in
     subqueries as well). Nested subqueries as long as we
     support the subquery inside the nested subquery.
   * Subqueries with local tables (i.e., those subqueries
     has the limitation that they have to be leaf subqueries)

   * VIEWs on the distributed tables just works (i.e., the
     limitations mentioned below still applies to views)

Some of the queries that is still NOT supported:

  * Corrolated subqueries that are not safe to pushdown
  * Window function on non-partition keys
  * Recursively planned subqueries or CTEs on the outer
    side of an outer join
  * Only recursively planned subqueries and CTEs in the FROM
    (i.e., not any distributed tables in the FROM) and subqueries
    in WHERE clause
  * Subquery joins that are not on the partition columns (i.e., each
    subquery is individually joined on partition keys but not the upper
    level subquery.)
  * Any limitation that logical planner applies such as aggregate
    distincts (except for count) when GROUP BY is on non-partition key,
    or array_agg with ORDER BY
2017-12-21 08:37:40 +02:00
Onder Kalaci e12ea914b9 Refactor ErrorIfQueryNotSupported to defer errors 2017-12-20 09:03:49 +02:00
Onder Kalaci 71ce42b936 Refactor RecursivelyPlanSubqueriesAndCTEs() to make it ready
to work with subqueries
2017-12-20 09:03:47 +02:00
Marco Slot 5e0539efa3 Plan CTEs when subquery pushdown is on 2017-12-19 16:34:56 +01:00
Marco Slot 44a1ea631a Show distributed subplan ID in EXPLAIN output 2017-12-19 16:34:56 +01:00
Marco Slot 7dab078e67 Set cost estimates for read_intermediate_result 2017-12-18 16:23:44 +01:00
Marco Slot 74bd33d0cc Revert "Plan CTEs when subquery pushdown is on"
This reverts commit e3b953b8e3.
2017-12-17 22:34:20 +01:00
Marco Slot aca5f35ab9 Revert "Show distributed subplan ID in EXPLAIN output"
This reverts commit 686b079272.
2017-12-17 22:34:04 +01:00
Marco Slot e3b953b8e3 Plan CTEs when subquery pushdown is on 2017-12-17 21:49:36 +01:00
Marco Slot 686b079272 Show distributed subplan ID in EXPLAIN output 2017-12-16 11:32:01 +01:00
Marco Slot ea6b98fda4 Allow count(distinct) in queries with a subquery 2017-12-15 15:24:26 +01:00
Marco Slot 5a69fc1b17 Relax checks on recurring tuples in FROM with sublinks 2017-12-15 11:56:06 +01:00
Marco Slot 2e2b4e81fa Add support for CTEs in distributed queries 2017-12-14 09:32:55 +01:00
Marco Slot 66f9f1d6cd Make some intermediate results functions public 2017-12-14 09:32:55 +01:00
Onder Kalaci 86b2d9420c Treat recurring tuples as reference table for GROUP BY checks
read_intermediate_results() and immutable functions are implemented.
Empty join trees seems not applicable here.
2017-12-13 14:55:42 +02:00
Marco Slot 60a1e31671 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 16:20:23 +01:00
Marco Slot d8fea4efb8 Revert "Allow queries with local tables in NeedsDistributedPlanning"
This reverts commit d2bac081e8.
2017-12-07 11:19:11 +01:00
Marco Slot d2bac081e8 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 11:02:16 +01:00
Onder Kalaci c42a92afd2 Fix bug related to incrementing an index not properly 2017-12-07 08:50:57 +02:00