Commit Graph

1740 Commits (8f9ef63e8a993eb5a576aa7ca28463ee63f202b6)

Author SHA1 Message Date
Halil Ozan Akgul 8ce4f20061 Fixes the bug of grants on public schema propagation 2020-02-05 18:05:58 +03:00
Hadi Moshayedi 9dd14fa90d Rename discarded target list items in repartitioned INSERT/SELECT 2020-02-05 11:06:44 +01:00
Onder Kalaci c7e2309f4c Improve single hash-repartitioning with numeric (or non-int) types
We used to treat the shard interval array that we passed as numeric[].
However, it should be int[], as the shard ranges are int[].
2020-02-04 20:30:04 +01:00
Hadi Moshayedi bc1a800f70 Use current user for repartition join temp schemas.
Otherwise when using a less privileged user we might get
errors when trying to create the schema.
2020-02-04 09:48:20 -08:00
Hadi Moshayedi 264530311a Don't use distributed insert/select for repartitioned joins 2020-02-03 13:13:30 -08:00
Marco Slot be77d3304f Fixup 2020-02-03 11:59:55 +01:00
Marco Slot b0fd6aa006 If reference tables was read over multiple connections, do not assign connection 2020-02-03 11:54:29 +01:00
Onder Kalaci 2f274a4fce Make sure to go deeper into the functions to search for PARAMs
For example, a PARAM might reside inside a function just because
of a casting of a type such as the follows:

```
               {FUNCEXPR
               :funcid 1740
               :funcresulttype 1700
               :funcretset false
               :funcvariadic false
               :funcformat 2
               :funccollid 0
               :inputcollid 0
               :args (
                  {PARAM
                  :paramkind 0
                  :paramid 15
                  :paramtype 23
                  :paramtypmod -1
                  :paramcollid 0
                  :location 356
                  }
               )
```

We should recursively check the expression before bailing out.
2020-02-03 09:36:12 +01:00
Philip Dubé d43c80d4d8 pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true
This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by
2020-01-30 21:18:08 +00:00
Philip Dubé 84a500ffc6 CitusRemoveDirectory: loop when directory is not empty
Sometimes during errors workers will create files while we're deleting intermediate directories

example:
DEBUG:  could not remove file "base/pgsql_job_cache/10_0_431": Directory not empty
DETAIL:  WARNING from localhost:57637
2020-01-30 20:02:08 +00:00
Philip Dubé 5fccc56d3e Expand the set of aggregates which cannot have LIMIT approximated
Previously we only prevented AVG from being pushed down, but this is incorrect:
- array_agg, while somewhat non sensical to order by, will potentially be missing values
- combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative)

This commit limits approximating LIMIT pushdown when ordering by aggregates to:
min, max, sum, count, bit_and, bit_or, every, any
Which means of those we previously supported, we now exclude:
avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union
2020-01-30 17:45:18 +00:00
Önder Kalacı 8584cb005b
Do not evaluate functions on the coordinator for SELECT queries (#3440)
Previously, the logic for evaluting the functions and the parameters
were the same. That ended-up evaluting the functions inaccurately
on the coordinator. Instead, split the function evaluation logic
from parameter evalution logic.
2020-01-30 08:47:28 +01:00
Önder Kalacı 412fe719f7
Hide citus.enable_ddl_propagation setting (#3437)
As that is powerful and cause metadata inconsistency. See the following steps:

(Note that we cannot use PGC_SUSET because on Citus MX we need this flag for non-
superusers as well)

```SQL
CREATE TABLE test_ref_table(key int);
SELECT create_reference_table('test_ref_table');

SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition;
┌────────────────┬──────────────┐
│  logicalrelid  │ logicalrelid │
├────────────────┼──────────────┤
│ test_ref_table │        16831 │
└────────────────┴──────────────┘
(1 row)

Time: 0.929 ms

SELECT relname FROM pg_class WHERE oid = 16831;
┌────────────────┐
│    relname     │
├────────────────┤
│ test_ref_table │
└────────────────┘
(1 row)

Time: 0.785 ms

SET citus.enable_ddl_propagation TO off;

 DROP TABLE test_ref_table ;

SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition;
┌──────────────┬──────────────┐
│ logicalrelid │ logicalrelid │
├──────────────┼──────────────┤
│ 16831        │        16831 │
└──────────────┴──────────────┘
(1 row)
Time: 0.972 ms

SELECT relname FROM pg_class WHERE oid = 16831;
┌─────────┐
│ relname │
├─────────┤
└─────────┘
(0 rows)

Time: 0.908 ms

 SELECT master_add_node('localhost', 9703);
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 5.028 ms
!>

```
2020-01-29 10:17:53 +01:00
SaitTalhaNisanci 94bd563ff0
switch back to old memory context in cache local plan for task (#3428) 2020-01-27 13:00:46 +03:00
Önder Kalacı 4519d3411d
Improve the representation of used sub plans (#3411)
Previously, we've identified the usedSubPlans by only looking
to the subPlanId.

With this commit, we're expanding it to also include information
on the location of the subPlan.

This is useful to distinguish the cases where the subPlan is used
either on only HAVING or both HAVING and any other part of the query.
2020-01-24 10:47:14 +01:00
Philip Dubé 50c5e814c8 CurrentDatabaseName: return const char* as we're borrowing from cache 2020-01-23 22:49:35 +00:00
Hadi Moshayedi 1dc19215eb Don't error for ENOENT in CitusRemoveDirectory.
For concurrency reasons, this can happen even if initial stat succeeded.
2020-01-23 10:07:54 -08:00
Hadi Moshayedi 3e1004c232 Change DistributedResultFragment::nodeId to uint32.
This is to match the type of WorkerNode::nodeId.
2020-01-23 09:33:15 -08:00
Önder Kalacı ef7d1ea91d
Locally execute queries that don't need any data access (#3410)
* Update shardPlacement->nodeId to uint

As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.

* Relax the restrictions on using the local execution

Previously, whenever any local execution happens, we disabled further
commands to do any remote queries. The basic motivation for doing that
is to prevent any accesses in the same transaction block to access the
same placements over multiple sessions: one is local session the other
is remote session to the same placement.

However, the current implementation does not distinguish local accesses
being to a placement or not. For example, we could have local accesses
that only touches intermediate results. In that case, we should not
implement the same restrictions as they become useless.

So, this is a pre-requisite for executing the intermediate result only
queries locally.

* Update the error messages

As the underlying implementation has changed, reflect it in the error
messages.

* Keep track of connections to local node

With this commit, we're adding infrastructure to track if any connection
to the same local host is done or not.

The main motivation for doing this is that we've previously were more
conservative about not choosing local execution. Simply, we disallowed
local execution if any connection to any remote node is done. However,
if we want to use local execution for intermediate result only queries,
this'd be annoying because we expect all queries to touch remote node
before the final query.

Note that this approach is still limiting in Citus MX case, but for now
we can ignore that.

* Formalize the concept of Local Node

Also some minor refactoring while creating the dummy placement

* Write intermediate results locally when the results are only needed locally

Before this commit, Citus used to always broadcast all the intermediate
results to remote nodes. However, it is possible to skip pushing
the results to remote nodes always.

There are two notable cases for doing that:

   (a) When the query consists of only intermediate results
   (b) When the query is a zero shard query

In both of the above cases, we don't need to access any data on the shards. So,
it is a valuable optimization to skip pushing the results to remote nodes.

The pattern mentioned in (a) is actually a common patterns that Citus users
use in practice. For example, if you have the following query:

WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...)
SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...;

The final query could be operating only on intermediate results. With this patch,
the intermediate results of the ctes are not unnecessarily pushed to remote
nodes.

* Add specific regression tests

As there are edge cases in Citus MX and with round-robin policy,
use the same queries on those cases as well.

* Fix failure tests

By forcing not to use local execution for intermediate results since
all the tests expects the results to be pushed remotely.

* Fix flaky test

* Apply code-review feedback

Mostly style changes

* Limit the max value of pg_dist_node_seq to reserve for internal use
2020-01-23 18:28:34 +01:00
Onder Kalaci a0dff301c7 Update shardPlacement->nodeId to uint
As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.
2020-01-23 13:00:24 +01:00
Jelte Fennema c62b756f34
Fix new method of locking shard distribition metadata (#3407)
In #3374 a new way of locking shard distribution metadata was
implemented. However, this was only done in the function
`LockShardDistributionMetadata` and not in
`TryLockShardDistributionMetadata`. This is bad, since it causes these
locks to not block eachother in some cases.

This commit fixes this issue by sharing the code that sets the locktag
between the two function.
2020-01-22 16:44:17 +01:00
Jelte Fennema cd5259a25a
Do not place new shards with shards in TO_DELETE state (#3408)
When creating a new distributed table. The shards would colocate with shards
with SHARD_STATE_TO_DELETE (shardstate = 4). This means if that state was
because of a shard move the new shard would be created on two nodes and it
would not get deleted since it's shard state would be 1.
2020-01-22 14:52:12 +01:00
Onder Kalaci 4be69bbf6f Fix reference table issue 2020-01-20 18:45:18 +00:00
Halil Ozan Akgul b40f067d05 Adds propagation for grant on schema commands 2020-01-20 14:51:28 +03:00
Philip Dubé fdcc413559 Code cleanup of adaptive_executor, connection_management, placement_connection
adaptive_executor: sort includes, use foreach_ptr, remove lies from FinishDistributedExecution docs
connection_management: rename msecs, which isn't milliseconds
placement_connection: small typos
2020-01-17 17:44:47 +00:00
Onder Kalaci 2f0ef8bc36 Apply feedback 1 2020-01-17 16:06:04 +01:00
Onder Kalaci 0bf1e81e33 Cache local plans on BeginScan 2020-01-17 16:02:57 +01:00
Onder Kalaci 5dc454cdad Exclude localPlannedStatements from copy distributedPlan 2020-01-17 16:02:57 +01:00
Onder Kalaci ff12df411b Add LocalPlannedStatement struct 2020-01-17 16:02:57 +01:00
Onder Kalaci 3833a7e686 Fix issues for CTE inlining on Postgres 11
Comment from code:

/*
 * We had to implement this hack because on Postgres11 and below, the originalQuery
 * and the query would have significant differences in terms of CTEs where CTEs
 * would not be inlined on the query (as standard_planner() wouldn't inline CTEs
 * on PG 11 and below).
 *
 * Instead, we prefer to pass the inlined query to the distributed planning. We rely
 * on the fact that the query includes subqueries, and it'd definitely go through
 * query pushdown planning. During query pushdown planning, the only relevant query
 * tree is the original query.
 */
2020-01-17 11:59:02 +01:00
Jelte Fennema 246435be7e
Lazy query deparsing executable queries (#3350)
Deparsing and parsing a query can be heavy on CPU. When locally executing 
the query we don't need to do this in theory most of the time.

This PR is the first step in allowing to skip deparsing and parsing
the query in these cases, by lazily creating the query string and
storing the query in the task. Future commits will make use of this and
not deparse and parse the query anymore, but use the one from the task
directly.
2020-01-17 11:49:43 +01:00
Hadi Moshayedi 6cf1c01660 Don't use repartitioned INSERT/SELECT for repartition joins 2020-01-16 23:40:31 -08:00
Hadi Moshayedi 5eeb07124f Repartitioned INSERT/SELECT: include job id in result id prefix 2020-01-16 23:24:52 -08:00
Hadi Moshayedi a079278b0c Repartitioned INSERT/SELECT: Add a GUC to enable/disable it 2020-01-16 23:24:52 -08:00
Hadi Moshayedi ce5eea4885 INSERT/SELECT: make SELECT column names unique 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 8635396cea Repartitioned INSERT/SELECT: Test rollback behaviour 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 97072c9eb1 INSERT/SELECT: show method in EXPLAIN output 2020-01-16 23:24:52 -08:00
Hadi Moshayedi fe548b762f Repartitioned INSERT/SELECT: Test CTEs 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 494cc383cc Repartitioned INSERT/SELECT: Enable RETURNING 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 44a2aede16 Don't start a coordinated transaction on workers.
Otherwise transaction hooks of Citus kick in and might cause unwanted errors.
2020-01-16 23:24:52 -08:00
Hadi Moshayedi 42c3c03b85 Handle extra columns added in ExpandWorkerTargetEntry() in repartitioned INSERT/SELECT 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 89463f9760 Repartitioned INSERT/SELECT: cast columns in SELECT targets 2020-01-16 23:24:52 -08:00
Hadi Moshayedi d67a384350 Enable repartitioned INSERT/SELECT ON CONFLICT. 2020-01-16 23:24:52 -08:00
Hadi Moshayedi b4e5f4b10a Implement INSERT ... SELECT with repartitioning 2020-01-16 23:24:52 -08:00
Hadi Moshayedi ced876358d INSERT/SELECT: Refactor out AddInsertSelectCasts 2020-01-16 23:24:52 -08:00
Hadi Moshayedi d449c1857c INSERT/SELECT: Use ExecutePlan* instead of ExecuteSelect* 2020-01-16 23:24:52 -08:00
Jelte Fennema 0ee1eab070 Make tests fail with a useful error message 2020-01-16 18:30:30 +01:00
Marco Slot 82f1fffa28 Fix epoll_ctl() error message on connection error 2020-01-16 06:40:57 +01:00
Onder Kalaci dc17c2658e Defer shard pruning for fast-path router queries to execution
This is purely to enable better performance with prepared statements.
Before this commit, the fast path queries with prepared statements
where the distribution key includes a parameter always went through
distributed planning. After this change, we only go through distributed
planning on the first 5 executions.
2020-01-16 16:59:36 +01:00
Onder Kalaci 933d666c0d Do not forget to copy fastPathRouterPlan@DistributedPlan 2020-01-16 16:39:20 +01:00
Halil Ozan Akgul c5539d20d9 Adds alter table schema propagation 2020-01-16 17:04:16 +03:00
Nils Dijk b6e09eb691
Fix: distributed function with table reference in declare (#3384)
DESCRIPTION: Fixes a problem when adding a new node due to tables referenced in a functions body

Fixes #3378 

It was reported that `master_add_node` would fail if a distributed function has a table name referenced in its declare section of the body. By default postgres validates the body of a function on creation. This is not a problem in the normal case as tables are replicated to the workers when we distribute functions.

However when a new node is added we first create dependencies on the workers before we try to create any tables, and the original tables get created out of bound when the metadata gets synced to the new node. This causes the function body validator to raise an error the table is not on the worker.

To mitigate this issue we set `check_function_bodies` to `off` right before we are creating the function.

The added test shows this does resolve the issue. (issue can be reproduced on the commit without the fix)
2020-01-16 14:21:54 +01:00
Jelte Fennema e76281500c
Replace shardId lock with lock on colocation+shardIntervalIndex (#3374)
This new locking pattern makes sure that some deadlocks that could
happend during rebalancing cannot occur anymore.
2020-01-16 13:14:01 +01:00
Onder Kalaci 81d8178625 Note that we'll drop the GUC after PG 11 support dropped 2020-01-16 12:28:15 +01:00
Onder Kalaci 64560b07be Update regression tests-2
In this commit, we're introducing a way to prevent CTE inlining via a GUC.

The GUC is used in all the tests where PG 11 and PG 12 tests would diverge
otherwise.

Note that, in PG 12, the restriction information for CTEs are generated. It
means that for some queries involving CTEs, Citus planner (router planner/
pushdown planner) may behave differently. So, via the GUC, we prevent
tests to diverge on PG 11 vs PG 12.

When we drop PG 11 support, we should get rid of the GUC, and mark
relevant ctes as MATERIALIZED, which does the same thing.
2020-01-16 12:28:15 +01:00
Onder Kalaci 5cb203b276 Update regression tests-1
These set of tests has changed in both PG 11 and PG 12.
The changes are only about CTE inlining kicking in both
versions, and yielding the exact same distributed planning.
2020-01-16 12:28:15 +01:00
Onder Kalaci efb1577d06 Handle CTE aliases accurately
Basically, make sure to update the column name with the CTEs alias
if we need to do so.
2020-01-16 12:28:15 +01:00
Onder Kalaci 05d600dd8f Call CTE inlining in Citus planner
The idea is simple: Inline CTEs(if any), try distributed planning.
If the planning yields a successful distributed plan, simply return
it.

If the planning fails, fallback to distributed planning on the query
tree where CTEs are not inlined. In that case, if the planning failed
just because of the CTE inlining, via recursive planning, the same
query would yield a successful plan.

A very basic set of examples:

WITH cte_1 AS (SELECT * FROM test_table)
SELECT
	*, row_number() OVER ()
FROM
	cte_1;

or

WITH a AS (SELECT * FROM test_table),
b AS (SELECT * FROM test_table)
SELECT * FROM  a JOIN b ON (a.value> b.value);
2020-01-16 12:28:15 +01:00
Onder Kalaci 01a5800ee8 Add Citus' CTE inlining functions
With this commit we add the necessary Citus function to inline CTEs
in a queryTree.

You might ask, why do we need to inline CTEs if Postgres is already
going to do it?

Few reasons behind this decision:

- One techinal node here is that Citus does the recursive CTE planning
  by checking the originalQuery which is the query that has not gone
  through the standard_planner().

  CTEs in Citus is super powerful. It is practically key for full SQL
  coverage for multi-shard queries. With CTEs, you can always reduce
  any query multi-shard query into a router query via recursive
  planning (thus full SQL coverage).
  We cannot let CTE inlining break that. The main idea is Citus should
  be able to retry planning if anything goes after CTE inlining.

  So, by taking ownership of CTE inlining on the originalQuery, Citus
  can fallback to recursive planning of CTEs if the planning with the
  inlined query fails. It could have been a lot harder if we had relied
  on standard_planner() to have the inlined CTEs on the original query.

- We want to have this feature in PostgreSQL 11 as well, but Postgres
  only inlines in version 12
2020-01-16 12:28:15 +01:00
Onder Kalaci 1856ab6cdd Copy & paste code from Postgres source
All the code in this commit is direct copy & paste from Postgres
source code.

We can classify the copy&paste code into two:

- Copy paste from CTE inline patch from postgres
  (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b)
  These include the functions inline_cte(), inline_cte_walker(),
  contain_dml(), contain_dml_walker().
  It also include the code in function PostgreSQLCTEInlineCondition().
  We prefer to extract that code into a seperate function, because
  (a) we'll re-use the logic later (b) we added one check for PG_11

  Finally, the struct "inline_cte_walker_context" is also copied from
  the same Postgres commit.

- Copy paste from the other parts of the Postgres code

  In order to implement CTE inlining in Postgres 12, the hackers
  modified the query_tree_walker()/range_table_walker() with the
  18c0da88a5

  Since Citus needs to support the same logic in PG 11, we copy & pasted
  that functions (and related flags) with the names pg_12_query_tree_walker()
  and pg_12_range_table_walker()
2020-01-16 12:28:15 +01:00
Philip Dubé 4d9a733c2f Fix inserting multiple values with row expression partition column causing the insert to be ignored
Raise an error instead of silently inserting nothing if we hit this condition in the future
2020-01-15 21:10:50 +00:00
Philip Dubé 4989c9a15c PlacementExecutionDone: We may mark placements as failed multiple times, but should only act the first time. 2020-01-15 18:20:01 +00:00
Marco Slot f1a0582973 Make ApplyLogRedaction a macro and redefine ereport 2020-01-13 18:24:36 +01:00
Marco Slot 06709ee108 Always use NOTICE in log_remote_commands and avoid redaction when possible 2020-01-13 18:24:36 +01:00
Marco Slot 90056f7d3c Remove copy from worker for append-partitioned table 2020-01-13 23:03:40 -08:00
Philip Dubé ccabf19090 Propagate DROP ROUTINE, ALTER ROUTINE
In two places I've made code more straight forward by using ROUTINE in our own codegen

Two changes which may seem extraneous:

AppendFunctionName was updated to not use pg_get_function_identity_arguments.
This is because that function includes ORDER BY when printing an aggregate like my_rank.
While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres,
ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not.

Tests were updated to use macaddr over integer. Using integer is flaky, our logic
could sometimes end up on tables like users_table. I originally wanted to use money,
but money isn't hashable.
2020-01-13 15:37:46 +00:00
Philip Dubé 4b5d6c3ebe Rename RelayFileState to ShardState
Replace FILE_ prefix with SHARD_STATE_
2020-01-12 05:57:53 +00:00
Philip Dubé e71386af33 Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT
Also use stack allocation for walkerContext in multi_logical_optimizer
2020-01-10 16:54:00 +00:00
Hadi Moshayedi 40ba2cdd6e Test RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Hadi Moshayedi 527d7d41c1 Implement RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Philip Dubé 281aacce9b Fix row-gather for subqueries being handled by task-tracker
task-tracker has specific logic for MultiPartition when GROUP BY is missing

We were ending up in this code path because row-gather removes GROUP BY
2020-01-10 01:51:37 +00:00
Hadi Moshayedi e1e383cb59 Don't override xact id assigned by coordinator on workers.
We might need to send commands from workers to other workers. In
these cases we shouldn't override the xact id assigned by coordinator,
or otherwise we won't read the consistent set of result files
accross the nodes.
2020-01-09 11:09:11 -08:00
Hadi Moshayedi c7c460e843 PartitionTasklistResults: Use different queries per placement
We need to know which placement succeeded in executing the worker_partition_query_result() call. Otherwise we wouldn't know which node to fetch from. This change allows that by introducing Task::perPlacementQueryStrings.
2020-01-09 10:55:58 -08:00
Hadi Moshayedi f38d0e5b3f Partitioned task list results. 2020-01-09 10:32:58 -08:00
Philip Dubé 73c06fae3b Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type 2020-01-09 18:24:29 +00:00
Philip Dubé bf7d86a3e8 Fix typo: aggragate -> aggregate 2020-01-07 01:16:09 +00:00
Philip Dubé 863bf49507 Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default 2020-01-07 01:16:04 +00:00
Jelte Fennema 5b0baea72c Refactor distributed_planner for better understandability 2020-01-06 14:23:38 +01:00
Onder Kalaci 5a1e752726 Apply feedback - add fastPath field to plan 2020-01-06 12:42:43 +01:00
Onder Kalaci 13a9b55695 Skip expensive checks when fast-path query
The definition of fast-path query is very strict. So, we don't need
to do some extra checks.
2020-01-06 12:42:43 +01:00
Onder Kalaci 7f3ab7892d Skip shard pruning when possible
We're already traversing the queryTree and finding the distribution
key value, so pass it to the later stages of the planning.
2020-01-06 12:42:43 +01:00
Onder Kalaci ca293116fa Reduce calls to FastPathRouterQuery()
Before this commit, we called it twice durning planning. Instead,
we save the information and pass it.
2020-01-06 12:42:43 +01:00
Onder Kalaci c8f14c9f6c Make sure to update shard states of partitions on failures
Fixes #3331

In #2389, we've implemented support for partitioned tables with rep > 1.
The implementation is limiting the use of modification queries on the
partitions. In fact, we error out when any partition is modified via
EnsurePartitionTableNotReplicated().

However, we seem to forgot an important case, where the parent table's
partition is marked as INVALID. In that case, at least one of the partition
becomes INVALID. However, we do not mark partitions as INVALID ever.

If the user queries the partition table directly, Citus could happily send
the query to INVALID placements -- which are not marked as INVALID.

This PR fixes it by marking the placements of the partitions as INVALID
as well.

The shard placement repair logic already re-creates all the partitions,
so should be fine in that front.
2020-01-06 12:26:08 +01:00
Önder Kalacı 0c70a5470e
Allow RETURNING in fast-path queries (#3352)
* Allow RETURNING in fast-path queries

Because there is no specific reason for that.
2020-01-03 13:42:50 +00:00
Önder Kalacı a174eb4f7b
Do not go through standard_planner() for INSERTs (#3348)
That seems unnecessary. We already have the notion of FastPath queries,
simply add it there.
2020-01-03 12:15:22 +00:00
Marco Slot ba39d72fe1 Fix incorrect union all pushdown issue 2020-01-01 09:03:50 +01:00
Jelte Fennema 3a042e4611 Allow cartesian products on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 61e2501645 Make any expression with two or more tables a join expression 2019-12-27 15:05:51 +01:00
Jelte Fennema 4233cd0d9d Allow non equi joins on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 7642928be1
Makefile fix DESTDIR together with cleanup (#3342)
This should fix this build issue: redmine.postgresql.org/issues/5032
2019-12-27 10:34:57 +01:00
Marco Slot b21b6905ae Do not repeat GROUP BY distribution_column on coordinator
Allow arbitrary aggregates to be pushed down in these scenarios
2019-12-25 01:33:41 +00:00
Marco Slot a2ddfecd86 Fix inconsistent shard metadata issue 2019-12-24 08:01:32 +01:00
Hadi Moshayedi d7aea7fa10 Implement partitioned intermediate results. 2019-12-24 03:53:39 -08:00
Marco Slot b37ef0e394 Fix error in distributed queries when shards are on the coordinator 2019-12-24 06:36:43 +01:00
Philip Dubé e9bbdb8f31 Fix handling of empty intermediate results when distributing custom aggregates 2019-12-23 17:27:52 +00:00
Philip Dubé f007b7f91d Also fix reindent inconsistencies with fake_fdw.c 2019-12-20 08:27:47 +00:00
Hadi Moshayedi 08eb0ade31 Fix reindent version inconsistencies.
Different versions of reindent tool reformatted citus_custom_scan.c
and citus_copyfuncs.c differently. So some developers spent some
extra attention not to commit these two files after reindent.

This PR tries to address this.
2019-12-19 23:10:34 -08:00
Jelte Fennema b655c02352
Add the necessary changes for rebalance strategies on enterprise (#3325)
This commit adds the SQL and C changes necessary to support custom rebalance
strategies in the Enterprise version of Citus.
2019-12-19 15:23:08 +01:00
Hadi Moshayedi ef487e0792 Implement fetch_intermediate_results 2019-12-18 10:46:35 -08:00
Hadi Moshayedi 249508d267 Estimate cost of read_intermediate_results() 2019-12-17 13:51:51 -08:00
Hadi Moshayedi 113bd1e5f1 Implement read_intermediate_results 2019-12-17 13:51:16 -08:00
SaitTalhaNisanci 7ff4ce2169
Add adaptive executor support for repartition joins (#3169)
* WIP

* wip

* add basic logic to run a single job with repartioning joins with adaptive executor

* fix some warnings and return in ExecuteDependedTasks if there is none

* Add the logic to run depended jobs in adaptive executor

The execution of depended tasks logic is changed. With the current
logic:
- All tasks are created from the top level task list.
- At one iteration:
	- CurTasks whose dependencies are executed are found.
	- CurTasks are executed in parallel with adapter executor main
logic.
- The iteration is repeated until all tasks are completed.

* Separate adaptive executor repartioning logic

* Remove duplicate parts

* cleanup directories and schemas

* add basic repartion tests for adaptive executor

* Use the first placement to fetch data

In task tracker, when there are replicas, we try to fetch from a replica
for which a map task is succeeded. TaskExecution is used for this,
however TaskExecution is not used in adaptive executor. So we cannot use
the same thing as task tracker.

Since adaptive executor fails when a map task fails (There is no retry
logic yet). We know that if we try to execute a fetch task, all of its
map tasks already succeeded, so we can just use the first one to fetch
from.

* fix clean directories logic

* do not change the search path while creating a udf

* Enable repartition joins with adaptive executor with only enable_reparitition_joins guc

* Add comments to adaptive_executor_repartition

* dont run adaptive executor repartition test in paralle with other tests

* execute cleanup only in the top level execution

* do cleanup only in the top level ezecution

* not begin a transaction if repartition query is used

* use new connections for repartititon specific queries

New connections are opened to send repartition specific queries. The
opened connections will be closed at the FinishDistributedExecution.

While sending repartition queries no transaction is begun so that
we can see all changes.

* error if a modification was done prior to repartition execution

* not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level

* fix cleanup logic

* update tests

* add missing function comments

* add test for transaction with DDL before repartition query

* do not close repartition connections in adaptive executor

* rollback instead of commit in repartition join test

* use close connection instead of shutdown connection

* remove unnecesary connection list, ensure schema owner before removing directory

* rename ExecuteTaskListRepartition

* put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case

* split adaptive executor repartition to DAG execution logic and repartition logic

* apply review items

* apply review items

* use an enum for remote transaction state and fix cleanup for repartition

* add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction

* fix style

* wip

* rename removejobdir to partition cleanup

* do not close connections at the end of repartition queries

* do repartition cleanup in pg catch

* apply review items

* decide whether to use transaction or not at execution creation

* rename isOutsideTransaction and add missing comment

* not error in pg catch while doing cleanup

* use replication factor of the creation time, not current time to decide if task tracker should be chosen

* apply review items

* apply review items

* apply review item
2019-12-17 19:09:45 +03:00
Marco Slot 2f568ad5a5 Forbid using connections that sent intermediate results for data access and vice versa 2019-12-17 11:49:13 +01:00
Marco Slot f4031dd477 Clean up transaction block usage logic in adaptive executor 2019-12-17 10:48:19 +01:00
Nils Dijk bfc3d2eb90
make sure to correctly decrement ExecutorLevel (#3311)
DESCRIPTION: Fix counter that keeps track of internal depth in executor

While reviewing #3302 I ran into the `ExecutorLevel` variable which used a variable to keep the original value to restore on successful exit. I haven't explored the full space and if it is possible to get into an inconsistent state. However using `PG_TRY`/`PG_CATCH` seems generally more correct.

Given very bad things will happen if this level is not reset, I kept the failsafe of setting the variiable back to 0 on the `XactCallback` but I did add an assert to treat it as a developer bug.
2019-12-16 20:50:13 +01:00
Marco Slot 5f656e22db Fix issue in IsMultiStatementTransaction detection 2019-12-16 17:01:43 +01:00
SaitTalhaNisanci 2829c601dd
replace Begin words in coordinated transactions with use (#3293) 2019-12-16 10:40:31 +03:00
SaitTalhaNisanci a2f2107e6a
refactor MapTaskList in multi physical planner (#3297) 2019-12-13 22:41:49 +03:00
Marco Slot 1633123d78 Fix crash in IN (NULL) queries 2019-12-13 08:35:54 +01:00
Hadi Moshayedi e7a6cc0801 Fix some typos from #3280 2019-12-12 13:29:26 -08:00
SaitTalhaNisanci 420e21919b
refactor extract distributed insert values rte (#3287) 2019-12-12 23:47:44 +03:00
Marco Slot e7a8db5493 Fix issue with some zero-shard modifications 2019-12-12 07:19:10 +01:00
SaitTalhaNisanci 2c040d2c8f
use a function for duplicate code in connection state machine (#3209) 2019-12-12 17:55:38 +03:00
SaitTalhaNisanci a0fe8646e0
add IsHoldOffCancellationReceived utility function (#3290) 2019-12-12 17:32:59 +03:00
SaitTalhaNisanci 053fe18404
not continue in sequential execution if a cancellation is received (#3289) 2019-12-12 17:22:30 +03:00
Hadi Moshayedi 939d3c955b Don't plan function joins locally 2019-12-11 16:53:29 -08:00
Hadi Moshayedi 067d92a7f6 Don't plan joins between ref tables and views locally 2019-12-11 14:31:34 -08:00
Hadi Moshayedi e3e174f30f Fix the way we check for local/reference table joins in the executor 2019-12-11 12:50:20 -08:00
SaitTalhaNisanci 13204487e9
remove copyright years (#3286) 2019-12-11 21:14:08 +03:00
SaitTalhaNisanci d10f97998c rename REMOTE_TRANS_INVALID to REMOTE_TRANS_NOT_STARTED 2019-12-11 15:24:18 +03:00
Marco Slot 133b8e1e0e Move coordinator insert..select logic into executor 2019-12-10 11:21:35 -08:00
Marco Slot 486c620a3c Fix inserts into local tables with distributed subqueries 2019-12-10 10:17:18 +01:00
Philip Dubé fcf2fd819b Add distributioncolumncollation to to pg_dist_colocation
Use partition column's collation for range distributed tables
Don't allow non deterministic collations for hash distributed tables
CoPartitionedTables: don't compare unequal types
2019-12-09 19:51:40 +00:00
Philip Dubé d138bb89bf Support creating collations as part of dependency resolution. Propagate ALTER/DROP on distributed collations
Propagate CREATE COLLATION when outside transaction
2019-12-09 04:42:51 +00:00
Alexander Pyhalov 6174a4d3d6 Fix build on illumos 2019-12-06 14:40:47 +01:00
Marco Slot 6a9c0ea7fe Fix errors in DML with sublinks hidden by null expressions 2019-12-06 14:25:04 +01:00
Hadi Moshayedi d28beb3711 Detect SQL UDF Calls. 2019-12-05 14:31:05 -08:00
Philip Dubé 5a17fd6d9d Test more reference/local cases, also ALTER ROLE
Test ALTER ROLE doesn't deadlock when coordinator added, or propagate from mx workers

Consolidate wait_until_metadata_sync & verify_metadata to multi_test_helpers
2019-12-03 22:23:14 +00:00
Philip Dubé 1597fbb369 aggregate_support test: test DISTINCT, ORDER BY, FILTER, & no intermediate results
Previously,
- we'd push down ORDER BY, but this doesn't order intermediate results between workers
- we'd keep FILTER on master aggregate, which would raise an error about unexpected cstrings
2019-12-03 15:46:01 +00:00
Philip Dubé 5fcc169a3a Stray depended to dependent tidy up 2019-12-03 15:28:32 +00:00
Marco Slot bb3bc10f0c Fix segfault in column_to_column_name 2019-12-01 23:57:25 +01:00
Marco Slot b1b13e394e Fix segfault when executing DDL via UDF 2019-12-01 22:54:41 +01:00
Marco Slot 4c8d43c5d0 Bump repo version to 9.2devel 2019-11-29 07:33:39 +01:00
Nils Dijk 1ef1667ddb
add gitref to the output of citus_version (#3246)
DESCRIPTION: add gitref to the output of citus_version

During debugging of custom builds it is hard to know the exact version of the citus build you are using. This patch will add a human readable/understandable git reference to the build of citus which can be retrieved by calling `citus_version();`.
2019-11-29 15:54:09 +01:00
Marco Slot 16d1ad3666 Remove distinction between SQL_TASK and ROUTER_TASK 2019-11-29 05:58:29 +01:00
SaitTalhaNisanci aeec3d1544
fix typo in dependent jobs and dependent task (#3244) 2019-11-28 23:47:28 +03:00
Philip Dubé 0d04ff1692 RECORD: Add support for more expression types
- OpExpr
- NullIfExpr
- MinMaxExpr
- CoalesceExpr
- CaseExpr

Also fix case where ARRAY[(1,2), NULL] was rejected
2019-11-27 17:07:22 +00:00
Philip Dubé 168e11cc9b Implement support for RECORD[] where we support RECORD
Support for ARRAY[] expressions is limited to having a consistent shape,
eg ARRAY[(int,text),(int,text)] as opposed to ARRAY[(int,text),(float,text)] or ARRAY[(int,text),(int,text,float)]
2019-11-27 15:02:43 +00:00
Hadi Moshayedi 2268a9cae6 Error for metadata commands if any metadata node is out-of-sync (#3226)
* Error for metadata commands if any metadata node is out-of-sync

* Make the functions have separate APIs for all workers/metadata workers
2019-11-27 09:52:57 +01:00
Önder Kalacı 1cfbeb89ec
Make NodeCanHaveDistTablePlacements() public (#3229)
Since it is required in rebalancer.
2019-11-26 12:15:38 +01:00
Marco Slot 60b741927f Add missing include to deparse_function_stmts.c 2019-11-24 06:04:22 +01:00
Philip Dubé 261a9de42d Fix typos:
VAR_SET_VALUE_KIND -> VAR_SET_VALUE kind
beginnig -> beginning
plannig -> planning
the the -> the
er then -> er than
2019-11-25 23:24:13 +00:00
Marco Slot 4b0ac4b0dd Properly escape ALTER FUNCTION .. SET deparsing. Also test 2019-11-25 23:01:30 +00:00
Philip Dubé 3c10c27b13 GetFunctionAlterOwnerCommand: use format_procedure_qualified
distributed_functions: test a function with a quote in name
AppendDefElemSet: quote variable names
2019-11-25 23:01:30 +00:00
Philip Dubé a81e6a81ab Fix distributed aggregation for non superuser roles
Moves support functions to pg_catalog for now. We'd prefer a different solution
for when we're creating these support functions dynamically
2019-11-25 20:46:25 +00:00
Khashayar Fereidani f81785ad14 Fix underflow initialization of default values
Initialization of queryWindowClause and queryOrderByLimit "memset" underflow these variables.
It's possible due to the invalid usage sizeof this part of the program cause buffer overflow and function return data corruption in future changes.
2019-11-25 19:25:51 +00:00
Onur TIRTIR bef32624c3
Escape extension name in extension command propagation (#3218) 2019-11-24 12:16:10 +03:00
Philip Dubé 99164398bf Fix potential segfault from standard_planner inlining functions 2019-11-21 18:47:36 +00:00
Philip Dubé c563e0825c Strip trailing whitespace and add final newline (#3186)
This brings files in line with our editorconfig file
2019-11-21 14:25:37 +01:00
Jelte Fennema 1d8dde232f
Automatically convert useless declarations using regex replace (#3181)
* Add declaration removal to CI

* Convert declarations
2019-11-21 13:47:29 +01:00
Onur TIRTIR 9961297d7b Improve extension command propagation logic and tests
* Improve extension command propagation tests

* patch for hardcoded citus extension name

(cherry picked from commit 0bb3dbac0afabda10e8928f9c17eda048dc4361a)
2019-11-21 11:24:39 +03:00
Marco Slot e0cccf7f9a Move C files into the appropriate directory 2019-11-16 11:36:17 +01:00
Hanefi Onaldi d82f3e9406
Introduce intermediate result broadcasting
In plain words, each distributed plan pulls the necessary intermediate
results to the worker nodes that the plan hits. This is primarily useful
in three ways. 

(i) If the distributed plan that uses intermediate
result(s) is a router query, then the intermediate results are only
broadcasted to a single node.

(ii) If a distributed plan consists of only intermediate results, which
is not uncommon, the intermediate results are broadcasted to a single
node only.

(iii) If a distributed query hits a sub-set of the shards in multiple
workers, the intermediate results will be broadcasted to the relevant
node(s).

The final item (iii) becomes crucial for append/range distributed
tables where typically the distributed queries hit a small subset of
shards/workers.

To do this, for each query that Citus creates a distributed plan, we keep
track of the subPlans used in the queryTree, and save it in the distributed
plan. Just before Citus executes each subPlan, Citus first keeps track of
every worker node that the distributed plan hits, and marks every subPlan
should be broadcasted to these nodes. Later, for each subPlan which is a
distributed plan, Citus does this operation recursively since these
distributed plans may access to different subPlans, and those have to be
recorded as well.
2019-11-20 15:26:36 +03:00
Philip Dubé b7fef5c31a Miscellaneous cleanup in prep for collation propagation 2019-11-19 17:28:59 +00:00
Onur TIRTIR 26c306d188
Add extensions to distributed object propagation infrastructure (#3185) 2019-11-19 17:56:28 +03:00
SaitTalhaNisanci 2cb82ae9bd
create a utility method to mark tasks as failed (#3150) 2019-11-19 16:35:56 +03:00
SaitTalhaNisanci 306d159072
refactor AfterXacthodtConnectionHandling (#3202) 2019-11-19 14:50:23 +03:00
Marco Slot 622462cad7 Return early in CitusHasBeenLoaded when creating a different extension 2019-11-15 03:00:20 +01:00
Önder Kalacı 40fa3862ce
Prevent Citus extension becoming distributed object (#3197)
Prevent Citus extension being distributed

Because that could prevent doing rolling upgrades, where users may
prefer to upgrade the version on the coordinator but not the workers.

There could be some other edge cases, so I'd prefer to keep Citus
extension outside the picture for now.
2019-11-18 16:57:10 +01:00
Halil Ozan Akgul 5ae7b219ff Create the ALTER ROLE propagation 2019-11-18 18:31:28 +03:00
Nils Dijk 217890af5f
Feature: Expression in reference join (#3180)
DESCRIPTION: Expression in reference join

Fixed: #2582

This patch allows arbitrary expressions in the join clause when joining to a reference table. An example of such joins could be found in CHbenCHmark queries 7, 8, 9 and 11; `mod((s_w_id * s_i_id),10000) = su_suppkey` and `ascii(substr(c_state,1,1)) = n2.n_nationkey`. Since the join is on a reference table these queries are able to be pushed down to the workers.

To implement these queries we will widen the `IsJoinClause` predicate to not check if the expressions are a type `Var` after stripping the implicit coerciens. Instead we define a join clause when the `Var`'s in a clause come from more than 1 table.

This allows more clauses to pass into the logical planner's `MultiNodeTree(...)` planning function. To compensate for this we tighten down the `LocalJoin`, `SinglePartitionJoin` and `DualPartitionJoin` to check for direct column references when planning. This allows the planner to work with arbitrary join expressions on reference tables.
2019-11-18 16:25:46 +01:00
Önder Kalacı a4c90b6ee1
Make distributed object dependency logic follow upto extensions (#3195)
With this commit, we're slightly changing the dependency traversal
logic to enable extension propagation.

The main idea is to "follow" the extension dependencies, but do not
"apply" them.

Since some extension dependencies are base types, and base types
could have circular dependencies, we implement a logic to prevent
revisiting an already visited object.
2019-11-17 17:21:21 +01:00
Hadi Moshayedi d9dcba25e3 Plan reference/local table joins locally 2019-11-15 07:36:50 -08:00
Onder Kalaci 90943a6ce6 Do not include coordinator shards when round-robin is selected
When the user picks "round-robin" policy, the aim is that the load
is distributed across nodes. However, for reference tables on the
coordinator, since local execution kicks in immediately, round-robin
is ignored.

With this change, we're excluding the placement on the coordinator.
Although the approach seems a little bit invasive because of
modifications in the placement list, that sounds acceptable.

We could have done this in some other ways such as:

1) Add a field to "Task->roundRobinPlacement" (or such), which is
updated as the first element after RoundRobinPolicy is applied.
During the execution, if that placement is local to the coordinator,
skip it and try the other remote placements.

2) On TaskAccessesLocalNode()@local_execution.c, check
task_assignment_policy, if round-robin selected and there is local
placement on the coordinator, skip it. However, task assignment is done
on planning, but this decision is happening on the execution, which
could create weird edge cases.
2019-11-15 06:03:32 -08:00
Hadi Moshayedi 15af1637aa Replicate reference tables to coordinator. 2019-11-15 05:50:19 -08:00
Hadi Moshayedi cb011bb30f Propagate isactive to metadata nodes. 2019-11-15 05:48:42 -08:00
SaitTalhaNisanci b9b7fd7660
add IsLoggableLevel utility function (#3149)
* add IsLoggableLevel utility function

* add function comment for IsLoggableLevel

* put ApplyLogRedaction to logutils
2019-11-15 14:59:13 +03:00
Jelte Fennema 1b2c438e69
Rename variables to not shadow globals in RHEL6 (#3194)
Fixes #2839
2019-11-15 12:12:24 +01:00
Jelte Fennema a8bd2d58f5
Update SQL definitions to prepare for drain node functionality (#3179) 2019-11-15 10:11:56 +01:00
Jelte Fennema 4b9b4b0995
Don't warn for declaration-after-statement since we only support GNU99 (#3132)
This change was actually already intended in #3124. However, the
postgres Makefile manually enables this warning too. This way we undo
that.

To confirm that it works two functions were changed to make use of not
having the warning anymore.
2019-11-15 09:46:06 +01:00
Philip Dubé 495c0f5117 Phase 1 implementation of custom aggregates
Phase 1 seeks to implement minimal infrastructure, so does not include:
	- dynamic generation of support aggregates to handle multiple arguments
	- configuration methods to direct aggregation strategy,
		or mark an aggregate's serialize/deserialize as safe to operate across nodes

Aggregates can be distributed when:
	- they have a single argument
	- they have a combinefunc
	- their transition type is not a pseudotype
2019-11-14 19:01:24 +00:00
Philip Dubé edc7a2ee38 Improve RECORD support 2019-11-14 18:32:22 +00:00
Philip Dubé eb35743c3f Remove citus.worker_list_file & master_initialize_node_metadata 2019-11-13 00:49:58 +00:00
Philip Dubé 48552bfffe Call DestReceiver rDestroy before it goes out of scope
CitusCopyDestReceiverDestroy: call hash_destroy on shardStateHash & connectionStateHash
2019-11-12 15:03:07 +00:00
Jelte Fennema adc6ca6100
Make simple in queries on unique columns work with repartion join (#3171)
This is necassery to support Q20 of the CHbenCHmark: #2582.

To summarize the fix: The subquery is converted into an INNER JOIN on a
table. This fixes the issue, since an INNER JOIN on a table is already
supported by the repartion planner.

The way this replacement is happening.:
1. Postgres replaces `col in (subquery)` with a SEMI JOIN (subquery) on col = subquery_result
2. If this subquery is simple enough Postgres will replace it with a
   regular read from a table
3. If the subquery returns unique results (e.g. a primary key) Postgres
   will convert the SEMI JOIN into an INNER JOIN during the planning. It
   will not change this in the rewritten query though.
4. We check if Postgres sends us any SEMI JOINs during its join order
   planning, if it doesn't we replace all SEMI JOINs in the rewritten
   query with INNER JOIN (which we already support).
2019-11-11 13:44:28 +01:00
SaitTalhaNisanci 57380fd668
remove duplicated method in multi_logical_optimizer (#3166) 2019-11-11 13:51:21 +03:00
Philip Dubé ad86c1b866 AcquireDistributedLockOnRelations: escape relation names 2019-11-08 21:23:01 +00:00
Philip Dubé e8ecbbfcb3 Escape transaction names 2019-11-08 21:23:01 +00:00
Jelte Fennema 9fb897a074
Fix queries with repartition joins and group by unique column (#3157)
Postgres doesn't require you to add all columns that are in the target list to
the GROUP BY when you group by a unique column (or columns). It even actively
removes these group by clauses when you do.

This is normally fine, but for repartition joins it is not. The reason for this
is that the temporary tables don't have these primary key columns. So when the
worker executes the query it will complain that it is missing columns in the
group by.

This PR fixes that by adding an ANY_VALUE aggregate around each variable in
the target list that does is not contained in the group by or in an aggregate.
This is done only for repartition joins.

The ANY_VALUE aggregate chooses the value from an undefined row in the
group.
2019-11-08 15:36:18 +01:00
SaitTalhaNisanci 02b359623f
remove duplicate code in citus_dist_stat_activity (#3165) 2019-11-08 15:41:32 +03:00
Önder Kalacı 0b3d4e55d9
Local execution should not change hasReturning for distributed tables (#3160)
It looks like the logic to prevent RETURNING in reference tables to
have duplicate entries that comes from local and remote executions
leads to missing some tuples for distributed tables.

With this PR, we're ensuring to kick in the logic for reference tables
only.
2019-11-08 12:49:56 +01:00
Philip Dubé 72c3d64ead Rename OpenConnectionsToAllNodes to OpenConnectionsToAllWorkerNodes 2019-11-07 17:50:22 +00:00
Philip Dubé 2fc45e5897 create_distributed_function: accept aggregates
Adds support for OCLASS_PROC to worker_create_or_replace_object
2019-11-06 18:23:37 +00:00
Hadi Moshayedi e00d1546f3 Don't maintain replicationfactor of reference tables 2019-11-05 07:23:14 -08:00
Onder Kalaci 471703bfaf DEBUG only when the function is distributed
Otherwise, we're seeing this message way to often.
2019-11-05 15:08:35 +00:00
Önder Kalacı 960cd02c67
Remove real time router executors (#3142)
* Remove unused executor codes

All of the codes of real-time executor. Some functions
in router executor still remains there because there
are common functions. We'll move them to accurate places
in the follow-up commits.

* Move GUCs to transaction mngnt and remove unused struct

* Update test output

* Get rid of references of real-time executor from code

* Warn if real-time executor is picked

* Remove lots of unused connection codes

* Removed unused code for connection restrictions

Real-time and router executors cannot handle re-using of the existing
connections within a transaction block.

Adaptive executor and COPY can re-use the connections. So, there is no
reason to keep the code around for applying the restrictions in the
placement connection logic.
2019-11-05 12:48:10 +01:00
Jelte Fennema f0c35ad134 Include fmgr.h, don't duplicate FunctionCallInfo typedef 2019-11-04 17:10:33 +00:00
SaitTalhaNisanci 7c410e3cd7
pass CitusCustomState directly to adaptive executor (#3151) 2019-11-01 19:57:32 +03:00
Önder Kalacı ffd89e4e01
Include all relevant relations in the ExtractRangeTableRelationWalker (#3135)
We've changed the logic for pulling RTE_RELATIONs in #3109 and
non-colocated subquery joins and partitioned tables.
@onurctirtir found this steps where I traced back and found the issues.

While looking into it in more detail, we decided to expand the list in a
way that the callers get all the relevant RTE_RELATIONs RELKIND_RELATION,
RELKIND_PARTITIONED_TABLE, RELKIND_FOREIGN_TABLE and RELKIND_MATVIEW.
These are all relation kinds that Citus planner is aware of.
2019-11-01 16:06:58 +01:00
Onur TIRTIR d3f68bf44f
Fix view is not distributed error when view is used in modify statements (#3104) 2019-11-01 16:34:01 +03:00
SaitTalhaNisanci c7ceca3216
update outdated comment in JobExecutorType (#3148) 2019-11-01 11:36:56 +03:00
SaitTalhaNisanci 70e46703aa
Fix debug1 message in JobExecutorType (#3147)
When citus.enable_repartition_joins guc is set to on, and we have
adaptive executor, there was a typo in the debug message, which was
saying realtime executor no adaptive executor.
2019-11-01 11:14:19 +03:00
Marco Slot 51c64c70c9 Do not try to sync metadata on standby coordinator 2019-10-30 05:15:45 +01:00
SaitTalhaNisanci dadbe86af1
refactor some of hard coded values in citus gucs (#3137)
* refactor some of hard coded values in citus gucs

* rename GUC_ALLOW_ALL to GUC_STANDARD
2019-10-30 10:35:39 +03:00
Marco Slot 067657af26 Disallow distributed functions with distribution arguments unless replication_model is streaming 2019-10-26 23:57:59 +02:00
SaitTalhaNisanci 29d45bd1b9
Do not assign InvalidOid for local execution while extracting parameters (#3131)
* do not assign InvalidOid for local execution while extracting parameters

* rename functions

* rename parameter and replace function
2019-10-28 14:28:22 +03:00
Önder Kalacı dceaddbe4d
Remove real-time/router executors (step 1) (#3125)
See #3125 for details on each item.

* Remove real-time/router executor tests-1

These are the ones which doesn't have '_%d' in the test
output files.

* Remove real-time/router executor tests-2

These are the ones which has in the test
output files.

* Move the tests outputs to correct place

* Make sure that single shard commits use 2PC on adaptive executor

It looks like we've messed the tests in #2891. Fixing back.

* Use adaptive executor for all router queries

This becomes important because when task-tracker is picked, we
used to pick router executor, which doesn't make sense.

* Remove explicit references to real-time/router executors in the tests

* JobExecutorType never picks real-time/router executors

* Make sure to go incremental in test output numbers

* Even users cannot pick real-time anymore

* Do not use real-time/router custom scans

* Get rid of unnecessary normalizations

* Reflect unneeded normalizations

* Get rid of unnecessary test output file
2019-10-25 10:54:54 +02:00
Marco Slot a1162b2023 Rename 9.1 upgrade script to upgrade from 9.0-2 2019-10-23 00:08:17 +02:00
Marco Slot 04040e0a37 Revoke usage from the citus schema 2019-10-23 00:08:17 +02:00
Jelte Fennema a5010e5b17
Add extra foreach convenience macros (#3117)
This completely hides `ListCell` to the user of the loop

Example usage:
```c
WorkerNode *workerNode = NULL;

foreach_ptr(workerNode, workerNodeList) {
	// Do stuff with workerNode
}
```

Instead of:
```c
ListCell *workerNodeCell = NULL;

foreach(cell, workerNodeList) {
    WorkerNode *workerNode = lfirst(workerNodeCell);
	// Do stuff with workerNode
}
```
2019-10-23 16:49:12 +02:00
Philip Dubé b2f084d7f5 UnsetMetadataSyncedForAll: use CatalogTupleUpdateWithInfo 2019-10-23 00:45:11 +00:00
Onder Kalaci a208f8b151 Fix memory leak on ReceiveResults
It turns out that TupleDescGetAttInMetadata() allocates quite a lot
of memory. And, if the target list is long and there are too many rows
returning, the leak becomes appereant.

You can reproduce the issue wout the fix with the following commands:

```SQL

CREATE TABLE users_table (user_id int, time timestamp, value_1 int, value_2 int, value_3 float, value_4 bigint);
SELECT create_distributed_table('users_table', 'user_id');

insert into users_table SELECT i, now(), i, i, i, i FROM generate_series(0,99999)i;

-- load faster

-- 200,000
INSERT INTO users_table SELECT * FROM users_table;

-- 400,000
INSERT INTO users_table SELECT * FROM users_table;

-- 800,000
INSERT INTO users_table SELECT * FROM users_table;

-- 1,600,000
INSERT INTO users_table SELECT * FROM users_table;

-- 3,200,000
INSERT INTO users_table SELECT * FROM users_table;

-- 6,400,000
INSERT INTO users_table SELECT * FROM users_table;

-- 12,800,000
INSERT INTO users_table SELECT * FROM users_table;

-- making the target list entry wider speeds up the leak to show up
 select *,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,* FROM users_table ;

 ```
2019-10-22 17:22:26 +02:00
Jelte Fennema 78e495e030
Add shouldhaveshards to pg_dist_node (#2960)
This is an improvement over #2512.

This adds the boolean shouldhaveshards column to pg_dist_node. When it's false, create_distributed_table for new collocation groups will not create shards on that node. Reference tables will still be created on nodes where it is false.
2019-10-22 16:47:16 +02:00
Hanefi Onaldi 7ebda04494
Update all c-style comments in migration files 2019-10-21 16:05:53 +03:00
Jelte Fennema 7abedc38b0
Support subqueries in HAVING (#3098)
Areas for further optimization:
- Don't save subquery results to a local file on the coordinator when the subquery is not in the having clause
- Push the the HAVING with subquery to the workers if there's a group by on the distribution column
- Don't push down the results to the workers when we don't push down the HAVING clause, only the coordinator needs it

Fixes #520
Fixes #756
Closes #2047
2019-10-16 16:40:14 +02:00
Onur TIRTIR 3bfb2a078b
Make changes on if-statement in ExtractRangeTableList for furhter walker types (#3110) 2019-10-16 15:50:09 +03:00
Onur TIRTIR d5f83dc110
Refactor range table walkers (#3109) 2019-10-16 01:20:49 +03:00
SaitTalhaNisanci 94a7e6475c
Remove copyright years (#2918)
* Update year as 2012-2019

* Remove copyright years
2019-10-15 17:44:30 +03:00
Philip Dubé 74cb168205 Remove Postgres 10 support 2019-10-11 21:56:56 +00:00
Hadi Moshayedi b50d216536 Fix a typo 2019-10-10 10:44:41 -07:00
Philip Dubé 4063e7ca67 CALL delegation: apply strip_implicit_coercions to distribution argument 2019-10-10 17:42:43 +00:00
Philip Dubé dd490b6376 Cache whether an object is in pg_dist_object. Avoids redundant lookups for non-distributed objects 2019-10-10 14:50:38 +00:00
Nils Dijk 4a4a220945
Fix enum add value order and pg12 (#3082)
DESCRIPTION: Fix order for enum values and correctly support pg12

PG 12 introduces `ALTER TYPE ... ADD VALUE ...` during transactions. Earlier versions would error out when called in a transaction, hence we connect to workers outside of the transaction which could cause inconsistencies on pg12 now that postgres doesn't error with this syntax anymore.

During the implementation of this fix it became apparent there was an error with the ordering of enum labels when the type was recreated. A patch and test have been included.
2019-10-07 17:16:19 +02:00
Jelte Fennema 01da11f264
Change citus truncate trigger to AFTER and add more upgrade tests (#3070)
* Add more upgrade tests

* Fix citus trigger generation after upgrade

citus_truncate_trigger runs before truncate when created by create_distributed_table:
492d1b2cba/src/backend/distributed/commands/create_distributed_table.c (L1163)

* Remove pg_dist_jobid_seq
2019-10-07 16:43:04 +02:00
Onder Kalaci 3be72ce42f Make sure that distributed functions always have the correct user
Objectives:

(a) both super user and regular user should have the correct owner for the function on the worker
(b) The transactional semantics would work fine for both super user and regular user
(c) non-super-user and non-function owner would get a reasonable error message if tries to distribute the function

Co-authored-by: @serprex
2019-10-04 21:38:49 +00:00
Marco Slot 1a3a174f67 Grant usage on schema citus to public 2019-10-04 12:26:08 +02:00
Marco Slot 89377ee578 Move RowExclusiveLock to start in SyncMetadataToNodes 2019-10-04 12:07:41 +02:00
Hadi Moshayedi 217db2a03e Don't block for locks in SyncMetadataToNodes() 2019-10-03 16:53:36 -07:00
Hadi Moshayedi ae915493e6 Don't send metadata commands to not-synced workers.
Otherwise some of the dependencies might not exist yet and
commands will error out.
2019-10-03 16:52:25 -07:00
Marco Slot 0b4b63e647 Drop the rebalancer before creating new UDFs 2019-10-03 16:08:58 +02:00
Marco Slot 2e50306cf8 Check command type in TryToDelegateFunctionCall 2019-10-03 15:37:15 +02:00
Hanefi Onaldi bd416ef68f Fix empty FROM clauses in PG12 2019-10-01 19:54:11 +00:00
Jelte Fennema ec4a165eec Improve isolation test block detection (#3055) 2019-10-01 14:10:15 +02:00
Jelte Fennema 40f785e6d8 Move citus_isolation_test_session_is_blocked to separate udf sql file 2019-10-01 14:10:15 +02:00
Philip Dubé 89d35e9692 Attempt to force custom plans for prepared statements when trying to delegate function calls
We discern between PARAM_EXEC & PARAM_EXTERN:
d52eaa0948/src/include/nodes/primnodes.h (L211)
According to primnodes.h we should only run into PARAM_EXEC or PARAM_EXTERN
2019-09-30 23:49:14 +00:00
Philip Dubé 29f1ea079b PG_VERSION_NUM > 110000 should be PG_VERSION_NUM >= 110000
Also fix a > 12000 typo
2019-09-30 23:37:43 +00:00
Hadi Moshayedi 5e97e5c98e Don't push down queries when in subqueries/ctes 2019-09-30 14:22:05 -07:00
Marco Slot 35bef0f3db Avoid caching connections from backends that servicei internal connections 2019-09-28 08:32:10 +02:00
Nils Dijk 01b26cf91a
Disallow distributed functions for functions depending on an extension (#3049)
DESCRIPTION: Disallow distributed functions for functions depending on an extension

Functions depending on an extension cannot (yet) be distributed by citus. If we would allow this it would cause issues with our dependency following mechanism as we stop following objects depending on an extension.

By not allowing functions to be distributed when they depend on an extension as well as not allowing to make distributed functions depend on an extension we won't break the ability to add new nodes. Allowing functions depending on extensions to be distributed at the moment could cause problems in that area.
2019-09-30 15:19:47 +02:00
Nils Dijk 473cbc0115
Propagate CREATE OR REPLACE FUNCTION to workers for distributed functions (#3043)
DESCRIPTION: Propagate CREATE OR REPLACE FUNCTION

Distributed functions could be replaced, which should be propagated to the workers to keep the function in sync between all nodes.

Due to the complexity of deparsing the `CreateFunctionStmt` we actually produce the plan during the processing phase of our utilityhook. Since the changes have already been made in the catalog tables we can reuse `pg_get_functiondef` to get us the generated `CREATE OR REPLACE` sql.
2019-09-30 12:41:17 +02:00
Jelte Fennema 82ec918b29
Add explain summary support (#3046)
Fixes #2922 and also adds explain analyze regression tests
2019-09-30 10:58:49 +02:00
Nils Dijk 9c2c50d875
Hookup function/procedure deparsing to our utility hook (#3041)
DESCRIPTION: Propagate ALTER FUNCTION statements for distributed functions

Using the implemented deparser for function statements to propagate changes to both functions and procedures that are previously distributed.
2019-09-27 22:06:49 +02:00
Philip Dubé 363409a0c2 Propagate REINDEX TABLE & REINDEX INDEX 2019-09-27 18:14:53 +00:00
Hanefi Onaldi 66b9f2e887 Deparsing and qualifiying for FUNCTION/PROCEDURE statements (#3014)
This PR aims to add all the necessary logic to qualify and deparse all possible `{ALTER|DROP} .. {FUNCTION|PROCEDURE}` queries.

As Procedures are introduced in PG11, the code contains many PG version checks. I tried my best to make it easy to clean up once we drop PG10 support.


Here are some caveats:
- I assumed that the parse tree is a valid one. There are some queries that are not allowed, but still are parsed successfully by postgres planner. Such queries will result in errors in execution time. (e.g. `ALTER PROCEDURE p STRICT` -> `STRICT` action is valid for functions but not procedures. Postgres decides to parse them nevertheless.)
2019-09-27 19:02:52 +02:00
Marco Slot 2868e02a3d Implement SELECT function call delegation.
When a function is marked as colocated with a distributed table,
we try delegating queries of kind "SELECT func(...)" to workers.

We currently only support this simple form, and don't delegate
forms like "SELECT f1(...), f2(...)", "SELECT f1(...) FROM ...",
or function calls inside transactions.

As a side effect, we also fix the transactional semantics of DO blocks.
Previously we didn't consider a DO block a multi-statement transaction.
Now we do.

Co-authored-by: Marco Slot <marco@citusdata.com>
Co-authored-by: serprex <serprex@users.noreply.github.com>
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
2019-09-27 09:13:25 -07:00
Jelte Fennema dab16be283
Set default threshold on get_rebalance_table_shards_plan to 0, like rebalance_table_shards (#3039)
In this PR the default `threshold` of `rebalance_table_shards` was set to 0: https://github.com/citusdata/shard_rebalancer/pull/73
However, the default for get_rebalance_table_shards_plan was not updated. This
can cause the confusing situation where the actual steps run by
`rebalance_table_shards` are not the same as the ones returned by
`get_rebalance_table_shards_plan`.
2019-09-27 17:21:36 +02:00
Marco Slot 32a11bdf6c Return early for common commands in the utility hook (#3031)
We started copying parse trees by default further on in `multi_ProcessUtility`. That's not a problem for maintenance command, but might register for things like `PREPARE` and `EXECUTE`, which might happen thousands of times per second. Add a few common commands to the check at the start.
2019-09-26 11:43:35 +02:00
Philip Dubé 4f60e3a149 Feedback 2019-09-24 17:31:09 +00:00
Marco Slot ca478defeb Deparse CALL statement instead of using original query string 2019-09-24 17:31:09 +00:00
Philip Dubé 90e1f1442a Annotated tests for multi_mx_call.
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
2019-09-24 17:31:09 +00:00
Marco Slot e269d990c9 Cast the distribution argument value when possible 2019-09-24 17:31:09 +00:00
Philip Dubé 432a8ef85b Hadi's feedback
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
Co-authored-by: serprex <serprex@users.noreply.github.com>
2019-09-24 17:31:09 +00:00
Philip Dubé bc1ad67eb5 Distribute CALL on distributed procedures to metadata workers
Lots taken from https://github.com/citusdata/citus/pull/2829
2019-09-24 17:31:09 +00:00
Onder Kalaci 18de78f386 Relax the colocation checks for distributed functions
As long as the types can be coerced, it is safe to pushdown
functions.
2019-09-24 16:31:08 +02:00
Marco Slot 42be8afd74 Swap pg_dist_node groupid and nodeid sequences 2019-09-24 12:03:44 +02:00
Hadi Moshayedi 48078a30e6 Fix wait_until_metadata_sync() for postgres 12.
Postgres 12 now has an assertion that the calls to WaitLatchOrSocket
handle postmaster death.
2019-09-23 14:15:35 -07:00
Philip Dubé 06faba91c0 Include ifdefs for pg12 API changes, update local_shard_executiuon test to avoid CTE inlining 2019-09-23 20:22:35 +00:00
Onder Kalaci d37745bfc7 Sync metadata to worker nodes after create_distributed_function
Since the distributed functions are useful when the workers have
metadata, we automatically sync it.

Also, after master_add_node(). We do it lazily and let the deamon
sync it. That's mainly because the metadata syncing cannot be done
in transaction blocks, and we don't want to add lots of transactional
limitations to master_add_node() and create_distributed_function().
2019-09-23 18:30:53 +02:00
Marco Slot 5f23b951c7 Support serial and smallserial when syncing metadata 2019-09-23 17:39:21 +02:00
Marco Slot e58d76c5f6 Fix assert failure in bare SELECT FROM reference table FOR UPDATE in MX 2019-09-23 17:00:09 +02:00