Commit Graph

1288 Commits (c48f0ca7e5dc070a2125673546b19614a7395d1b)

Author SHA1 Message Date
Jelte Fennema 685b54b3de
Semmle: Check for NULL in some places where it might occur (#3509)
Semmle reported quite some places where we use a value that could be NULL. Most of these are not actually a real issue, but better to be on the safe side with these things and make the static analysis happy.
2020-02-27 10:45:29 +01:00
Hadi Moshayedi 1b3e58f0c3 Merge branch 'improve-shard-pruning' of https://github.com/MarkusSintonen/citus into MarkusSintonen-improve-shard-pruning 2020-02-26 07:13:33 -08:00
Nils Dijk a77ed9cd23
Refactor master query to be planned by postgres' planner (#3326)
DESCRIPTION: Replace the query planner for the coordinator part with the postgres planner

Closes #2761 

Citus had a simple rule based planner for the query executed on the query coordinator. This planner grew over time with the addigion of SQL support till it was getting close to the functionality of the postgres planner. Except the code was brittle and its complexity rose which made it hard to add new SQL support.

Given its resemblance with the postgres planner it was a long outstanding wish to replace our hand crafted planner with the well supported postgres planner. This patch replaces our planner with a call to postgres' planner.

Due to the functionality of the postgres planner we needed to support both projections and filters/quals on the citus custom scan node. When a sort operation is planned above the custom scan it might require fields to be reordered in the custom scan before returning the tuple (projection). The postgres planner assumes every custom scan node implements projections. Because we controlled the plan that was created we prevented reordering in the custom scan and never had implemented it before.

A same optimisation applies to having clauses that could have been where clauses. Instead of applying the filter as a having on the aggregate it will push it down into the plan which could reach a custom scan node.

For both filters and projections we have implemented them when tuples are read from the tuple store. If no projections or filters are required it will directly return the tuple from the tuple store. Otherwise it will loop tuples from the tuple store through the filter and projection until a tuple is found and returned.

Besides filters being pushed down a side effect of having quals that could have been a where clause is that a call to read intermediate result could be called before the first tuple is fetched from the custom scan. This failed because the intermediate result would only be pulled to the coordinator on the first tuple fetch. To overcome this problem we do run the distributed subplans now before we run the postgres executor. This ensures the intermediate result is present on the coordinator in time. We do account for total time instrumentation by removing the instrumentation before handing control to the psotgres executor and update the timings our self.

For future SQL support it is enough to create a valid query structure for the part of the query to be executed on the query coordinating node. As a utility we do serialise and print the query at debug level4 for engineers to inspect what kind of query is being planned on the query coordinator.
2020-02-25 14:39:56 +01:00
Philip Dubé 025cb94159 Fix multi_task_string_size sometimes leaking intermediate files 2020-02-24 16:33:34 +00:00
Philip Dubé bcf54c5014 Address a couple issues with maintenace daemon management:
- Stop the daemon when citus extension is dropped
- Bail on maintenance daemon startup if myDbData is started with a non-zero pid
- Stop maintenance daemon from spawning itself
- Don't use postgres die, just wrap proc_exit(0)
- Assert(myDbData->workerPid == MyProcPid)

The two issues were that multiple daemons could be running for a database,
or that a daemon would be leftover after DROP EXTENSION citus
2020-02-21 16:49:01 +00:00
Nils Dijk 6ee82c381e
Add missing pieces for version bump of #3482 (#3523) 2020-02-21 12:35:29 +01:00
Onur Tirtir 926a1a61b9 change "relation" with "table" in error messages related with foreign keys on reference tables 2020-02-20 09:58:47 +03:00
Onur Tirtir 001089783c Fix null relation name issue in CheckConflictingRelationAccesses 2020-02-19 19:10:35 +03:00
Philip Dubé d7a4ffdc46 Add test for issue, does not reproduce issue 2020-02-18 23:45:17 +00:00
Marco Slot 038e5999cb Implement direct COPY table TO stdout 2020-02-17 15:15:10 +01:00
Jelte Fennema 15f1173b1d
Semmle: Ensure permissions of private keys are 0600 (#3506)
When using --allow-group-access option from initdb our keys and
certificates would be created with 0640 permissions. Which is a pretty
serious security issue: This changes that. This would not be exploitable
though, since postgres would not actually enable SSL and would output
the following message in the logs:

```
DETAIL:  File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by root.
```

Since citus still expected the cluster to have SSL enabled handshakes
between workers and coordinator would fail. So instead of a security
issue the cluster would simply be unusable.
2020-02-17 12:58:40 +01:00
Markus Sintonen 099e266a6c Force task executor 2020-02-16 01:32:52 +02:00
Markus Sintonen cf8319b992 Add comment, add subquery NOT tests 2020-02-16 01:21:10 +02:00
Markus Sintonen 3d3d615040 Add comment about NOT_EXPR. Treat it as invalid constraint for safety. 2020-02-15 16:54:38 +02:00
Philip Dubé 7382c8be00 Clean up from code review
Only change to behavior is:
- don't ignore array const's constcollid in SAORestrictions
- don't end lines with commas in DebugLogPruningInstance
2020-02-14 17:58:23 +00:00
Markus Sintonen cdedb98c54 Improve shard pruning logic to understand OR-conditions.
Previously a limitation in the shard pruning logic caused multi distribution value queries to always go into all the shards/workers whenever query also used OR conditions in WHERE clause.

Related to https://github.com/citusdata/citus/issues/2593 and https://github.com/citusdata/citus/issues/1537
There was no good workaround for this limitation. The limitation caused quite a bit of overhead with simple queries being sent to all workers/shards (especially with setups having lot of workers/shards).

An example of a previous plan which was inadequately pruned:
```
EXPLAIN SELECT count(*) FROM orders_hash_partitioned
	WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22);
                                                          QUERY PLAN
---------------------------------------------------------------------
 Aggregate  (cost=0.00..0.00 rows=0 width=0)
   ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)
         Task Count: 4
         Tasks Shown: One of 4
         ->  Task
               Node: host=localhost port=xxxxx dbname=regression
               ->  Aggregate  (cost=13.68..13.69 rows=1 width=8)
                     ->  Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned  (cost=0.00..13.68 rows=1 width=0)
                           Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22)))
(9 rows)
```

After this commit the task count is what one would expect from the query defining multiple distinct values for the distribution column:
```
EXPLAIN SELECT count(*) FROM orders_hash_partitioned
	WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22);
                                                          QUERY PLAN
---------------------------------------------------------------------
 Aggregate  (cost=0.00..0.00 rows=0 width=0)
   ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)
         Task Count: 2
         Tasks Shown: One of 2
         ->  Task
               Node: host=localhost port=xxxxx dbname=regression
               ->  Aggregate  (cost=13.68..13.69 rows=1 width=8)
                     ->  Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned  (cost=0.00..13.68 rows=1 width=0)
                           Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22)))
(9 rows)
```

"Core" of the pruning logic works as previously where it uses `PrunableInstances` to queue ORable valid constraints for shard pruning.
The difference is that now we build a compact internal representation of the query expression tree with PruningTreeNodes before actual shard pruning is run.

Pruning tree nodes represent boolean operators and the associated constraints of it. This internal format allows us to have compact representation of the query WHERE clauses which allows "core" pruning logic to work with OR-clauses correctly.

For example query having
`WHERE (o_orderkey IN (1,2)) AND (o_custkey=11 OR (o_shippriority > 1 AND o_shippriority < 10))`
gets transformed into:
1. AND(o_orderkey IN (1,2), OR(X, AND(X, X)))
2. AND(o_orderkey IN (1,2), OR(X, X))
3. AND(o_orderkey IN (1,2), X)
Here X is any set of unknown condition(s) for shard pruning.

This allow the final shard pruning to correctly recognize that shard pruning is done with the valid condition of `o_orderkey IN (1,2)`.

Another example with unprunable condition in query
`WHERE (o_orderkey IN (1,2)) OR (o_custkey=11 AND o_custkey=22)`
gets transformed into:
1. OR(o_orderkey IN (1,2), AND(X, X))
2. OR(o_orderkey IN (1,2), X)

Which is recognized as unprunable due to the OR condition between distribution column and unknown constraint -> goes to all shards.

Issue https://github.com/citusdata/citus/issues/1537 originally suggested transforming the query conditions into a full disjunctive normal form (DNF),
but this process of transforming into DNF is quite a heavy operation. It may "blow up" into a really large DNF form with complex queries having non trivial `WHERE` clauses.

I think the logic for shard pruning could be simplified further but I decided to leave the "core" of the shard pruning untouched.
2020-02-14 17:58:13 +00:00
Jelte Fennema 3d8efe303e
Fix flaky test introduced by #3374 (#3504)
Since #3374 multi_utilities is not safe to run in parallel anymore. This
is because it now also shows locks on shards created outside it's own
test. This is not really possible to fix.

Example of flaky test:
- https://circleci.com/gh/citusdata/citus/89995
- https://circleci.com/gh/citusdata/citus/90017
2020-02-14 16:07:33 +01:00
Jelte Fennema 5ef3e83ce4
Make multi_utilities test take 2 seconds instead of 20 (#3507)
On worker 2 it was waiting for dustbunnies_990001 to be
vacuumed/analyzed. This table doesn't actually exist, so that never
happend. Now it waits for the correct table and throws an error if it
waits more than 10 seconds.
2020-02-14 15:38:51 +01:00
Onder Kalaci 975c4c2264 Do not prune shards if the distribution key is NULL
The root of the problem is that, standard_planner() converts the following qual

```
   {OPEXPR
   :opno 98
   :opfuncid 67
   :opresulttype 16
   :opretset false
   :opcollid 0
   :inputcollid 100
   :args (
      {VAR
      :varno 1
      :varattno 1
      :vartype 25
      :vartypmod -1
      :varcollid 100
      :varlevelsup 0
      :varnoold 1
      :varoattno 1
      :location 45
      }
      {CONST
      :consttype 25
      :consttypmod -1
      :constcollid 100
      :constlen -1
      :constbyval false
      :constisnull true
      :location 51
      :constvalue <>
      }
   )
   :location 49
   }
```

To

```
(
   {CONST
   :consttype 16
   :consttypmod -1
   :constcollid 0
   :constlen 1
   :constbyval true
   :constisnull true
   :location -1
   :constvalue <>
   }
)
```

So, Citus doesn't deal with NULL values in real-time or non-fast path router queries.

And, in the FastPathRouter planner, we check constisnull in DistKeyInSimpleOpExpression().
However, in deferred pruning case, we do not check for isnull for const.

Thus, the fix consists of two parts:
- Let PruneShards() not crash when NULL parameter is passed
- For deferred shard pruning in fast-path queries, explicitly check that we have CONST which is not NULL
2020-02-13 15:00:31 +01:00
Onur Tirtir cd8210d516
Bump citus version to 9.3devel (#3482) 2020-02-13 16:22:05 +03:00
Onur Tirtir 39df51e903
Introduce objects to dist. infrastructure when updating Citus (#3477)
Mark existing objects that are not included in distributed object infrastructure
in older versions of Citus (but now should be) as distributed, after updating
Citus successfully.
2020-02-07 18:07:59 +03:00
Nils Dijk d5433400f9
Fix: Unnecessary repartition on joins with more than 4 tables (#3473)
DESCRIPTION: Fix unnecessary repartition on joins with more than 4 tables

In 9.1 we have introduced support for all CH-benCHmark queries by widening our definitions of joins to include joins with expressions in them. This had the undesired side effect of Q5 regressing on its plan by implementing a repartition join.

It turned out this regression was not directly related to widening of the join clause, nor the schema employed by CH-benCHmark. Instead it had to do with 4 or more tables being joined in a chain. A chain meaning:

```sql
SELECT * FROM a,b,c,d WHERE a.part = b.part AND b.part = c.part AND ....
```

Due to how our join order planner was implemented it would only keep track of 1 of the partition columns when comparing if the join could be executed locally. This manifested in a join chain of 4 tables to _always_ be executed as a repartition join. 3 tables joined in a chain would have the middle table shared by the two outer tables causing the local join possibility to be found.

With this patch we keep a  unique list (or set) of all partition columns participating in the join. When a candidate table is checked for a possibility to execute a local join it will check if there is any partition column in that set that matches an equality join clause on the partition column of the candidate table.

By taking into account all partition columns in the left relation it will now find the local join path on >= 4 tables joined in a chain. 

fixes: #3276
2020-02-06 15:07:07 +01:00
Philip Dubé c252811884 dont: don't, wont: won't, acylic: acyclic 2020-02-05 17:32:22 +00:00
Halil Ozan Akgul 8ce4f20061 Fixes the bug of grants on public schema propagation 2020-02-05 18:05:58 +03:00
SaitTalhaNisanci 89dc7d5e41
remove outdated information in citus upgrade readme (#3471) 2020-02-05 13:31:02 +03:00
Marco Slot 64ca5c9acb Add additional INSERT..SELECT repartition tests 2020-02-05 11:06:44 +01:00
Hadi Moshayedi 9dd14fa90d Rename discarded target list items in repartitioned INSERT/SELECT 2020-02-05 11:06:44 +01:00
Onder Kalaci c7e2309f4c Improve single hash-repartitioning with numeric (or non-int) types
We used to treat the shard interval array that we passed as numeric[].
However, it should be int[], as the shard ranges are int[].
2020-02-04 20:30:04 +01:00
Hadi Moshayedi bc1a800f70 Use current user for repartition join temp schemas.
Otherwise when using a less privileged user we might get
errors when trying to create the schema.
2020-02-04 09:48:20 -08:00
Hadi Moshayedi 890e23e734 Update multi_insert_select_non_pushable_queries 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 5818bcd27e Update with_dml 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 46f60e1ac0 Update multi_insert_select_conflict 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 05f58c9ec5 Update multi_insert_select 2020-02-03 13:13:30 -08:00
Hadi Moshayedi 264530311a Don't use distributed insert/select for repartitioned joins 2020-02-03 13:13:30 -08:00
Onder Kalaci 8be1b0112d Add failure test for parallel reference table join 2020-02-03 19:35:07 +01:00
Marco Slot a6bd6c657e Add tests that exercise parallel reference table join logic 2020-02-03 11:54:29 +01:00
Onder Kalaci 2f274a4fce Make sure to go deeper into the functions to search for PARAMs
For example, a PARAM might reside inside a function just because
of a casting of a type such as the follows:

```
               {FUNCEXPR
               :funcid 1740
               :funcresulttype 1700
               :funcretset false
               :funcvariadic false
               :funcformat 2
               :funccollid 0
               :inputcollid 0
               :args (
                  {PARAM
                  :paramkind 0
                  :paramid 15
                  :paramtype 23
                  :paramtypmod -1
                  :paramcollid 0
                  :location 356
                  }
               )
```

We should recursively check the expression before bailing out.
2020-02-03 09:36:12 +01:00
Philip Dubé db2eac5658 diff-filter: use utf8 encoding, not ascii 2020-01-31 00:03:17 +00:00
Hadi Moshayedi 9d988b3437 Add insert/select connection leak tests 2020-01-30 14:09:07 -08:00
Philip Dubé d43c80d4d8 pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true
This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by
2020-01-30 21:18:08 +00:00
Philip Dubé 5fccc56d3e Expand the set of aggregates which cannot have LIMIT approximated
Previously we only prevented AVG from being pushed down, but this is incorrect:
- array_agg, while somewhat non sensical to order by, will potentially be missing values
- combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative)

This commit limits approximating LIMIT pushdown when ordering by aggregates to:
min, max, sum, count, bit_and, bit_or, every, any
Which means of those we previously supported, we now exclude:
avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union
2020-01-30 17:45:18 +00:00
Önder Kalacı 8584cb005b
Do not evaluate functions on the coordinator for SELECT queries (#3440)
Previously, the logic for evaluting the functions and the parameters
were the same. That ended-up evaluting the functions inaccurately
on the coordinator. Instead, split the function evaluation logic
from parameter evalution logic.
2020-01-30 08:47:28 +01:00
Önder Kalacı e9c17b71a4
Add missing ORDER BY (#3441)
As it causes some random failures
2020-01-29 17:36:32 +01:00
Philip Dubé 40ce531850 Update diff-filter to handle lines removed by normalization
Add a script to test our diff logic

pg_regress_multi updated to rely on $PATH having copy_modified to fix testing with VPATH
2020-01-28 15:39:40 +00:00
Jelte Fennema b9eee70fa5 Fix random output ordering in CTE inlining test (#3434) 2020-01-27 16:38:27 +01:00
Jelte Fennema c38446b5f5 Replace denormalized test output with normalized at the end of the run 2020-01-24 11:42:38 +01:00
Önder Kalacı 4519d3411d
Improve the representation of used sub plans (#3411)
Previously, we've identified the usedSubPlans by only looking
to the subPlanId.

With this commit, we're expanding it to also include information
on the location of the subPlan.

This is useful to distinguish the cases where the subPlan is used
either on only HAVING or both HAVING and any other part of the query.
2020-01-24 10:47:14 +01:00
Philip Dubé 69dde460de See what flaky multi_extension test is doing with roles 2020-01-23 21:50:40 +00:00
Philip Dubé 5e55a36172 Avoid obscuring regression test diffs with normalization
First, diff is updated to not update the files in-place

For some reason diff is being called multiple times,
so $file1.unmodified becomes normalized on second invocation

Secondly, diff-filter updates output to come from the unmodified version

Normalization is serving two purposes:
- avoid diff noise in regressions
- avoid diff noise in commits when expected result is updated

The first purpose only wants to reduce the lines which diff registers,
whereas the second wants those changes to be committed
2020-01-23 18:51:23 +00:00
Önder Kalacı ef7d1ea91d
Locally execute queries that don't need any data access (#3410)
* Update shardPlacement->nodeId to uint

As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.

* Relax the restrictions on using the local execution

Previously, whenever any local execution happens, we disabled further
commands to do any remote queries. The basic motivation for doing that
is to prevent any accesses in the same transaction block to access the
same placements over multiple sessions: one is local session the other
is remote session to the same placement.

However, the current implementation does not distinguish local accesses
being to a placement or not. For example, we could have local accesses
that only touches intermediate results. In that case, we should not
implement the same restrictions as they become useless.

So, this is a pre-requisite for executing the intermediate result only
queries locally.

* Update the error messages

As the underlying implementation has changed, reflect it in the error
messages.

* Keep track of connections to local node

With this commit, we're adding infrastructure to track if any connection
to the same local host is done or not.

The main motivation for doing this is that we've previously were more
conservative about not choosing local execution. Simply, we disallowed
local execution if any connection to any remote node is done. However,
if we want to use local execution for intermediate result only queries,
this'd be annoying because we expect all queries to touch remote node
before the final query.

Note that this approach is still limiting in Citus MX case, but for now
we can ignore that.

* Formalize the concept of Local Node

Also some minor refactoring while creating the dummy placement

* Write intermediate results locally when the results are only needed locally

Before this commit, Citus used to always broadcast all the intermediate
results to remote nodes. However, it is possible to skip pushing
the results to remote nodes always.

There are two notable cases for doing that:

   (a) When the query consists of only intermediate results
   (b) When the query is a zero shard query

In both of the above cases, we don't need to access any data on the shards. So,
it is a valuable optimization to skip pushing the results to remote nodes.

The pattern mentioned in (a) is actually a common patterns that Citus users
use in practice. For example, if you have the following query:

WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...)
SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...;

The final query could be operating only on intermediate results. With this patch,
the intermediate results of the ctes are not unnecessarily pushed to remote
nodes.

* Add specific regression tests

As there are edge cases in Citus MX and with round-robin policy,
use the same queries on those cases as well.

* Fix failure tests

By forcing not to use local execution for intermediate results since
all the tests expects the results to be pushed remotely.

* Fix flaky test

* Apply code-review feedback

Mostly style changes

* Limit the max value of pg_dist_node_seq to reserve for internal use
2020-01-23 18:28:34 +01:00