Commit Graph

1562 Commits (e4e0c65203fde365aaccd5d5e1d4c8ef272581e9)

Author SHA1 Message Date
Jelte Fennema e4e0c65203 Semmle: Check for NULL in some places where it might occur (#3509)
Semmle reported quite some places where we use a value that could be NULL. Most of these are not actually a real issue, but better to be on the safe side with these things and make the static analysis happy.

(cherry picked from commit 685b54b3de)
2020-03-25 09:19:15 +01:00
Jelte Fennema 5dd61c8319 Fix Makefile so that it builds safestringlib correctly on OSX
(cherry picked from commit eb8e099f09)
2020-03-25 09:17:22 +01:00
Jelte Fennema 2ad9ca6726 Add clean-full to also clean full builds of vendored libraries
(cherry picked from commit 8e7eaaf949)
2020-03-25 09:17:16 +01:00
Jelte Fennema 1d0c3f36fc Make SafeSnprintf work on PG11
(cherry picked from commit 62bf571ced)
2020-03-25 09:16:51 +01:00
Jelte Fennema d0e4bc5d22 Add pg11 snprintf file to repo for use in pg11 when it's not compiled
(cherry picked from commit 7d24cebc80)
2020-03-25 09:16:43 +01:00
Jelte Fennema 2f063d0316 Convert unsafe APIs to safe ones
(cherry picked from commit 8de8b62669)
2020-03-25 09:16:31 +01:00
Jelte Fennema e0736d3da7 Remove READFUNCs (#3536)
We don't actually use these functions anymore since merging #1477.

Advantages of removing:
1. They add work whenever we add a new node.
2. They contain some usage of stdlib APIs that are banned by Microsoft.
   Removing it means we don't have to replace those with safe ones.

(cherry picked from commit 2a9fccc7a0)
2020-03-25 09:12:26 +01:00
Jelte Fennema b451c31c4c Semmle: Fix obvious issues (#3502)
Fixes some obvious issues found by the Semmle static analysis tool.

(cherry picked from commit 00d667c41d)
2020-03-25 09:11:27 +01:00
Jelte Fennema e2d49c6122 Semmle: Fix possible infite loops caused by overflow (#3503)
Comparison between differently sized integers in loop conditions can cause
infinite loops. This can happen when doing something like this:

```c
int64 very_big = MAX_INT32 + 1;
for (int32 i = 0; i < very_big; i++) {
    // do something
}
// never reached because i overflows before it can reach the value of very_big
```

(cherry picked from commit 3f7c5a5cf6)
2020-03-25 09:11:21 +01:00
Jelte Fennema 568f057ba0 Semmle: Ensure permissions of private keys are 0600 (#3506)
When using --allow-group-access option from initdb our keys and
certificates would be created with 0640 permissions. Which is a pretty
serious security issue: This changes that. This would not be exploitable
though, since postgres would not actually enable SSL and would output
the following message in the logs:

```
DETAIL:  File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by root.
```

Since citus still expected the cluster to have SSL enabled handshakes
between workers and coordinator would fail. So instead of a security
issue the cluster would simply be unusable.

(cherry picked from commit 15f1173b1d)
2020-03-25 09:11:13 +01:00
Onder Kalaci db1a0835f3 Improve definition of RelationInfoContainsOnlyRecurringTuples
Before this commit, we considered !ContainsRecurringRTE() enough
for NotContainsOnlyRecurringTuples. However, instead, we can check
for existince of any distributed table.

DESCRIPTION: Fixes a bug that causes wrong results with complex outer joins
2020-03-09 17:29:13 +01:00
Hanefi Onaldi 8d979b4752
Fix early exit bug on intermediate result pruning
There are 2 problems with our early exit strategy that this commit fixes:

1- When we decide that a subplan results are sent to all worker nodes,
we used to skip traversing the whole distributed plan, instead of
skipping only the subplan.

2- We used to consider all available nodes in the cluster (secondaries
and inactive nodes as well as active primaries) when deciding on early
exit strategy. This resulted in failures to early exit when there are
secondaries or inactive nodes.

(cherry picked from commit c0ad44f975)
2020-03-05 16:47:58 +03:00
Marco Slot ca44697723 Refactor CitusBeginScan into separate DML / SELECT paths 2020-03-05 13:04:39 +01:00
Onder Kalaci fd89760a29 For composite types, add cast to the parameter to ease remote node detect
the type.
2020-03-05 13:04:39 +01:00
SaitTalhaNisanci a96ff3cd6c create temp schemas in parallel (#3540)
(cherry picked from commit 82d22b34fe)
2020-03-03 17:13:09 +03:00
SaitTalhaNisanci 741ca1d33e send repartition cleanup jobs in parallel to all workers (#3485)
* send repartition cleanup jobs in parallel to all workers

* add review items

(cherry picked from commit d94c3fd43d)
2020-03-03 17:13:02 +03:00
Marco Slot 946b7a1a49 Make merge tables during re-partitioning unlogged
(cherry picked from commit c7f123947e)
2020-03-03 17:12:54 +03:00
Onder Kalaci bcc675cf84 Do not prune shards if the distribution key is NULL
The root of the problem is that, standard_planner() converts the following qual

```
   {OPEXPR
   :opno 98
   :opfuncid 67
   :opresulttype 16
   :opretset false
   :opcollid 0
   :inputcollid 100
   :args (
      {VAR
      :varno 1
      :varattno 1
      :vartype 25
      :vartypmod -1
      :varcollid 100
      :varlevelsup 0
      :varnoold 1
      :varoattno 1
      :location 45
      }
      {CONST
      :consttype 25
      :consttypmod -1
      :constcollid 100
      :constlen -1
      :constbyval false
      :constisnull true
      :location 51
      :constvalue <>
      }
   )
   :location 49
   }
```

To

```
(
   {CONST
   :consttype 16
   :consttypmod -1
   :constcollid 0
   :constlen 1
   :constbyval true
   :constisnull true
   :location -1
   :constvalue <>
   }
)
```

So, Citus doesn't deal with NULL values in real-time or non-fast path router queries.

And, in the FastPathRouter planner, we check constisnull in DistKeyInSimpleOpExpression().
However, in deferred pruning case, we do not check for isnull for const.

Thus, the fix consists of two parts:
- Let PruneShards() not crash when NULL parameter is passed
- For deferred shard pruning in fast-path queries, explicitly check that we have CONST which is not NULL
2020-02-13 17:22:49 +01:00
Onur Tirtir 39df51e903
Introduce objects to dist. infrastructure when updating Citus (#3477)
Mark existing objects that are not included in distributed object infrastructure
in older versions of Citus (but now should be) as distributed, after updating
Citus successfully.
2020-02-07 18:07:59 +03:00
Nils Dijk d5433400f9
Fix: Unnecessary repartition on joins with more than 4 tables (#3473)
DESCRIPTION: Fix unnecessary repartition on joins with more than 4 tables

In 9.1 we have introduced support for all CH-benCHmark queries by widening our definitions of joins to include joins with expressions in them. This had the undesired side effect of Q5 regressing on its plan by implementing a repartition join.

It turned out this regression was not directly related to widening of the join clause, nor the schema employed by CH-benCHmark. Instead it had to do with 4 or more tables being joined in a chain. A chain meaning:

```sql
SELECT * FROM a,b,c,d WHERE a.part = b.part AND b.part = c.part AND ....
```

Due to how our join order planner was implemented it would only keep track of 1 of the partition columns when comparing if the join could be executed locally. This manifested in a join chain of 4 tables to _always_ be executed as a repartition join. 3 tables joined in a chain would have the middle table shared by the two outer tables causing the local join possibility to be found.

With this patch we keep a  unique list (or set) of all partition columns participating in the join. When a candidate table is checked for a possibility to execute a local join it will check if there is any partition column in that set that matches an equality join clause on the partition column of the candidate table.

By taking into account all partition columns in the left relation it will now find the local join path on >= 4 tables joined in a chain. 

fixes: #3276
2020-02-06 15:07:07 +01:00
Philip Dubé ecad4aa5e6 Fill in jobIdList field of DistributedExecution
Pass down jobIdList from ExecuteTasksInDependencyOrder

Also clean up comment for ExecuteTaskListOutsideTransaction
2020-02-05 17:32:22 +00:00
Philip Dubé c252811884 dont: don't, wont: won't, acylic: acyclic 2020-02-05 17:32:22 +00:00
Halil Ozan Akgul 8ce4f20061 Fixes the bug of grants on public schema propagation 2020-02-05 18:05:58 +03:00
Hadi Moshayedi 9dd14fa90d Rename discarded target list items in repartitioned INSERT/SELECT 2020-02-05 11:06:44 +01:00
Onder Kalaci c7e2309f4c Improve single hash-repartitioning with numeric (or non-int) types
We used to treat the shard interval array that we passed as numeric[].
However, it should be int[], as the shard ranges are int[].
2020-02-04 20:30:04 +01:00
Hadi Moshayedi bc1a800f70 Use current user for repartition join temp schemas.
Otherwise when using a less privileged user we might get
errors when trying to create the schema.
2020-02-04 09:48:20 -08:00
Hadi Moshayedi 264530311a Don't use distributed insert/select for repartitioned joins 2020-02-03 13:13:30 -08:00
Marco Slot be77d3304f Fixup 2020-02-03 11:59:55 +01:00
Marco Slot b0fd6aa006 If reference tables was read over multiple connections, do not assign connection 2020-02-03 11:54:29 +01:00
Onder Kalaci 2f274a4fce Make sure to go deeper into the functions to search for PARAMs
For example, a PARAM might reside inside a function just because
of a casting of a type such as the follows:

```
               {FUNCEXPR
               :funcid 1740
               :funcresulttype 1700
               :funcretset false
               :funcvariadic false
               :funcformat 2
               :funccollid 0
               :inputcollid 0
               :args (
                  {PARAM
                  :paramkind 0
                  :paramid 15
                  :paramtype 23
                  :paramtypmod -1
                  :paramcollid 0
                  :location 356
                  }
               )
```

We should recursively check the expression before bailing out.
2020-02-03 09:36:12 +01:00
Philip Dubé d43c80d4d8 pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true
This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by
2020-01-30 21:18:08 +00:00
Philip Dubé 84a500ffc6 CitusRemoveDirectory: loop when directory is not empty
Sometimes during errors workers will create files while we're deleting intermediate directories

example:
DEBUG:  could not remove file "base/pgsql_job_cache/10_0_431": Directory not empty
DETAIL:  WARNING from localhost:57637
2020-01-30 20:02:08 +00:00
Philip Dubé 5fccc56d3e Expand the set of aggregates which cannot have LIMIT approximated
Previously we only prevented AVG from being pushed down, but this is incorrect:
- array_agg, while somewhat non sensical to order by, will potentially be missing values
- combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative)

This commit limits approximating LIMIT pushdown when ordering by aggregates to:
min, max, sum, count, bit_and, bit_or, every, any
Which means of those we previously supported, we now exclude:
avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union
2020-01-30 17:45:18 +00:00
Önder Kalacı 8584cb005b
Do not evaluate functions on the coordinator for SELECT queries (#3440)
Previously, the logic for evaluting the functions and the parameters
were the same. That ended-up evaluting the functions inaccurately
on the coordinator. Instead, split the function evaluation logic
from parameter evalution logic.
2020-01-30 08:47:28 +01:00
Önder Kalacı 412fe719f7
Hide citus.enable_ddl_propagation setting (#3437)
As that is powerful and cause metadata inconsistency. See the following steps:

(Note that we cannot use PGC_SUSET because on Citus MX we need this flag for non-
superusers as well)

```SQL
CREATE TABLE test_ref_table(key int);
SELECT create_reference_table('test_ref_table');

SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition;
┌────────────────┬──────────────┐
│  logicalrelid  │ logicalrelid │
├────────────────┼──────────────┤
│ test_ref_table │        16831 │
└────────────────┴──────────────┘
(1 row)

Time: 0.929 ms

SELECT relname FROM pg_class WHERE oid = 16831;
┌────────────────┐
│    relname     │
├────────────────┤
│ test_ref_table │
└────────────────┘
(1 row)

Time: 0.785 ms

SET citus.enable_ddl_propagation TO off;

 DROP TABLE test_ref_table ;

SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition;
┌──────────────┬──────────────┐
│ logicalrelid │ logicalrelid │
├──────────────┼──────────────┤
│ 16831        │        16831 │
└──────────────┴──────────────┘
(1 row)
Time: 0.972 ms

SELECT relname FROM pg_class WHERE oid = 16831;
┌─────────┐
│ relname │
├─────────┤
└─────────┘
(0 rows)

Time: 0.908 ms

 SELECT master_add_node('localhost', 9703);
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 5.028 ms
!>

```
2020-01-29 10:17:53 +01:00
SaitTalhaNisanci 94bd563ff0
switch back to old memory context in cache local plan for task (#3428) 2020-01-27 13:00:46 +03:00
Önder Kalacı 4519d3411d
Improve the representation of used sub plans (#3411)
Previously, we've identified the usedSubPlans by only looking
to the subPlanId.

With this commit, we're expanding it to also include information
on the location of the subPlan.

This is useful to distinguish the cases where the subPlan is used
either on only HAVING or both HAVING and any other part of the query.
2020-01-24 10:47:14 +01:00
Philip Dubé 50c5e814c8 CurrentDatabaseName: return const char* as we're borrowing from cache 2020-01-23 22:49:35 +00:00
Hadi Moshayedi 1dc19215eb Don't error for ENOENT in CitusRemoveDirectory.
For concurrency reasons, this can happen even if initial stat succeeded.
2020-01-23 10:07:54 -08:00
Hadi Moshayedi 3e1004c232 Change DistributedResultFragment::nodeId to uint32.
This is to match the type of WorkerNode::nodeId.
2020-01-23 09:33:15 -08:00
Önder Kalacı ef7d1ea91d
Locally execute queries that don't need any data access (#3410)
* Update shardPlacement->nodeId to uint

As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.

* Relax the restrictions on using the local execution

Previously, whenever any local execution happens, we disabled further
commands to do any remote queries. The basic motivation for doing that
is to prevent any accesses in the same transaction block to access the
same placements over multiple sessions: one is local session the other
is remote session to the same placement.

However, the current implementation does not distinguish local accesses
being to a placement or not. For example, we could have local accesses
that only touches intermediate results. In that case, we should not
implement the same restrictions as they become useless.

So, this is a pre-requisite for executing the intermediate result only
queries locally.

* Update the error messages

As the underlying implementation has changed, reflect it in the error
messages.

* Keep track of connections to local node

With this commit, we're adding infrastructure to track if any connection
to the same local host is done or not.

The main motivation for doing this is that we've previously were more
conservative about not choosing local execution. Simply, we disallowed
local execution if any connection to any remote node is done. However,
if we want to use local execution for intermediate result only queries,
this'd be annoying because we expect all queries to touch remote node
before the final query.

Note that this approach is still limiting in Citus MX case, but for now
we can ignore that.

* Formalize the concept of Local Node

Also some minor refactoring while creating the dummy placement

* Write intermediate results locally when the results are only needed locally

Before this commit, Citus used to always broadcast all the intermediate
results to remote nodes. However, it is possible to skip pushing
the results to remote nodes always.

There are two notable cases for doing that:

   (a) When the query consists of only intermediate results
   (b) When the query is a zero shard query

In both of the above cases, we don't need to access any data on the shards. So,
it is a valuable optimization to skip pushing the results to remote nodes.

The pattern mentioned in (a) is actually a common patterns that Citus users
use in practice. For example, if you have the following query:

WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...)
SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...;

The final query could be operating only on intermediate results. With this patch,
the intermediate results of the ctes are not unnecessarily pushed to remote
nodes.

* Add specific regression tests

As there are edge cases in Citus MX and with round-robin policy,
use the same queries on those cases as well.

* Fix failure tests

By forcing not to use local execution for intermediate results since
all the tests expects the results to be pushed remotely.

* Fix flaky test

* Apply code-review feedback

Mostly style changes

* Limit the max value of pg_dist_node_seq to reserve for internal use
2020-01-23 18:28:34 +01:00
Onder Kalaci a0dff301c7 Update shardPlacement->nodeId to uint
As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.
2020-01-23 13:00:24 +01:00
Jelte Fennema c62b756f34
Fix new method of locking shard distribition metadata (#3407)
In #3374 a new way of locking shard distribution metadata was
implemented. However, this was only done in the function
`LockShardDistributionMetadata` and not in
`TryLockShardDistributionMetadata`. This is bad, since it causes these
locks to not block eachother in some cases.

This commit fixes this issue by sharing the code that sets the locktag
between the two function.
2020-01-22 16:44:17 +01:00
Jelte Fennema cd5259a25a
Do not place new shards with shards in TO_DELETE state (#3408)
When creating a new distributed table. The shards would colocate with shards
with SHARD_STATE_TO_DELETE (shardstate = 4). This means if that state was
because of a shard move the new shard would be created on two nodes and it
would not get deleted since it's shard state would be 1.
2020-01-22 14:52:12 +01:00
Onder Kalaci 4be69bbf6f Fix reference table issue 2020-01-20 18:45:18 +00:00
Halil Ozan Akgul b40f067d05 Adds propagation for grant on schema commands 2020-01-20 14:51:28 +03:00
Philip Dubé fdcc413559 Code cleanup of adaptive_executor, connection_management, placement_connection
adaptive_executor: sort includes, use foreach_ptr, remove lies from FinishDistributedExecution docs
connection_management: rename msecs, which isn't milliseconds
placement_connection: small typos
2020-01-17 17:44:47 +00:00
Onder Kalaci 2f0ef8bc36 Apply feedback 1 2020-01-17 16:06:04 +01:00
Onder Kalaci 0bf1e81e33 Cache local plans on BeginScan 2020-01-17 16:02:57 +01:00
Onder Kalaci 5dc454cdad Exclude localPlannedStatements from copy distributedPlan 2020-01-17 16:02:57 +01:00