Commit Graph

3364 Commits (31573a01d7c64bd5c3a4b8f4aebc12c70bb6d987)

Author SHA1 Message Date
Philip Dubé d43c80d4d8 pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true
This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by
2020-01-30 21:18:08 +00:00
Philip Dubé 84a500ffc6 CitusRemoveDirectory: loop when directory is not empty
Sometimes during errors workers will create files while we're deleting intermediate directories

example:
DEBUG:  could not remove file "base/pgsql_job_cache/10_0_431": Directory not empty
DETAIL:  WARNING from localhost:57637
2020-01-30 20:02:08 +00:00
Philip Dubé 5fccc56d3e Expand the set of aggregates which cannot have LIMIT approximated
Previously we only prevented AVG from being pushed down, but this is incorrect:
- array_agg, while somewhat non sensical to order by, will potentially be missing values
- combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative)

This commit limits approximating LIMIT pushdown when ordering by aggregates to:
min, max, sum, count, bit_and, bit_or, every, any
Which means of those we previously supported, we now exclude:
avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union
2020-01-30 17:45:18 +00:00
Önder Kalacı 8584cb005b
Do not evaluate functions on the coordinator for SELECT queries (#3440)
Previously, the logic for evaluting the functions and the parameters
were the same. That ended-up evaluting the functions inaccurately
on the coordinator. Instead, split the function evaluation logic
from parameter evalution logic.
2020-01-30 08:47:28 +01:00
Önder Kalacı e9c17b71a4
Add missing ORDER BY (#3441)
As it causes some random failures
2020-01-29 17:36:32 +01:00
Önder Kalacı 412fe719f7
Hide citus.enable_ddl_propagation setting (#3437)
As that is powerful and cause metadata inconsistency. See the following steps:

(Note that we cannot use PGC_SUSET because on Citus MX we need this flag for non-
superusers as well)

```SQL
CREATE TABLE test_ref_table(key int);
SELECT create_reference_table('test_ref_table');

SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition;
┌────────────────┬──────────────┐
│  logicalrelid  │ logicalrelid │
├────────────────┼──────────────┤
│ test_ref_table │        16831 │
└────────────────┴──────────────┘
(1 row)

Time: 0.929 ms

SELECT relname FROM pg_class WHERE oid = 16831;
┌────────────────┐
│    relname     │
├────────────────┤
│ test_ref_table │
└────────────────┘
(1 row)

Time: 0.785 ms

SET citus.enable_ddl_propagation TO off;

 DROP TABLE test_ref_table ;

SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition;
┌──────────────┬──────────────┐
│ logicalrelid │ logicalrelid │
├──────────────┼──────────────┤
│ 16831        │        16831 │
└──────────────┴──────────────┘
(1 row)
Time: 0.972 ms

SELECT relname FROM pg_class WHERE oid = 16831;
┌─────────┐
│ relname │
├─────────┤
└─────────┘
(0 rows)

Time: 0.908 ms

 SELECT master_add_node('localhost', 9703);
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 5.028 ms
!>

```
2020-01-29 10:17:53 +01:00
Philip Dubé 40ce531850 Update diff-filter to handle lines removed by normalization
Add a script to test our diff logic

pg_regress_multi updated to rely on $PATH having copy_modified to fix testing with VPATH
2020-01-28 15:39:40 +00:00
Jelte Fennema b9eee70fa5 Fix random output ordering in CTE inlining test (#3434) 2020-01-27 16:38:27 +01:00
SaitTalhaNisanci 94bd563ff0
switch back to old memory context in cache local plan for task (#3428) 2020-01-27 13:00:46 +03:00
Jelte Fennema c38446b5f5 Replace denormalized test output with normalized at the end of the run 2020-01-24 11:42:38 +01:00
Önder Kalacı 4519d3411d
Improve the representation of used sub plans (#3411)
Previously, we've identified the usedSubPlans by only looking
to the subPlanId.

With this commit, we're expanding it to also include information
on the location of the subPlan.

This is useful to distinguish the cases where the subPlan is used
either on only HAVING or both HAVING and any other part of the query.
2020-01-24 10:47:14 +01:00
Philip Dubé 50c5e814c8 CurrentDatabaseName: return const char* as we're borrowing from cache 2020-01-23 22:49:35 +00:00
Philip Dubé 69dde460de See what flaky multi_extension test is doing with roles 2020-01-23 21:50:40 +00:00
Philip Dubé 5e55a36172 Avoid obscuring regression test diffs with normalization
First, diff is updated to not update the files in-place

For some reason diff is being called multiple times,
so $file1.unmodified becomes normalized on second invocation

Secondly, diff-filter updates output to come from the unmodified version

Normalization is serving two purposes:
- avoid diff noise in regressions
- avoid diff noise in commits when expected result is updated

The first purpose only wants to reduce the lines which diff registers,
whereas the second wants those changes to be committed
2020-01-23 18:51:23 +00:00
Hadi Moshayedi 1dc19215eb Don't error for ENOENT in CitusRemoveDirectory.
For concurrency reasons, this can happen even if initial stat succeeded.
2020-01-23 10:07:54 -08:00
Hadi Moshayedi 3e1004c232 Change DistributedResultFragment::nodeId to uint32.
This is to match the type of WorkerNode::nodeId.
2020-01-23 09:33:15 -08:00
Önder Kalacı ef7d1ea91d
Locally execute queries that don't need any data access (#3410)
* Update shardPlacement->nodeId to uint

As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.

* Relax the restrictions on using the local execution

Previously, whenever any local execution happens, we disabled further
commands to do any remote queries. The basic motivation for doing that
is to prevent any accesses in the same transaction block to access the
same placements over multiple sessions: one is local session the other
is remote session to the same placement.

However, the current implementation does not distinguish local accesses
being to a placement or not. For example, we could have local accesses
that only touches intermediate results. In that case, we should not
implement the same restrictions as they become useless.

So, this is a pre-requisite for executing the intermediate result only
queries locally.

* Update the error messages

As the underlying implementation has changed, reflect it in the error
messages.

* Keep track of connections to local node

With this commit, we're adding infrastructure to track if any connection
to the same local host is done or not.

The main motivation for doing this is that we've previously were more
conservative about not choosing local execution. Simply, we disallowed
local execution if any connection to any remote node is done. However,
if we want to use local execution for intermediate result only queries,
this'd be annoying because we expect all queries to touch remote node
before the final query.

Note that this approach is still limiting in Citus MX case, but for now
we can ignore that.

* Formalize the concept of Local Node

Also some minor refactoring while creating the dummy placement

* Write intermediate results locally when the results are only needed locally

Before this commit, Citus used to always broadcast all the intermediate
results to remote nodes. However, it is possible to skip pushing
the results to remote nodes always.

There are two notable cases for doing that:

   (a) When the query consists of only intermediate results
   (b) When the query is a zero shard query

In both of the above cases, we don't need to access any data on the shards. So,
it is a valuable optimization to skip pushing the results to remote nodes.

The pattern mentioned in (a) is actually a common patterns that Citus users
use in practice. For example, if you have the following query:

WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...)
SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...;

The final query could be operating only on intermediate results. With this patch,
the intermediate results of the ctes are not unnecessarily pushed to remote
nodes.

* Add specific regression tests

As there are edge cases in Citus MX and with round-robin policy,
use the same queries on those cases as well.

* Fix failure tests

By forcing not to use local execution for intermediate results since
all the tests expects the results to be pushed remotely.

* Fix flaky test

* Apply code-review feedback

Mostly style changes

* Limit the max value of pg_dist_node_seq to reserve for internal use
2020-01-23 18:28:34 +01:00
Onder Kalaci a0dff301c7 Update shardPlacement->nodeId to uint
As the source of the shardPlacement->nodeId is always workerNode->nodeId,
and that is uint32.

We had this hack because of: 0ea4e52df5 (r266421409)

And, that is gone with: 90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)

So, we're safe to do it now.
2020-01-23 13:00:24 +01:00
Hadi Moshayedi be647ad944 Output filenames in ensure_no_intermediate_data_leak
This can helpful in guiding us where to look when this test fails.
For example, if the result file has repartitioned_results_ prefix,
then we need to look into repartitioned insert/select. Otherwise
it is probably a CTE or a subquery.
2020-01-22 11:12:16 -08:00
Jelte Fennema c62b756f34
Fix new method of locking shard distribition metadata (#3407)
In #3374 a new way of locking shard distribution metadata was
implemented. However, this was only done in the function
`LockShardDistributionMetadata` and not in
`TryLockShardDistributionMetadata`. This is bad, since it causes these
locks to not block eachother in some cases.

This commit fixes this issue by sharing the code that sets the locktag
between the two function.
2020-01-22 16:44:17 +01:00
Jelte Fennema cd5259a25a
Do not place new shards with shards in TO_DELETE state (#3408)
When creating a new distributed table. The shards would colocate with shards
with SHARD_STATE_TO_DELETE (shardstate = 4). This means if that state was
because of a shard move the new shard would be created on two nodes and it
would not get deleted since it's shard state would be 1.
2020-01-22 14:52:12 +01:00
Onder Kalaci 4be69bbf6f Fix reference table issue 2020-01-20 18:45:18 +00:00
Halil Ozan Akgul b40f067d05 Adds propagation for grant on schema commands 2020-01-20 14:51:28 +03:00
Philip Dubé fdcc413559 Code cleanup of adaptive_executor, connection_management, placement_connection
adaptive_executor: sort includes, use foreach_ptr, remove lies from FinishDistributedExecution docs
connection_management: rename msecs, which isn't milliseconds
placement_connection: small typos
2020-01-17 17:44:47 +00:00
Onder Kalaci 2f0ef8bc36 Apply feedback 1 2020-01-17 16:06:04 +01:00
Onder Kalaci fd17e4578e Improve tests 2020-01-17 16:02:57 +01:00
Onder Kalaci 0bf1e81e33 Cache local plans on BeginScan 2020-01-17 16:02:57 +01:00
Onder Kalaci 08d148d43e Make TaskAccessesLocalNode external function 2020-01-17 16:02:57 +01:00
Onder Kalaci 5dc454cdad Exclude localPlannedStatements from copy distributedPlan 2020-01-17 16:02:57 +01:00
Onder Kalaci ff12df411b Add LocalPlannedStatement struct 2020-01-17 16:02:57 +01:00
Onder Kalaci 016f561e45 Ingest data for cte_inline tests 2020-01-17 12:46:00 +01:00
Onder Kalaci 3833a7e686 Fix issues for CTE inlining on Postgres 11
Comment from code:

/*
 * We had to implement this hack because on Postgres11 and below, the originalQuery
 * and the query would have significant differences in terms of CTEs where CTEs
 * would not be inlined on the query (as standard_planner() wouldn't inline CTEs
 * on PG 11 and below).
 *
 * Instead, we prefer to pass the inlined query to the distributed planning. We rely
 * on the fact that the query includes subqueries, and it'd definitely go through
 * query pushdown planning. During query pushdown planning, the only relevant query
 * tree is the original query.
 */
2020-01-17 11:59:02 +01:00
Jelte Fennema 246435be7e
Lazy query deparsing executable queries (#3350)
Deparsing and parsing a query can be heavy on CPU. When locally executing 
the query we don't need to do this in theory most of the time.

This PR is the first step in allowing to skip deparsing and parsing
the query in these cases, by lazily creating the query string and
storing the query in the task. Future commits will make use of this and
not deparse and parse the query anymore, but use the one from the task
directly.
2020-01-17 11:49:43 +01:00
Hadi Moshayedi 6cf1c01660 Don't use repartitioned INSERT/SELECT for repartition joins 2020-01-16 23:40:31 -08:00
Hadi Moshayedi 5eeb07124f Repartitioned INSERT/SELECT: include job id in result id prefix 2020-01-16 23:24:52 -08:00
Hadi Moshayedi a079278b0c Repartitioned INSERT/SELECT: Add a GUC to enable/disable it 2020-01-16 23:24:52 -08:00
Hadi Moshayedi ce5eea4885 INSERT/SELECT: make SELECT column names unique 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 3258d87f3e Isolation tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 8b27a9a195 More range partitioned tests 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 8635396cea Repartitioned INSERT/SELECT: Test rollback behaviour 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 43218eebf6 Failure tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 665b33dca1 MX tests for INSERT/SELECT repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi af2349f21f Repartitioned INSERT/SELECT: Add a prepared statement test 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 97072c9eb1 INSERT/SELECT: show method in EXPLAIN output 2020-01-16 23:24:52 -08:00
Hadi Moshayedi b143d9588a Repartitioned INSERT/SELECT: Test GROUP BY 2020-01-16 23:24:52 -08:00
Hadi Moshayedi fe548b762f Repartitioned INSERT/SELECT: Test CTEs 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 494cc383cc Repartitioned INSERT/SELECT: Enable RETURNING 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 4b14347fc3 Tests for DML followed by insert/select repartition 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 44a2aede16 Don't start a coordinated transaction on workers.
Otherwise transaction hooks of Citus kick in and might cause unwanted errors.
2020-01-16 23:24:52 -08:00
Hadi Moshayedi 42c3c03b85 Handle extra columns added in ExpandWorkerTargetEntry() in repartitioned INSERT/SELECT 2020-01-16 23:24:52 -08:00
Hadi Moshayedi 89463f9760 Repartitioned INSERT/SELECT: cast columns in SELECT targets 2020-01-16 23:24:52 -08:00
Hadi Moshayedi d67a384350 Enable repartitioned INSERT/SELECT ON CONFLICT. 2020-01-16 23:24:52 -08:00
Hadi Moshayedi b4e5f4b10a Implement INSERT ... SELECT with repartitioning 2020-01-16 23:24:52 -08:00
Hadi Moshayedi ced876358d INSERT/SELECT: Refactor out AddInsertSelectCasts 2020-01-16 23:24:52 -08:00
Hadi Moshayedi d449c1857c INSERT/SELECT: Use ExecutePlan* instead of ExecuteSelect* 2020-01-16 23:24:52 -08:00
Hadi Moshayedi e30580e2bd Add ORDER BY to multi_row_insert.sql 2020-01-16 15:20:39 -08:00
Jelte Fennema 0ee1eab070 Make tests fail with a useful error message 2020-01-16 18:30:30 +01:00
Jelte Fennema cb5154cf03 Add more failing tests, of which some have bad error messages 2020-01-16 18:30:30 +01:00
Marco Slot 82f1fffa28 Fix epoll_ctl() error message on connection error 2020-01-16 06:40:57 +01:00
Onder Kalaci dc17c2658e Defer shard pruning for fast-path router queries to execution
This is purely to enable better performance with prepared statements.
Before this commit, the fast path queries with prepared statements
where the distribution key includes a parameter always went through
distributed planning. After this change, we only go through distributed
planning on the first 5 executions.
2020-01-16 16:59:36 +01:00
Onder Kalaci 933d666c0d Do not forget to copy fastPathRouterPlan@DistributedPlan 2020-01-16 16:39:20 +01:00
Halil Ozan Akgul c5539d20d9 Adds alter table schema propagation 2020-01-16 17:04:16 +03:00
Nils Dijk b6e09eb691
Fix: distributed function with table reference in declare (#3384)
DESCRIPTION: Fixes a problem when adding a new node due to tables referenced in a functions body

Fixes #3378 

It was reported that `master_add_node` would fail if a distributed function has a table name referenced in its declare section of the body. By default postgres validates the body of a function on creation. This is not a problem in the normal case as tables are replicated to the workers when we distribute functions.

However when a new node is added we first create dependencies on the workers before we try to create any tables, and the original tables get created out of bound when the metadata gets synced to the new node. This causes the function body validator to raise an error the table is not on the worker.

To mitigate this issue we set `check_function_bodies` to `off` right before we are creating the function.

The added test shows this does resolve the issue. (issue can be reproduced on the commit without the fix)
2020-01-16 14:21:54 +01:00
Jelte Fennema e76281500c
Replace shardId lock with lock on colocation+shardIntervalIndex (#3374)
This new locking pattern makes sure that some deadlocks that could
happend during rebalancing cannot occur anymore.
2020-01-16 13:14:01 +01:00
Jelte Fennema 86343bcc8f Re-add test that broke with GUC workaround 2020-01-16 12:34:50 +01:00
Jelte Fennema 6b9b633695 Add more tests for prepared statements 2020-01-16 12:28:15 +01:00
Jelte Fennema 43a3fdd12f Fix comment 2020-01-16 12:28:15 +01:00
Jelte Fennema fe3827e499 Add tests for [NOT] MATERIALEZED 2020-01-16 12:28:15 +01:00
Onder Kalaci 326dfab44a Fix a query which triggers an existing bug, see https://github.com/citusdata/citus/issues/3189#issuecomment-571497051 2020-01-16 12:28:15 +01:00
Onder Kalaci 81d8178625 Note that we'll drop the GUC after PG 11 support dropped 2020-01-16 12:28:15 +01:00
Onder Kalaci c653923960 Update regression tests 6
Local execution and CTE pushdown
2020-01-16 12:28:15 +01:00
Onder Kalaci 3818be45a6 Update regression tests-5
Failure tests that rely on intermediate results
2020-01-16 12:28:15 +01:00
Onder Kalaci 1e85938b46 Update regression tests-4
Update the MX tests. Similar to the previous commits, prevent CTE
inlining in some cases to prevent divergent test outputs.
2020-01-16 12:28:15 +01:00
Onder Kalaci fc07bd7c5b Update regression tests-3
Update the regression tests which only change in PG 12.
2020-01-16 12:28:15 +01:00
Onder Kalaci 64560b07be Update regression tests-2
In this commit, we're introducing a way to prevent CTE inlining via a GUC.

The GUC is used in all the tests where PG 11 and PG 12 tests would diverge
otherwise.

Note that, in PG 12, the restriction information for CTEs are generated. It
means that for some queries involving CTEs, Citus planner (router planner/
pushdown planner) may behave differently. So, via the GUC, we prevent
tests to diverge on PG 11 vs PG 12.

When we drop PG 11 support, we should get rid of the GUC, and mark
relevant ctes as MATERIALIZED, which does the same thing.
2020-01-16 12:28:15 +01:00
Onder Kalaci 5cb203b276 Update regression tests-1
These set of tests has changed in both PG 11 and PG 12.
The changes are only about CTE inlining kicking in both
versions, and yielding the exact same distributed planning.
2020-01-16 12:28:15 +01:00
Onder Kalaci 421bf68516 Add the specific regression tests
With this commit, we're adding the specific tests for CTE inlining.
The test has a different output file for pg 11, because as mentioned
in the previous commits, PG 12 generates more restriction information
for CTEs.
2020-01-16 12:28:15 +01:00
Onder Kalaci efb1577d06 Handle CTE aliases accurately
Basically, make sure to update the column name with the CTEs alias
if we need to do so.
2020-01-16 12:28:15 +01:00
Onder Kalaci 05d600dd8f Call CTE inlining in Citus planner
The idea is simple: Inline CTEs(if any), try distributed planning.
If the planning yields a successful distributed plan, simply return
it.

If the planning fails, fallback to distributed planning on the query
tree where CTEs are not inlined. In that case, if the planning failed
just because of the CTE inlining, via recursive planning, the same
query would yield a successful plan.

A very basic set of examples:

WITH cte_1 AS (SELECT * FROM test_table)
SELECT
	*, row_number() OVER ()
FROM
	cte_1;

or

WITH a AS (SELECT * FROM test_table),
b AS (SELECT * FROM test_table)
SELECT * FROM  a JOIN b ON (a.value> b.value);
2020-01-16 12:28:15 +01:00
Onder Kalaci 01a5800ee8 Add Citus' CTE inlining functions
With this commit we add the necessary Citus function to inline CTEs
in a queryTree.

You might ask, why do we need to inline CTEs if Postgres is already
going to do it?

Few reasons behind this decision:

- One techinal node here is that Citus does the recursive CTE planning
  by checking the originalQuery which is the query that has not gone
  through the standard_planner().

  CTEs in Citus is super powerful. It is practically key for full SQL
  coverage for multi-shard queries. With CTEs, you can always reduce
  any query multi-shard query into a router query via recursive
  planning (thus full SQL coverage).
  We cannot let CTE inlining break that. The main idea is Citus should
  be able to retry planning if anything goes after CTE inlining.

  So, by taking ownership of CTE inlining on the originalQuery, Citus
  can fallback to recursive planning of CTEs if the planning with the
  inlined query fails. It could have been a lot harder if we had relied
  on standard_planner() to have the inlined CTEs on the original query.

- We want to have this feature in PostgreSQL 11 as well, but Postgres
  only inlines in version 12
2020-01-16 12:28:15 +01:00
Onder Kalaci 1856ab6cdd Copy & paste code from Postgres source
All the code in this commit is direct copy & paste from Postgres
source code.

We can classify the copy&paste code into two:

- Copy paste from CTE inline patch from postgres
  (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b)
  These include the functions inline_cte(), inline_cte_walker(),
  contain_dml(), contain_dml_walker().
  It also include the code in function PostgreSQLCTEInlineCondition().
  We prefer to extract that code into a seperate function, because
  (a) we'll re-use the logic later (b) we added one check for PG_11

  Finally, the struct "inline_cte_walker_context" is also copied from
  the same Postgres commit.

- Copy paste from the other parts of the Postgres code

  In order to implement CTE inlining in Postgres 12, the hackers
  modified the query_tree_walker()/range_table_walker() with the
  18c0da88a5

  Since Citus needs to support the same logic in PG 11, we copy & pasted
  that functions (and related flags) with the names pg_12_query_tree_walker()
  and pg_12_range_table_walker()
2020-01-16 12:28:15 +01:00
Philip Dubé 4d9a733c2f Fix inserting multiple values with row expression partition column causing the insert to be ignored
Raise an error instead of silently inserting nothing if we hit this condition in the future
2020-01-15 21:10:50 +00:00
Philip Dubé 4989c9a15c PlacementExecutionDone: We may mark placements as failed multiple times, but should only act the first time. 2020-01-15 18:20:01 +00:00
Marco Slot f1a0582973 Make ApplyLogRedaction a macro and redefine ereport 2020-01-13 18:24:36 +01:00
Marco Slot 06709ee108 Always use NOTICE in log_remote_commands and avoid redaction when possible 2020-01-13 18:24:36 +01:00
Marco Slot 90056f7d3c Remove copy from worker for append-partitioned table 2020-01-13 23:03:40 -08:00
Philip Dubé 62524d152d mitmscripts/fluent.py: use atomic increment 2020-01-13 20:35:08 +00:00
Philip Dubé ccabf19090 Propagate DROP ROUTINE, ALTER ROUTINE
In two places I've made code more straight forward by using ROUTINE in our own codegen

Two changes which may seem extraneous:

AppendFunctionName was updated to not use pg_get_function_identity_arguments.
This is because that function includes ORDER BY when printing an aggregate like my_rank.
While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres,
ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not.

Tests were updated to use macaddr over integer. Using integer is flaky, our logic
could sometimes end up on tables like users_table. I originally wanted to use money,
but money isn't hashable.
2020-01-13 15:37:46 +00:00
Philip Dubé 4b5d6c3ebe Rename RelayFileState to ShardState
Replace FILE_ prefix with SHARD_STATE_
2020-01-12 05:57:53 +00:00
Philip Dubé e71386af33 Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT
Also use stack allocation for walkerContext in multi_logical_optimizer
2020-01-10 16:54:00 +00:00
Hadi Moshayedi 40ba2cdd6e Test RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Hadi Moshayedi 527d7d41c1 Implement RedistributeTaskListResult 2020-01-09 23:47:25 -08:00
Philip Dubé 281aacce9b Fix row-gather for subqueries being handled by task-tracker
task-tracker has specific logic for MultiPartition when GROUP BY is missing

We were ending up in this code path because row-gather removes GROUP BY
2020-01-10 01:51:37 +00:00
Hadi Moshayedi e1e383cb59 Don't override xact id assigned by coordinator on workers.
We might need to send commands from workers to other workers. In
these cases we shouldn't override the xact id assigned by coordinator,
or otherwise we won't read the consistent set of result files
accross the nodes.
2020-01-09 11:09:11 -08:00
Hadi Moshayedi bb65669186 Failure tests for PartitionTasklistResults 2020-01-09 10:55:58 -08:00
Hadi Moshayedi c7c460e843 PartitionTasklistResults: Use different queries per placement
We need to know which placement succeeded in executing the worker_partition_query_result() call. Otherwise we wouldn't know which node to fetch from. This change allows that by introducing Task::perPlacementQueryStrings.
2020-01-09 10:55:58 -08:00
Hadi Moshayedi f38d0e5b3f Partitioned task list results. 2020-01-09 10:32:58 -08:00
Philip Dubé 73c06fae3b Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type 2020-01-09 18:24:29 +00:00
Jelte Fennema 9724e25065 Normalize plan numbers in insert_select output 2020-01-07 10:34:08 +01:00
Philip Dubé bf7d86a3e8 Fix typo: aggragate -> aggregate 2020-01-07 01:16:09 +00:00
Philip Dubé 863bf49507 Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default 2020-01-07 01:16:04 +00:00
Jelte Fennema 5b0baea72c Refactor distributed_planner for better understandability 2020-01-06 14:23:38 +01:00
Onder Kalaci 5a1e752726 Apply feedback - add fastPath field to plan 2020-01-06 12:42:43 +01:00
Onder Kalaci 13a9b55695 Skip expensive checks when fast-path query
The definition of fast-path query is very strict. So, we don't need
to do some extra checks.
2020-01-06 12:42:43 +01:00
Onder Kalaci 7f3ab7892d Skip shard pruning when possible
We're already traversing the queryTree and finding the distribution
key value, so pass it to the later stages of the planning.
2020-01-06 12:42:43 +01:00
Onder Kalaci ca293116fa Reduce calls to FastPathRouterQuery()
Before this commit, we called it twice durning planning. Instead,
we save the information and pass it.
2020-01-06 12:42:43 +01:00
Onder Kalaci c8f14c9f6c Make sure to update shard states of partitions on failures
Fixes #3331

In #2389, we've implemented support for partitioned tables with rep > 1.
The implementation is limiting the use of modification queries on the
partitions. In fact, we error out when any partition is modified via
EnsurePartitionTableNotReplicated().

However, we seem to forgot an important case, where the parent table's
partition is marked as INVALID. In that case, at least one of the partition
becomes INVALID. However, we do not mark partitions as INVALID ever.

If the user queries the partition table directly, Citus could happily send
the query to INVALID placements -- which are not marked as INVALID.

This PR fixes it by marking the placements of the partitions as INVALID
as well.

The shard placement repair logic already re-creates all the partitions,
so should be fine in that front.
2020-01-06 12:26:08 +01:00
Jelte Fennema 3c770516eb
Commenting out flaky intermediate data leak test (#3359)
check-multi apparently has an intermediate data leak, so commenting out
that test for now. This was introduced by #3349 

Examples:
- https://app.circleci.com/jobs/github/citusdata/citus/74675
- https://app.circleci.com/jobs/github/citusdata/citus/74683
- https://app.circleci.com/jobs/github/citusdata/citus/74763
2020-01-06 11:55:01 +01:00
Jelte Fennema d29ce8965c
Actually check that test output normalization is applied in CI (#3358)
Fixup of an issue with #3336 that caused CI not to check correctly that
normalized test output was committed.
2020-01-06 10:37:34 +01:00
Jelte Fennema 4a20ba3bfc Merge remote-tracking branch 'origin/master' into normalized-test-output 2020-01-06 09:36:04 +01:00
Jelte Fennema 2e4e1c030f Make sure the expected .out file always exists when running diff on it 2020-01-06 09:32:03 +01:00
Jelte Fennema 16bcf15e16 Remove unused normalization rule 2020-01-06 09:32:03 +01:00
Jelte Fennema 634ea80009 Add a basic testing README including normalization explanation 2020-01-06 09:32:03 +01:00
Jelte Fennema 7c3e8e150e Normalize tests: s/Subplan [0-9]+\_/Subplan XXX\_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema acd12a6de5 Normalize tests: s/read_intermediate_result\('[0-9]+_/read_intermediate_result('XXX_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 21dbd4e55d Normalize tests: s/generating subplan [0-9]+\_/generating subplan XXX\_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 58723dd8b0 Normalize tests: s/DEBUG: Plan [0-9]+/DEBUG: Plan XXX/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 34c5532e9c Add commented out rules to normalize Plan numbers 2020-01-06 09:32:03 +01:00
Jelte Fennema 38ac28b4b8 Normalize tests: intermediate_results 2020-01-06 09:32:03 +01:00
Jelte Fennema 0c6983a80e Normalize tests: pg12 changes 2020-01-06 09:32:03 +01:00
Jelte Fennema 7730bd449c Normalize tests: Remove trailing whitespace 2020-01-06 09:32:03 +01:00
Jelte Fennema 6353c9907f Normalize tests: Line info varies between versions 2020-01-06 09:32:03 +01:00
Jelte Fennema bf2c203908 Normalize tests: solation_ref2ref_foreign_keys 2020-01-06 09:32:03 +01:00
Jelte Fennema 7b2c769a5d Normalize tests: normalize file names for partitioned files 2020-01-06 09:32:03 +01:00
Jelte Fennema 98bab9caab Normalize tests: ignore WAL warnings 2020-01-06 09:32:03 +01:00
Jelte Fennema 5c0f955ab9 Normalize tests: ignore could not consume warnings 2020-01-06 09:32:03 +01:00
Jelte Fennema dc3cff991f Normalize tests: normalize failed task ids 2020-01-06 09:32:03 +01:00
Jelte Fennema d0ade90cd0 Normalize tests: pkey constraints for multi_insert_select 2020-01-06 09:32:03 +01:00
Jelte Fennema 704e1d2bc8 Normalize tests: shard table names for multi_name_lengths 2020-01-06 09:32:03 +01:00
Jelte Fennema 1c4ea6836b Normalize tests: shard table names for multi_insert_select_conflict 2020-01-06 09:32:03 +01:00
Jelte Fennema 27997c054e Normalize tests: shard table names for foreign_key_restrection_enforcement 2020-01-06 09:32:03 +01:00
Jelte Fennema 432b5baac7 Normalize tests: shard table names for custom_aggregate_support 2020-01-06 09:32:03 +01:00
Jelte Fennema 0c23caeb75 Normalize tests: shard table names for multi_subtransactions 2020-01-06 09:32:03 +01:00
Jelte Fennema 883ee9121f Normalize tests: shard table names in foreign_key_to_reference_table 2020-01-06 09:32:03 +01:00
Jelte Fennema 7f3de68b0d Normalize tests: header separator length 2020-01-06 09:32:03 +01:00
Philip Dubé 566246ecd4 End regression tests with ensure_no_intermediate_data_leak
Also update tests to clean up jobs when they're directly testing job udfs
2020-01-03 18:59:02 +00:00
Önder Kalacı 0c70a5470e
Allow RETURNING in fast-path queries (#3352)
* Allow RETURNING in fast-path queries

Because there is no specific reason for that.
2020-01-03 13:42:50 +00:00
Önder Kalacı a174eb4f7b
Do not go through standard_planner() for INSERTs (#3348)
That seems unnecessary. We already have the notion of FastPath queries,
simply add it there.
2020-01-03 12:15:22 +00:00
Jelte Fennema 75a9c25acd Normalize tests: s/node group [12] (but|does)/node group \1/ 2020-01-03 11:46:01 +01:00
Jelte Fennema 96434e898f Normalize tests: s/assigned task [0-9]+ to node/assigned task to node/ 2020-01-03 11:45:22 +01:00
Jelte Fennema 7b833466ba Normalize tests: s/shard [0-9]+/shard xxxxx/g 2020-01-03 11:44:30 +01:00
Jelte Fennema 8b5fe8aa17 Normalize tests: s/placement [0-9]+/placement xxxxx/g 2020-01-03 11:42:48 +01:00
Jelte Fennema f21f00544e Normalize tests: s/ port=[0-9]+ / port=xxxxx /g 2020-01-03 11:42:09 +01:00
Jelte Fennema 8c5c0dd74c Normalize tests: s/localhost:[0-9]+/localhost:xxxxx/g 2020-01-03 11:40:50 +01:00
Jelte Fennema a1ff2117bf Ignore .modified and .unmodified files in git 2020-01-03 11:38:12 +01:00
Jelte Fennema 7630029a7f Keep an .unmodified file for debugging 2020-01-03 11:30:08 +01:00
Jelte Fennema 8fae3ed800 Remove trailing whitespace during normalization in test output 2020-01-03 11:30:08 +01:00
Jelte Fennema b815425d2c Make diff normalize our test output files in place 2020-01-03 11:30:08 +01:00
Jelte Fennema 5fee9d04c9 Uncomment local execution EXPLAIN ANALYZE tests 2020-01-02 18:56:32 +00:00
Marco Slot ba39d72fe1 Fix incorrect union all pushdown issue 2020-01-01 09:03:50 +01:00
Jelte Fennema cf88bdf833 Add tests for complex joins on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 3a042e4611 Allow cartesian products on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 61e2501645 Make any expression with two or more tables a join expression 2019-12-27 15:05:51 +01:00
Jelte Fennema 4233cd0d9d Allow non equi joins on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 7642928be1
Makefile fix DESTDIR together with cleanup (#3342)
This should fix this build issue: redmine.postgresql.org/issues/5032
2019-12-27 10:34:57 +01:00
Marco Slot b21b6905ae Do not repeat GROUP BY distribution_column on coordinator
Allow arbitrary aggregates to be pushed down in these scenarios
2019-12-25 01:33:41 +00:00
Philip Dubé a6ffcab59d CREATE EXTENSION is propagated now 2019-12-24 21:04:37 +00:00
Marco Slot a2ddfecd86 Fix inconsistent shard metadata issue 2019-12-24 08:01:32 +01:00
Hadi Moshayedi d7aea7fa10 Implement partitioned intermediate results. 2019-12-24 03:53:39 -08:00
Marco Slot b37ef0e394 Fix error in distributed queries when shards are on the coordinator 2019-12-24 06:36:43 +01:00
Philip Dubé e9bbdb8f31 Fix handling of empty intermediate results when distributing custom aggregates 2019-12-23 17:27:52 +00:00
Philip Dubé f007b7f91d Also fix reindent inconsistencies with fake_fdw.c 2019-12-20 08:27:47 +00:00
Hadi Moshayedi 08eb0ade31 Fix reindent version inconsistencies.
Different versions of reindent tool reformatted citus_custom_scan.c
and citus_copyfuncs.c differently. So some developers spent some
extra attention not to commit these two files after reindent.

This PR tries to address this.
2019-12-19 23:10:34 -08:00
Jelte Fennema b655c02352
Add the necessary changes for rebalance strategies on enterprise (#3325)
This commit adds the SQL and C changes necessary to support custom rebalance
strategies in the Enterprise version of Citus.
2019-12-19 15:23:08 +01:00
Hadi Moshayedi ef487e0792 Implement fetch_intermediate_results 2019-12-18 10:46:35 -08:00
Hadi Moshayedi 249508d267 Estimate cost of read_intermediate_results() 2019-12-17 13:51:51 -08:00
Hadi Moshayedi 113bd1e5f1 Implement read_intermediate_results 2019-12-17 13:51:16 -08:00
SaitTalhaNisanci 7ff4ce2169
Add adaptive executor support for repartition joins (#3169)
* WIP

* wip

* add basic logic to run a single job with repartioning joins with adaptive executor

* fix some warnings and return in ExecuteDependedTasks if there is none

* Add the logic to run depended jobs in adaptive executor

The execution of depended tasks logic is changed. With the current
logic:
- All tasks are created from the top level task list.
- At one iteration:
	- CurTasks whose dependencies are executed are found.
	- CurTasks are executed in parallel with adapter executor main
logic.
- The iteration is repeated until all tasks are completed.

* Separate adaptive executor repartioning logic

* Remove duplicate parts

* cleanup directories and schemas

* add basic repartion tests for adaptive executor

* Use the first placement to fetch data

In task tracker, when there are replicas, we try to fetch from a replica
for which a map task is succeeded. TaskExecution is used for this,
however TaskExecution is not used in adaptive executor. So we cannot use
the same thing as task tracker.

Since adaptive executor fails when a map task fails (There is no retry
logic yet). We know that if we try to execute a fetch task, all of its
map tasks already succeeded, so we can just use the first one to fetch
from.

* fix clean directories logic

* do not change the search path while creating a udf

* Enable repartition joins with adaptive executor with only enable_reparitition_joins guc

* Add comments to adaptive_executor_repartition

* dont run adaptive executor repartition test in paralle with other tests

* execute cleanup only in the top level execution

* do cleanup only in the top level ezecution

* not begin a transaction if repartition query is used

* use new connections for repartititon specific queries

New connections are opened to send repartition specific queries. The
opened connections will be closed at the FinishDistributedExecution.

While sending repartition queries no transaction is begun so that
we can see all changes.

* error if a modification was done prior to repartition execution

* not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level

* fix cleanup logic

* update tests

* add missing function comments

* add test for transaction with DDL before repartition query

* do not close repartition connections in adaptive executor

* rollback instead of commit in repartition join test

* use close connection instead of shutdown connection

* remove unnecesary connection list, ensure schema owner before removing directory

* rename ExecuteTaskListRepartition

* put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case

* split adaptive executor repartition to DAG execution logic and repartition logic

* apply review items

* apply review items

* use an enum for remote transaction state and fix cleanup for repartition

* add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction

* fix style

* wip

* rename removejobdir to partition cleanup

* do not close connections at the end of repartition queries

* do repartition cleanup in pg catch

* apply review items

* decide whether to use transaction or not at execution creation

* rename isOutsideTransaction and add missing comment

* not error in pg catch while doing cleanup

* use replication factor of the creation time, not current time to decide if task tracker should be chosen

* apply review items

* apply review items

* apply review item
2019-12-17 19:09:45 +03:00
Marco Slot 2f568ad5a5 Forbid using connections that sent intermediate results for data access and vice versa 2019-12-17 11:49:13 +01:00
Onur TIRTIR 8092529a2c
Split propagate extension test and add alternative output (#3314)
* Split extension name tests from propagate_extension_commands.sql

* Add alternative output for escape_extension_name.sql
2019-12-17 13:49:16 +03:00
Marco Slot f4031dd477 Clean up transaction block usage logic in adaptive executor 2019-12-17 10:48:19 +01:00
Nils Dijk bfc3d2eb90
make sure to correctly decrement ExecutorLevel (#3311)
DESCRIPTION: Fix counter that keeps track of internal depth in executor

While reviewing #3302 I ran into the `ExecutorLevel` variable which used a variable to keep the original value to restore on successful exit. I haven't explored the full space and if it is possible to get into an inconsistent state. However using `PG_TRY`/`PG_CATCH` seems generally more correct.

Given very bad things will happen if this level is not reset, I kept the failsafe of setting the variiable back to 0 on the `XactCallback` but I did add an assert to treat it as a developer bug.
2019-12-16 20:50:13 +01:00
Marco Slot 5f656e22db Fix issue in IsMultiStatementTransaction detection 2019-12-16 17:01:43 +01:00
SaitTalhaNisanci 2829c601dd
replace Begin words in coordinated transactions with use (#3293) 2019-12-16 10:40:31 +03:00
SaitTalhaNisanci a2f2107e6a
refactor MapTaskList in multi physical planner (#3297) 2019-12-13 22:41:49 +03:00
Marco Slot 1633123d78 Fix crash in IN (NULL) queries 2019-12-13 08:35:54 +01:00
Hadi Moshayedi e7a6cc0801 Fix some typos from #3280 2019-12-12 13:29:26 -08:00
SaitTalhaNisanci 420e21919b
refactor extract distributed insert values rte (#3287) 2019-12-12 23:47:44 +03:00
Marco Slot e7a8db5493 Fix issue with some zero-shard modifications 2019-12-12 07:19:10 +01:00
SaitTalhaNisanci 2c040d2c8f
use a function for duplicate code in connection state machine (#3209) 2019-12-12 17:55:38 +03:00
SaitTalhaNisanci a0fe8646e0
add IsHoldOffCancellationReceived utility function (#3290) 2019-12-12 17:32:59 +03:00
SaitTalhaNisanci 053fe18404
not continue in sequential execution if a cancellation is received (#3289) 2019-12-12 17:22:30 +03:00
Hadi Moshayedi 383d34f51b Tests for multi-statement transactions with subqueries or ctes 2019-12-11 19:54:15 -08:00
Hadi Moshayedi 939d3c955b Don't plan function joins locally 2019-12-11 16:53:29 -08:00
Hadi Moshayedi 067d92a7f6 Don't plan joins between ref tables and views locally 2019-12-11 14:31:34 -08:00
Hadi Moshayedi e3e174f30f Fix the way we check for local/reference table joins in the executor 2019-12-11 12:50:20 -08:00
SaitTalhaNisanci 13204487e9
remove copyright years (#3286) 2019-12-11 21:14:08 +03:00
SaitTalhaNisanci c2823c9349
remove unused targets from makefile (#3283) 2019-12-11 20:37:56 +03:00
SaitTalhaNisanci d10f97998c rename REMOTE_TRANS_INVALID to REMOTE_TRANS_NOT_STARTED 2019-12-11 15:24:18 +03:00
Önder Kalacı fecf61ef1f
Add missing ORDER BY in a CTE (#3282)
Otherwise, the query output might not be consistent.
2019-12-11 10:24:54 +01:00
Marco Slot 133b8e1e0e Move coordinator insert..select logic into executor 2019-12-10 11:21:35 -08:00
Marco Slot 486c620a3c Fix inserts into local tables with distributed subqueries 2019-12-10 10:17:18 +01:00
SaitTalhaNisanci 8e5041885d Refactor isolation tests (#3062)
Currently in mx isolation tests the setup is the same except the creation of tables. Isolation framework lets us define multiple `setup` stages, therefore I thought that we can put the `mx_setup` to one file and prepend this prior to running tests. 

How the structure works:
- cpp is used before running isolation tests to preprocess spec files. This way we can include any file we want to. Currently this is used to include mx common part.
- spec files are put to `/build/specs` for clear separation between generated files and template files
- a symbolic link is created for `/expected` in `build/expected/`.
- when running isolation tests, as the `inputdir`, `build` is passed so it runs the spec files from `build/specs` and checks the expected output from `build/expected`.

`/specs` is renamed as `/spec` because postgres first look at the `specs` file under current directory, so this is renamed to avoid that since we are running the isolation tests from `build/specs` now.

Note: now we use `//` instead of `#` in comments in spec files, because cpp interprets `#` as a directive and it ignores `//`.
2019-12-10 16:12:54 +01:00
Önder Kalacı 5395ce6480
We don't support PG 10 anymore, so the sed rule can go away (#3277) 2019-12-10 12:40:43 +01:00
Önder Kalacı f027e9dd77
Improve Recursive CTE tests (#3274)
Postgres keeps track of recursive CTEs in the queryTree in two ways:

   - queryTree->hasRecursive is set to true, whenever a RECURSIVE CTE
     is used in the SQL. Citus checks for it
   - If the CTE is actually a recursive one (a.k.a., references itself)
     Postgres marks CommonTableExpr->cterecursive as true as well

The tests that are changed in the PR doesn't cover (b), and this becomes
an issue with CTE inlining (#3161). In that case, Citus/Postgres can inline
such CTEs, and the queries works with Citus.

However, this tests intend to check if there is any recursive CTE in the queryTree.
So, we're actually making the CTEs recursive CTEs by referring itself.

We'll add cases where a recursive CTE works by inlining in #3161.
2019-12-10 09:38:45 +01:00
Philip Dubé fcf2fd819b Add distributioncolumncollation to to pg_dist_colocation
Use partition column's collation for range distributed tables
Don't allow non deterministic collations for hash distributed tables
CoPartitionedTables: don't compare unequal types
2019-12-09 19:51:40 +00:00
Philip Dubé d138bb89bf Support creating collations as part of dependency resolution. Propagate ALTER/DROP on distributed collations
Propagate CREATE COLLATION when outside transaction
2019-12-09 04:42:51 +00:00
Alexander Pyhalov 6174a4d3d6 Fix build on illumos 2019-12-06 14:40:47 +01:00
Marco Slot 6a9c0ea7fe Fix errors in DML with sublinks hidden by null expressions 2019-12-06 14:25:04 +01:00
Hadi Moshayedi d28beb3711 Detect SQL UDF Calls. 2019-12-05 14:31:05 -08:00
Philip Dubé 5a17fd6d9d Test more reference/local cases, also ALTER ROLE
Test ALTER ROLE doesn't deadlock when coordinator added, or propagate from mx workers

Consolidate wait_until_metadata_sync & verify_metadata to multi_test_helpers
2019-12-03 22:23:14 +00:00
Philip Dubé 1597fbb369 aggregate_support test: test DISTINCT, ORDER BY, FILTER, & no intermediate results
Previously,
- we'd push down ORDER BY, but this doesn't order intermediate results between workers
- we'd keep FILTER on master aggregate, which would raise an error about unexpected cstrings
2019-12-03 15:46:01 +00:00
Philip Dubé 5fcc169a3a Stray depended to dependent tidy up 2019-12-03 15:28:32 +00:00
Marco Slot bb3bc10f0c Fix segfault in column_to_column_name 2019-12-01 23:57:25 +01:00
Marco Slot b1b13e394e Fix segfault when executing DDL via UDF 2019-12-01 22:54:41 +01:00
Marco Slot 4c8d43c5d0 Bump repo version to 9.2devel 2019-11-29 07:33:39 +01:00
Nils Dijk 1ef1667ddb
add gitref to the output of citus_version (#3246)
DESCRIPTION: add gitref to the output of citus_version

During debugging of custom builds it is hard to know the exact version of the citus build you are using. This patch will add a human readable/understandable git reference to the build of citus which can be retrieved by calling `citus_version();`.
2019-11-29 15:54:09 +01:00
Marco Slot 16d1ad3666 Remove distinction between SQL_TASK and ROUTER_TASK 2019-11-29 05:58:29 +01:00
SaitTalhaNisanci aeec3d1544
fix typo in dependent jobs and dependent task (#3244) 2019-11-28 23:47:28 +03:00
Philip Dubé 0d04ff1692 RECORD: Add support for more expression types
- OpExpr
- NullIfExpr
- MinMaxExpr
- CoalesceExpr
- CaseExpr

Also fix case where ARRAY[(1,2), NULL] was rejected
2019-11-27 17:07:22 +00:00
Philip Dubé 168e11cc9b Implement support for RECORD[] where we support RECORD
Support for ARRAY[] expressions is limited to having a consistent shape,
eg ARRAY[(int,text),(int,text)] as opposed to ARRAY[(int,text),(float,text)] or ARRAY[(int,text),(int,text,float)]
2019-11-27 15:02:43 +00:00
Hadi Moshayedi 2268a9cae6 Error for metadata commands if any metadata node is out-of-sync (#3226)
* Error for metadata commands if any metadata node is out-of-sync

* Make the functions have separate APIs for all workers/metadata workers
2019-11-27 09:52:57 +01:00
Marco Slot 2329157406 Swap aggregate_support tests to simplify enterprise merge 2019-11-26 13:39:18 +01:00
Önder Kalacı 1cfbeb89ec
Make NodeCanHaveDistTablePlacements() public (#3229)
Since it is required in rebalancer.
2019-11-26 12:15:38 +01:00
Marco Slot 60b741927f Add missing include to deparse_function_stmts.c 2019-11-24 06:04:22 +01:00
Philip Dubé 261a9de42d Fix typos:
VAR_SET_VALUE_KIND -> VAR_SET_VALUE kind
beginnig -> beginning
plannig -> planning
the the -> the
er then -> er than
2019-11-25 23:24:13 +00:00
Marco Slot 4b0ac4b0dd Properly escape ALTER FUNCTION .. SET deparsing. Also test 2019-11-25 23:01:30 +00:00
Philip Dubé 3c10c27b13 GetFunctionAlterOwnerCommand: use format_procedure_qualified
distributed_functions: test a function with a quote in name
AppendDefElemSet: quote variable names
2019-11-25 23:01:30 +00:00
Philip Dubé a81e6a81ab Fix distributed aggregation for non superuser roles
Moves support functions to pg_catalog for now. We'd prefer a different solution
for when we're creating these support functions dynamically
2019-11-25 20:46:25 +00:00
Khashayar Fereidani f81785ad14 Fix underflow initialization of default values
Initialization of queryWindowClause and queryOrderByLimit "memset" underflow these variables.
It's possible due to the invalid usage sizeof this part of the program cause buffer overflow and function return data corruption in future changes.
2019-11-25 19:25:51 +00:00
Onur TIRTIR bef32624c3
Escape extension name in extension command propagation (#3218) 2019-11-24 12:16:10 +03:00
Philip Dubé 99164398bf Fix potential segfault from standard_planner inlining functions 2019-11-21 18:47:36 +00:00
Philip Dubé c563e0825c Strip trailing whitespace and add final newline (#3186)
This brings files in line with our editorconfig file
2019-11-21 14:25:37 +01:00
Jelte Fennema 1d8dde232f
Automatically convert useless declarations using regex replace (#3181)
* Add declaration removal to CI

* Convert declarations
2019-11-21 13:47:29 +01:00
Onur TIRTIR 9961297d7b Improve extension command propagation logic and tests
* Improve extension command propagation tests

* patch for hardcoded citus extension name

(cherry picked from commit 0bb3dbac0afabda10e8928f9c17eda048dc4361a)
2019-11-21 11:24:39 +03:00
Marco Slot e0cccf7f9a Move C files into the appropriate directory 2019-11-16 11:36:17 +01:00
Hanefi Onaldi d82f3e9406
Introduce intermediate result broadcasting
In plain words, each distributed plan pulls the necessary intermediate
results to the worker nodes that the plan hits. This is primarily useful
in three ways. 

(i) If the distributed plan that uses intermediate
result(s) is a router query, then the intermediate results are only
broadcasted to a single node.

(ii) If a distributed plan consists of only intermediate results, which
is not uncommon, the intermediate results are broadcasted to a single
node only.

(iii) If a distributed query hits a sub-set of the shards in multiple
workers, the intermediate results will be broadcasted to the relevant
node(s).

The final item (iii) becomes crucial for append/range distributed
tables where typically the distributed queries hit a small subset of
shards/workers.

To do this, for each query that Citus creates a distributed plan, we keep
track of the subPlans used in the queryTree, and save it in the distributed
plan. Just before Citus executes each subPlan, Citus first keeps track of
every worker node that the distributed plan hits, and marks every subPlan
should be broadcasted to these nodes. Later, for each subPlan which is a
distributed plan, Citus does this operation recursively since these
distributed plans may access to different subPlans, and those have to be
recorded as well.
2019-11-20 15:26:36 +03:00
Philip Dubé b7fef5c31a Miscellaneous cleanup in prep for collation propagation 2019-11-19 17:28:59 +00:00
Jelte Fennema 1ed05be82c
Flaky test: Fix recover_prepared_transactions (#3205)
Failed test: https://app.circleci.com/jobs/github/citusdata/citus/35994

We now always take a new connection
2019-11-19 17:49:13 +01:00
Jelte Fennema 1ac96f228b
Flaky test: Force correct plan (#3203)
Failing test: https://app.circleci.com/jobs/github/citusdata/citus/23148
2019-11-19 17:11:05 +01:00
Onur TIRTIR 26c306d188
Add extensions to distributed object propagation infrastructure (#3185) 2019-11-19 17:56:28 +03:00
SaitTalhaNisanci 2cb82ae9bd
create a utility method to mark tasks as failed (#3150) 2019-11-19 16:35:56 +03:00
SaitTalhaNisanci 306d159072
refactor AfterXacthodtConnectionHandling (#3202) 2019-11-19 14:50:23 +03:00
Jelte Fennema 87f57eb92b
Fix verify_metadata not returning consistent results (#3199)
Failing test: https://app.circleci.com/jobs/github/citusdata/citus/58827
2019-11-19 11:02:35 +01:00
Hanefi Onaldi e3ad4aba94
Bump 9.1devel
* Add Changelog entry for 9.0.1
* Bump citus version to 9.1devel
2019-11-19 10:35:57 +03:00
Marco Slot 622462cad7 Return early in CitusHasBeenLoaded when creating a different extension 2019-11-15 03:00:20 +01:00
Önder Kalacı 40fa3862ce
Prevent Citus extension becoming distributed object (#3197)
Prevent Citus extension being distributed

Because that could prevent doing rolling upgrades, where users may
prefer to upgrade the version on the coordinator but not the workers.

There could be some other edge cases, so I'd prefer to keep Citus
extension outside the picture for now.
2019-11-18 16:57:10 +01:00
Halil Ozan Akgul 5ae7b219ff Create the ALTER ROLE propagation 2019-11-18 18:31:28 +03:00
Nils Dijk 217890af5f
Feature: Expression in reference join (#3180)
DESCRIPTION: Expression in reference join

Fixed: #2582

This patch allows arbitrary expressions in the join clause when joining to a reference table. An example of such joins could be found in CHbenCHmark queries 7, 8, 9 and 11; `mod((s_w_id * s_i_id),10000) = su_suppkey` and `ascii(substr(c_state,1,1)) = n2.n_nationkey`. Since the join is on a reference table these queries are able to be pushed down to the workers.

To implement these queries we will widen the `IsJoinClause` predicate to not check if the expressions are a type `Var` after stripping the implicit coerciens. Instead we define a join clause when the `Var`'s in a clause come from more than 1 table.

This allows more clauses to pass into the logical planner's `MultiNodeTree(...)` planning function. To compensate for this we tighten down the `LocalJoin`, `SinglePartitionJoin` and `DualPartitionJoin` to check for direct column references when planning. This allows the planner to work with arbitrary join expressions on reference tables.
2019-11-18 16:25:46 +01:00
Önder Kalacı a4c90b6ee1
Make distributed object dependency logic follow upto extensions (#3195)
With this commit, we're slightly changing the dependency traversal
logic to enable extension propagation.

The main idea is to "follow" the extension dependencies, but do not
"apply" them.

Since some extension dependencies are base types, and base types
could have circular dependencies, we implement a logic to prevent
revisiting an already visited object.
2019-11-17 17:21:21 +01:00
Hadi Moshayedi d9dcba25e3 Plan reference/local table joins locally 2019-11-15 07:36:50 -08:00
Onder Kalaci 90943a6ce6 Do not include coordinator shards when round-robin is selected
When the user picks "round-robin" policy, the aim is that the load
is distributed across nodes. However, for reference tables on the
coordinator, since local execution kicks in immediately, round-robin
is ignored.

With this change, we're excluding the placement on the coordinator.
Although the approach seems a little bit invasive because of
modifications in the placement list, that sounds acceptable.

We could have done this in some other ways such as:

1) Add a field to "Task->roundRobinPlacement" (or such), which is
updated as the first element after RoundRobinPolicy is applied.
During the execution, if that placement is local to the coordinator,
skip it and try the other remote placements.

2) On TaskAccessesLocalNode()@local_execution.c, check
task_assignment_policy, if round-robin selected and there is local
placement on the coordinator, skip it. However, task assignment is done
on planning, but this decision is happening on the execution, which
could create weird edge cases.
2019-11-15 06:03:32 -08:00
Hadi Moshayedi 15af1637aa Replicate reference tables to coordinator. 2019-11-15 05:50:19 -08:00
Hadi Moshayedi cb011bb30f Propagate isactive to metadata nodes. 2019-11-15 05:48:42 -08:00
SaitTalhaNisanci b9b7fd7660
add IsLoggableLevel utility function (#3149)
* add IsLoggableLevel utility function

* add function comment for IsLoggableLevel

* put ApplyLogRedaction to logutils
2019-11-15 14:59:13 +03:00
Jelte Fennema 1b2c438e69
Rename variables to not shadow globals in RHEL6 (#3194)
Fixes #2839
2019-11-15 12:12:24 +01:00
Jelte Fennema a8bd2d58f5
Update SQL definitions to prepare for drain node functionality (#3179) 2019-11-15 10:11:56 +01:00
Jelte Fennema 4b9b4b0995
Don't warn for declaration-after-statement since we only support GNU99 (#3132)
This change was actually already intended in #3124. However, the
postgres Makefile manually enables this warning too. This way we undo
that.

To confirm that it works two functions were changed to make use of not
having the warning anymore.
2019-11-15 09:46:06 +01:00
Philip Dubé 495c0f5117 Phase 1 implementation of custom aggregates
Phase 1 seeks to implement minimal infrastructure, so does not include:
	- dynamic generation of support aggregates to handle multiple arguments
	- configuration methods to direct aggregation strategy,
		or mark an aggregate's serialize/deserialize as safe to operate across nodes

Aggregates can be distributed when:
	- they have a single argument
	- they have a combinefunc
	- their transition type is not a pseudotype
2019-11-14 19:01:24 +00:00
Philip Dubé edc7a2ee38 Improve RECORD support 2019-11-14 18:32:22 +00:00
Philip Dubé eb35743c3f Remove citus.worker_list_file & master_initialize_node_metadata 2019-11-13 00:49:58 +00:00
Philip Dubé 48552bfffe Call DestReceiver rDestroy before it goes out of scope
CitusCopyDestReceiverDestroy: call hash_destroy on shardStateHash & connectionStateHash
2019-11-12 15:03:07 +00:00
Jelte Fennema adc6ca6100
Make simple in queries on unique columns work with repartion join (#3171)
This is necassery to support Q20 of the CHbenCHmark: #2582.

To summarize the fix: The subquery is converted into an INNER JOIN on a
table. This fixes the issue, since an INNER JOIN on a table is already
supported by the repartion planner.

The way this replacement is happening.:
1. Postgres replaces `col in (subquery)` with a SEMI JOIN (subquery) on col = subquery_result
2. If this subquery is simple enough Postgres will replace it with a
   regular read from a table
3. If the subquery returns unique results (e.g. a primary key) Postgres
   will convert the SEMI JOIN into an INNER JOIN during the planning. It
   will not change this in the rewritten query though.
4. We check if Postgres sends us any SEMI JOINs during its join order
   planning, if it doesn't we replace all SEMI JOINs in the rewritten
   query with INNER JOIN (which we already support).
2019-11-11 13:44:28 +01:00
SaitTalhaNisanci 57380fd668
remove duplicated method in multi_logical_optimizer (#3166) 2019-11-11 13:51:21 +03:00
Önder Kalacı 460f000218
Remove failure tests related to real-time executor (#3174)
Since we've removed the executor, we don't need the specific tests.
Since the tests are already using adaptive executor, they were passing.
But, we've plenty of extra tests for adaptive executor, so seems safe
to remove.
2019-11-11 10:18:37 +01:00
Philip Dubé ad86c1b866 AcquireDistributedLockOnRelations: escape relation names 2019-11-08 21:23:01 +00:00
Philip Dubé e8ecbbfcb3 Escape transaction names 2019-11-08 21:23:01 +00:00
Jelte Fennema 9fb897a074
Fix queries with repartition joins and group by unique column (#3157)
Postgres doesn't require you to add all columns that are in the target list to
the GROUP BY when you group by a unique column (or columns). It even actively
removes these group by clauses when you do.

This is normally fine, but for repartition joins it is not. The reason for this
is that the temporary tables don't have these primary key columns. So when the
worker executes the query it will complain that it is missing columns in the
group by.

This PR fixes that by adding an ANY_VALUE aggregate around each variable in
the target list that does is not contained in the group by or in an aggregate.
This is done only for repartition joins.

The ANY_VALUE aggregate chooses the value from an undefined row in the
group.
2019-11-08 15:36:18 +01:00
SaitTalhaNisanci 02b359623f
remove duplicate code in citus_dist_stat_activity (#3165) 2019-11-08 15:41:32 +03:00
Önder Kalacı 0b3d4e55d9
Local execution should not change hasReturning for distributed tables (#3160)
It looks like the logic to prevent RETURNING in reference tables to
have duplicate entries that comes from local and remote executions
leads to missing some tuples for distributed tables.

With this PR, we're ensuring to kick in the logic for reference tables
only.
2019-11-08 12:49:56 +01:00
Philip Dubé 9a31837647 isolation_create_restore_point: test reference tables too 2019-11-07 17:50:22 +00:00
Philip Dubé 72c3d64ead Rename OpenConnectionsToAllNodes to OpenConnectionsToAllWorkerNodes 2019-11-07 17:50:22 +00:00
Philip Dubé 2fc45e5897 create_distributed_function: accept aggregates
Adds support for OCLASS_PROC to worker_create_or_replace_object
2019-11-06 18:23:37 +00:00
Hadi Moshayedi e00d1546f3 Don't maintain replicationfactor of reference tables 2019-11-05 07:23:14 -08:00
Onder Kalaci 471703bfaf DEBUG only when the function is distributed
Otherwise, we're seeing this message way to often.
2019-11-05 15:08:35 +00:00
Önder Kalacı 960cd02c67
Remove real time router executors (#3142)
* Remove unused executor codes

All of the codes of real-time executor. Some functions
in router executor still remains there because there
are common functions. We'll move them to accurate places
in the follow-up commits.

* Move GUCs to transaction mngnt and remove unused struct

* Update test output

* Get rid of references of real-time executor from code

* Warn if real-time executor is picked

* Remove lots of unused connection codes

* Removed unused code for connection restrictions

Real-time and router executors cannot handle re-using of the existing
connections within a transaction block.

Adaptive executor and COPY can re-use the connections. So, there is no
reason to keep the code around for applying the restrictions in the
placement connection logic.
2019-11-05 12:48:10 +01:00
Jelte Fennema f0c35ad134 Include fmgr.h, don't duplicate FunctionCallInfo typedef 2019-11-04 17:10:33 +00:00
SaitTalhaNisanci 7c410e3cd7
pass CitusCustomState directly to adaptive executor (#3151) 2019-11-01 19:57:32 +03:00
Önder Kalacı ffd89e4e01
Include all relevant relations in the ExtractRangeTableRelationWalker (#3135)
We've changed the logic for pulling RTE_RELATIONs in #3109 and
non-colocated subquery joins and partitioned tables.
@onurctirtir found this steps where I traced back and found the issues.

While looking into it in more detail, we decided to expand the list in a
way that the callers get all the relevant RTE_RELATIONs RELKIND_RELATION,
RELKIND_PARTITIONED_TABLE, RELKIND_FOREIGN_TABLE and RELKIND_MATVIEW.
These are all relation kinds that Citus planner is aware of.
2019-11-01 16:06:58 +01:00
Onur TIRTIR d3f68bf44f
Fix view is not distributed error when view is used in modify statements (#3104) 2019-11-01 16:34:01 +03:00
SaitTalhaNisanci c7ceca3216
update outdated comment in JobExecutorType (#3148) 2019-11-01 11:36:56 +03:00
SaitTalhaNisanci 70e46703aa
Fix debug1 message in JobExecutorType (#3147)
When citus.enable_repartition_joins guc is set to on, and we have
adaptive executor, there was a typo in the debug message, which was
saying realtime executor no adaptive executor.
2019-11-01 11:14:19 +03:00
Marco Slot 51c64c70c9 Do not try to sync metadata on standby coordinator 2019-10-30 05:15:45 +01:00
SaitTalhaNisanci dadbe86af1
refactor some of hard coded values in citus gucs (#3137)
* refactor some of hard coded values in citus gucs

* rename GUC_ALLOW_ALL to GUC_STANDARD
2019-10-30 10:35:39 +03:00
Marco Slot 03cae27782 Add tests for distributing functions with replication_model statement 2019-10-26 23:57:59 +02:00
Marco Slot 067657af26 Disallow distributed functions with distribution arguments unless replication_model is streaming 2019-10-26 23:57:59 +02:00
SaitTalhaNisanci 29d45bd1b9
Do not assign InvalidOid for local execution while extracting parameters (#3131)
* do not assign InvalidOid for local execution while extracting parameters

* rename functions

* rename parameter and replace function
2019-10-28 14:28:22 +03:00
Önder Kalacı dceaddbe4d
Remove real-time/router executors (step 1) (#3125)
See #3125 for details on each item.

* Remove real-time/router executor tests-1

These are the ones which doesn't have '_%d' in the test
output files.

* Remove real-time/router executor tests-2

These are the ones which has in the test
output files.

* Move the tests outputs to correct place

* Make sure that single shard commits use 2PC on adaptive executor

It looks like we've messed the tests in #2891. Fixing back.

* Use adaptive executor for all router queries

This becomes important because when task-tracker is picked, we
used to pick router executor, which doesn't make sense.

* Remove explicit references to real-time/router executors in the tests

* JobExecutorType never picks real-time/router executors

* Make sure to go incremental in test output numbers

* Even users cannot pick real-time anymore

* Do not use real-time/router custom scans

* Get rid of unnecessary normalizations

* Reflect unneeded normalizations

* Get rid of unnecessary test output file
2019-10-25 10:54:54 +02:00
Marco Slot b8c8fd4612 Fix run_command_on_colocated_placements tests 2019-10-23 00:08:17 +02:00
Marco Slot a1162b2023 Rename 9.1 upgrade script to upgrade from 9.0-2 2019-10-23 00:08:17 +02:00
Marco Slot 04040e0a37 Revoke usage from the citus schema 2019-10-23 00:08:17 +02:00
Jelte Fennema a5010e5b17
Add extra foreach convenience macros (#3117)
This completely hides `ListCell` to the user of the loop

Example usage:
```c
WorkerNode *workerNode = NULL;

foreach_ptr(workerNode, workerNodeList) {
	// Do stuff with workerNode
}
```

Instead of:
```c
ListCell *workerNodeCell = NULL;

foreach(cell, workerNodeList) {
    WorkerNode *workerNode = lfirst(workerNodeCell);
	// Do stuff with workerNode
}
```
2019-10-23 16:49:12 +02:00
Onder Kalaci c2460a1c31 Add upgrade test for distributed functions
Simply make sure that Citus can pushdown functions after pg upgrade.
2019-10-23 12:07:51 +02:00
Philip Dubé b2f084d7f5 UnsetMetadataSyncedForAll: use CatalogTupleUpdateWithInfo 2019-10-23 00:45:11 +00:00
Philip Dubé 2a969fe4bb ssl_by_default: remove stray PG10 check 2019-10-23 00:27:54 +00:00
Philip Dubé 2204a17dbd isolation_multiuser_locking: reorder GRANT to avoid deadlock on enterprise 2019-10-22 21:10:55 +00:00
Onder Kalaci a208f8b151 Fix memory leak on ReceiveResults
It turns out that TupleDescGetAttInMetadata() allocates quite a lot
of memory. And, if the target list is long and there are too many rows
returning, the leak becomes appereant.

You can reproduce the issue wout the fix with the following commands:

```SQL

CREATE TABLE users_table (user_id int, time timestamp, value_1 int, value_2 int, value_3 float, value_4 bigint);
SELECT create_distributed_table('users_table', 'user_id');

insert into users_table SELECT i, now(), i, i, i, i FROM generate_series(0,99999)i;

-- load faster

-- 200,000
INSERT INTO users_table SELECT * FROM users_table;

-- 400,000
INSERT INTO users_table SELECT * FROM users_table;

-- 800,000
INSERT INTO users_table SELECT * FROM users_table;

-- 1,600,000
INSERT INTO users_table SELECT * FROM users_table;

-- 3,200,000
INSERT INTO users_table SELECT * FROM users_table;

-- 6,400,000
INSERT INTO users_table SELECT * FROM users_table;

-- 12,800,000
INSERT INTO users_table SELECT * FROM users_table;

-- making the target list entry wider speeds up the leak to show up
 select *,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,* FROM users_table ;

 ```
2019-10-22 17:22:26 +02:00
Jelte Fennema 78e495e030
Add shouldhaveshards to pg_dist_node (#2960)
This is an improvement over #2512.

This adds the boolean shouldhaveshards column to pg_dist_node. When it's false, create_distributed_table for new collocation groups will not create shards on that node. Reference tables will still be created on nodes where it is false.
2019-10-22 16:47:16 +02:00
Hanefi Onaldi 7ebda04494
Update all c-style comments in migration files 2019-10-21 16:05:53 +03:00
Halil Ozan Akgul 5f04ac774f Adds the tests for refresh materialized views 2019-10-17 16:00:56 +03:00
Jelte Fennema 7abedc38b0
Support subqueries in HAVING (#3098)
Areas for further optimization:
- Don't save subquery results to a local file on the coordinator when the subquery is not in the having clause
- Push the the HAVING with subquery to the workers if there's a group by on the distribution column
- Don't push down the results to the workers when we don't push down the HAVING clause, only the coordinator needs it

Fixes #520
Fixes #756
Closes #2047
2019-10-16 16:40:14 +02:00
Onur TIRTIR 3bfb2a078b
Make changes on if-statement in ExtractRangeTableList for furhter walker types (#3110) 2019-10-16 15:50:09 +03:00
Onur TIRTIR d5f83dc110
Refactor range table walkers (#3109) 2019-10-16 01:20:49 +03:00
SaitTalhaNisanci 94a7e6475c
Remove copyright years (#2918)
* Update year as 2012-2019

* Remove copyright years
2019-10-15 17:44:30 +03:00
Jelte Fennema 9b2f4d71ac
Make sure some MX tests use defined shard_ids (#3103) 2019-10-12 22:46:14 +02:00
Philip Dubé 74cb168205 Remove Postgres 10 support 2019-10-11 21:56:56 +00:00
Hadi Moshayedi b50d216536 Fix a typo 2019-10-10 10:44:41 -07:00
Philip Dubé 4063e7ca67 CALL delegation: apply strip_implicit_coercions to distribution argument 2019-10-10 17:42:43 +00:00
Philip Dubé 7ffd78b6e0 isolation_multiuser_locking
Introduce a test which checks that locks are only acquired when a user has necessary permissions
Currently tests REINDEX, CREATE INDEX, TRUNCATE
2019-10-10 16:58:41 +00:00
Philip Dubé dd490b6376 Cache whether an object is in pg_dist_object. Avoids redundant lookups for non-distributed objects 2019-10-10 14:50:38 +00:00
SaitTalhaNisanci 83667436f7
add support to run citus upgrade tests locally (#3083)
* add support to run citus upgrade tests locally

* dont build tars if they already exist

* use current code instead of master for upgrade

* always build the current code

* copy the current citus code to have isolated citus upgrade tests

* fix configure and simplify copy
2019-10-08 15:32:44 +03:00
Nils Dijk 4a4a220945
Fix enum add value order and pg12 (#3082)
DESCRIPTION: Fix order for enum values and correctly support pg12

PG 12 introduces `ALTER TYPE ... ADD VALUE ...` during transactions. Earlier versions would error out when called in a transaction, hence we connect to workers outside of the transaction which could cause inconsistencies on pg12 now that postgres doesn't error with this syntax anymore.

During the implementation of this fix it became apparent there was an error with the ordering of enum labels when the type was recreated. A patch and test have been included.
2019-10-07 17:16:19 +02:00
Jelte Fennema 01da11f264
Change citus truncate trigger to AFTER and add more upgrade tests (#3070)
* Add more upgrade tests

* Fix citus trigger generation after upgrade

citus_truncate_trigger runs before truncate when created by create_distributed_table:
492d1b2cba/src/backend/distributed/commands/create_distributed_table.c (L1163)

* Remove pg_dist_jobid_seq
2019-10-07 16:43:04 +02:00
Onder Kalaci 3be72ce42f Make sure that distributed functions always have the correct user
Objectives:

(a) both super user and regular user should have the correct owner for the function on the worker
(b) The transactional semantics would work fine for both super user and regular user
(c) non-super-user and non-function owner would get a reasonable error message if tries to distribute the function

Co-authored-by: @serprex
2019-10-04 21:38:49 +00:00
SaitTalhaNisanci c547664fae
Add Citus upgrade tests with its job (#3003)
* Add initial citus upgrade test

* Add restart databases and run tests in all nodes

* Add output for citus versions 8.0 8.1 8.2 and 8.3

* Add verify step for citus upgrade

* Add target for citus upgrade test in makefile

* Add check citus upgrade job

* Fix installation file path and add missing tar

* Run citus upgrade for v8.0 v8.1 v8.2 and v8.3

* Create upgrade_common file and rename upgrade check

* Add pg version to citus upgrade test

* Test with postgres 10 and 11 in citus upgrade tests

* Add readme for citus upgrade test

* Add some basic tests to citus upgrade tests

* Add citus upgrade mixed mode test

* Remove citus artifacts before installing another one

* Refactor citus upgrade test according to reviews

* quick and dirty rewrite of citus upgrade tests to support local execution.

I think we need to change the makefile in such a way that the tar files can be injected from the circle ci config file.

Also I removed some of the citus version checks you had to not have the requirement to pass that in separately from the pre tar file. I am not super happy with it, but two flags that need to be kept in sync is also not desirable. Instead I print out the citus version that is installed per node. This will not cause a failure if they are not what one would expect but it lets us verify we are running the expected version.

* use latest citusupgradetester in circleci

* update readme and use common alias for upgrade_common import
2019-10-04 17:44:49 +03:00
Marco Slot 1a3a174f67 Grant usage on schema citus to public 2019-10-04 12:26:08 +02:00
Marco Slot 89377ee578 Move RowExclusiveLock to start in SyncMetadataToNodes 2019-10-04 12:07:41 +02:00
Hadi Moshayedi 217db2a03e Don't block for locks in SyncMetadataToNodes() 2019-10-03 16:53:36 -07:00
Hadi Moshayedi ae915493e6 Don't send metadata commands to not-synced workers.
Otherwise some of the dependencies might not exist yet and
commands will error out.
2019-10-03 16:52:25 -07:00
Marco Slot 0b4b63e647 Drop the rebalancer before creating new UDFs 2019-10-03 16:08:58 +02:00
Marco Slot 2e50306cf8 Check command type in TryToDelegateFunctionCall 2019-10-03 15:37:15 +02:00
Jelte Fennema 9833c07070 Improve upgrade test runner
- Update certifi in regress pipenv
- Use normal test ports for upgrade tests
- Make diff behave correctly for upgrade tests
2019-10-03 13:10:11 +02:00
Halil Ozan Akgul bda8f6f87b Created tests for distribution to reference table foreign keys on mx 2019-10-03 09:31:13 +03:00
SaitTalhaNisanci 19bdca14d8
Add jobs to run tests with pg 12 (#3033)
* Add PG12 test outputs

* Add jobs to run tests with pg 12

* use POSIX collate for compatibility between pg10/pg11/pg12

* do not override the new default value when running vanilla tests

* fix 2 problems with pg12 tests

* update pg12 images with pg12 rc1

* remove pg10 jobs

* Revert "Add PG12 test outputs"

This reverts commit f3545b92ef.

* change images to use latest instead of dev

* add missing coverage flags
2019-10-02 15:33:12 +03:00
Halil Ozan Akgul e5906bead2 Created isolation tests for update, delete, upsert on reference tables with MX. 2019-10-02 10:11:21 +03:00
Hanefi Onaldi bd416ef68f Fix empty FROM clauses in PG12 2019-10-01 19:54:11 +00:00
Halil Ozan Akgul 1d7030a651 Created isolation tests for select for update on reference tables with MX. 2019-10-01 16:29:15 +03:00
Jelte Fennema ec4a165eec Improve isolation test block detection (#3055) 2019-10-01 14:10:15 +02:00
Jelte Fennema 40f785e6d8 Move citus_isolation_test_session_is_blocked to separate udf sql file 2019-10-01 14:10:15 +02:00
Philip Dubé 89d35e9692 Attempt to force custom plans for prepared statements when trying to delegate function calls
We discern between PARAM_EXEC & PARAM_EXTERN:
d52eaa0948/src/include/nodes/primnodes.h (L211)
According to primnodes.h we should only run into PARAM_EXEC or PARAM_EXTERN
2019-09-30 23:49:14 +00:00
Philip Dubé 29f1ea079b PG_VERSION_NUM > 110000 should be PG_VERSION_NUM >= 110000
Also fix a > 12000 typo
2019-09-30 23:37:43 +00:00
Hadi Moshayedi 5e97e5c98e Don't push down queries when in subqueries/ctes 2019-09-30 14:22:05 -07:00
Marco Slot 35bef0f3db Avoid caching connections from backends that servicei internal connections 2019-09-28 08:32:10 +02:00
Nils Dijk 01b26cf91a
Disallow distributed functions for functions depending on an extension (#3049)
DESCRIPTION: Disallow distributed functions for functions depending on an extension

Functions depending on an extension cannot (yet) be distributed by citus. If we would allow this it would cause issues with our dependency following mechanism as we stop following objects depending on an extension.

By not allowing functions to be distributed when they depend on an extension as well as not allowing to make distributed functions depend on an extension we won't break the ability to add new nodes. Allowing functions depending on extensions to be distributed at the moment could cause problems in that area.
2019-09-30 15:19:47 +02:00
Nils Dijk 473cbc0115
Propagate CREATE OR REPLACE FUNCTION to workers for distributed functions (#3043)
DESCRIPTION: Propagate CREATE OR REPLACE FUNCTION

Distributed functions could be replaced, which should be propagated to the workers to keep the function in sync between all nodes.

Due to the complexity of deparsing the `CreateFunctionStmt` we actually produce the plan during the processing phase of our utilityhook. Since the changes have already been made in the catalog tables we can reuse `pg_get_functiondef` to get us the generated `CREATE OR REPLACE` sql.
2019-09-30 12:41:17 +02:00
Jelte Fennema 82ec918b29
Add explain summary support (#3046)
Fixes #2922 and also adds explain analyze regression tests
2019-09-30 10:58:49 +02:00
Nils Dijk 9c2c50d875
Hookup function/procedure deparsing to our utility hook (#3041)
DESCRIPTION: Propagate ALTER FUNCTION statements for distributed functions

Using the implemented deparser for function statements to propagate changes to both functions and procedures that are previously distributed.
2019-09-27 22:06:49 +02:00
Philip Dubé 363409a0c2 Propagate REINDEX TABLE & REINDEX INDEX 2019-09-27 18:14:53 +00:00
Hanefi Onaldi 66b9f2e887 Deparsing and qualifiying for FUNCTION/PROCEDURE statements (#3014)
This PR aims to add all the necessary logic to qualify and deparse all possible `{ALTER|DROP} .. {FUNCTION|PROCEDURE}` queries.

As Procedures are introduced in PG11, the code contains many PG version checks. I tried my best to make it easy to clean up once we drop PG10 support.


Here are some caveats:
- I assumed that the parse tree is a valid one. There are some queries that are not allowed, but still are parsed successfully by postgres planner. Such queries will result in errors in execution time. (e.g. `ALTER PROCEDURE p STRICT` -> `STRICT` action is valid for functions but not procedures. Postgres decides to parse them nevertheless.)
2019-09-27 19:02:52 +02:00
Marco Slot 2868e02a3d Implement SELECT function call delegation.
When a function is marked as colocated with a distributed table,
we try delegating queries of kind "SELECT func(...)" to workers.

We currently only support this simple form, and don't delegate
forms like "SELECT f1(...), f2(...)", "SELECT f1(...) FROM ...",
or function calls inside transactions.

As a side effect, we also fix the transactional semantics of DO blocks.
Previously we didn't consider a DO block a multi-statement transaction.
Now we do.

Co-authored-by: Marco Slot <marco@citusdata.com>
Co-authored-by: serprex <serprex@users.noreply.github.com>
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
2019-09-27 09:13:25 -07:00
Jelte Fennema dab16be283
Set default threshold on get_rebalance_table_shards_plan to 0, like rebalance_table_shards (#3039)
In this PR the default `threshold` of `rebalance_table_shards` was set to 0: https://github.com/citusdata/shard_rebalancer/pull/73
However, the default for get_rebalance_table_shards_plan was not updated. This
can cause the confusing situation where the actual steps run by
`rebalance_table_shards` are not the same as the ones returned by
`get_rebalance_table_shards_plan`.
2019-09-27 17:21:36 +02:00
Halil Ozan Akgul 824a69587c Created isolation tests for insert select on MX 2019-09-26 17:40:36 +03:00
Marco Slot 32a11bdf6c Return early for common commands in the utility hook (#3031)
We started copying parse trees by default further on in `multi_ProcessUtility`. That's not a problem for maintenance command, but might register for things like `PREPARE` and `EXECUTE`, which might happen thousands of times per second. Add a few common commands to the check at the start.
2019-09-26 11:43:35 +02:00
Halil Ozan Akgul d56ab6274c Created isolation tests for drop, alter, index and select for update on MX. 2019-09-26 10:47:14 +03:00
Halil Ozan Akgul d426fb2159 Created isolation tests for truncate on MX. 2019-09-25 16:51:20 +03:00
Halil Ozan Akgul 62b6852923 Created isolation tests for copy on MX. 2019-09-25 15:36:05 +03:00
Onder Kalaci 219f3676a0 Improve some tests around local execution and CTE inlining on pg 12 2019-09-25 10:53:19 +02:00
Philip Dubé 4f60e3a149 Feedback 2019-09-24 17:31:09 +00:00
Marco Slot c1e43b25da Use the new create_distributed_function API in some call tests 2019-09-24 17:31:09 +00:00
Marco Slot ca478defeb Deparse CALL statement instead of using original query string 2019-09-24 17:31:09 +00:00
Philip Dubé 90e1f1442a Annotated tests for multi_mx_call.
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
2019-09-24 17:31:09 +00:00
Marco Slot e269d990c9 Cast the distribution argument value when possible 2019-09-24 17:31:09 +00:00
Philip Dubé c95d46b4f3 Extend multi_mx_call with some of Hadi's suggestions for better test coverage 2019-09-24 17:31:09 +00:00
Philip Dubé 432a8ef85b Hadi's feedback
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
Co-authored-by: serprex <serprex@users.noreply.github.com>
2019-09-24 17:31:09 +00:00
Philip Dubé 16b8d17aba Test: multi_mx_call 2019-09-24 17:31:09 +00:00
Philip Dubé bc1ad67eb5 Distribute CALL on distributed procedures to metadata workers
Lots taken from https://github.com/citusdata/citus/pull/2829
2019-09-24 17:31:09 +00:00
Onder Kalaci 18de78f386 Relax the colocation checks for distributed functions
As long as the types can be coerced, it is safe to pushdown
functions.
2019-09-24 16:31:08 +02:00
Jelte Fennema 0f90c2497e Use synchronous replication for follower tests 2019-09-24 15:51:49 +02:00
Jelte Fennema 78ccc323d1 Remove stuff needed only for PG 9.6 from test runner 2019-09-24 15:51:49 +02:00
Jelte Fennema bd2103e597 Remove flappy test 2019-09-24 14:15:33 +02:00
Jelte Fennema 897ec1bdeb Revert "Temporarily disable flappy test"
This reverts commit 4b4459ee62.
2019-09-24 14:15:33 +02:00
Marco Slot 42be8afd74 Swap pg_dist_node groupid and nodeid sequences 2019-09-24 12:03:44 +02:00
Marco Slot 0dea485c68 Fix misspelling in multi_colocation_utils 2019-09-24 11:27:30 +02:00
Marco Slot 4b4459ee62 Temporarily disable flappy test 2019-09-24 11:02:34 +02:00
Hadi Moshayedi 48078a30e6 Fix wait_until_metadata_sync() for postgres 12.
Postgres 12 now has an assertion that the calls to WaitLatchOrSocket
handle postmaster death.
2019-09-23 14:15:35 -07:00
Philip Dubé 06faba91c0 Include ifdefs for pg12 API changes, update local_shard_executiuon test to avoid CTE inlining 2019-09-23 20:22:35 +00:00
Onder Kalaci d37745bfc7 Sync metadata to worker nodes after create_distributed_function
Since the distributed functions are useful when the workers have
metadata, we automatically sync it.

Also, after master_add_node(). We do it lazily and let the deamon
sync it. That's mainly because the metadata syncing cannot be done
in transaction blocks, and we don't want to add lots of transactional
limitations to master_add_node() and create_distributed_function().
2019-09-23 18:30:53 +02:00
Marco Slot 5f23b951c7 Support serial and smallserial when syncing metadata 2019-09-23 17:39:21 +02:00
Marco Slot e58d76c5f6 Fix assert failure in bare SELECT FROM reference table FOR UPDATE in MX 2019-09-23 17:00:09 +02:00
SaitTalhaNisanci 71e7047e65
Enhance pg upgrade tests (#3002)
* Enhance pg upgrade tests

* Add a specific upgrade test for pg_dist_partition

We store the index of distribution column, and when a column with an
index that is smaller than distribution column index is dropped before
an upgrade, the index should still match the distribution column after
an upgrade
2019-09-23 17:37:14 +03:00
Marco Slot d85d77634d Handle anonymous composite types on the target list 2019-09-23 14:53:02 +02:00
Halil Ozan Akgul b55b275a30 Created isolation tests for update, delete and upsert on MX 2019-09-23 14:13:29 +03:00
Onder Kalaci d7e2968120 Add parameters to create_distributed_function()
With this commit, we're changing the API for create_distributed_function()
such that users can provide the distribution argument and the colocation
information.
2019-09-22 21:53:33 +02:00
Onder Kalaci e1fe8d60b4 Make sure that functions are also listed in SupportedDependencyByCitus
We've recently merged two commits, db5d03931d
and eccba1d4c3, which actually operates
on the very similar places.

It turns out that we've an integration issue, where master_add_node()
fails to replicate the functions to newly added node.
2019-09-20 11:02:50 +02:00
Hadi Moshayedi d24cefd055 Set active snapshot before SyncMetadataToNodes(). 2019-09-19 09:00:25 -07:00
Nils Dijk 72015faeb2
fix disable_object_propagation test for pg12 2019-09-19 17:40:24 +02:00
Hanefi Onaldi ed11b9590c
Add distributed func creation queries in dependency replication logic 2019-09-18 20:07:45 +03:00
Hadi Moshayedi d2f2acc4b2 Make master_update_node citus-ha friendly. 2019-09-18 09:32:54 -07:00
Hadi Moshayedi 76f3933b05 Add metadatasynced, and sync on master_update_node()
Co-authored-by: pykello <hadi.moshayedi@microsoft.com>
Co-authored-by: serprex <serprex@users.noreply.github.com>
2019-09-18 09:32:54 -07:00
Nils Dijk db5d03931d
Feature disable object propagation (#2986)
DESCRIPTION: Provide a GUC to turn of the new dependency propagation functionality

In the case the dependency propagation functionality introduced in 9.0 causes issues to a cluster of a user they can turn it off almost completely. The only dependency that will still be propagated and kept track of is the schema to emulate the old behaviour.

GUC to change is `citus.enable_object_propagation`. When set to `false` the functionality will be mostly turned off. Be aware that objects marked as distributed in `pg_dist_object` will still be kept in the catalog as a distributed object. Alter statements to these objects will not be propagated to workers and may cause desynchronisation.
2019-09-18 17:16:22 +02:00
Philip Dubé ac14f1dd49 pg12 doesn't support client_min_messages as 'fatal' 2019-09-17 20:37:06 +00:00
Nils Dijk 2b7f5552c8
Fix: rename remote type on conflict (#2983)
DESCRIPTION: Rename remote types during type propagation

To prevent data to be destructed when a remote type differs from the type on the coordinator during type propagation we wanted to rename the type instead of `DROP CASCADE`.

This patch removes the `DROP` logic and adds the creation of a rename statement to a free name.
2019-09-17 18:54:10 +02:00
Nils Dijk 0a3152d09c
Add feature flag to turn off create type propagation (#2982)
DESCRIPTION: Add feature flag to turn off create type propagation

When `citus.enable_create_type_propagation` is set to `false` citus will not propagate `CREATE TYPE` statements to the workers. Types are still distributed when tables that depend on these types are distributed.
2019-09-17 15:50:06 +02:00
Halil Ozan Akgul 5333296a54 Created isolation tests for select on MX 2019-09-17 12:44:45 +03:00
Philip Dubé 964020097d Merge two conflicting pg_dist_object headers 2019-09-16 19:19:21 +00:00
Onder Kalaci cde6b02858 Add columns to pg_dist_object for distributed functions
This PR simply adds the columns to pg_dist_object and
implements the necessary metadata changes to keep track of
distribution argument of the functions/procedures.
2019-09-16 17:28:04 +02:00
Jelte Fennema af9fb9f785
Fix depend arguments for OSX clang cpp (#2978)
A better fix for #2975. Apparently for OSX cpp -MF and -MT shouldn't have a
space in between the flag and their value. Without the space it still works for
gcc as well.
2019-09-16 15:22:07 +02:00
Halil Ozan Akgul 7cde785031 Added the MX isolation tests for insert 2019-09-16 15:49:43 +03:00
Jelte Fennema 31fac3b90e
Don't generate SQL files twice by not making directories a target (#2977) 2019-09-16 12:53:17 +02:00
Önder Kalacı 13947a63ce Don't use flags that mac clang doesn't support as it does on other platforms (#2975) 2019-09-16 11:44:06 +02:00
Hanefi Onaldi 8f2a3a0604
Introduce create_distributed_function(regproc) UDF (#2961)
This PR aims to add the minimal set of changes required to start
distributing functions. You can use create_distributed_function(regproc)
UDF to distribute a function.

    SELECT create_distributed_function('add(int,int)');

The function definition should include the param types to properly
identify the correct function that we wish to distribute
2019-09-13 23:27:46 +03:00
Philip Dubé fb10edcb9d isolation_add_node_vs_reference_table_operations: test add in parallel with create_reference_table 2019-09-13 18:13:58 +00:00
Philip Dubé 492d1b2cba ActivePrimaryNodeList: add lockMode parameter 2019-09-13 17:44:56 +00:00
Philip Dubé 5e5f4628a0 Fix pg12 compile 2019-09-13 17:25:30 +00:00
Jelte Fennema 4bbf65d913
Change SQL migration build process for easier reviews (#2951)
@thanodnl told me it was a bit of a problem that it's impossible to see
the history of a UDF in git. The only way to do so is by reading all the
sql migration files from new to old. Another problem is that it's also
hard to review the changed UDF during code review, because to find out
what changed you have to do the same. I thought of a IMHO better (but
not perfect) way to handle this.

We keep the definition of a UDF in sql/udfs/{name_of_udf}/latest.sql.
That file we change whenever we need to make a change to the the UDF. On
top of that you also make a snapshot of the file in
sql/udfs/{name_of_udf}/{migration-version}.sql (e.g. 9.0-1.sql) by
copying the contents. This way you can easily view what the actual
changes were by looking at the latest.sql file.

There's still the question on how to use these files then. Sadly
postgres doesn't allow inclusion of other sql files in the migration sql
file (it does in psql using \i). So instead I used the C preprocessor+
make to compile a sql/xxx.sql to a build/sql/xxx.sql file. This final
build/sql/xxx.sql file has every occurence of #include "somefile.sql" in
sql/xxx.sql replaced by the contents of somefile.sql.
2019-09-13 18:44:27 +02:00
Nils Dijk 2879689441
Distribute Types to worker nodes (#2893)
DESCRIPTION: Distribute Types to worker nodes

When to propagate
==============

There are two logical moments that types could be distributed to the worker nodes
 - When they get used ( just in time distribution )
 - When they get created ( proactive distribution )

The just in time distribution follows the model used by how schema's get created right before we are going to create a table in that schema, for types this would be when the table uses a type as its column.

The proactive distribution is suitable for situations where it is benificial to have the type on the worker nodes directly. They can later on be used in queries where an intermediate result gets created with a cast to this type.

Just in time creation is always the last resort, you cannot create a distributed table before the type gets created. A good example use case is; you have an existing postgres server that needs to scale out. By adding the citus extension, add some nodes to the cluster, and distribute the table. The type got created before citus existed. There was no moment where citus could have propagated the creation of a type.

Proactive is almost always a good option. Types are not resource intensive objects, there is no performance overhead of having 100's of types. If you want to use them in a query to represent an intermediate result (which happens in our test suite) they just work.

There is however a moment when proactive type distribution is not beneficial; in transactions where the type is used in a distributed table.

Lets assume the following transaction:

```sql
BEGIN;
CREATE TYPE tt1 AS (a int, b int);
CREATE TABLE t1 AS (a int PRIMARY KEY, b tt1);
SELECT create_distributed_table('t1', 'a');
\copy t1 FROM bigdata.csv
```

Types are node scoped objects; meaning the type exists once per worker. Shards however have best performance when they are created over their own connection. For the type to be visible on all connections it needs to be created and committed before we try to create the shards. Here the just in time situation is most beneficial and follows how we create schema's on the workers. Outside of a transaction block we will just use 1 connection to propagate the creation.

How propagation works
=================

Just in time
-----------

Just in time propagation hooks into the infrastructure introduced in #2882. It adds types as a supported object in `SupportedDependencyByCitus`. This will make sure that any object being distributed by citus that depends on types will now cascade into types. When types are depending them self on other objects they will get created first.

Creation later works by getting the ddl commands to create the object by its `ObjectAddress` in `GetDependencyCreateDDLCommands` which will dispatch types to `CreateTypeDDLCommandsIdempotent`.

For the correct walking of the graph we follow array types, when later asked for the ddl commands for array types we return `NIL` (empty list) which makes that the object will not be recorded as distributed, (its an internal type, dependant on the user type).

Proactive distribution
---------------------

When the user creates a type (composite or enum) we will have a hook running in `multi_ProcessUtility` after the command has been applied locally. Running after running locally makes that we already have an `ObjectAddress` for the type. This is required to mark the type as being distributed.

Keeping the type up to date
====================

For types that are recorded in `pg_dist_object` (eg. `IsObjectDistributed` returns true for the `ObjectAddress`) we will intercept the utility commands that alter the type.
 - `AlterTableStmt` with `relkind` set to `OBJECT_TYPE` encapsulate changes to the fields of a composite type.
 - `DropStmt` with removeType set to `OBJECT_TYPE` encapsulate `DROP TYPE`.
 - `AlterEnumStmt` encapsulates changes to enum values.
    Enum types can not be changed transactionally. When the execution on a worker fails a warning will be shown to the user the propagation was incomplete due to worker communication failure. An idempotent command is shown for the user to re-execute when the worker communication is fixed.

Keeping types up to date is done via the executor. Before the statement is executed locally we create a plan on how to apply it on the workers. This plan is executed after we have applied the statement locally.

All changes to types need to be done in the same transaction for types that have already been distributed and will fail with an error if parallel queries have already been executed in the same transaction. Much like foreign keys to reference tables.
2019-09-13 17:46:07 +02:00
Jelte Fennema e4cfea3751 Correctly add schema when distributing sequence definitons
Fixes 2958
2019-09-13 17:19:35 +02:00
Jelte Fennema 579a40dfa5 Add make check-base-mx 2019-09-13 17:19:35 +02:00
Jelte Fennema 389086102a
Refactor 9 argument function to use a struct (#2952)
For another PR I needed to add another column which would require to add
another argument to an already 9 argument function signature. In this
case it would be a boolean flag and there were already two boolean flags
in there. In my experience it becomes really easy to mess up the order
of these flags at that point. Especially because the type system doesn't
distinguish between the 3 different booleans with completely different
meanings.

So I refactored these signatures to receive a struct containing most of
these arguments. Like that you don't mess up orderening, because the
meaning of the boolean is not order dependent but fieldname dependent.
It also makes it possible to set good shared defaults for this struct.
2019-09-13 15:49:53 +02:00
Halil Ozan Akgul 4d34b79b87 There were two multi insert - single insert tests but no multi insert - multi insert test. Fixed it. 2019-09-13 16:09:11 +03:00
Nils Dijk 05f0668cdc
Fix: schema leak onto create index statement cache (#2964)
DESCRIPTION: Fix schema leak on CREATE INDEX statement

When a CREATE INDEX is cached between execution we might leak the schema name onto the cached statement of an earlier execution preventing the right index to be created.

Even though the cache is cleared when the search_path changes we can trigger this behaviour by having the schema already on the search path before a colliding table is created in a schema earlier on the `search_path`. When calling an unqualified create index via a function (used to trigger the caching behaviour) we see that the index is created on the wrong table after the schema leaked onto the statement.

By copying the complete `PlannedStmt` and `utilityStmt` during our planning phase for distributed ddls we make sure we are not leaking the schema name onto a cached data structure.

Caveat; COPY statements already have a lot of parsestree copying ongoing without directly putting it back on the `pstmt`. We should verify that copies modify the statement and potentially copy the complete `pstmt` there already.
2019-09-13 14:04:23 +02:00
Hadi Moshayedi 48ff4691a0 Return nodeid instead of record in some UDFs 2019-09-12 12:46:21 -07:00
Philip Dubé ae1171a373 Test invalid aggregate 2019-09-12 16:55:05 +00:00
Philip Dubé 2aa6852dea Begin searching AggregateNames from 1, not 0 2019-09-12 16:55:05 +00:00
Jelte Fennema d6deb062aa Add shard rebalancer stubs 2019-09-12 16:40:25 +02:00
Jelte Fennema 58012054c9 Add an extra advisory lock tag class 2019-09-12 16:40:25 +02:00
Jelte Fennema eb7e45d556 Make LookupNodeForGroup extern 2019-09-12 16:40:25 +02:00
Jelte Fennema 257406fda7 Fix ArrayObjectCount for zero sized arrays 2019-09-12 16:40:25 +02:00
Jelte Fennema de5174f763 include postgres.h into some of our .h files to silence warnings 2019-09-12 16:40:25 +02:00
Jelte Fennema 4ebdf5989b Add check-minimal to test Makefile 2019-09-12 16:40:25 +02:00
Onder Kalaci 0b0c779c77 Introduce the concept of Local Execution
/*
 * local_executor.c
 *
 * The scope of the local execution is locally executing the queries on the
 * shards. In other words, local execution does not deal with any local tables
 * that are not shards on the node that the query is being executed. In that sense,
 * the local executor is only triggered if the node has both the metadata and the
 * shards (e.g., only Citus MX worker nodes).
 *
 * The goal of the local execution is to skip the unnecessary network round-trip
 * happening on the node itself. Instead, identify the locally executable tasks and
 * simply call PostgreSQL's planner and executor.
 *
 * The local executor is an extension of the adaptive executor. So, the executor uses
 * adaptive executor's custom scan nodes.
 *
 * One thing to note that Citus MX is only supported with replication factor = 1, so
 * keep that in mind while continuing the comments below.
 *
 * On the high level, there are 3 slightly different ways of utilizing local execution:
 *
 * (1) Execution of local single shard queries of a distributed table
 *
 *      This is the simplest case. The executor kicks at the start of the adaptive
 *      executor, and since the query is only a single task the execution finishes
 *      without going to the network at all.
 *
 *      Even if there is a transaction block (or recursively planned CTEs), as long
 *      as the queries hit the shards on the same, the local execution will kick in.
 *
 * (2) Execution of local single queries and remote multi-shard queries
 *
 *      The rule is simple. If a transaction block starts with a local query execution,
 *      all the other queries in the same transaction block that touch any local shard
 *      have to use the local execution. Although this sounds restrictive, we prefer to
 *      implement in this way, otherwise we'd end-up with as complex scenarious as we
 *      have in the connection managements due to foreign keys.
 *
 *      See the following example:
 *      BEGIN;
 *          -- assume that the query is executed locally
 *          SELECT count(*) FROM test WHERE key = 1;
 *
 *          -- at this point, all the shards that reside on the
 *          -- node is executed locally one-by-one. After those finishes
 *          -- the remaining tasks are handled by adaptive executor
 *          SELECT count(*) FROM test;
 *
 *
 * (3) Modifications of reference tables
 *
 *		Modifications to reference tables have to be executed on all nodes. So, after the
 *		local execution, the adaptive executor keeps continuing the execution on the other
 *		nodes.
 *
 *		Note that for read-only queries, after the local execution, there is no need to
 *		kick in adaptive executor.
 *
 *  There are also few limitations/trade-offs that is worth mentioning. First, the
 *  local execution on multiple shards might be slow because the execution has to
 *  happen one task at a time (e.g., no parallelism). Second, if a transaction
 *  block/CTE starts with a multi-shard command, we do not use local query execution
 *  since local execution is sequential. Basically, we do not want to lose parallelism
 *  across local tasks by switching to local execution. Third, the local execution
 *  currently only supports queries. In other words, any utility commands like TRUNCATE,
 *  fails if the command is executed after a local execution inside a transaction block.
 *  Forth, the local execution cannot be mixed with the executors other than adaptive,
 *  namely task-tracker, real-time and router executors. Finally, related with the
 *  previous item, COPY command cannot be mixed with local execution in a transaction.
 *  The implication of that any part of INSERT..SELECT via coordinator cannot happen
 *  via the local execution.
 */
2019-09-12 11:51:25 +02:00
Marco Slot 810aca8d41 Drop foreign key from pg_dist_poolinfo to pg_dist_node 2019-09-10 09:52:19 +02:00
SaitTalhaNisanci e132d579f2
Change --new-bindir flag description to be consistent (#2950) 2019-09-11 15:36:39 +03:00
SaitTalhaNisanci 0f170cb75f
Use variables instead of hardcoded tmp dirs (#2944) 2019-09-11 13:25:18 +03:00
Onder Kalaci 485189c0b6 Make sure that lost connections are handled properly
Before this patch, when a connection is lost, we'd have the following
situation:

    - Pop a task execution from readyQueue
    - Lost connection
    - Fail the session/pool. -> This step was not acting properly
      because we've popped the task, but not set to session->currentTask
      yet

After the patch:

    - Pop a task execution from readyQueue
    - Immediately set it to session->currentTask
    - Lost connection
    - Fail the session/pool. -> At this step, failing the
      session would trigger query failures (or failovers)
      properly.
2019-09-10 17:54:27 +02:00
SaitTalhaNisanci d99deab7d9
Add upgrade postgres version test (#2940)
* Add creating a citus cluster script

Creating a citus cluster is automated.
Before running this script:
- Citus should be installed and its control file should be added to postgres. (make install)
- Postgres should be installed.

* Initialize upgrade test table and fill

* Finalize the layout of upgrade tests

Postgres upgrade function is added.
The newly added UDFs(citus_prepare_pg_upgrade, citus_finish_pg_upgrade) are used to
perform upgrade.

* Refactor upgrade test and add config file

* Add schedules for upgrade testing

* Use pg_regress for upgrade tests

pg_regress is used for creating a simple distributed table in
upgrade tests. After upgrading another schedule is used to verify
that the distributed table exists. Router and realtime queries are
used for verifying.

* Run upgrade tests as a postgres user in a temp dir

postgres user is used for psql to be consistent at running tests.
A temp dir is created and the temp dir's permissions are changed so
that postgres user can access it. All psql commands are now run with
postgres user.

"Select * from t" query is changed as "Select * from t order by a"
so that the result is always in the same order.

* Add docopt and arguments for the upgrade script

Docopt dependency is added to parse flags in script.
Some refactoring in variable names is done.

* Add readme for upgrade tests

* Refactor upgrade tests

Use relative data path instead of absolute assuming that this script will
always be run from 'src/test/regress'
Remove 'citus-path' flag
Use specific version for docopt instead of *
Use named args in string formatting

* Resolve a security problem

Instead of using string formatting in subprocess.call, arguments
list is used. Otherwise users could do shell injection.
Shell = True is removed from subprocess call as it is not recommended
to use this.

* Add how the test works to readme

* Refactor some variables to be consistent

* Update upgrade script based on the reviews

It was possible that postgres server would stay running even when the script
crashes, atexit library is used to ensure that we always do a teardown where we stop
the databases.

Some formatting is done in the code for better readability.

Config class is used instead of a dictonary.

A target for upgrade test is added to makefile.

Unused flags/functions/variables are removed.

* Format commands and remove unnecessary flag from readme
2019-09-10 17:56:04 +03:00
Philip Dubé b301cf628a Test worker_cleanup_job_schema_cache actually drops schemas 2019-09-05 16:52:24 +00:00
Philip Dubé 8979fd038b worker_check_invalid_arguments: invalid task/job ids 2019-09-05 16:52:24 +00:00
Philip Dubé 5f9e88b260 multi_multiuser: test that worker_merge_files_and_query doesn't allow privilege escalation 2019-09-05 16:52:24 +00:00
Philip Dubé a28b82d67d get_catalog_object_by_oid requires an extra parameter in pg12 2019-09-05 16:38:07 +00:00
Nils Dijk 511e715ee3
Remove early escape in walking pg_depend (#2930)
This is a bug that got in when we inlined the body of a function into this loop. Earlier revisions had two loops, hence a function that would be reused.

With a return instead of a continue the list of dependencies being walked is dependent on the order in which we find them in pg_depend. This became apparent during pg12 compatibility. The order of entries in pg12 was luckily different causing a random test to fail due to this return.

By changing it to a continue we only skip the entries that we don’t want to follow instead of skipping all entries that happen to be found later.

sidefix for more stable isolation tests around ensure dependency
2019-09-05 18:03:34 +02:00
Philip Dubé bdd30bb181 Don't allow distributing by a generated column 2019-09-04 14:50:17 +00:00
Philip Dubé 41dca121e2 Support GENERATE ALWAYS AS STORED 2019-09-04 14:50:17 +00:00
Nils Dijk 936d546a3c
Refactor Ensure Schema Exists to Ensure Dependecies Exists (#2882)
DESCRIPTION: Refactor ensure schema exists to dependency exists

Historically we only supported schema's as table dependencies to be created on the workers before a table gets distributed. This PR puts infrastructure in place to walk pg_depend to figure out which dependencies to create on the workers. Currently only schema's are supported as objects to create before creating a table.

We also keep track of dependencies that have been created in the cluster. When we add a new node to the cluster we use this catalog to know which objects need to be created on the worker.

Side effect of knowing which objects are already distributed is that we don't have debug messages anymore when creating schema's that are already created on the workers.
2019-09-04 14:10:20 +02:00
Philip Dubé 28d964240f Remove CheckForUpdates
https://reports.citusdata.com/v1/releases/latest
We haven't updated the version CheckForUpdates sees since 7.1.0
2019-09-03 21:11:25 +00:00
Philip Dubé 4d26829d50 Remove normalized_tests.lst, don't normalize check-vanilla 2019-09-03 17:25:00 +00:00
Philip Dubé da00c62eea create_distributed_table: include COLLATE on columns 2019-08-29 14:22:54 +00:00
Philip Dubé 32ef459025 backend_data.c: include max_wal_senders in calculating maxBackend, matches changes in pg12's InitializeMaxBackends 2019-08-28 21:24:33 +00:00
Jelte Fennema cbecf97c84
Move tuplestore setup to a helper function (#2898)
* Add tuplestore helpers

* More detailed error messages in tuplestore

* Add CreateTupleDescCopy to SetupTuplestore

* Use new SetupTuplestore helper function

* Remove unnecessary copy

* Remove comment about undefined behaviour
2019-08-27 09:11:08 +02:00
Philip Dubé eba3828ef7 ColocatedShardIntervalList: sort 2019-08-26 17:42:41 +00:00
Matthias Kurz fc069dc611 Test SET LOCAL propagation when GUC is used in RLS policy 2019-08-22 20:29:52 +00:00
Philip Dubé 6b0d8ed83d SortList in FinalizedShardPlacementList, makes 3 failure tests consistent between 11/12 2019-08-22 19:30:56 +00:00
Philip Dubé 693d4695d7 Create a test 'pg12' for pg12 features & error on unsupported new features
Unsupported new features: COPY FROM WHERE, GENERATED ALWAYS AS, non-heap table access methods
2019-08-22 19:30:56 +00:00
Philip Dubé e84fcc0b12 Modify tests to be consistent between versions
Normalize
UNION to prevent optimization
Remove WITH OIDS
Sort ddl events
client_min_messages no longer accepts FATAL
2019-08-22 19:30:50 +00:00
Philip Dubé e5cd298a98 pg12 revised layout of FunctionCallInfoData
See a9c35cf85c

clang raises a warning due to FunctionCall2InfoData technically being variable sized
This is fine, as the struct is the size we want it to be. So silence the warning
2019-08-22 19:02:35 +00:00
Philip Dubé bee779e7d4 planner/distributed_planner.c: get_func_cost replaced with add_function_cost in pg12 2019-08-22 19:02:10 +00:00
Philip Dubé be3285828f Collations matter for hashing strings in pg12
See https://www.postgresql.org/docs/12/collation.html#COLLATION-NONDETERMINISTIC
2019-08-22 18:58:37 +00:00
Philip Dubé fe10ca453d Implement FileCompat to abstract pg12 requiring API consumer to track file offsets 2019-08-22 18:57:47 +00:00
Philip Dubé 018ad1c58e pg12: version_compat.h, tuples, oids, misc 2019-08-22 18:57:23 +00:00
Philip Dubé 9643ff580e Update commands/vacuum.c with pg12 changes
Adds support for SKIP_LOCKED, INDEX_CLEANUP, TRUNCATE
Removes broken assert
2019-08-22 18:56:54 +00:00
Philip Dubé 68c4b71f93 Fix up includes with pg12 changes 2019-08-22 18:56:21 +00:00
Philip Dubé fbc3e346e8 ruleutils_12.c
Produced this file by copying ruleutils_11.c,
then comparing postgres ruleutils.c changes between REL_11_STABLE & REL_12_STABLE
2019-08-22 18:56:05 +00:00
Hadi Moshayedi 6be1bacddd Fix distributed deadlock for TRUNCATE 2019-08-22 11:03:53 -07:00
Hadi Moshayedi a5b087c89b Support FKs between reference tables 2019-08-21 16:11:27 -07:00
Hadi Moshayedi a3578a6e60 Sort load_shard_placement_array by worker name/port 2019-08-21 14:35:05 -07:00
Philip Dubé 7bf7e41594 commands/index.c: Fix assertion typo 2019-08-21 18:54:05 +00:00
Philip Dubé f4b90419ae Raise an error when REINDEX TABLE or INDEX is invoked on a distributed relation 2019-08-21 17:03:14 +00:00
Philip Dubé db5a7f49a7 Task Tracker: fix error being copy pasted from above block 2019-08-21 15:44:01 +00:00
Philip Dubé f62d4a6712 citus_rm_job_directory for multi_query_directory_cleanup 2019-08-19 17:04:42 +00:00
Philip Dubé 9777f22e1e Avoid invalid array accesses to partitionFileArray 2019-08-19 17:04:42 +00:00
Philip Dubé f4ca02664a single_shard_commit_protocol: GUC_NO_SHOW_ALL 2019-08-18 12:54:32 +00:00
Hadi Moshayedi c582eb89c8 Add some missing locks. 2019-08-15 12:34:31 -07:00
Philip Dubé f4e513b3d4 Introduce citus.single_shard_commit_protocol for if users want 1PC on writes to replicas 2019-08-15 18:49:40 +00:00
Philip Dubé cd951fa9ca Avoid multiple pg_dist_colocation records being created for reference tables
master_deactivate_node is updated to decrement the replication factor
Otherwise deactivation could have create_reference_table produce a second record

UpdateColocationGroupReplicationFactor is renamed UpdateColocationGroupReplicationFactorForReferenceTables
& the implementation looks up the record based on distributioncolumntype == InvalidOid, rather than by id
Otherwise the record's replication factor fails to be maintained when there are no reference tables
2019-08-13 17:21:02 +00:00
Nils Dijk be6b7bec69
Add UDF citus_(prepare|finish)_pg_upgrade to aid with upgrading citus (#2877)
DESCRIPTION: Add functions to help with postgres upgrades

Currently there is [a list of manual steps](https://docs.citusdata.com/en/v8.2/admin_guide/upgrading_citus.html?highlight=upgrade#upgrading-postgresql-version-from-10-to-11) to perform during a postgres upgrade. These steps guarantee our catalog tables are kept and counter values are maintained across upgrades.

Having more than 1 command in our docs for users to manually execute during upgrades is error prone for both the user, and our docs. There are already 2 catalog tables that have been introduced to citus that have not been added to our docs for backing up during upgrades (`pg_authinfo` and `pg_dist_poolinfo`).

As we add more functionality to citus we run into situations where there are more steps required either before or after the upgrade. At the same time, when we move catalog tables to a place where the contents will be maintained automatically during upgrades we could have less steps in our docs. This will come to a hard to maintain matrix of citus versions and steps to be performed.

Instead we could take ownership of these steps within the extension itself. This PR introduces two new functions for the user to use instead of long lists of error prone instructions to follow.
 - `citus_prepare_pg_upgrade`
    This function should be called by the user right before shutting down the cluster. This will ensure all citus catalog tables are backed up in a location where the information will be retained during an upgrade.
- `citus_finish_pg_upgrade`
    This function should be called right after a pg_upgrade of the cluster. This will restore the catalog tables to the state before the upgrade happend.

Both functions need to be executed both on the coordinator and on all the workers, in the same fashion our current documentation instructs to do.

There are two known problems with this function in its current form, which is also a problem with our docs. We should schedule time in the future to improve on this, but having it automated now is better as we are about to add extra steps to take after upgrades.
 - When you install citus in a clean cluster we do enable ssl for communication between the coordinator and the workers. If an upgrade to a clean cluster is performed we do not setup ssl on the new cluster causing the communication to fail.
 - There are no automated tests added in this PR to execute an upgrade test durning every build. 
    Our current test infrastructure does not allow for 2 versions of postgres to exist in the same environment. We will need to invest time to create a new testing harness that could run the following scenario:
      1. Create cluster
      2. Run extensible scripts to execute arbitrary statements on this cluster
      3. Perform an upgrade by preparing, upgrading and finishing
      4. Run extensible scripts to verify all objects created by earlier scripts exists in correct form in the upgraded cluster

    Given the non trivial amount of work involved for such a suite I'd like to land this before we have 
automated testing.

On a side note; As the reviewer noticed, the tables created in the public namespace are not visible in `psql` with `\d`. The backup catalog tables have the same name as the tables in `pg_catalog`. Due to postgres internals `pg_catalog` is first in the search path and therefore the non-qualified name would alwasy resolve to `pg_catalog.pg_dist_*`. Internally this is called a non-visible table as it would resolve to a different table without a qualified name. Only visible tables are shown with `\d`.
2019-08-13 15:53:10 +02:00
Hadi Moshayedi 009d8b7401 Some cleanup 2019-08-12 15:38:52 -07:00
Philip Dubé 5459c01956 multi_partitioning_utils: version_above_ten 2019-08-09 15:25:59 +00:00
Philip Dubé e0f19fb58c multi_partitioning_1.out 2019-08-09 15:25:59 +00:00
Philip Dubé 5e835e7565 Fix multi_repair_shards. There's already a group/shardid entry, pg11 gives us back the inserted one, pg12 gives us the preexisting one 2019-08-09 15:25:59 +00:00
Philip Dubé 66ce2d2d2d Materialize c1 to keep subplan ids in sync 2019-08-09 15:25:59 +00:00
Philip Dubé 9065ef429c foreign_key_to_reference_table: terse to avoid differing order of drop cascade details 2019-08-09 15:25:59 +00:00
Philip Dubé 0d9e5bde9c window_functions: 'ORDER BY time' when using lag(time) & coordinator_plan 2019-08-09 15:25:59 +00:00
Philip Dubé 7992077fd9 multi_modifying_xacts: don't differ in output if reference table select tries broken worker first 2019-08-09 15:25:59 +00:00
Philip Dubé 546b71ac18 multi_router_planner: be terse for ctes with false wheres 2019-08-09 15:25:59 +00:00
Philip Dubé a523a5b773 multi_null_minmax_value_pruning: no versioning & coordinator_plan 2019-08-09 15:25:59 +00:00
Philip Dubé 871dabdc63 Force CTE materialization in pg12 2019-08-09 15:25:59 +00:00
Philip Dubé 667c67891e intermediate_results: COSTS OFF 2019-08-09 15:25:59 +00:00
Philip Dubé b2ea806d8a extra_float_digits=0 2019-08-09 15:25:59 +00:00
Philip Dubé 705d1bf0e0 Use PG_JOB_CACHE_DIR 2019-08-09 15:25:59 +00:00
Onder Kalaci 060ac11476 Do not record relation accessess unnecessarily
Before this commit, we've recorded the relation accesses in 3 different
places
    - FindPlacementListConnection         -- applies all executor in tx block
    - StartPlacementExecutionOnSession()  -- adaptive executor only
    - StartPlacementListConnection()      -- router/real-time only

This is different than Citus 8.2, and could lead to query execution times
increase considerably on multi-shard commands in transaction block
that are on partitioned tables.

Benchmarks:

```
1+8 c5.4xlarge cluster

Empty distributed partitioned table with 365 partitions: https://gist.github.com/onderkalaci/1edace4ed6bd6f061c8a15594865bb51#file-partitions_365-sql

./pgbench -f /tmp/multi_shard.sql -c10 -j10 -P 1 -T 120 postgres://citus:w3r6KLJpv3mxe9E-NIUeJw@c.fy5fkjcv45vcepaogqcaskmmkee.db.citusdata.com:5432/citus?sslmode=require

cat  /tmp/multi_shard.sql
BEGIN;
	DELETE FROM collections_list;
	DELETE FROM collections_list;
	DELETE FROM collections_list;
COMMIT;
cat  /tmp/single_shard.sql
BEGIN;
	DELETE FROM collections_list WHERE key = :aid;
	DELETE FROM collections_list WHERE key = :aid;
	DELETE FROM collections_list WHERE key = :aid;
COMMIT;

cat  /tmp/mix.sql
BEGIN;
	DELETE FROM collections_list WHERE key = :aid;
	DELETE FROM collections_list WHERE key = :aid;
	DELETE FROM collections_list WHERE key = :aid;

	DELETE FROM collections_list;
	DELETE FROM collections_list;
	DELETE FROM collections_list;
COMMIT;
```

The table shows `latency average` of pgbench runs explained above, so we have a pretty solid improvement even over 8.2.2.

| Test  | Citus 8.2.2  |  Citus 8.3.1   | Citus 8.3.2 (this branch)  | Citus 8.3.1 (FKEYs disabled via GUC)  |
| ------------- | ------------- | ------------- |------------- | ------------- |
|multi_shard |  2370.083 ms  |3605.040 ms |1324.094 ms |1247.255 ms  |
| single_shard  | 85.338 ms  |120.934 ms  |73.216 ms  | 78.765 ms |
| mix  | 2434.459 ms | 3727.080 ms  |1306.456 ms  | 1280.326 ms |
2019-08-08 18:42:08 +02:00
Onder Kalaci 35ee896f3d Get rid of an unnecessary parameter
targetPoolSize parameter for ExecuteUtilityTaskListWithoutResults
becomes obsolete, just remove it.
2019-08-07 19:35:56 +02:00
Onder Kalaci b2e01d0745 Refactor switching to sequential mode
We don't need to wait until the execution. As soon as we realize
that we need sequential execution, we should do it.
2019-08-07 19:35:56 +02:00
Hadi Moshayedi b1ab805ce2 Fix a typo in foreign_key_restriction_enforcement 2019-08-02 16:06:52 -07:00
Philip Dubé b77c52f95b PlanRouterQuery: don't store list of list of shard intervals in relationShardList 2019-08-02 14:08:57 +00:00
Philip Dubé fdc0ef6392 Adaptive executor: use 2PC when replication_factor > 1 2019-08-01 23:55:12 +00:00
Philip Dubé 19bcb1b4f7 multi_modifications: extend to demonstrate issue in adaptive executor 2019-08-01 23:55:04 +00:00
Philip Dubé 064bd66a20 Avoid segfault in logging queries 2019-07-31 15:28:46 +00:00
Philip Dubé 3982b4635f CompareShardIntervals: if intervals are equal, compare id. Works around sort being unstable 2019-07-26 16:13:36 +00:00
Philip Dubé 0e233c63a3 multi_colocation_utils: sort by nodeport, not placementid
multi_copy: replace smgr with aclitem, smgr is removed in pg12
2019-07-25 14:33:43 +00:00
Marco Slot e2bc09838e Use ereport instead of elog in adaptive executor 2019-07-23 20:40:32 +02:00
Marco Slot bd111366b0 Skip CheckConnectionTimeout when checkForPoolTimeout is false 2019-07-23 20:40:32 +02:00
Marco Slot a3811b1e55 Avoid FindWorkerNode calls in adaptive executor 2019-07-23 20:40:32 +02:00
Marco Slot 4444d92dbc Set initial pool size to cached connection count 2019-07-23 20:40:32 +02:00
Marco Slot 4c0c33365e Avoid creating a redundant event set at the start 2019-07-23 20:40:32 +02:00
Marco Slot 32e7a80960 Avoid unnecessary calls to PQconsumeInput 2019-07-23 20:40:32 +02:00
Marco Slot 71ad5c095b Use ModifyWaitEvent when only wait flags changed 2019-07-23 20:40:32 +02:00
Philip Dubé 50144b75d0 Add check-empty to testing Makefile
Don't create functions multiple times
Move ALTER TABLEs to their declaration
Remove DROP FUNCTIONS IF EXISTS, OR REPLACE
2019-07-24 11:03:54 -07:00
Philip Dubé acbaa38a62 Squash migrations for versions 5/6, don't use WITH OIDS 2019-07-24 11:03:29 -07:00
Hanefi Onaldi 8127297999 update workerNodeList after sorting 2019-07-23 20:57:07 +00:00
Philip Dubé 6598c68993 Fix multi_prune_shard_list & don't set next_shard_id unnecessarily in multi_null_minmax_value_pruning 2019-07-23 19:44:18 +00:00
Marco Slot efbe58eab2 Fix SQL schema version, we skipped 8.3 2019-07-17 16:05:25 +02:00
Philip Dubé 0915027389 DistributedPlan: replace operation with modLevel
This causes no behaviorial changes, only organizes better to implement modifying CTEs

Also rename ExtactInsertRangeTableEntry to ExtractResultRelationRTE,
as the source of this function didn't match the documentation

Remove Task's upsertQuery in favor of ROW_MODIFY_NONCOMMUTATIVE

Split up AcquireExecutorShardLock into more internal functions

Tests: Normalize multi_reference_table multi_create_table_constraints
2019-07-16 13:58:18 -07:00
Hanefi Onaldi 0bdec52761
Fix default_version in citus.control file (#2840) 2019-07-11 14:24:51 +03:00
Philip Dubé befd0caddd Tests: normalize sql_procedure and custom_aggregate_support
Also fix typo in multi_insert_select
2019-07-10 14:36:17 +00:00
Hanefi Onaldi 5a6eba6ba9
Bump Citus to 8.4devel 2019-07-10 15:26:10 +03:00
Nils Dijk 791cc26a86
Fix an issue with subquery map merge jobs as non-root
Also automated all manual tests around multi user isolation for internal citus udf's

automate upgrade_to_reference_table tests
add negative tests for lock_relation_if_exists
add tests for permissions on worker_cleanup_job_schema_cache
add tests for worker_fetch_partition_file
add tests for worker_merge_files_into_table
fix problem with worker_merge_files_and_run_query when run as non-super user and add tests for behaviour
2019-07-10 12:40:05 +02:00
Hadi Moshayedi 46608e42f9 Add hyperscale tutorial to the regression tests. 2019-07-10 10:47:55 +02:00
Hadi Moshayedi 91d8a41ecd Don't modify cache entry in RelationShardListForShardCreate() 2019-07-09 12:44:48 -07:00
Marco Slot 70434bc716 Increase slow start time in test to make valgrind tests pass 2019-07-08 06:04:13 +02:00
Hadi Moshayedi 032167c553 Fix Assert() in ProcessVariableSetStmt() 2019-07-05 14:11:22 -07:00
Marco Slot 07d2266e11 Fix RESET and other types of SET 2019-07-05 19:30:48 +02:00
Marco Slot 97334ff1ec Copy WorkerNode before returning in FindWorkerNode 2019-07-05 09:35:53 +02:00
Hadi Moshayedi 5d59aab38d Increase valgrind's max-stackframe 2019-07-04 14:19:41 +02:00
Hadi Moshayedi d233887d68 Fix multi_extension in check-multi-vg 2019-07-04 13:03:46 +02:00
Hadi Moshayedi 47aa95d00d Fix a NULL dereference. 2019-07-03 16:26:49 -07:00
Hadi Moshayedi 805a2ac602 Fix a use after free in adaptive executor 2019-07-02 10:12:13 -07:00
Marco Slot d6c667946c Fix citus_executor_name mapping by reimplementing it in C 2019-06-29 22:38:29 +02:00
Marco Slot 70c0d96507 Track partition key for adaptive executor in CitusEndScan 2019-06-29 21:37:15 +02:00
Önder Kalacı 40da78c6fd
Introduce the adaptive executor (#2798)
With this commit, we're introducing the Adaptive Executor. 


The commit message consists of two distinct sections. The first part explains
how the executor works. The second part consists of the commit messages of
the individual smaller commits that resulted in this commit. The readers
can search for the each of the smaller commit messages on 
https://github.com/citusdata/citus and can learn more about the history
of the change.

/*-------------------------------------------------------------------------
 *
 * adaptive_executor.c
 *
 * The adaptive executor executes a list of tasks (queries on shards) over
 * a connection pool per worker node. The results of the queries, if any,
 * are written to a tuple store.
 *
 * The concepts in the executor are modelled in a set of structs:
 *
 * - DistributedExecution:
 *     Execution of a Task list over a set of WorkerPools.
 * - WorkerPool
 *     Pool of WorkerSessions for the same worker which opportunistically
 *     executes "unassigned" tasks from a queue.
 * - WorkerSession:
 *     Connection to a worker that is used to execute "assigned" tasks
 *     from a queue and may execute unasssigned tasks from the WorkerPool.
 * - ShardCommandExecution:
 *     Execution of a Task across a list of placements.
 * - TaskPlacementExecution:
 *     Execution of a Task on a specific placement.
 *     Used in the WorkerPool and WorkerSession queues.
 *
 * Every connection pool (WorkerPool) and every connection (WorkerSession)
 * have a queue of tasks that are ready to execute (readyTaskQueue) and a
 * queue/set of pending tasks that may become ready later in the execution
 * (pendingTaskQueue). The tasks are wrapped in a ShardCommandExecution,
 * which keeps track of the state of execution and is referenced from a
 * TaskPlacementExecution, which is the data structure that is actually
 * added to the queues and describes the state of the execution of a task
 * on a particular worker node.
 *
 * When the task list is part of a bigger distributed transaction, the
 * shards that are accessed or modified by the task may have already been
 * accessed earlier in the transaction. We need to make sure we use the
 * same connection since it may hold relevant locks or have uncommitted
 * writes. In that case we "assign" the task to a connection by adding
 * it to the task queue of specific connection (in
 * AssignTasksToConnections). Otherwise we consider the task unassigned
 * and add it to the task queue of a worker pool, which means that it
 * can be executed over any connection in the pool.
 *
 * A task may be executed on multiple placements in case of a reference
 * table or a replicated distributed table. Depending on the type of
 * task, it may not be ready to be executed on a worker node immediately.
 * For instance, INSERTs on a reference table are executed serially across
 * placements to avoid deadlocks when concurrent INSERTs take conflicting
 * locks. At the beginning, only the "first" placement is ready to execute
 * and therefore added to the readyTaskQueue in the pool or connection.
 * The remaining placements are added to the pendingTaskQueue. Once
 * execution on the first placement is done the second placement moves
 * from pendingTaskQueue to readyTaskQueue. The same approach is used to
 * fail over read-only tasks to another placement.
 *
 * Once all the tasks are added to a queue, the main loop in
 * RunDistributedExecution repeatedly does the following:
 *
 * For each pool:
 * - ManageWorkPool evaluates whether to open additional connections
 *   based on the number unassigned tasks that are ready to execute
 *   and the targetPoolSize of the execution.
 *
 * Poll all connections:
 * - We use a WaitEventSet that contains all (non-failed) connections
 *   and is rebuilt whenever the set of active connections or any of
 *   their wait flags change.
 *
 *   We almost always check for WL_SOCKET_READABLE because a session
 *   can emit notices at any time during execution, but it will only
 *   wake up WaitEventSetWait when there are actual bytes to read.
 *
 *   We check for WL_SOCKET_WRITEABLE just after sending bytes in case
 *   there is not enough space in the TCP buffer. Since a socket is
 *   almost always writable we also use WL_SOCKET_WRITEABLE as a
 *   mechanism to wake up WaitEventSetWait for non-I/O events, e.g.
 *   when a task moves from pending to ready.
 *
 * For each connection that is ready:
 * - ConnectionStateMachine handles connection establishment and failure
 *   as well as command execution via TransactionStateMachine.
 *
 * When a connection is ready to execute a new task, it first checks its
 * own readyTaskQueue and otherwise takes a task from the worker pool's
 * readyTaskQueue (on a first-come-first-serve basis).
 *
 * In cases where the tasks finish quickly (e.g. <1ms), a single
 * connection will often be sufficient to finish all tasks. It is
 * therefore not necessary that all connections are established
 * successfully or open a transaction (which may be blocked by an
 * intermediate pgbouncer in transaction pooling mode). It is therefore
 * essential that we take a task from the queue only after opening a
 * transaction block.
 *
 * When a command on a worker finishes or the connection is lost, we call
 * PlacementExecutionDone, which then updates the state of the task
 * based on whether we need to run it on other placements. When a
 * connection fails or all connections to a worker fail, we also call
 * PlacementExecutionDone for all queued tasks to try the next placement
 * and, if necessary, mark shard placements as inactive. If a task fails
 * to execute on all placements, the execution fails and the distributed
 * transaction rolls back.
 *
 * For multi-row INSERTs, tasks are executed sequentially by
 * SequentialRunDistributedExecution instead of in parallel, which allows
 * a high degree of concurrency without high risk of deadlocks.
 * Conversely, multi-row UPDATE/DELETE/DDL commands take aggressive locks
 * which forbids concurrency, but allows parallelism without high risk
 * of deadlocks. Note that this is unrelated to SEQUENTIAL_CONNECTION,
 * which indicates that we should use at most one connection per node, but
 * can run tasks in parallel across nodes. This is used when there are
 * writes to a reference table that has foreign keys from a distributed
 * table.
 *
 * Execution finishes when all tasks are done, the query errors out, or
 * the user cancels the query.
 *
 *-------------------------------------------------------------------------
 */



All the commits involved here:
* Initial unified executor prototype

* Latest changes

* Fix rebase conflicts to master branch

* Add missing variable for assertion

* Ensure that master_modify_multiple_shards() returns the affectedTupleCount

* Adjust intermediate result sizes

The real-time executor uses COPY command to get the results
from the worker nodes. Unified executor avoids that which
results in less data transfer. Simply adjust the tests to lower
sizes.

* Force one connection per placement (or co-located placements) when requested

The existing executors (real-time and router) always open 1 connection per
placement when parallel execution is requested.

That might be useful under certain circumstances:

(a) User wants to utilize as much as CPUs on the workers per
distributed query
(b) User has a transaction block which involves COPY command

Also, lots of regression tests rely on this execution semantics.
So, we'd enable few of the tests with this change as well.

* For parameters to be resolved before using them

For the details, see PostgreSQL's copyParamList()

* Unified executor sorts the returning output

* Ensure that unified executor doesn't ignore sequential execution of DDLJob's

Certain DDL commands, mainly creating foreign keys to reference tables,
should be executed sequentially. Otherwise, we'd end up with a self
distributed deadlock.

To overcome this situaiton, we set a flag `DDLJob->executeSequentially`
and execute it sequentially. Note that we have to do this because
the command might not be called within a transaction block, and
we cannot call `SetLocalMultiShardModifyModeToSequential()`.

This fixes at least two test: multi_insert_select_on_conflit.sql and
multi_foreign_key.sql

Also, I wouldn't mind scattering local `targetPoolSize` variables within
the code. The reason is that we'll soon have a GUC (or a global
variable based on a GUC) that'd set the pool size. In that case, we'd
simply replace `targetPoolSize` with the global variables.

* Fix 2PC conditions for DDL tasks

* Improve closing connections that are not fully established in unified execution

* Support foreign keys to reference tables in unified executor

The idea for supporting foreign keys to reference tables is simple:
Keep track of the relation accesses within a transaction block.
    - If a parallel access happens on a distributed table which
      has a foreign key to a reference table, one cannot modify
      the reference table in the same transaction. Otherwise,
      we're very likely to end-up with a self-distributed deadlock.
    - If an access to a reference table happens, and then a parallel
      access to a distributed table (which has a fkey to the reference
      table) happens, we switch to sequential mode.

Unified executor misses the function calls that marks the relation
accesses during the execution. Thus, simply add the necessary calls
and let the logic kick in.

* Make sure to close the failed connections after the execution

* Improve comments

* Fix savepoints in unified executor.

* Rebuild the WaitEventSet only when necessary

* Unclaim connections on all errors.

* Improve failure handling for unified executor

   - Implement the notion of errorOnAnyFailure. This is similar to
     Critical Connections that the connection managament APIs provide
   - If the nodes inside a modifying transaction expand, activate 2PC
   - Fix few bugs related to wait event sets
   - Mark placement INACTIVE during the execution as much as possible
     as opposed to we do in the COMMIT handler
   - Fix few bugs related to scheduling next placement executions
   - Improve decision on when to use 2PC

Improve the logic to start a transaction block for distributed transactions

- Make sure that only reference table modifications are always
  executed with distributed transactions
- Make sure that stored procedures and functions are executed
  with distributed transactions

* Move waitEventSet to DistributedExecution

This could also be local to RunDistributedExecution(), but in that case
we had to mark it as "volatile" to avoid PG_TRY()/PG_CATCH() issues, and
cast it to non-volatile when doing WaitEventSetFree(). We thought that
would make code a bit harder to read than making this non-local, so we
move it here. See comments for PG_TRY() in postgres/src/include/elog.h
and "man 3 siglongjmp" for more context.

* Fix multi_insert_select test outputs

Two things:
   1) One complex transaction block is now supported. Simply update
      the test output
   2) Due to dynamic nature of the unified executor, the orders of
      the errors coming from the shards might change (e.g., all of
      the queries on the shards would fail, but which one appears
      on the error message?). To fix that, we simply added it to
      our shardId normalization tool which happens just before diff.

* Fix subeury_and_cte test

The error message is updated from:
	failed to execute task
To:
        more than one row returned by a subquery or an expression

which is a lot clearer to the user.

* Fix intermediate_results test outputs

Simply update the error message from:
	could not receive query results
to
	result "squares" does not exist

which makes a lot more sense.

* Fix multi_function_in_join test

The error messages update from:
     Failed to execute task XXX
To:
     function f(..) does not exist

* Fix multi_query_directory_cleanup test

The unified executor does not create any intermediate files.

* Fix with_transactions test

A test case that just started to work fine

* Fix multi_router_planner test outputs

The error message is update from:
	Could not receive query results
To:
	Relation does not exists

which is a lot more clearer for the users

* Fix multi_router_planner_fast_path test

The error message is update from:
	Could not receive query results
To:
	Relation does not exists

which is a lot more clearer for the users

* Fix isolation_copy_placement_vs_modification by disabling select_opens_transaction_block

* Fix ordering in isolation_multi_shard_modify_vs_all

* Add executor locks to unified executor

* Make sure to allocate enought WaitEvents

The previous code was missing the waitEvents for the latch and
postmaster death.

* Fix rebase conflicts for master rebase

* Make sure that TRUNCATE relies on unified executor

* Implement true sequential execution for multi-row INSERTS

Execute the individual tasks executed one by one. Note that this is different than
MultiShardConnectionType == SEQUENTIAL_CONNECTION case (e.g., sequential execution
mode). In that case, running the tasks across the nodes in parallel is acceptable
and implemented in that way.

However, the executions that are qualified here would perform poorly if the
tasks across the workers are executed in parallel. We currently qualify only
one class of distributed queries here, multi-row INSERTs. If we do not enforce
true sequential execution, concurrent multi-row upserts could easily form
a distributed deadlock when the upserts touch the same rows.

* Remove SESSION_LIFESPAN flag in unified_executor

* Apply failure test updates

We've changed the failure behaviour a bit, and also the error messages
that show up to the user. This PR covers majority of the updates.

* Unified executor honors citus.node_connection_timeout

With this commit, unified executor errors out if even
a single connection cannot be established within
citus.node_connection_timeout.

And, as a side effect this fixes failure_connection_establishment
test.

* Properly increment/decrement pool size variables

Before this commit, the idle and active connection
counts were not properly calculated.

* insert_select_executor goes through unified executor.

* Add missing file for task tracker

* Modify ExecuteTaskListExtended()'s signature

* Sort output of INSERT ... SELECT ... RETURNING

* Take partition locks correctly in unified executor

* Alternative implementation for force_max_query_parallelization

* Fix compile warnings in unified executor

* Fix style issues

* Decrement idleConnectionCount when idle connection is lost

* Always rebuild the wait event sets

In the previous implementation, on waitFlag changes, we were only
modifying the wait events. However, we've realized that it might
be an over optimization since (a) we couldn't see any performance
benefits (b) we see some errors on failures and because of (a)
we prefer to disable it now.

* Make sure to allocate enough sized waitEventSet

With multi-row INSERTs, we might have more sessions than
task*workerCount after few calls of RunDistributedExecution()
because the previous sessions would also be alive.

Instead, re-allocate events when the connectino set changes.

* Implement SELECT FOR UPDATE on reference tables

On master branch, we do two extra things on SELECT FOR UPDATE
queries on reference tables:
   - Acquire executor locks
   - Execute the query on all replicas

With this commit, we're implementing the same logic on the
new executor.

* SELECT FOR UPDATE opens transaction block even if SelectOpensTransactionBlock disabled

Otherwise, users would be very confused and their logic is very likely
to break.

* Fix build error

* Fix the newConnectionCount calculation in ManageWorkerPool

* Fix rebase conflicts

* Fix minor test output differences

* Fix citus indent

* Remove duplicate sorts that is added with rebase

* Create distributed table via executor

* Fix wait flags in CheckConnectionReady

* failure_savepoints output for unified executor.

* failure_vacuum output (pg 10) for unified executor.

* Fix WaitEventSetWait timeout in unified executor

* Stabilize failure_truncate test output

* Add an ORDER BY to multi_upsert

* Fix regression test outputs after rebase to master

* Add executor.c comment

* Rename executor.c to adaptive_executor.c

* Do not schedule tasks if the failed placement is not ready to execute

Before the commit, we were blindly scheduling the next placement executions
even if the failed placement is not on the ready queue. Now, we're ensuring
that if failed placement execution is on a failed pool or session where the
execution is on the pendingQueue, we do not schedule the next task. Because
the other placement execution should be already running.

* Implement a proper custom scan node for adaptive executor

- Switch between the executors, add GUC to set the pool size
- Add non-adaptive regression test suites
- Enable CIRCLE CI for non-adaptive tests
- Adjust test output files

* Add slow start interval to the executor

* Expose max_cached_connection_per_worker to user

* Do not start slow when there are cached connections

* Consider ExecutorSlowStartInterval in NextEventTimeout

* Fix memory issues with ReceiveResults().

* Disable executor via TaskExecutorType

* Make sure to execute the tests with the other executor

* Use task_executor_type to enable-disable adaptive executor

* Remove useless code

* Adjust the regression tests

* Add slow start regression test

* Rebase to master

* Fix test failures in adaptive executor.

* Rebase to master - 2

* Improve comments & debug messages

* Set force_max_query_parallelization in isolation_citus_dist_activity

* Force max parallelization for creating shards when asked to use exclusive connection.

* Adjust the default pool size

* Expand description of max_adaptive_executor_pool_size GUC

* Update warnings in FinishRemoteTransactionCommit()

* Improve session clean up at the end of execution

Explicitly list all the states that the execution might end,
otherwise warn.

* Remove MULTI_CONNECTION_WAIT_RETRY which is not used at all

* Add more ORDER BYs to multi_mx_partitioning
2019-06-28 14:04:40 +02:00
Philip Dubé 4e54c1525d Isolation tests: consistently name COMMIT '-commit' 2019-06-27 07:32:39 +02:00
Hanefi Onaldi 4e08477fed Add test case for issue 2575 2019-06-26 17:12:28 +02:00
Hanefi Onaldi 7e8fd49b94 Create Schemas as superuser on all shard/table creation UDFs
- All the schema creations on the workers will now be  via superuser connections
- If a shard is being repaired or a shard is replicated, we will create the
  schema only in the relevant worker; and in all the other cases where a schema
  creation is needed, we will block operations until we ensure the schema exists
  in all the workers
2019-06-26 17:12:28 +02:00
Philip Dubé aa0c47848e subquery_and_cte: test rejecting volatile ctes
Also update isolation_citus_dist_activity from after merge
2019-06-26 16:27:07 +02:00
Philip Dubé db7fdb1854 Router planner: bail on volatile functions in CTEs 2019-06-26 10:32:01 +02:00
Philip Dubé 5c62f9935a Router planner: reject SELECT FOR UPDATE ctes 2019-06-26 10:32:01 +02:00
Philip Dubé 18575ccfd3 Add tests to subquery_and_cte, update check-multi-mx expected results 2019-06-26 10:32:01 +02:00
Philip Dubé 77efec04a0 Router Planner: accept SELECT_CMD ctes in modification queries 2019-06-26 10:32:01 +02:00
Philip Dubé 84fe626378 multi_router_planner: refactor error propagation 2019-06-26 10:32:01 +02:00
Philip Dubé 9ed6dd5570 Ignore compile_commands.json, fix typo 2019-06-26 10:32:01 +02:00
Hadi Moshayedi 25a984bab4 Normalize multi_name_lengths. 2019-06-25 14:18:33 +02:00
Hadi Moshayedi 3d0a521295 Show just coordinator plan in some test outputs. 2019-06-24 12:24:30 +02:00
Onder Kalaci ad93d6feea Change the order of placement access added to the list
This is to make sure that the error messages related to foreign keys
to reference tables shows the exact placement access name instead of
SELECT.
2019-06-23 11:32:58 +02:00
Nils Dijk eb98f2d13a
Fix null pointer caused by partial initialization of ConnParamsHashEntry (#2789)
It has been reported a null pointer dereference could be triggered in FreeConnParamsHashEntryFields. Likely cause is an error in GetConnParams which will leave the cached ConnParamsHashEntry in a state that would cause the null pointer dereference in a subsequent connection establishment to the same server. This has been simulated by inserting ereport(ERROR, ...) at certain places in the code.

Not only would ConnParamsHashEntry be in a state that would cause a crash, it was also leaking memory in the ConnectionContext due to the loss of pointers as they are only stored on the ConnParamsHashEntry at the end of the function.

This patch rewrites both the GetConnParams to store pointers 'durably' at every point in the code so that an error would not lose the pointer as well as FreeConnParamsHashEntryFields in a way that it can clear half initialised ConnParamsHashEntry's in a safer manner.
2019-06-21 18:16:43 +02:00
Hanefi Onaldi 7a6eb2aba0
Fix one regression test that fails on enterprise (#2786)
GRANT queries are propagated on Enterprise. If a user attempts to
create a user and run a GRANT query before creating it on workers, we
fail. This issue does not happen in community as the user needs to run
the GRANTs on the workers manually.
2019-06-21 15:46:28 +03:00
Nils Dijk 5df1b49bed
Feature: optionally force master_update_node during failover (#2773)
When `master_update_node` is called to update a node's location it waits for appropriate locks to become available. This is useful during normal operation as new operations will be blocked till after the metadata update while running operations have time to finish.

When `master_update_node` is called after a node failure it is less useful to wait for running operations to finish as they can't. The lock being held indicates an operation that once attempted to commit will fail as the machine already failed. Now the downside is the failover is postponed till the termination point of the operation. This has been observed by users to take a significant amount of time causing the rest of the system to be observed unavailable.

With this patch it is possible in such situations to invoke `master_update_node` with 2 optional arguments:
 - `force` (bool defaults to `false`): When called with true the update of the metadata will be forced to proceed by terminating conflicting backends. A cancel is not enough as the backend might be in idle time (eg. an interactive session, or going back and forth between an appliaction), therefore a more intrusive solution of termination is used here.
 - `lock_cooldown` (int defaults to `10000`): This is the time in milliseconds before conflicting backends are terminated. This is to allow the backends to finish cleanly before terminating them. This allows the user to set an upperbound to the expected time to complete the metadata update, eg. performing the failover.

The functionality is implemented by spawning a background worker that has the task of helping a certain backend in acquiring its locks. The backend is either terminated on successful execution of the metadata update, or once the memory context of the expression gets reset, eg. on a cancel of the statement.
2019-06-21 12:03:15 +02:00
Jason Petersen d4e1172247 Implement propagation of SET LOCAL commands
Adds support for propagation of SET LOCAL commands to all workers
involved in a query. For now, SET SESSION (i.e. plain SET) is not
supported whatsoever, though this code is intended as somewhat of a
base for implementing such support in the future.

As SET LOCAL modifications are scoped to the body of a BEGIN/END xact
block, queries wishing to use SET LOCAL propagation must be within such
a block. In addition, subsequent modifications after e.g. any SAVEPOINT
or ROLLBACK statements will correspondingly push or pop variable mod-
ifications onto an internal stack such that the behavior of changed
values across the cluster will be identical to such behavior on e.g.
single-node PostgreSQL (or equivalently, what values are visible to
the end user by running SHOW on such variables on the coordinator).

If nodes enter the set of participants at some point after SET LOCAL
modifications (or SAVEPOINT, ROLLBACK, etc.) have occurred, the SET
variable state is eagerly propagated to them upon their entrance (this
is identical to, and indeed just augments, the existing logic for the
propagation of the SAVEPOINT "stack").

A new GUC (citus.propagate_set_commands) has been added to control this
behavior. Though the code suggests the valid settings are 'none', 'local',
'session', and 'all', only 'none' (the default) and 'local' are presently
implemented: attempting to use other values will result in an error.
2019-06-20 16:15:43 -07:00
Jason Petersen 1dec6c5163 Change BeginCoordinatedTransaction to internal linkage
It's only ever called from a single file, so having it be extern didn't
make a whole lot of sense.
2019-06-20 13:44:06 -07:00
Jason Petersen 2349e8e75c Remove extraneous comments around PG header change 2019-06-20 13:37:53 -07:00
Hadi Moshayedi 4bbae02778 Make COPY compatible with unified executor. 2019-06-20 19:53:40 +02:00
Hadi Moshayedi 2e6d04df7b Refactor ExecuteModifyTasksSequentially. 2019-06-20 18:38:57 +02:00
Hadi Moshayedi d4f3e2809d Use normalization for multi_subtransaction output 2019-06-19 17:54:33 +02:00
Hadi Moshayedi 83f6c7dab4 Fix subxact release crash 2019-06-19 17:43:10 +02:00
Onder Kalaci 2b0c4accda Apply feedback 2019-06-19 10:03:58 +02:00
Onder Kalaci 3a04374a9e Refactor relation shard list creation during placement creation
This change is to make further refactoring even simpler such as
using the executor for shard creation.
2019-06-19 10:03:58 +02:00
Onder Kalaci 4fd1fcbbef Refactor shard creation logic
This is a preperation for the new executor, where creating shards
would go through the executor. So, explicitly generate the commands
for further processing.
2019-06-19 10:03:58 +02:00
Philip Dubé 4bfcf5b665 Enable Werror for all warnings
Changes to ruleutils match changes made upstream to silence gcc fallthrough warnings
2019-06-18 14:43:54 -07:00
Hadi Moshayedi b240854b8c Use SendCancelationRequest() in ShutdownConnection() 2019-06-18 12:10:05 +02:00
Hadi Moshayedi c42b22f8fd Fix test name detection in bin/diff 2019-06-17 11:31:42 +02:00
Philip Dubé 342d423725 Fix join alias resolution
FROM (query) alias ignored renaming
In nested subqueries the select list would rename, while the join alias would not respect that
2019-06-12 17:25:07 -07:00
Marco Slot c1ac794b77 enable_statistics_collection defaults to off 2019-06-05 18:43:26 +02:00
Hadi Moshayedi 85325e0098 Refactor ScanStateGetExecutorState into its own function. 2019-06-05 09:16:43 -07:00
Hadi Moshayedi 0b01c59fa6 Refactor ScanStateGetTupleDescriptor() into a function. 2019-06-04 15:19:49 -07:00
Hadi Moshayedi 8e2d328530 Search all outer node levels for lateral join params. 2019-06-04 10:14:05 -07:00
Philip Dubé b5ced403d8 Also check rewrittenQuery jointree for outer join 2019-06-04 07:47:35 -07:00
Hadi Moshayedi dee5bc31b4 Refactor ShardIdForTuple() to a separate function. 2019-06-02 09:48:15 -07:00
Marco Slot c1566d464b Fix failure and isolation tests
On top of citus.max_cached_conns_per_worker GUC, with this commit
we're updating the regression tests to comply with the new behaviour.
2019-05-29 14:42:31 +02:00
Marco Slot bb3a96eacb Cache a configurable number of connections at xact end 2019-05-29 13:24:31 +02:00
Onder Kalaci d46b92d79a Add order by to multi_mx_schema_support 2019-05-28 12:23:28 +02:00
Onder Kalaci fa2a6e4d8f Add order by to multi_mx_router_planner 2019-05-28 12:23:28 +02:00
Onder Kalaci 0a7a173eee Add order by to multi_mx_reference_table 2019-05-28 12:23:28 +02:00
Onder Kalaci 1553e12ee4 Add order by to multi_subquery_complex_reference_clause 2019-05-28 12:06:57 +02:00
Hadi Moshayedi 23207a43e0 Fix a typo: WITH CARDINALITY -> WITH ORDINALITY 2019-05-24 15:49:17 -07:00
Philip Dubé b8871d9ff4 Propagate more ALTER FOREIGN TABLE to workers 2019-05-24 12:54:05 -07:00
Marco Slot b3fcf2a48f Deprecate master_modify_multiple_shards 2019-05-24 15:22:06 +02:00
Marco Slot 7fa5d36057 Stop using master_modify_multiple_shards in TRUNCATE 2019-05-24 14:35:46 +02:00
exialin 59e54de54d Minor code clean-up 2019-05-24 14:26:26 +02:00
Hanefi Onaldi 7443191397
Improve tests for round robin & router queries 2019-05-24 14:16:56 +03:00
Hanefi Onaldi 4d737177e6
Remove redundant active placement filters and unneded sort operations
If a query is router executable, it hits a single shard and therefore has a
single task associated with it. Therefore there is no need to sort the task list
that has a single element.

Also we already have a list of active shard placements, sending it in param
and reuse it.
2019-05-24 14:16:50 +03:00
Hanefi Onaldi b935dfb8c8
Cleanup deleted function declaration 2019-05-24 14:04:26 +03:00
Philip Dubé 16886b3c63 Fix misc typos 2019-05-23 17:23:27 -07:00
Onder Kalaci f1a80a609f Fix wrong test output
If replication factor eqauls to 2 and there are two worker nodes,
even if two modifications hit different shards, Citus doesn't use
2PC. The reason is that it doesn't fit into the definition of
"expanding participating worker nodes".

Thus, we're simply fixing the test to fit in the comment
on top of it.
2019-05-21 19:12:37 +03:00
Hadi Moshayedi 8ae47e1244 Fix comments for RemoteFileDestReceiverStartup and CitusCopyDestReceiverStartup 2019-05-21 09:03:22 -07:00
Onder Kalaci f76abfe470 Add ORDER BY to multi_router_planner 2019-05-21 15:54:33 +03:00
Onder Kalaci f06a79563d Add ORDER BY to multi_foreign_key 2019-05-21 15:54:03 +03:00
Hadi Moshayedi dce9260c0e Fix an include in recusive_planning.c 2019-05-20 18:57:03 -07:00
Murat Tuncer 3fe482adbc Fix DistShardCacheHash initialization
InitializeCaches() method may prematurely set
performedInitialization without actually creating
DistShardCacheHash.

Fix makes sure flag is set only if DistShardCacheHash is created successfully.

Also introduced a new memory context to allocate aforementioned hash tables.
If allocation/initialization fails for any reason we make sure
memory is reclaimed by deleting the memory context.
2019-05-15 16:47:44 +03:00
Hanefi Onaldi 4030d603eb
Merge pull request #2691 from citusdata/update_changelog
Add 8.1.2 and 8.2.1 changelog entries
2019-05-15 09:18:58 +03:00
Onder Kalaci 5d68a13139 Add order by to multi_shard_update_delete 2019-05-02 20:09:33 +03:00
Onder Kalaci 2c76b4bc46 Add order by to multi_function_in_join test 2019-05-02 20:05:25 +03:00
Onder Kalaci 495b6e9b62 Refactor Parallel Relation Access Recording
Instead of scattering the code around, we move all the
logic into a single function.

This will help supporting foreign keys to reference tables
in the unified executor with a single line of change, just
calling this function.
2019-05-02 18:12:33 +03:00
Onder Kalaci 3d871c5334 Add some ORDER BYs to make the test output consistent 2019-05-02 18:00:46 +03:00
Hadi Moshayedi 32ecb6884c Test ROLLBACK TO SAVEPOINT with multi-shard CTE failures 2019-05-01 09:33:43 -07:00
Hadi Moshayedi aafd22dffa Fix savepoint rollback for INSERT INTO ... SELECT. 2019-05-01 09:33:43 -07:00
Hadi Moshayedi b69a762e0b Fix savepoint rollback after multi-shard update failure. 2019-05-01 09:33:43 -07:00
Jason Petersen 71d5d1c865 Enable variable shadowing warnings; fix all
Rather than wait for another place like the previous commit to bite us,
I think we should turn on this warning.
2019-04-30 13:24:25 -06:00
Jason Petersen 1125fc9da0 Fix self-strncmp in ConstrIsFKToReferenceTable
Make the function do what I assume was intended.
2019-04-30 13:24:25 -06:00
Hadi Moshayedi a9f7c1e8cb Normalize test result/expected files before doing diff. 2019-04-30 10:19:23 -07:00
Hadi Moshayedi c9b1d9c2d1 Check all placements aren't inactive 2019-04-26 10:04:55 -07:00
Hadi Moshayedi 7b1d03772d Don't schedule tasks on inactive nodes. 2019-04-26 10:04:54 -07:00
Onder Kalaci 82813a8796 Add ORDER BYs to multi_subquery and subqueries_deep tests 2019-04-24 13:36:11 +03:00
Onder Kalaci 004f28e18c Sort output of RETURNING
The feature is only intended for getting consistent outputs for the regression tests.

RETURNING does not have any ordering gurantees and with unified executor, the ordering
of query executions on the shards are also becoming unpredictable. Thus, we're enforcing
ordering when a GUC is set.

We implicitly add an `ORDER BY` something equivalent of
	`
	  RETURNING expr1, expr2, .. ,exprN
	  ORDER BY expr1, expr2, .. ,exprN
	`

As described in the code comments as well, this is probably not the most
performant approach we could implement. However, since we're only
targeting regression tests, I don't see any issues with that. If we
decide to expand this to a feature to users, we should revisit the
implementation and improve the performance.
2019-04-24 11:51:19 +03:00
Onder Kalaci 64b323d9eb Add ORDER BY to set_operations 2019-04-23 11:51:58 +03:00
Onder Kalaci 913ffc9dcd Add ORDER BY to multi_subquery_in_where_clause 2019-04-23 11:46:00 +03:00
Onder Kalaci 753163b4d8 Be less verbose for printing worker ports in intermediate_results 2019-04-17 14:57:20 +03:00
Onder Kalaci b3af5b2cc4 Add order by multi_mx_modifications 2019-04-17 14:57:20 +03:00
Onder Kalaci a159bd9aed Add order by window_functions 2019-04-17 14:57:20 +03:00
Jason Petersen 4b9519e7d6 Check for non-extended constraint before extending
This will only apply to DROP and VALIDATE commands; see the lengthy
comment in multi_create_table_constraints.sql for more explanation.
2019-04-15 23:14:21 -06:00
Jason Petersen 5a017c684c Add repro case for #2484 2019-04-15 23:14:11 -06:00
Onder Kalaci 6d81fc518c Add order by subquery_complex_target_list 2019-04-10 19:55:41 +03:00
Onder Kalaci 58e90ad60d Add order by multi_outer_join 2019-04-09 12:53:57 +03:00
Onder Kalaci 298e95c441 Add order by multi_shard_update_delete 2019-04-09 12:41:46 +03:00
Onder Kalaci 6a8e2c260a Add order by multi_insert_select 2019-04-09 12:28:57 +03:00
Onder Kalaci af096a898c Add order by subquery_and_cte 2019-04-09 12:19:10 +03:00
Onder Kalaci 56a1a39fd4 Add order by multi_subquery_complex_queries 2019-04-09 12:12:26 +03:00
Onder Kalaci 4effa8c1f8 Add order by multi_schema_support 2019-04-09 11:52:08 +03:00
Onder Kalaci 92e87738dd Make sure that the regression test output is durable to different execution orders
Mostly add order bys and suppress worker node ports in the test
outputs.
2019-04-08 11:48:08 +03:00
Onder Kalaci 7d872a343a Rename MultiConnectionState to MultiConnectionPollState 2019-04-05 11:50:11 +03:00
Onder Kalaci fb38dc3136 Ensure that stack resizing logic works expected
This commit has two goals:

(a) Ensure to access both edges of the allocated stack
(b) Ensure that any compiler optimizations to prevent the
    function optimized away.

Stack size after the patch:
 sudo grep -A 1 stack /proc/2119/smaps
7ffe305a6000-7ffe307a9000 rw-p 00000000 00:00 0                          [stack]
Size:               2060 kB

Stack size before the patch:
 sudo grep -A 1 stack /proc/3610/smaps
7fff09957000-7fff09978000 rw-p 00000000 00:00 0                          [stack]
Size:                132 kB
2019-04-03 10:58:19 +03:00
Murat Tuncer 1424f75ec9 Support columns referencing an aliased joins
We used to rely on PG function flatten_join_alias_vars
to resolve actual columns referenced in target entry list.

The function goes deep and finds the actual relation. This logic
usually works fine. However, when joins are given an alias, inner
relation names are not visible to target entry entry. Thus relation
resolving should stop when we the target entry column refers an
rte of an aliased join.

We stopped using PG function and provided our own flatten function.
2019-03-26 09:46:22 +03:00
Jason Petersen 4c7f78bd7e Code review feedback 2019-03-25 22:07:27 -05:00
Jason Petersen 6a0dc7756e Formatting fixes
Noticed a lot of weird lines wrapped at 80; our standard is 90.
2019-03-22 20:32:19 -06:00
Jason Petersen 6acf52660c Always coerce RHS of pruning op to part. key type
Our assumption that strip_implicit_coercions would leave us with a bi-
nary-compatible type to that of the partition key was wrong. Instead,
we should ensure the RHS of the comparison we perform is proactively
coerced into a compatible type (at least binary compatible).
2019-03-22 20:32:19 -06:00
Jason Petersen 5baa257c91 Add second assert to guard against future changes
This isn't entirely necessary but I feel safer with it here.
2019-03-22 20:32:19 -06:00
Jason Petersen 69adb627c3 Add Assert that will crash before coercion fix is in 2019-03-22 20:32:19 -06:00
Hadi Moshayedi ff1d4f697a
Ignore test_times.log (#2638) 2019-03-22 10:29:01 -07:00
Nils Dijk feaac69769
Implementation for asycn FinishConnectionListEstablishment (#2584) 2019-03-22 17:30:42 +01:00
Marco Slot e3b7e74f43 Allow rescan in DECLARE .. WITH HOLD 2019-03-22 11:25:55 +01:00
Jason Petersen a2c6f596f9 Address code review comments 2019-03-21 11:59:52 -06:00
Jason Petersen 04aa34da68 Invalidate ConnParamsHash at config reload
At configuration reload, we free all "global" (i.e. GUC-set) connection
parameters, but these may still have live references in the connection
parameters hash. By marking the entries as invalid, we can ensure they
will not be used after free.
2019-03-21 00:03:35 -06:00
Jason Petersen 00d836e5a3 alloc non-global conn. params in provided context
Having DATA-segment string literals made blindly freeing the keywords/
values difficult, so I've switched to allocating all in the provided
context; because of this (and with the knowledge of the end point of
the global parameters), we can safely pfree non-global parameters when
we come across an invalid connection parameter entry.
2019-03-21 00:03:35 -06:00
Marco Slot e8152d9b6d Only look in top-level rtable in ExtractFirstDistributedTableId 2019-03-20 12:14:46 +03:00
Marco Slot ee6a0b6943 Speed up RTE walkers
Do it in two ways (a) re-use the rte list as much as possible instead of
re-calculating over and over again (b) Limit the recursion to the relevant
parts of the query tree
2019-03-20 12:14:46 +03:00
Marco Slot 5ff1821411 Cache the current database name
Purely for performance reasons.
2019-03-20 12:14:46 +03:00
Marco Slot 0ea4e52df5 Add nodeId to shardPlacements and use it for shard placement comparisons
Before this commit, shardPlacements were identified with shardId, nodeName
and nodeport. Instead of using nodeName and nodePort, we now use nodeId
since it apparently has performance benefits in several places in the
code.
2019-03-20 12:14:46 +03:00
Onder Kalaci 41d8c4030a Add some more regression tests for outer join pushdown 2019-03-19 11:49:38 +03:00
Onder Kalaci ad5ff1d01a Some queries lead to infinite recursion with recurisve planning
The rule for infinite recursion is the following:

    - If the query contains a subquery which is recursively planned, and
      no other subqueries can be recursively planned due to correlation
      (e.g., LATERAL joins), the planner keeps recursing again and again.

One interesting thing here is that even if a subquery contains only intermediate
result(s), we re-recursively plan that. In the end, the logic in the code does the following:

  - Try recursive planning any of the subqueries in the query tree
     - If any subquery is recursively planned, call the planner again
        where the subquery is replaced with the intermediate result.
        - Try recursively planning any of the queries
          - If any subquery is recursively planned, call the planner again
            where the subquery (in this case it is already intermediate result)
            is replaced with the intermediate result.
              - Try recursively planning any of the queries
                - If any subquery is recursively planned, call the planner again
                  where the subquery (in this case it is already intermediate result)
                  is replaced with the intermediate result.
                  - Try recursively planning any of the queries
                    - If any subquery is recursively planned, call the planner again
                      where the subquery (in this case it is already intermediate result)
                      is replaced with the intermediate result.
                      ......
2019-03-18 10:35:00 +03:00
Marco Slot f2abf2b8e5 Functions are treated as transaction blocks 2019-03-15 16:34:08 -06:00
Marco Slot 4b9bd54ae0 Remove create_insert_proxy_for_table 2019-03-15 14:13:03 -06:00
exialin 84b853e1b5 Fix some typos (#2620) 2019-03-14 16:48:31 -07:00
Hadi Moshayedi a9e6d06a98 Skip execution of ALTER TABLE constraint checks on the coordinator 2019-03-14 15:40:56 -07:00
Hadi Moshayedi cdd3b15ac8 Fix distributed deadlock for ALTER TABLE ... ATTACH PARTITION.
Following scenario resulted in distributed deadlock before this commit:

CREATE TABLE partitioning_test(id int, time date) PARTITION BY RANGE (time);
CREATE TABLE partitioning_test_2009 (LIKE partitioning_test);
CREATE TABLE partitioning_test_reference(id int PRIMARY KEY, subid int);

SELECT create_distributed_table('partitioning_test_2009', 'id'),
       create_distributed_table('partitioning_test', 'id'),
       create_reference_table('partitioning_test_reference');

ALTER TABLE partitioning_test ADD CONSTRAINT partitioning_reference_fkey FOREIGN KEY (id) REFERENCES partitioning_test_reference(id) ON DELETE CASCADE;
ALTER TABLE partitioning_test_2009 ADD CONSTRAINT partitioning_reference_fkey_2009 FOREIGN KEY (id) REFERENCES partitioning_test_reference(id) ON DELETE CASCADE;

ALTER TABLE partitioning_test ATTACH PARTITION partitioning_test_2009 FOR VALUES FROM ('2009-01-01') TO ('2010-01-01');
2019-03-14 15:28:37 -07:00
Hadi Moshayedi f19feb742c
Remove never assigned colocatedRelation from CreateDistributedTable (#2479) 2019-03-12 14:50:18 -07:00
Hanefi Onaldi 419f52884f
Merge branch 'master' into improve-mitmproxy-documentation 2019-03-12 07:16:01 -07:00
Murat Tuncer 2681231c98 Create column aliases for shard tables in worker queries when requested 2019-03-07 12:54:42 +03:00
Hadi Moshayedi f4d3b94e22
Fix some of the casts for groupId (#2609)
A small change which partially addresses #2608.
2019-03-05 12:06:44 -08:00
velioglu faf50849d7 Enhance pushdown planning logic to handle full outer joins with using clause
Since flattening query may flatten outer joins' columns into coalesce expr that is
in the USING part, and that was not expected before this commit, these queries were
erroring out. It is fixed by this commit with considering coalesce expression as well.
2019-03-05 11:49:30 +03:00
Onder Kalaci 26f569abd8 Make sure to clear PGresult on few places
This leads to a memory leak otherwise.
2019-02-28 13:44:34 +03:00
Jason Petersen 3df2f51881
Turn on style-checking, fix lingering violations
We'd been ignoring updating uncrustify for some time now because I'd
thought these were misclassifications that would require an update in
our rules to address. Turns out they're legit, so I'm checking them in.
2019-02-26 23:01:40 -07:00
Jason Petersen 5817bc3cce
Add test-timing script
Through some clever stream redirections and options, we can get decent
timing data for each of our tests.
2019-02-26 23:01:40 -07:00
Jason Petersen 5db45bac45
Enable CircleCI
The configuration for the build is in the YAML file; the changes to the
regression runner are backward-compatible with Travis and just add the
logic to detect whether our custom (isolation- and vanilla-enabled) pkg
is present.
2019-02-26 22:17:26 -07:00
Onder Kalaci f706772b2f Round-robin task assignment policy relies on local transaction id
Before this commit, round-robin task assignment policy was relying
on the taskId. Thus, even inside a transaction, the tasks were
assigned to different nodes. This was especially problematic
while reading from reference tables within transaction blocks.
Because, we had to expand the distributed transaction to many
nodes that are not necessarily already in the distributed transaction.
2019-02-22 19:26:38 +03:00
Onder Kalaci e521e7e39c Apply feedback 2019-02-22 18:14:30 +03:00
Onder Kalaci 407d0e30f5 Fix selectForUpdate bug 2019-02-21 18:21:41 +03:00
Onder Kalaci f144bb4911 Introduce fast path router planning
In this context, we define "Fast Path Planning for SELECT" as trivial
queries where Citus can skip relying on the standard_planner() and
handle all the planning.

For router planner, standard_planner() is mostly important to generate
the necessary restriction information. Later, the restriction information
generated by the standard_planner is used to decide whether all the shards
that a distributed query touches reside on a single worker node. However,
standard_planner() does a lot of extra things such as cost estimation and
execution path generations which are completely unnecessary in the context
of distributed planning.

There are certain types of queries where Citus could skip relying on
standard_planner() to generate the restriction information. For queries
in the following format, Citus does not need any information that the
standard_planner() generates:

  SELECT ... FROM single_table WHERE distribution_key = X;  or
  DELETE FROM single_table WHERE distribution_key = X; or
  UPDATE single_table SET value_1 = value_2 + 1 WHERE distribution_key = X;

Note that the queries might not be as simple as the above such that
GROUP BY, WINDOW FUNCIONS, ORDER BY or HAVING etc. are all acceptable. The
only rule is that the query is on a single distributed (or reference) table
and there is a "distribution_key = X;" in the WHERE clause. With that, we
could use to decide the shard that a distributed query touches reside on
a worker node.
2019-02-21 13:27:01 +03:00
Nils Dijk 1623c44fc7 Simplify make file for citus sql files 2019-02-19 21:29:20 -05:00
Hanefi Onaldi 148dcad0bb
More documentation and stale comments rewritten 2019-02-04 20:21:51 +03:00
Hanefi Onaldi 825666f912
Query samples in docs and better errors 2019-02-04 19:20:02 +03:00
Hanefi Onaldi 574b071113
Add wrapper function introduced in PG11 for compatibility 2019-02-04 19:20:02 +03:00
Hanefi Onaldi 1106e14385
Wrap functions in subqueries
remove debug logs to fix travis tests

Support RowType functions in joins

Regression tests for a custom type function in join
2019-02-04 19:19:29 +03:00
Hanefi Onaldi c5c3d6d0a3
Update the mitmscripts readme
and migrate readme to markdown
and create contribution guidelines
2019-02-04 11:30:05 +03:00
Hanefi Onaldi 4dd1f5784b
Failure&cancellation tests for mx metadata sync
Failure&Cancellation tests for initial start_metadata_sync() calls
to worker and DDL queries that send metadata syncing messages to an MX node

Also adds message type definitions for messages that are exchanged
during metadata syncing
-
2019-02-01 11:50:25 +03:00
Murat Tuncer b36b59dd4f Relax reference table restrictions in subquery union pushdowns
We used to error out if there is a reference table
in the query participating a union. This has caused
pushdownable queries to be evaluated in coordinator.

Now we let reference tables inside union queries as long
as there is a distributed table in from clause.

Existing join checks (reference table on the outer part)
sufficient enought that we do not need check the join relation
of reference tables.
2019-01-31 15:34:29 +03:00
Onder Kalaci ec67381ba2 Queries with only intermediate results do not rely on task assignment policy
Previously we allowed task assignment policy to have affect on router queries
with only intermediate results. However, that is erroneous since the code-path
that assigns placements relies on shardIds and placements, which doesn't exists
for intermediate results.

With this commit, we do not apply task assignment policies when a router query
hits only intermediate results.
2019-01-28 17:59:17 +03:00
Murat Tuncer cd5213abee Set sequential mode execution GUC for alter partitioned table
PG recently started propagating foreign key constraints
to partition tables. This came with a select query
to validate the the constaint.

We are already setting sequential mode execution for this
command. In order for validation select query to respect
this setting we need to explicitly set the GUC.

This commit also handles detach partition part.
2019-01-25 15:28:07 +03:00
velioglu 1bb0ec316a Reset planner restriction context instead of popping with recursive planning 2019-01-17 14:35:16 +03:00
Jason Petersen 339e6e661e
Remove 9.6 (#2554)
Removes support and code for PostgreSQL 9.6

cr: @velioglu
2019-01-16 13:11:24 -07:00
Nils Dijk 3f2bac18df
Add make target to run regression tests in isolation with vagrant
Also allow `multi_alter_table_add_constraints` to run in isolation
2019-01-16 11:41:09 +01:00
Marco Slot 1656b519c4 Plan outer joins through pushdown planning 2019-01-05 20:55:27 +01:00
Murat Tuncer b389bebda1 Move repeated code to a function 2019-01-03 17:19:01 +03:00
Murat Tuncer a72d959735 Fix multi_view tests 2019-01-03 17:07:26 +03:00
Murat Tuncer 2ed7d24591 Fix having clause bug for complex joins
We update column attributes of various clauses for a query
inluding target columns, select clauses when we introduce
new range table entries in the query.

It seems having clause column attributes were not updated.

This fix resolves the issue
2019-01-03 17:07:26 +03:00
Murat Tuncer ec36030fae Move functions calls that can fail to outside of spinlock
We had recently fixed a spinlock issue due to functions
failing, but spinlock is not being released.

This is the continuation of that work to eliminate possible
regression of the issue. Function calls that are moved out of
spinlock scope are macros and plain type casting. However,
depending on the configuration they have an alternate implementation
in PG source that performs memory allocation.

This commit moves last bit of codes to out of spinlock for completion purposes.
2019-01-03 15:59:56 +03:00
Murat Tuncer 3b95a03c3e
Merge branch 'master' into fix_spinlock_use 2018-12-25 14:41:21 +03:00
Hadi Moshayedi 38579d52d0
Speed-up run_command_on_shards(). (#2564)
We were establishing connections synchronously. Establishing
connections asynchronously results in some parallelization, saving
hundreds of milliseconds.

In a test I did, this decreased the query time from 150ms to 40ms.
2018-12-24 08:47:01 -05:00
Murat Tuncer 9671bc3cbb Make sure spinlock is not left unreleased when an exception is thrown
A spinlock is not released when an exception is thrown after
spinlock is acquired. This has caused infinite wait and eventual
crash in maintenance daemon.

This work moves the code than can fail to the outside of spinlock
scope so that in the case of failure spinlock is not left locked
since it was not locked in the first place.
2018-12-24 15:47:21 +03:00
Hanefi Onaldi fb497ddad1
Bump 8.2devel on master (#2567) 2018-12-24 13:49:50 +03:00
Onder Kalaci 9fff7d28a7
Revert 4925521 2018-12-21 15:36:40 -07:00
Marco Slot 2e4029973c
Remove sequential create index concurrently test 2018-12-21 14:03:00 -07:00
Marco Slot 1b1c6374f7
Execute CREATE INDEX CONCURRENTLY concurrently 2018-12-21 14:02:59 -07:00
Marco Slot 3ff2b47366 Restrict visibility of get_*_active_transactions functions to pg_monitor 2018-12-19 18:32:42 +01:00
Dimitri Fontaine 6a1a2b8458 Move an assert-only array-bound check to run-time.
When the bound-check fails at run-time, better abort with an error message
rather than trying to user memory we did not allocate.
2018-12-19 06:12:05 +01:00
Marco Slot 13f4a0ac9f Stabilize failure test shard IDs 2018-12-19 04:26:46 +01:00
Marco Slot 5b9376a7f8 Check ownership before taking locks in distributed table creation 2018-12-18 15:32:07 +01:00
Nils Dijk 694992e946
upgrade default ssl_ciphers to more restrictive on extension creation
Show ssl_ciphers in ssl_by_default_test
2018-12-12 15:33:15 +01:00
Jason Petersen 92893e9601
Fix control file version 2018-12-11 18:50:20 -07:00
Jason Petersen bd0d1f05e7
Bump SQL version
Should have been done when the release-8.0 branch was created…
2018-12-11 10:40:15 -07:00
velioglu 90704d9a52 Fix getting function oid to get hll_add_agg id 2018-12-10 14:16:19 +03:00
velioglu 3e0cff94a6 Add FunctionOidExtended function 2018-12-10 11:59:41 +03:00
Nils Dijk 4af40eee76 Enable SSL by default during installation of citus 2018-12-07 11:23:19 -07:00
velioglu 8764a19464 Adds support for disabling hash agg with hll functions on coordinator query 2018-12-07 18:49:25 +03:00
Marco Slot 9cf91c438b Only allow transmit from pgsql_job_cache directory 2018-12-05 10:18:27 +01:00
Marco Slot 70fb9c851b Remove odd memcpy usag in BuildCachedShardList 2018-12-04 14:09:10 +01:00
Marco Slot 0388324fbe Expand planner readme 2018-12-04 09:55:19 +01:00
Dimitri Fontaine d1b182de7d Replace calls to unsafe functions like memcpy and sscanf
In answer to a security audit, we double check buffer sizes and avoid
known-dangerous operations such as sscanf.
2018-12-04 08:54:43 +01:00
Onder Kalaci 621ccf3946 Ensure to use initialized MaxBackends
Postgresql loads shared libraries before calculating MaxBackends.
However, Citus relies on MaxBackends being set. Thus, with this
commit we use the same steps to calculate MaxBackends while
Citus is being loaded (e.g., PG_Init is called).

Note that this is safe since all the elements that are used to
calculate MaxBackends are PGC_POSTMASTER gucs and a constant
value.
2018-12-03 13:25:51 +03:00
Onder Kalaci b6ebd791a6 Sort task list for multi-task explain outputs
This is purely for ensuring that regression tests do not randomly fail.
2018-11-30 11:19:37 -07:00
Onder Kalaci 18c9badff5 Make sure the explain output for partition wise join is stable
We disable bunch of planning options on the workers. This might be
risky if any concurrent test relies on EXPLAIN OUTPUT as well. Still,
we want to keep this test, so we should try to not parallelize this
test with such test.
2018-11-30 16:44:57 +03:00
Marco Slot 8893cc141d Support INSERT...SELECT with ON CONFLICT or RETURNING via coordinator
Before this commit, Citus supported INSERT...SELECT queries with
ON CONFLICT or RETURNING clauses only for pushdownable ones, since
queries supported via coordinator were utilizing COPY infrastructure
of PG to send selected tuples to the target worker nodes.

After this PR, INSERT...SELECT queries with ON CONFLICT or RETURNING
clauses will be performed in two phases via coordinator. In the first
phase selected tuples will be saved to the intermediate table which
is colocated with target table of the INSERT...SELECT query. Note that,
a utility function to save results to the colocated intermediate result
also implemented as a part of this commit. In the second phase, INSERT..
SELECT query is directly run on the worker node using the intermediate
table as the source table.
2018-11-30 15:29:12 +03:00
Hanefi Onaldi 088a2ef66a throw an error when a subquery has grouping set clause 2018-11-30 13:11:32 +03:00
Onder Kalaci a15f168ce4 Ensure that citus_dist_activity test outputs do not change
Since there is no lock ordering among the query that is executed
and the select from the view, we prefer to add a timeout before
priting the activity.
2018-11-30 11:46:17 +03:00
Nils Dijk 9309e63156
create_distributed_table as user, change table ownership during create 2018-11-29 14:20:42 +01:00
Nils Dijk 6aa191f72c
remove table_ddl_command_array and test master_get_table_ddl_events 2018-11-29 14:20:42 +01:00
Murat Tuncer fd868ec268 Fix citus_stat_statements view
Join between pg_stat_statements and citus_query_stats should
include queryid, dbid, userid instead of just queryid.
2018-11-29 14:49:16 +03:00
Dimitri Fontaine 5ae2d03881 Refrain from having a strong opinion on maxGroupId.
When initializing a Citus formation automatically from an external piece of
software such as Citus-HA, the following process process may be used:

  - decide on the groupId in the external software
  - SELECT * FROM master_add_inactive_node('localhost', 9701, groupid => X)

When Citus checks for maxGroupId, it forbids other software to pick their
own group Ids to ues with the master_add_inactive_node() API.

This patch removes the extra testing around maxGroupId.
2018-11-28 04:29:15 +01:00
Marco Slot 0393910c65 Shard IDs in isolation_citus_dist_stat_activity output changed 2018-11-28 02:59:50 +01:00
Marco Slot aff37cf1bc Control multi-shard modify locks with enable_deadlock_prevention 2018-11-28 02:59:50 +01:00
Marco Slot 1ec5b6c890 Remove old worker_hash_partition_table API 2018-11-26 14:40:37 +01:00
Marco Slot 5a63deab2e Clean up UDFs and remove unnecessary permissions 2018-11-26 14:40:37 +01:00
Hanefi Onaldi 448b241ab4 validate query isolation tests 2018-11-26 14:04:51 +03:00
Hanefi Onaldi 4edb193f25 make the tests parallelizeable
helper view table_fkeys_in_workers now allows filtering by schema so that a test case can print out foreign keys in its schema only
2018-11-26 14:04:51 +03:00
Hanefi Onaldi b3d897039a constraint validation regression tests 2018-11-26 14:04:51 +03:00
Hanefi Onaldi 7db6991dc0 propagate validate queries to workers 2018-11-26 14:04:51 +03:00
Marco Slot e8e956aa9f Require superuser when using non-existent job schema in worker_merge_files_into_table 2018-11-24 02:57:16 +01:00
Marco Slot c4ad899dd8 Check schema ownership in worker_merge_* functions 2018-11-23 11:05:09 +01:00
Marco Slot e9a7295ead Add multi-user tests for task-tracker protocol functions 2018-11-23 11:05:09 +01:00
Marco Slot 8e93fe5870 Check schema owner in task_tracker_assign_task 2018-11-23 11:05:09 +01:00
Marco Slot ec957a833a Check permission in task_tracker_task_status 2018-11-23 11:04:58 +01:00
Marco Slot 4245032849 Add user ID suffixes to filenames in check-worker tests 2018-11-23 08:36:12 +01:00
Marco Slot 6aa5592e52 Add user ID suffix to intermediate files in re-partition jobs 2018-11-23 08:36:11 +01:00
Marco Slot a59bf31c76 Use worker_execute_sql_task UDF in task-tracker executor 2018-11-22 18:15:33 +01:00
Marco Slot 30bad7e66f Add worker_execute_sql_task UDF 2018-11-22 18:15:33 +01:00
Marco Slot e3521ce320 Test current user in task-tracker queries 2018-11-22 18:15:33 +01:00
Marco Slot caf402d506 COPY to a task file no longer switches to superuser 2018-11-22 18:15:33 +01:00
Marco Slot e17025e1d4 Check table ownership in mark_tables_colocated 2018-11-18 00:11:38 +01:00
Marco Slot 18acd00553 Check permissions in lock_relation_if_exists 2018-11-18 00:11:38 +01:00
Marco Slot aab9f623eb Check table ownership in upgrade_to_reference_table 2018-11-16 23:27:34 +01:00
Onder Kalaci 052ba21b19 Make sure to prevent unauthorized users to drop sequences in Citus MX 2018-11-15 18:08:04 +03:00
Onder Kalaci 7f0a57a153 Make sure to prevent unauthorized users to drop tables in Citus MX 2018-11-15 18:07:03 +03:00
Nils Dijk f9520be011
Round robin queries to reference tables with task_assignment_policy set to `round-robin` (#2472)
Description: Support round-robin `task_assignment_policy` for queries to reference tables.

This PR allows users to query multiple placements of shards in a round robin fashion. When `citus.task_assignment_policy` is set to `'round-robin'` the planner will use a round robin scheduling feature when multiple shard placements are available.

The primary use-case is spreading the load of reference table queries to all the nodes in the cluster instead of hammering only the first placement of the reference table. Since reference tables share the same path for selecting the shards with single shard queries that have multiple placements (`citus.shard_replication_factor > 1`) this setting also allows users to spread the query load on these shards.

For modifying queries we do not apply a round-robin strategy. This would be negated by an extra reordering step in the executor for such queries where a `first-replica` strategy is enforced.
2018-11-15 15:11:15 +01:00
Marco Slot 2de8ef29c3 Revoke function permissions for node metadata functions 2018-11-15 06:25:07 +01:00
Marco Slot f383e4f307
Description: Refactor code that handles DDL commands from one file into a module
The file handling the utility functions (DDL) for citus organically grew over time and became unreasonably large. This refactor takes that file and refactored the functionality into separate files per command. Initially modeled after the directory and file layout that can be found in postgres.

Although the size of the change is quite big there are barely any code changes. Only one two functions have been added for readability purposes:

- PostProcessIndexStmt which is extracted from PostProcessUtility
- PostProcessAlterTableStmt which is extracted from multi_ProcessUtility

A README.md has been added to `src/backend/distributed/commands` describing the contents of the module and every file in the module.
We need more documentation around the overloading of the COPY command, for now the boilerplate has been added for people with better knowledge to fill out.
2018-11-14 13:36:27 +01:00
Burak Yucesoy f8e0d37ba1 Fix crashes caused by stack size increase under high memory load
Each PostgreSQL backend starts with a predefined amount of stack and this stack
size can be increased if there is a need. However, stack size increase during
high memory load may cause unexpected crashes, because if there is not enough
memory for stack size increase, there is nothing to do for process apart from
crashing. An interesting thing is; the process would get OOM error instead of
crash, if the process had an explicit memory request (with palloc) for example.
However, in the case of stack size increase, there is no system call to get OOM
error, so the process simply crashes.

With this change, we are increasing the stack size explicitly by requesting extra
memory from the stack, so that, even if there is not memory, we can at least get
an OOM instead of a crash.
2018-11-14 01:27:53 +03:00
Nils Dijk 97da44558b
Description: Fix failures of tests on recent postgres builds
In recent postgres builds you cannot set client_min_messages to
values higher then ERROR, if will silently set it to ERROR if so.

During some tests we would set it to fatal to hide random values
(eg. pid's of processes) from the test output. This patch will use
different tactics for hiding these values.
2018-11-13 16:53:05 +01:00
Murat Tuncer cc401a2616 Create function_utils for pg function call related utilities 2018-11-07 15:29:38 +03:00
Hadi Moshayedi d3e284dcd6
Use heap_deform_tuple() instead of calling heap_getattr(). (#2464)
After Fast ALTER TABLE ADD COLUMN with a non-NULL default in PG11, physical heaps might not contain all attributes after a ALTER TABLE ADD COLUMN happens. heap_getattr() returns NULL when the physical tuple doesn't contain an attribute. So we should use heap_deform_tuple() in these cases, which fills in the missing attributes.

Our catalog tables evolve over time, and an upgrade might involve some ALTER TABLE ADD COLUMN commands.

Note that we don't need to worry about postgres catalog tables and we can use heap_getattr() for them, because they only change between major versions.

This also fixes #2453.
2018-11-05 15:11:01 -05:00
Onder Kalaci 7aa2af8975 Add failure and cancellation tests for multi row inserts 2018-10-29 11:36:02 +03:00
Onder Kalaci 7b4d912904 Add cancellation tests for VACUUM/ANALYZE 2018-10-26 16:25:11 +03:00
Onder Kalaci 85d7d074c3 Add cancellation tests for multi shard modification queries 2018-10-26 15:07:52 +03:00
Onder Kalaci 18eee6d9c8 Add cancellation tests for router selects 2018-10-26 14:29:56 +03:00
Jason Petersen a37a809d49
Add savepoint failure tests
Tests at each significant point (i.e. SAVEPOINT, ROLLBACK, RELEASE)
that correct semantics are preserved (using both no and statement replication).
2018-10-26 11:12:40 +01:00
Onder Kalaci 9e2e2a7300 Make sure to access PARAM_EXTERN accurately in PG 11
PG 11 has change the way that PARAM_EXTERN is processed.
This commit ensures that Citus follows the same pattern.

For details see the related Postgres commit:
6719b238e8
2018-10-25 21:55:03 +03:00
Onder Kalaci 6e05921736 Processes that are blocked on advisory locks show up in wait edges
Assign the distributed transaction id before trying to acquire the
executor advisory locks. This is useful to show this backend in citus
lock graphs (e.g., dump_global_wait_edges() and citus_lock_waits).
2018-10-24 13:32:13 +03:00
Jason Petersen 98c8267a37
Add single-shard modification failure tests
I'm pretty sure a lot of this test functionality may be covered in some
of our existing regression tests, but I've included them to ensure we
put all failure-based tests under our new testing method for that kind
of test.

Didn't include lower replication factor, as (for a single-shard mod.),
it's indistinguishable from modifying a reference table. So these all
test modifications which hit a single, replicated shard.
2018-10-23 23:31:40 +01:00
Hadi Moshayedi 3e00bf1c0d Don't throw error for DROP DATABASE IF EXISTS 2018-10-23 09:45:03 -04:00
Murat Tuncer 081594ad03 Don't allow PG11 travis failures anymore
We made PG11 builds optional when we had an issue
with mx isolation test that we could not solve back then.

This commit solves the issue with a workaround by running
start_metadata_sync_to_node outside the transaction block.
2018-10-19 15:20:53 +03:00
Jason Petersen ae9a98c2d1
Attempt to address planner context crashes
Both of these are a bit of a shot in the dark. In one case, we noticed
a stack trace where a caller received a null pointer and attempted to
dereference the memory context field (at 0x010). In the other, I saw
that any error thrown from within AdjustParseTree could keep the stack
from being cleaned up (presumably if we push we should always pop).

Both stack traces were collected during times of high memory pressure
and locally reproducing the problem locally or otherwise has been very
tricky (i.e. it hasn't been reproduced reliably at all).
2018-10-18 08:41:51 -06:00
Murat Tuncer c7efd8aff0 Add failure test for insert/select pushdown 2018-10-18 09:09:26 +03:00
Hadi Moshayedi 431ac80563
Keep track of cached entries in case of interruption. (#2433)
* Keep track of cached entries in case of interruption.

Previously we set DistTableCacheEntry->sortedShardIntervalArray
and DistTableCacheEntry->shardIntervalArrayLength after we entered
all related shard entries into DistShardCacheHash. The drawback was
that if populating DistShardCacheHash was interrupted,
ResetDistTableCacheEntry() didn't see the shard hash entries created,
so was unable to clean them up.

This patch fixes that by setting sortedShardIntervalArray earlier,
and incrementing shardIntervalArrayLength as we enter shards into
the cache.
2018-10-15 14:06:56 -04:00
Jason Petersen 9fb951c312
Fix user-facing typos
Lintian found these (presumably by looking in the text section and
running them through e.g. aspell).
2018-10-09 16:54:03 -07:00
velioglu 5713019058 Add failure tests for real time select queries 2018-10-09 14:12:02 -07:00
Onder Kalaci 73696a03e4 Make sure not to leak intermediate result folders on the workers 2018-10-09 22:47:56 +03:00
Hadi Moshayedi 7509c6c8fb Add tests which check we disallow writes to local tables. 2018-10-06 10:54:44 +02:00
Marco Slot d56baefe3d Allow simple DML commands from hot standby 2018-10-06 10:54:44 +02:00
Jason Petersen 1cb48416eb
Add reference table failure tests
Fairly straightforward; verified that modifications fail atomically if
a worker is down or fails mid-transaction (i.e. all workers need to ack
modifications to reference tables in order to persist changes).
2018-10-09 09:39:30 -07:00
Jason Petersen 9bcf2873a7
Add single-shard router select failure tests
Including several examples from #1926. I couldn't understand why the
recover_prepared_transactions "should be an error", and EXPLAIN has
changed since the original bug (so that it runs EXPLAINs in txns, I
think for EXPLAIN ANALYZE to not have side effects); other than that,
most of the reported bugs now error out rather than crash or return
an empty result set.
2018-10-09 08:51:10 -07:00
Jason Petersen 8f2aa00951
Add failure tests for VACUUM/ANALYZE
VACUUM runs outside of a transaction, so the failure modes for it are
somewhat straightforward, though ANALYZE runs in a 1pc transaction and
multi-table VACUUM can fail between statements (PG 11 and higher).
2018-10-09 08:50:37 -07:00
Jason Petersen ee4114bc7a Failure tests for modifying multiple shards in txn
Tests various failure points during a multi-shard modification within
a transaction with multiple statements. Verifies three cases:

  * Reference tables (single shard, many placements)
  * Normal table with replication factor two
  * Multi-shard table with no replication

In the replication-factor case, we expect shard health to be affected
in some transactions; most others fail the transaction entirely and
all we need verify is that no effects of the transaction are visible.

Had trouble testing the final PREPARE/COMMIT/ROLLBACK phase of the 2pc,
in particular because the error message produced includes the PID of
the backend, which is unpredictable.
2018-10-09 09:17:32 -06:00
Murat Tuncer 4f8042085c Fix drop schema in mx with partitioned tables
Drop schema command fails in mx mode if there
is a partitioned table with active partitions.

This is due to fact that sql drop trigger receives
all the dropped objects including partitions. When
we call drop table on parent partition, it also drops
the partitions on the mx node. This causes the drop
table command on partitions to fail on mx node because
they are already dropped when the partition parent was
dropped.

With this work we did not require the table to exist on
worker_drop_distributed_table.
2018-10-08 17:01:54 -07:00
Murat Tuncer 71a910d2fa Add failure tests for insert/select via coordinator 2018-10-04 18:01:19 +03:00
Murat Tuncer 0a987e9c0e Fix cte subquery failure test 2018-10-03 15:43:48 +03:00
Murat Tuncer d26b312cad Add failure test for coordinator pull/push for cte 2018-10-03 15:43:48 +03:00
Murat Tuncer 6c66033455 Add failure tests for multi-shard update/delete
Failure tests for update/delete  on hash distributed tables
using 1PC and 2PC
2018-10-03 15:43:48 +03:00
velioglu 512d23934f Show router modify,select and real-time queries on MX views 2018-10-02 13:59:38 +03:00
Murat Tuncer 9bdef67bab Do not create inherited constraints on worker shards
PG now allows foreign keys on partitioned tables.
Each foreign key constraint on partitioned table
is propagated down to partitions.

We used to create all constraints on shards when we are creating
a new shard, or when just simply moving a shard from one worker
to another. We also used the same logic when creating a copy of
coordinator table in mx node.

With this change we create the constraint on worker node only if
it is not an inherited constraint.
2018-09-28 14:14:51 +03:00
Murat Tuncer 653c7e4ae0 Fix memory leak in FinishRemoteTransactionPrepare 2018-09-28 11:13:21 +03:00
Onder Kalaci cdc0d1491c Make sure to use correct execution mode for TRUNCATE
We used to set the execution mode in the truncate trigger. However,
when multiple tables are truncated with a single command, we could
set the execution mode very late. Instead, now set the execution mode
on the utility hook.
2018-09-25 15:35:27 +03:00
Marco Slot 1ca9a5b867 Do not allow unresolved parameters in INSERT...SELECT 2018-09-24 14:12:04 +02:00
Jason Petersen d7f10b0896 Rewrite parallel ID test to avoid costly JITting
By setting the CPU tuple cost so high, we were triggering JIT. Instead,
we should use parallel_tuple_cost.

See: rhaas.blogspot.com/2018/06/using-forceparallelmode-correctly.html
2018-09-24 09:29:53 +03:00
Jason Petersen e62a1ab43d Revert "Disable JIT during PostgreSQL 11 test runs"
This reverts commit a2fb5a84f1.

JIT wasn't actually interfering with the operation of Citus, a test was
just written in a way which caused JIT to run for a function on every
row in a 150k-row table.
2018-09-24 09:29:53 +03:00
Marco Slot 877d703ac5
Evaluate functions (and when applicable, parameters) anywhere in query 2018-09-21 12:57:50 -06:00
Onder Kalaci abc443d7fa Make sure that shard repair considers replication factor 2018-09-21 15:24:49 +03:00
Onder Kalaci 8520a5b432 worker_append_table_to_shard becomes aware of partitioned tables 2018-09-21 14:40:42 +03:00
Onder Kalaci c1b5a04f6e Allow partitioned tables with replication factor > 1
With this commit, we all partitioned distributed tables with
replication factor > 1. However, we also have many restrictions.

In summary, we disallow all kinds of modifications (including DDLs)
on the partition tables. Instead, the user is allowed to run the
modifications over the parent table.

The necessity for such a restriction have two aspects:
   - We need to acquire shard resource locks appropriately
   - We need to handle marking partitions INVALID in case
     of any failures. Note that, in theory, the parent table
     should also become INVALID, which is too aggressive.
2018-09-21 14:40:41 +03:00
Murat Tuncer b6930e3db9 Add distributed locking to truncated mx tables
We acquire distributed lock on all mx nodes for truncated
tables before actually doing truncate operation.

This is needed for distributed serialization of the truncate
command without causing a deadlock.
2018-09-21 14:23:19 +03:00
velioglu d7f75e5b48 Add citus_lock_waits to show locked distributed queries 2018-09-20 14:13:51 +03:00
Murat Tuncer 0f6e514bfb Fixes a bug on not being able to drop index on a partitioned table.
Reason for the failure is that PG11 introduced a new relation kind
RELKIND_PARTITIONED_INDEX to be used for partitioned indices.

We expanded our check to cover that case.
2018-09-19 13:15:05 +03:00
Marco Slot f34ab55389 Fix bug preventing rollback in stored procedure 2018-08-31 20:49:20 +02:00
Onder Kalaci 41d606b575 Use tree walker instad of mutator in relation visibility
This commit uses *_walker instead of *_mutator for performance reasons.
Given that we're only updating a functionId in the tree, the approach
seems fine.
2018-09-18 09:33:01 +03:00
Onder Kalaci 4cae856846 Relax assertion on transaction abort on PREPARE step
In case a failure happens when a transaction is failed on PREPARE,
we used to hit an assertion for ensuring there is no pending
activity on the connection. However, that's not true after the
changes in #2031. Thus, we've replaced the assertion with a more
generic function call to consume any pending activity, if exists.
2018-09-17 18:09:16 +03:00
Onder Kalaci a94184fff8 Prevent overflow of memory accesses during deadlock detection
In the distributed deadlock detection design, we concluded that prepared transactions
cannot be part of a distributed deadlock. The idea is that (a) when the transaction
is prepared it already acquires all the locks, so cannot be part of a deadlock
(b) even if some other processes blocked on the prepared transaction,  prepared transactions
 would eventually be committed (or rollbacked) and the system will continue operating.

With the above in mind, we probably had a mistake in terms of memory allocations. For each
backend initialized, we keep a `BackendData` struct. The bug we've introduced is that, we
assumed there would only be `MaxBackend` number of backends. However, `MaxBackends` doesn't
include the prepared transactions and axuliary processes. When you check Postgres' InitProcGlobal`
you'd see that `TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;`

This commit aligns with total procs processed with that.
2018-09-17 16:23:29 +03:00
Marco Slot 55f46acedf Support TABLESAMPLE in router queries 2018-08-31 13:22:38 +02:00
Brian Cloutier 2fae06056a
Attempt to stabilize packet dumps and add them back it 2018-09-12 22:10:39 -06:00
Brian Cloutier 5bde8626c5
Travis uses Pipfile instead of re-specifying deps 2018-09-12 17:37:14 -06:00
Brian Cloutier e61e5d4980
Update mitmproxy version to remove vulnerability warnings 2018-09-12 17:17:22 -06:00
Murat Tuncer ae0032dff8 Add regression tests for procedure calls
PG11 introduced PROCEDURE concept similar to FUNCTION
Procedure's allow committing/rolling back behavior.

This commmit adds regression tests for procedure calls.
2018-09-12 10:28:50 +03:00
velioglu d1f005daac Adds UDFs for testing MX functionalities with isolation tests 2018-09-12 07:04:16 +03:00
Murat Tuncer 470ee0b4d9 Revert multi_partition test back to being required
Test was marked as optional (ignore) by previous
commit. Reverting that change to make test required
2018-09-11 12:39:44 -06:00
Onder Kalaci d657759c97 Views to Provide some insight about the distributed transactions on Citus MX
With this commit, we implement two views that are very similar
to pg_stat_activity, but showing queries that are involved in
distributed queries:

    - citus_dist_stat_activity: Shows all the distributed queries
    - citus_worker_stat_activity: Shows all the queries on the shards
                                  that are initiated by distributed queries.

Both views have the same columns in the outputs. In very basic terms, both of the views
are meant to provide some useful insights about the distributed
transactions within the cluster. As the names reveal, both views are similar to pg_stat_activity.
Also note that these views can be pretty useful on Citus MX clusters.

Note that when the views are queried from the worker nodes, they'd not show the distributed
transactions that are initiated from the coordinator node. The reason is that the worker
nodes do not know the host/port of the coordinator. Thus, it is advisable to query the
views from the coordinator.

If we bucket the columns that the views returns, we'd end up with the following:

- Hostnames and ports:
   - query_hostname, query_hostport: The node that the query is running
   - master_query_host_name, master_query_host_port: The node in the cluster
                                                   initiated the query.
    Note that for citus_dist_stat_activity view, the query_hostname-query_hostport
    is always the same with master_query_host_name-master_query_host_port. The
    distinction is mostly relevant for citus_worker_stat_activity. For example,
    on Citus MX, a users starts a transaction on Node-A, which starts worker
    transactions on Node-B and Node-C. In that case, the query hostnames would be
    Node-B and Node-C whereas the master_query_host_name would Node-A.

- Distributed transaction related things:
    This is mostly the process_id, distributed transactionId and distributed transaction
    number.

- pg_stat_activity columns:
    These two views get all the columns from pg_stat_activity. We're basically joining
    pg_stat_activity with get_all_active_transactions on process_id.
2018-09-10 21:33:27 +03:00
Onder Kalaci 7de5e30432 Change flaky explain test to non-explain
This test's output changes depending on which worker is
picked for explain (e.g., worker port in the output changes).

Given that the test is only aiming to ensure that CTEs inside
CTEs work fine in DML queries, it should be fine to get rid of
the EXPLAIN. The output is verified to be correct as well.
2018-09-10 16:01:30 +03:00
Onder Kalaci 76aa6951c2 Properly send commands to other nodes
We previously implemented OTHER_WORKERS_WITH_METADATA tag. However,
that was wrong. See the related discussion:
     https://github.com/citusdata/citus/issues/2320

Instead, we switched using OTHER_WORKER_NODES and make the command
that we're running optional such that even if the node is not a
metadata node, we won't be in trouble.
2018-09-10 16:01:30 +03:00
Onder Kalaci 5cf8fbe7b6 Add infrastructure to relation if exists 2018-09-07 14:49:36 +03:00
Onder Kalaci bf28dd0cff Do not recover wrong distributed transactions in MX 2018-09-07 09:52:46 +03:00
Murat Tuncer 65276311f7 Reflect changed index for constraint scans in PG11 2018-09-07 08:07:01 +03:00
Murat Tuncer d8279569b8 Add support for INCLUDE option in index creation
INCLUDE is a new feature in index creation in PG11.
Included column/expression paramameters are now forwarded to shards
2018-09-06 19:41:06 +03:00
Murat Tuncer 7d3f7c2bf4 Add regression tests related to new PG11 partitioning features 2018-09-06 19:06:28 +03:00
Murat Tuncer 55cf3e321c Add regression tests for new PG11 window functions
- <offset> preceding/following
- exclude
2018-09-04 10:48:04 +03:00
Onder Kalaci 1b3257816e Make sure that table is dropped before shards are dropped
This commit fixes a bug where a concurrent DROP TABLE deadlocks
with SELECT (or DML) when the SELECT is executed from the workers.

The problem was that Citus used to remove the metadata before
droping the table on the workers. That creates a time window
where the SELECT starts running on some of the nodes and DROP
table on some of the other nodes.
2018-09-04 08:57:20 +03:00
Onder Kalaci 2ab0e63b30 Fix flaky test 2018-09-03 14:06:32 +03:00
Onder Kalaci 26e308bf2a Support TRUNCATE from the MX worker nodes
This commit enables support for TRUNCATE on both
distributed table and reference tables.

The basic idea is to acquire lock on the relation by sending
the TRUNCATE command to all metedata worker nodes. We only
skip sending the TRUNCATE command to the node that actually
executus the command to prevent a self-distributed-deadlock.
2018-09-03 14:06:31 +03:00
Onder Kalaci 97ba7bf2eb Add the option to skip the node that is executing the node 2018-09-03 14:01:24 +03:00
velioglu bd30e3e908 Add support for writing to reference tables from MX nodes 2018-08-27 18:15:04 +03:00
velioglu 2639149bd8 Enterprise functions about metadata/resource locks 2018-08-27 16:32:20 +03:00
Onder Kalaci b8af8c359b Make sure that modifying CTEs always use the correct execution mode 2018-08-23 14:53:55 +03:00
Onder Kalaci 910ea392f5 Prevent multiple placements of a single shard to lead huge memory allocations 2018-08-22 19:25:01 +03:00
Onder Kalaci cb481f55cf Prevent excessive number of unnecessary range table traversal 2018-08-22 11:45:00 +03:00
mehmet furkan şahin ef9f38b68d ApplyLogRedaction noop func is added 2018-08-17 14:48:54 -07:00
Jason Petersen c3c0d62ca6
Add test showing poolinfo validation works
In other words, that it errors out.
2018-08-16 20:14:18 -06:00
Jason Petersen c4e2349b80
Mark failing PostgreSQL 11 test as ignored
This commit should be reverted once a new PostgreSQL 11 beta is
available: it's due to a bug in the partitioning code which has been
fixed in REL_11_STABLE but (not yet) a released tag.
2018-08-16 19:37:37 -06:00
Jason Petersen a2fb5a84f1
Disable JIT during PostgreSQL 11 test runs
It's causing problems with one of our tests.
2018-08-16 19:37:14 -06:00
Nils Dijk 6cf4516fdb
fix \d change for indexes in pg11 2018-08-15 23:27:31 -06:00
Nils Dijk 2a9d47e1a6
fix pg11 tests 2018-08-15 23:27:31 -06:00
mehmet furkan şahin 1a3b9f731e Make master_disable/activate_node runnable when superuser 2018-08-15 00:43:35 -07:00
Onder Kalaci 85d418412d Fix DDL execution problem on MX when search_path is used
Make sure that the coordinator sends the commands when the search
path synchronised with the coordinator's search_path. This is only
important when Citus sends the commands that are directly relayed
to the worker nodes. For example, the deparsed DLL commands or
queries always adds schema qualifications to the queries. So, they
do not require this change.
2018-08-13 16:34:50 +03:00
velioglu 44fc9f46fc Add create_distributed_table (without data) failure tests 2018-08-13 09:31:15 +03:00
Onder Kalaci 974cbf11a5 Hide shard names on MX worker nodes
This commit by default enables hiding shard names on MX workers
by simple replacing `pg_table_is_visible()` calls with
`citus_table_is_visible()` calls on the MX worker nodes. The latter
function filters out tables that are known to be shards.

The main motivation of this change is a better UX. The functionality
can be opted out via a GUC.

We also added two views, namely citus_shards_on_worker and
citus_shard_indexes_on_worker such that users can query
them to see the shards and their corresponding indexes.

We also added debug messages such that the filtered tables can
be interactively seen by setting the level to DEBUG1.
2018-08-07 14:21:45 +03:00
Onder Kalaci e13da6a343 Add infrastructure to hide shards on MX worker nodes
Add ability to understand whether a table is a
known shard on MX workers. Note that this is only useful
and applicable for hiding shards on MX worker nodes given
that we can have metadata only there.
2018-08-04 09:03:37 +03:00
mehmet furkan şahin c1f7631f98 failure tests on create_distributed_table nonempty 2018-08-03 12:41:25 -07:00
velioglu b21bd2d1a0 Add create_reference_table failure tests 2018-08-03 17:49:57 +03:00
velioglu bc27651dd9 Add failure test for copy on hash distributed table 2018-08-03 17:11:09 +03:00
Brian Cloutier 82fa85fa5b
Add tests for 1PC COPY on append and hash-distributed tables
Add tests for 1PC COPY on append and hash-distributed tables
2018-07-31 15:17:59 -07:00
Brian Cloutier f0f7a691a3
Prevent failure tests from hanging by using a port outside the ephemeral port range
- mitmdump now listens on port 9060
- Add some logging to fluent.py, making issues like this easier to debug in the future
- Fail the tests if something is already running on the port mitmProxy tries to use
- check-failure now works with VPATH builds
2018-07-31 14:30:56 -07:00
mehmet furkan şahin dde86cb731 Copy to reference table failure tests are added 2018-07-30 11:48:12 +03:00
mehmet furkan şahin bc757845eb Citus versioning fix 2018-07-26 10:56:34 +03:00
Brian Cloutier ace248d13c Remove unnecessary calls to 'conn.allow()' 2018-07-25 17:45:00 -07:00
mehmet furkan şahin 887aa8150d Bump citus version to 8.0devel 2018-07-25 12:03:47 +03:00
velioglu e23625bf5e Use contype to check for FK constraint instead of reading catalog table 2018-07-24 15:53:05 +03:00
mehmet furkan şahin 6d0fbbace7 ALTER TABLE %s ADD COLUMN constraint check is added 2018-07-24 15:53:05 +03:00
Marco Slot 625816242a
Don't try to check unopened connection in EXEC_TASK_FAILED state 2018-07-23 11:41:02 -06:00
Nils Dijk 2d13900230
error on unsupported changing of distirbution column in ON CONFLICT for INSERT ... SELECT 2018-07-23 15:18:21 +02:00
Nils Dijk 6a15e1c9fc
extract ErrorIfOnConflictNotSupported function for reuse 2018-07-23 12:20:10 +02:00
Nils Dijk df98900f80
fix missing space for tablein in error 2018-07-20 15:05:13 +02:00
Marco Slot 69a3ebea5f Ensure StartPlacementListConnection connects with username supplied by the caller 2018-07-19 20:10:11 +02:00
Murat Tuncer a837dde1a0 Add failure tests for master add/remove/disable/active node 2018-07-13 18:06:24 +03:00
mehmet furkan şahin f854420079 truncate failure tests are added 2018-07-13 13:20:50 +03:00
Murat Tuncer 2795494758 Added failure test for create index concurrently 2018-07-13 11:53:49 +03:00
Onder Kalaci a446e71ee7 Add failure testing for DDL commands
This commit adds an extensive failure testing, which covers quite
a bit of things and their combinations:
   - 1PC vs 2PC
   - Replication factor 1 and Replication factor 2
   - Network failures and query cancellations
   - Sequential vs Parallel query execution mode
2018-07-12 13:05:29 +03:00
Jason Petersen 318119910b
Add pg_dist_poolinfo table
For storing nodes' pool host/port overrides.
2018-07-10 09:30:22 -07:00
mehmet furkan şahin 3afa7f425d Topn aggregates are supported 2018-07-10 14:33:42 +03:00
Murat Tuncer a7277526fd Make citus_stat_statements_reset() super user function 2018-07-10 11:21:20 +03:00
Marco Slot 89870e76ce Add a select_opens_transaction_block GUC 2018-07-08 03:50:39 +02:00
Brian Cloutier a54f9a6d2c network proxy-based failure testing
- Lots of detail is in src/test/regress/mitmscripts/README
- Create a new target, make check-failure, which runs tests
- Tells travis how to install everything and run the tests
2018-07-06 12:38:53 -07:00
Murat Tuncer f20258ef10 Expand count distinct support
We can now support more complex count distinct operations by
pulling necessary columns to coordinator and evalutating the
aggreage at coordinator.

It supports broad range of expression with the restriction that
the expression must contain a column.
2018-07-06 09:44:20 +03:00
Onder Kalaci 7fb529aab9 Some stylistic improvements in the foreign keys to reference table
changes.
2018-07-05 23:23:34 +03:00
Brian Cloutier 735218ee5d Remove sslmode structs and add more helpful description 2018-07-05 14:12:36 +02:00
Nils Dijk c1c8c38dc9 create placeholder for policy ddl 2018-07-05 11:07:01 +02:00
mehmet furkan şahin df11dda750 hll aggregates are tested 2018-07-05 08:19:01 +03:00
mehmet furkan şahin 06217be326 hll aggregate functions are supported natively 2018-07-04 16:41:09 +03:00
Murat Tuncer 901066a421 Move partition key logging related code from enterprise 2018-07-04 13:11:34 +03:00
mehmet furkan şahin f7b901e3fd CopyShardForeignConstraintCommandList API change for grouped constraints 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 35eac2318d lock referenced reference table metadata is added
For certain operations in enterprise, we need to lock the
referenced reference table shard distribution metadata
2018-07-03 17:05:55 +03:00
Onder Kalaci d83be3a33f Enforce foreign key restrictions inside transaction blocks
When a hash distributed table have a foreign key to a reference
table, there are few restrictions we have to apply in order to
prevent distributed deadlocks or reading wrong results.

The necessity to apply the restrictions arise from cascading
nature of foreign keys. When a foreign key on a reference table
cascades to a distributed table, a single operation over a single
connection can acquire locks on multiple shards of the distributed
table. Thus, any parallel operation on that distributed table, in the
same transaction should not open parallel connections to the shards.
Otherwise, we'd either end-up with a self-distributed deadlock or
read wrong results.

As briefly described above, the restrictions that we apply is done
by tracking the distributed/reference relation accesses inside
transaction blocks, and act accordingly when necessary.

The two main rules are as follows:
   - Whenever a parallel distributed relation access conflicts
     with a consecutive reference relation access, Citus errors
     out
   - Whenever a reference relation access is followed by a
     conflicting parallel relation access, the execution mode
     is switched to sequential mode.

There are also some other notes to mention:
   - If the user does SET LOCAL citus.multi_shard_modify_mode
     TO 'sequential';, all the queries should simply work with
     using one connection per worker and sequentially executing
     the commands. That's obviously a slower approach than Citus'
     usual parallel execution. However, we've at least have a way
     to run all commands successfully.

   - If an unrelated parallel query executed on any distributed
     table, we cannot switch to sequential mode. Because, the essense
     of sequential mode is using one connection per worker. However,
     in the presence of a parallel connection, the connection manager
     picks those connections to execute the commands. That contradicts
     with our purpose, thus we error out.

   - COPY to a distributed table cannot be executed in sequential mode.
     Thus, if we switch to sequential mode and COPY is executed, the
     operation fails and there is currently no way of implementing that.
     Note that, when the local table is not empty and create_distributed_table
     is used, citus uses COPY internally. Thus, in those cases,
     create_distributed_table() will also fail.

   - There is a GUC called citus.enforce_foreign_key_restrictions
     to disable all the checks. We added that GUC since the restrictions
     we apply is sometimes a bit more restrictive than its necessary.
     The user might want to relax those. Similarly, if you don't have
     CASCADEing reference tables, you might consider disabling all the
     checks.
2018-07-03 17:05:55 +03:00
velioglu 6be6911ed9 Create foreign key relation graph and functions to query on it 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 89a8d6ab95 FK from dist to ref is tested for partitioning, MX 2018-07-03 17:05:55 +03:00
mehmet furkan şahin 4db72c99f6 Specific DDLs are sequentialized when there is FK
-[x] drop constraint
-[x] drop column
-[x] alter column type
-[x] truncate

are sequentialized if there is a foreign constraint from
a distributed table to a reference table on the affected relations
by the above commands.
2018-07-03 17:05:55 +03:00
mehmet furkan şahin e37f76c276 tests are added 2018-07-03 17:05:01 +03:00
mehmet furkan şahin 2c5d59f3a8 create_distributed_table in transaction is fixed 2018-07-03 17:05:01 +03:00
mehmet furkan şahin 45f8017f42 create_distributed_table with fk to ref table is implemented 2018-07-03 17:05:01 +03:00
mehmet furkan şahin 2fa4e38841 FK from dist to ref can be added with alter table 2018-07-03 17:05:01 +03:00
Murat Tuncer 3fc98e8225 Add pg_stat_statements to shared_preload_libraries if installed 2018-07-03 16:33:15 +03:00
Murat Tuncer 23800f50f1 Update citus_stat_statements view and regression tests 2018-07-03 16:14:13 +03:00
Murat Tuncer e532755a6e Fix bug in partition column extraction
added strip_implicit_coercion prior to
checking if the expression is Const.
This is important to find values for types
like bigint.
2018-07-02 18:08:16 +03:00
Murat Tuncer 3fc7cdfe6d Apply master_stage_protocol refactoring changes 2018-06-28 11:24:57 +03:00
Murat Tuncer 4d35b92016 Add groundwork for citus_stat_statements api 2018-06-27 14:20:03 +03:00
Brian Cloutier 5ce18327a7 Don't spinloop when trying to cleanup a failed connection 2018-06-26 13:13:34 -07:00
Onder Kalaci 4ccabf9544 Increase timeout to keep appveyor happy 2018-06-25 18:40:40 +03:00
Onder Kalaci 7d0f7835e7 Improve relation accesses association to do less job 2018-06-25 18:40:40 +03:00
Onder Kalaci 8ccb8b679e Real-time executor marks multi shard relation accesses before opening connections 2018-06-25 18:40:31 +03:00
Onder Kalaci 2890154420 Make sure that TRUNCATE always opens a DDL access 2018-06-25 18:40:31 +03:00
Onder Kalaci 21038f0d0e Make sure that inter-shard DDL commands are always covers both tables 2018-06-25 18:40:30 +03:00
Onder Kalaci 2f01894589 Track relation accesses using the connection management infrastructure 2018-06-25 18:40:30 +03:00
Onder Kalaci d5472614df Use non-data connection for intermediate results
Make sure that intermediate results use a connection that is
not associated with any placement. That is useful in two ways:
    - More complex queries can be executed with CTEs
    - Safely use the same connections when there is a foreign key
      to reference table from a distributed table, which needs to
      use the same connection for modifications since the reference
      table might cascade to the distributed table.
2018-06-21 13:26:13 +03:00
Onder Kalaci 7762d81cba Move test UDF under test folder 2018-06-21 08:42:44 +03:00
Jason Petersen 7a75c2ed31 Add connparam invalidation trigger creation logic
This needs to live in Community, since we haven't yet added the com-
plication of having divergent upgrade scripts in Enterprise.
2018-06-20 14:13:18 -06:00
mehmet furkan şahin 2b2ce036eb create_distributed_table honors sequential mode 2018-06-19 17:33:45 +03:00
Onder Kalaci 8f5821493a Implement C interface for setting GUC
We need the ability to switch to sequential mode (e.g.,
 SET LOCAL citus.multi_shard_modify_mode = 'sequential'). This
commit enables that.
2018-06-19 10:23:43 +03:00
Marco Slot f3f2805978
Fix use-after-free that may occur for INSERT..SELECT in prepared statements 2018-06-18 22:55:06 -06:00
velioglu 53b2e81d01 Adds SELECT ... FOR UPDATE support for router plannable queries 2018-06-18 13:55:17 +03:00
Marco Slot 28860b2469 Remove volatile explain plan from regression tests 2018-06-15 00:21:52 +02:00
Marco Slot 04da0cf9b1 Remove costs from explain plans in window_functions tests 2018-06-14 23:51:46 +02:00
Marco Slot 0bbe778760 Rename failOnError to alwaysThrowErrorOnFailure 2018-06-14 23:37:47 +02:00
Marco Slot 0feb1f2eb1 Do not call CheckRemoteTransactionsHealth from commit handler 2018-06-14 23:33:07 +02:00
Marco Slot bc1cc419e1 Fix could not receive query results error in regression test ouput 2018-06-14 23:33:07 +02:00
Marco Slot 4ab8e87090 Always throw errors on failure on critical connection in router executor 2018-06-14 23:33:07 +02:00
Nils Dijk 73efcb22c4 Extract RoleSpecString and resolve role references 2018-06-14 11:38:42 +02:00
Jason Petersen 5bf7bc64ba Add pg_dist_authinfo schema and validation
This table will be used by Citus Enterprise to populate authentication-
related fields in outbound connections; Citus Community lacks support
for this functionality.
2018-06-13 11:16:26 -06:00
Jason Petersen 57b3f253c5
Add node_conninfo GUC and related logic
To support more flexible (i.e. not at compile-time) specification of
libpq connection parameters, this change adds a new GUC, node_conninfo,
which must be a space-separated string of key-value pairs suitable for
parsing by libpq's connection establishment methods.

To avoid rebuilding and parsing these values at connection time, this
change also adds a cache in front of the configuration params to permit
immediate use of any previously-calculated parameters.
2018-06-12 20:23:47 -06:00
mehmet furkan şahin d1a3b20115 foreign_constraint_utils is created 2018-06-07 18:19:24 +03:00
Onder Kalaci a5370f5bb0 Realtime executor honours multi_shard_modify_mode
We're relying on multi_shard_modify_mode GUC for real-time SELECTs.
The name of the GUC is unfortunate, but, adding one more GUC
(or renaming the GUC) would make the UX even worse. Given that this
mode is mostly important for transaction blocks that involve modification
/DDL queries along with real-time SELECTs, we can live with the confusion.
2018-06-06 14:59:54 +03:00
Onder Kalaci d918556dca INSERT .. SELECT pushdown honors multi_shard_modification_mode 2018-06-06 12:42:23 +03:00
Onder Kalaci 336044f2a8 master_modify_multiple_shards() and TRUNCATE honors multi_shard_modification_mode 2018-06-06 12:29:05 +03:00
Onder Kalaci 51cb24b39c Increase timeout to make the appveyor tests happy 2018-06-05 17:52:18 +03:00
Onder Kalaci df44956dc3 Make sure that sequential DDL opens a single connection to each node
After this commit DDL commands honour `citus.multi_shard_modify_mode`.

We preferred using the code-path that executes single task router
queries (e.g., ExecuteSingleModifyTask()) in order not to invent
a new executor that is only applicable for DDL commands that require
sequential execution.
2018-06-05 17:52:17 +03:00
Marco Slot fd4ff29f2f Add a debug message with distribution column value 2018-06-05 15:09:17 +03:00
Murat Tuncer ba50e3f33e Add handling for grant/revoke all tables in schema 2018-05-31 13:47:02 +03:00
velioglu 20acee2cd4
Bump citus version to 7.5devel 2018-05-28 17:25:21 -06:00
Brian Cloutier a7e09d777b Increase deadlock timeout so we get fewer signals 2018-05-16 17:07:24 -07:00
Brian Cloutier 9667ee5ac9 Alleviate OOM failures in COMMIT callback
Previously those failures caused us to crash, postgres abort()s when it
notices a failure in the COMMIT callback.
2018-05-15 16:39:33 -07:00
Marco Slot 61d2c0f618 Stabilise output of multi_shard_update_delete test 2018-05-11 08:33:23 +02:00
Onder Kalaci ed47e4e6b9 Remove placementId from the ORDER BY to make results consistent 2018-05-11 17:04:50 +03:00
Onder Kalaci 12e50d96dc Fix tests where hll is not installed 2018-05-11 16:01:47 +03:00
Brian Cloutier 4c2bf5d2d6 Move call to RemoveIntermediateResultsDirectory
Errors thrown in the COMMIT handler will cause Postgres to segfault,
there's nothing it can do it abort the transaction by the time that
handler is called!

RemoveIntermediateResultsDirectory is problematic for two reasons:
- It has calls to ereport(ERROR which have been known to trigger
- It makes memory allocations which raise ERRORs when they fail

Once the COMMIT process has begun we don't use the intermediate results,
so it's safe to remove them a little earlier in the process. A failure
here will abort the transaction. That's pretty unnecessary, it's not
that important that we remove the results, but it's still better than a
crash.
2018-05-10 19:28:41 -07:00
Brian Cloutier b3e85e4f71 Fix the huge appveyor diffs 2018-05-10 18:18:43 -07:00
mehmet furkan şahin b8c3197399 enterprise test fixes 2018-05-10 13:06:54 +03:00
Onder Kalaci 04d9e886fe Run concurrent modification queries in tests sequentially 2018-05-10 11:59:18 +03:00
mehmet furkan şahin 785a86ed0a Tests are updated to use create_distributed_table 2018-05-10 11:18:59 +03:00
mehmet furkan şahin d35f2725bf valgrind tests fix 2018-05-10 10:20:14 +03:00
Dimitri Fontaine 8b258cbdb0 Lock reads and writes only to the node being updated in master_update_node
Rather than locking out all the writes in the cluster, the function now only
locks out writes that target shards hosted by the node we're updating.
2018-05-09 15:14:20 +02:00
Marco Slot 5f5f7b4fe0 Throw an error if placements cannot be found in router executor 2018-05-08 22:39:18 -04:00
Marco Slot a7e6689890 Run recursive_dml_queries_mx test on its own 2018-05-06 17:12:53 +02:00
Marco Slot 9438e5bde9 Ensure single-shard modifying CTEs are part of distributed transaction 2018-05-06 12:49:40 +02:00
velioglu caa27161ca Check volatile functions in modify queries 2018-05-08 11:16:40 +03:00
Hadi Moshayedi 86b12bc2d0
Always prefix operators with their namespace. (#2147)
Previously we checked if an operator is in pg_catalog, and if it wasn't we prefixed it with namespace in worker queries. This can have a huge impact on performance of physical planner when using custom data types.

This happened regardless of current search_path config, because Citus overrides the search path in get_query_def_extended(). When we do so, the check for existence of the operator in current search path in generate_operator_name() fails for any operators outside pg_catalog. This means that nothing gets cached, and in the following calls we will again recheck the system tables for existence of the operators, which took an additional 40-50ms for some of the usecases we were seeing.

In this change we skip the pg_catalog check, and always prefix the operator with its namespace.
2018-05-05 13:27:26 -04:00
Marco Slot 2f9c8c6af0 Allow DML commands with unreferenced SELECT CTEs 2018-05-03 14:53:26 +02:00
Marco Slot f8cfe07fd1 Support intermediate results in distributed INSERT..SELECT 2018-05-03 14:42:28 +02:00
Marco Slot 90cdfff602 Implement recursive planning for DML statements 2018-05-03 14:42:28 +02:00
Murat Tuncer 42a8082721 PG11 compatibility refresh
adds a shim for a changed function api
2018-05-03 13:21:15 -06:00
mehmet furkan şahin ef90122cd3 shard count for some of the tests are increased 2018-05-03 10:44:43 +03:00
Onder Kalaci 317dd02a2f Implement single repartitioning on hash distributed tables
* Change worker_hash_partition_table() such that the
     divergence between Citus planner's hashing and
     worker_hash_partition_table() becomes the same.

   * Rename single partitioning to single range partitioning.

   * Add single hash repartitioning. Basically, logical planner
     treats single hash and range partitioning almost equally.
     Physical planner, on the other hand, treats single hash and
     dual hash repartitioning almost equally (except for JoinPruning).

   * Add a new GUC to enable this feature
2018-05-02 18:50:55 +03:00
velioglu 32bcd610c1 Support modify queries with multiple tables
With this commit we begin to support modify queries with multiple
tables if these queries are pushdownable.
2018-05-02 16:22:26 +03:00
velioglu d9fa69c031 Refactor query pushdown related logic 2018-05-02 15:03:09 +03:00
Brian Cloutier f8fb7a27fb Don't copyObject into the wrong memory context
utilityStmt sometimes (such as when it's inside of a plpgsql function)
comes from a cached plan, which is kept in a child of the
CacheMemoryContext. When we naively call copyObject we're copying it into
a statement-local context, which corrupts the cached plan when it's
thrown away.
2018-05-01 15:34:32 -07:00
Marco Slot 2559b84049 Drop shards as current user instead of super user 2018-05-01 09:57:20 +02:00
velioglu 121ff39b26 Removes large_table_shard_count GUC 2018-04-29 10:34:50 +02:00
Onder Kalaci 832c91e28c Move processing each part of the query into its own functions
This commit doesn't change any of the logic at all.

Instead, the goal is to:

 * Get rid of any code duplication
 * Incremental changes to the optimizer made it slightly hard
   to follow the code, improve that and make it easier to
   implement new features
 * Simplify the code by moving each part of query processing (e.g.,
   DISTINCT, LIMIT etc) into its own function
 * Make the interaction between each part of the query more
   obvious (e.g., How DISTINCT affects LIMIT etc)
2018-04-27 17:32:38 +03:00
mehmet furkan şahin f2555317b6 ProcessVacuumStmt update on names 2018-04-27 14:37:01 +03:00
mehmet furkan şahin a4153c6ab1 notice handler is implemented 2018-04-27 14:37:01 +03:00
Marco Slot 304b3a41ba Cache the partition column Var 2018-04-26 14:58:16 -06:00
Marco Slot 3d3c19a717
Improve messages for essential connection failures 2018-04-26 12:58:47 -06:00
Marco Slot 88f64d22db
Prevent connection pointer is NULL details 2018-04-26 12:49:57 -06:00
Marco Slot 394732b6be
Add a connection failure error code 2018-04-26 12:49:57 -06:00
Önder Kalacı ebb8f902c8 Relax assertion on transaction rollback failure (#2052)
In case a failure happens when a transaction is rollbacked,
we used to hit an assertion for ensuring there is no pending
activity on the connection. However, that's not true after the
changes in #2031. Thus, we've replaced the assertion with a more
generic function call to consume any pending activity, if exists.
2018-04-26 13:39:03 -04:00
Hadi Moshayedi 24659a97dc
Fail task in real-time executor if no placements found. (#2133) 2018-04-26 12:05:24 -04:00
Murat Tuncer a6fe5ca183 PG11 compatibility update
- changes in ruleutils_11.c is reflected
- vacuum statement api change is handled. We now allow
  multi-table vacuum commands.
- some other function header changes are reflected
- api conflicts between PG11 and earlier versions
  are handled by adding shims in version_compat.h
- various regression tests are fixed due output and
  functionality in PG1
- no change is made to support new features in PG11
  they need to be handled by new commit
2018-04-26 11:29:43 +03:00
Brian Cloutier 49255213d4 Configure appveyor to run regression tests
- Add install.pl to instal .sql files on Windows
- Remove a hack to PGDLLIMPORT some variables
- Add citus_version.o to the Makefile
- Fix pg_regress_multi's PATH generation on Windows
- Output regression.diffs when the tests fail
- Fix permissions in data directory, make sure postgres can play with it
2018-04-25 18:02:07 -07:00
Onder Kalaci ac8f2f1e6d Eliminate code duplication in WorkerExtendedOpNode()
Before this commit, we had code duplication in the
WorkerExtendedOpNode(). The duplication was
noticeable and any change is prone to bugs.

The PR consists of 4 commits. Each commit incrementally
fixes the problem by moving certain parts of the duplicated
code into smaller, better-documented functions.
2018-04-25 08:54:59 +03:00
Brian Cloutier 8d4c4d5c58 Close all files before trying to remove them 2018-04-24 14:35:20 -07:00
Brian Cloutier c5f1235090 Turn the crashes on Windows into WARNINGs 2018-04-24 14:35:20 -07:00
Onder Kalaci ee748d9140 Unify extendedOpNode Processing
Before this commit, we had a divergence among
the creation of master/worker extended op nodes.

This commit moves the related parts into a single place
and allows the creation of master/extended op nodes to
share a common data structure.
2018-04-24 11:56:38 +03:00
Hadi Moshayedi 966f01fad3
Fix write and copy functions for TaskExecution. (#2120)
We were missing criticalErrorOccurred from CopyNodeTaskExecution() and OutTaskExecution(). This PR fixes it.
2018-04-23 09:07:52 -04:00
Onder Kalaci 814f0e3acc Ensure Citus never try to access a not planned subquery
PostgreSQL might remove some of the subqueries when they do not
contribute to the query result at all. Citus should not try to
access such subqueries during planning.
2018-04-20 13:52:00 +03:00
Brian Cloutier b0b130f064 Fix Windows crash in multi_copy test
Without this change we crash on Windows with COPYing into a table with
62 shards, and we ERROR when COPYing into a table with >62 shards:

ERROR: WaitForMutipleObjects() failed: error code 87
2018-04-17 15:48:02 -07:00
Brian Cloutier d02f761d8e Change intermediate_results test to not crash 2018-04-17 15:14:02 -07:00
Brian Cloutier 0104790385 Fix hard-coded temp directory in multi_copy
/tmp does not exist on Windows, use :temp_dir instead
2018-04-17 15:01:22 -07:00
Brian Cloutier a59c1c634e Fix cancellation of real time queries
Without this change multi_real_time_transaction blocks forever (on
Windows) in the block where it repeatedly calls pg_advisory_lock(15).
This happens because the deadlock detector tries to cancel the backend
but the backend never processes that signal.
2018-04-17 14:26:22 -07:00
mehmet furkan şahin 00e786af00 Capital named schema support is added 2018-04-17 17:17:42 +03:00
mehmet furkan şahin e5a5502b16 Adds support for multiple ANDs in Having
This PR adds support for multiple AND expressions in Having
for pushdown planner. We simply make a call to make_ands_explicit
from MultiLogicalPlanOptimize for the having qual in
workerExtendedOpNode.
2018-04-16 14:14:48 +03:00
Brian Cloutier 42ddfa176d Fix crash on Windows where there is no detail 2018-04-13 12:54:22 -07:00
velioglu 82b2d21b0c Convert broadcast join to reference join
After this commit large_table_shard_count wont be used to
check whether broadcast join, which is renamed as reference
join, can be applied. Reference join can only be applied over
reference tables.
2018-04-13 12:58:14 +03:00
velioglu 1b92812be2 Add co-placement check to CoPartition function 2018-04-13 12:13:08 +03:00
Marco Slot 9318aeee6b Allow multiple size function calls per query 2018-04-12 14:16:17 +02:00
Marco Slot ee132c5ead Prune shards once per relation in subquery pushdown 2018-04-10 20:33:07 +02:00
Burak Yucesoy b33b282030 Fix bug while DROPping partitioned table from worker
We recently added partitionin support to Citus MX. We should not execute
DROP table commands from MX workers but at the moment we try to execute
such commands for partitioned tables. This PR fixes that problem by
adding check.
2018-04-09 13:50:21 +03:00
Burak Yucesoy 0c283fa8a3 Add partitioning support to MX tables
Previously, we prevented creation of partitioned tables on Citus MX.
We decided to not focus on this feature until there is a need. Since
now there are requests for this feature, we are implementing support
for partitioned tables on Citus MX.
2018-04-06 12:47:06 +03:00
velioglu f01daa0c83 Bump citus version to 7.4devel 2018-04-05 20:38:47 +03:00
velioglu 72dfe4a289 Adds colocation check to local join 2018-04-04 22:49:27 +03:00
velioglu 82a864308a Remove SHARD_STORAGE_RELAY type 2018-03-30 11:45:19 +03:00
velioglu 698d585fb5 Remove broadcast join logic
After this change all the logic related to shard data fetch logic
will be removed. Planner won't plan any ShardFetchTask anymore.
Shard fetch related steps in real time executor and task-tracker
executor have been removed.
2018-03-30 11:45:19 +03:00
Matthew Wozniczka 4582a4b398 Fixed a typo 2018-03-27 22:51:36 -06:00
Murat Tuncer 1cb8e5b4bf Fix isolation tests for windows echo command 2018-03-27 14:18:48 -07:00
Brian Cloutier 9aff4384a1 Make tests platform independent
- Force all platforms to use the same collation
- Force all platforms to use the same locale
- Use /dev/null or NUL, depending on platform
- Use /tmp or %TEMP%, dpeending on platform
2018-03-27 14:18:48 -07:00
Brian Cloutier 2140b5d82d Make pg_regress_multi.pl platform independent
- don't hardcode path names
- replace system calls for rm/mkdir/rm -rf with perl equivalents
- force utf-8 encoding
- the Windows shell uses different quoting and escape rules
2018-03-27 14:18:48 -07:00
Brian Cloutier f8f0d4aedc Add Windows replacement for uname 2018-03-21 20:35:56 -07:00
Brian Cloutier 98ffafe16e Fix error handling in connection_management 2018-03-21 20:05:00 -07:00
Murat Tuncer 224b0a8c14 Replace poll with select/poll
Windows does not have poll(), so fall back to select()
2018-03-21 20:05:00 -07:00
Metin Doslu 3b7b64a8b6 Remove skip_jsonb_validation_in_copy GUC 2018-03-13 10:33:27 +02:00
Murat Tuncer 1440caeef2
Fix incorrect limit pushdown when distinct clause is not superset of group by (#2035)
Pushing down limit and order by into workers may produce
wrong output when distinct on() clause has expressions,
aggregates, or window functions.

This checking allows pushing down of limits only if
distinct clause is a superset of group by clause. i.e. it contains all clauses in group by.
2018-03-07 13:24:56 +03:00
Metin Doslu e86d34256c Change default to false for citus.skip_jsonb_validation_in_copy 2018-03-06 13:19:47 +02:00
Onder Kalaci 40b898b59f Improve error messages for INSERT queries that have subqueries 2018-03-05 14:46:47 +02:00
Onder Kalaci 7dc9589b56 Handle failures during I/O
This commit checks the connection status right after any IO happens
on the socket.

This is necessary since before this commit we didn't pass any information
to the higher level functions whether we're done with the connection
(e.g., no IO required anymore) or an errors happened during the IO.
2018-03-02 08:33:53 +02:00
Onder Kalaci da0048e0b7 ForgetResults() becomes a wrapper for ClearResults()
ClearResults() is able to handle failures properly by
checking the result status. So, relying on it makes
error handling more generic in Citus.
2018-03-02 08:33:53 +02:00
Murat Tuncer 76f6883d5d
Add support for window functions that can be pushed down to worker (#2008)
This is the first of series of window function work.

We can now support window functions that can be pushed down to workers.
Window function must have distribution column in the partition clause
 to be pushed down.
2018-03-01 19:07:07 +03:00
Marco Slot dc7213a11c Use expressions in the ORDER BY in bool_agg 2018-02-27 23:52:44 +01:00
Marco Slot e79db17b91 Update comment in WorkerAggregateExpressionList 2018-02-27 23:48:25 +01:00
Marco Slot ef5ff7eb12 Add bit_ and bool_ aggregates to AggregateType 2018-02-27 23:48:25 +01:00
Marco Slot c723a1fa32 Add support for bool and bit aggregates 2018-02-27 23:48:25 +01:00
Murat Tuncer e13c5beced
Fix worker query when order by avg aggregate is used (#2024)
We push down order by to worker query when limit is specified
(with some other additional checks). If the query has an expression
on an aggregate or avg aggregate by itself, and there is an order
by on this particular target we may send wrong order by to worker
query with potential to affect query result.

The fix creates a auxilary target entry in the worker query and
uses that target entry for sorting.
2018-02-28 12:12:54 +03:00
Metin Doslu bcf660475a Add support for modifying CTEs 2018-02-27 15:08:32 +02:00
Metin Doslu 91165c4140 Add a temporary file to pass Travis tests 2018-02-27 13:50:36 +02:00
Metin Doslu 53bb0b6aee Fix regression for changes on REL_10_STABLE 2018-02-27 12:52:56 +02:00
velioglu 78e6d990a2 Fix master plan of the query with distinct, aggregate and group by clauses.
Before this PR, we were trusting on the columns of group by about
guaranteeing the uniqueness of the results. However, this assumption
is correct only if the columns in the group by is subset of columns
in the distinct clause. It can be wrong if we have part of group by
columns and some aggregation columns in the distinct clause. With
this PR, we add distinct plan on top of aggregate plan when necessary.
2018-02-26 15:30:15 +03:00
Onder Kalaci 1c930c96a3 Support non-co-located joins between subqueries
With #1804 (and related PRs), Citus gained the ability to
plan subqueries that are not safe to pushdown.

There are two high-level requirements for pushing down subqueries:

   * Individual subqueries that require a merge step (i.e., GROUP BY
     on non-distribution key, or LIMIT in the subquery etc). We've
     handled such subqueries via #1876.

    * Combination of subqueries that are not joined on distribution keys.
      This commit aims to recursively plan some of such subqueries to make
      the whole query safe to pushdown.

The main logic behind non colocated subquery joins is that we pick
an anchor range table entry and check for distribution key equality
of any  other subqueries in the given query. If for a given subquery,
we cannot find distribution key equality with the anchor rte, we
recursively plan that subquery.

We also used a hacky solution for picking relations as the anchor range
table entries. The hack is that we wrap them into a subquery. This is only
necessary since some of the attribute equivalance checks are based on
queries rather than range table entries.
2018-02-26 13:50:37 +02:00
Onder Kalaci 7b57e0562a Add infrastructure for detecting non-colocated subqueries 2018-02-26 13:28:25 +02:00
Onder Kalaci cdb8d429a7 Add regression tests for non-colocated leaf subqueries 2018-02-26 13:28:24 +02:00
Onder Kalaci 4d4648aabd Change single shard mx test tables to reference tables 2018-02-26 13:28:24 +02:00
Onder Kalaci 4d70c86645 Leaf level recursive planning for non colocated subqueries
With this commit, we enable recursive planning for the subqueries
that are not joined on the distribution keys.
2018-02-26 13:28:24 +02:00
Onder Kalaci e998703ff8 Enable restriction eq. checks for top level set operations
We used to only support pushdownable set operations inside a
subquery, however, we could easily expand the restriction
checks to cover top level set operations as well.
2018-02-26 13:28:24 +02:00
Onder Kalaci e8aa532a90 Refactor checks for distribution key equality
Change some function names, ensure we stick to Citus'
function order rules etc.
2018-02-26 13:28:24 +02:00
Marco Slot 1e9186a3b5 Do not use new connection in table size functions 2018-02-23 07:07:55 +01:00
Markus Sintonen 6202e80d06 Implemented jsonb_agg, json_agg, jsonb_object_agg, json_object_agg 2018-02-18 00:19:18 +02:00
velioglu 195ac948d2 Recursively plan subqueries in WHERE clause when FROM recurs 2018-02-13 19:52:12 +03:00
Marco Slot 0cba4ab588 Refactor worker node hash initialisation 2018-02-12 23:36:43 +01:00
Marco Slot 40d715d494 Cache worker node array for faster iteration 2018-02-12 23:36:43 +01:00
Marco Slot 6e79a34c97 Do not check for cancellation in ClearResultsIfReady 2018-02-12 16:45:02 +01:00
Marco Slot 6051aae56e Handle errors that are discovered during abort 2018-02-12 16:45:02 +01:00
Marco Slot ee6a751798 Only copy distributed plan when modifying it 2018-02-12 16:30:55 +01:00
Onder Kalaci 94c5ac6ebb Remove duplicate join restrictions
We use PostgreSQL hooks to accumulate the join restrictions
and PostgreSQL gives us all the join paths it tries while
deciding on the join order. Thus, for queries that have many
joins, this function is likely to remove lots of duplicate join
restrictions. This becomes relevant for Citus on query pushdown
check peformance.
2018-02-12 18:35:05 +02:00
Onder Kalaci c228d8ff3d Refactor equivalance generation related codes
This commit changes the APIs for restriction generation to make future
changes simpler.
2018-02-12 18:35:04 +02:00
Onder Kalaci 2f2d350924 Refactor relation restriction related codes
This commit moves some of the functions to a more relevant
source file.
2018-02-12 18:35:04 +02:00
Murat Tuncer 678223224b Update regression test output expectation based on recent PG10 change 2018-02-06 14:44:55 +03:00
Murat Tuncer 901b543e20 Fix count distinct using field select on top level query
We were allowing count distict queries even if they were
not directly on columns if the query is grouped on
distribution column.

When performing these checks we were skipping subqueries
because they also perform this check in a more concise manner.
We relied on oid SUBQUERY_RELATION_ID (10000) to decide if
a given RTE relation id denotes a subquery, however, we also
use SUBQUERY_PUSHDOWN_RELATION_ID (10001) for some subqueries.

We skip both type of subqueries with this change.
2018-02-06 13:16:10 +03:00
metdos 35f864bcaf Respect enable_hashagg in the master planner 2018-02-05 15:06:00 +02:00
metdos 3d540d961c Fix typo in grouping_is_sortable() 2018-02-05 12:10:19 +02:00
Marco Slot 6f7c3bd73b Skip JSON validation on coordinator during COPY 2018-02-02 15:33:27 +01:00
Brian Cloutier 15511f6ba1 Dynamically allocate connection metadata in WaitForAllConnections 2018-02-01 10:30:41 -08:00
Brian Cloutier e6ebfc1f53 Remove VLA from UpdateNodeLocation 2018-02-01 10:30:41 -08:00
Brian Cloutier a2ed45e206 Remove variable length arrays
VLAs aren't supported by Visual Studio.

- Remove all existing instances of VLAs.
- Add a flag, -Werror=vla, which makes gcc refuse to compile if we add
  VLAs in the future.
2018-02-01 10:30:41 -08:00
Brian Cloutier 2efe80ce55 CheckForDistributedDeadlocks no longer uses a VLA
- variable length arrays (VLAs) do not work with Visual Studio
- fix an off-by-one error. We incorrectly assumed there would always at
  least as many edges as there were nodes.
- refactor: reduce scope of transactionNodeStack by moving it into the
  function which uses it.
- refactor: break up the distinct uses of currentStackDepth into
  separate variables.
2018-02-01 10:30:41 -08:00
Brian Cloutier 097fd15a89 small refactor, CheckDeadlockForTransactionNode builds it's own array 2018-02-01 10:30:41 -08:00
Brian Cloutier 457f570b77 Small refactor, we were using incompatible types 2018-01-31 11:05:59 -08:00
Brian Cloutier b864d014ab
GetNextNodeId() incorrectly called PG_RETURN_DATUM
- Also stabilize the output of a multi_router_planner test
2018-01-29 15:32:36 -08:00
Brian Cloutier 61a6b846b9 Refactor: use a temporary timestamp variable
It's against our coding convention to call functions inside parameter
lists; when single-stepping with a debugger it's difficult to determine
what the function returned.

That wouldn't be good enough reason to change this code but while
porting Citus to Windows I ran into this line of code.
assign_distributed_transaction_id was called with a weird timestamp and
I wasn't able to find the problem without first making this change.
2018-01-29 11:20:13 -08:00
Marco Slot bd0ebac865 Skip call to ActiveReadableNodeList when there are no subplans 2018-01-29 16:05:10 +01:00
Hadi Moshayedi ff26bcd5a5
Include sys/stat.h for S_IRUSR and S_IWUSR. (#1977) 2018-01-26 16:21:48 -05:00
Marco Slot 4762503c34 Add base schedule for only running specific regression tests 2018-01-25 18:51:22 +01:00
Brian Cloutier 76d1edc3fd
Don't rely on gcc-specific features (#1963)
* Don't use expressions inside compound statements
* Don't depend on __builtin_constant_p
* Remove reliance on S_ISLNK
* Replace use of __func__: older mcvs doesn't support this builtin
2018-01-23 17:03:29 -08:00
Onder Kalaci fbde87d2d0 Allocate enough space for transaction nodes
This fix prevents any potential memory access that might occur
while forming the deadlock path.
2018-01-22 08:45:48 +02:00
Onder Kalaci 9a89c0b425 Fix bug while traversing the distributed deadlock graph
With this fix, we traverse the graph with DFS which was originally
intended. Note that, before the fix, we traverse the graph with BFS
which might lead to killing some unrelated backend that is not
involved in the distributed deadlock.
2018-01-22 08:45:48 +02:00
Dimitri Fontaine c9760fbb64 Fix CREATE INDEX with storage options on distributed tables.
By sharing the implementation of the function AppendOptionListToString on
three call sites, we would expand an extra OPTIONS keyword in a create index
statement, and omit other bits of the specific syntax here.

This patch introduces an AppendStorageParametersToString() function that is
very similar to AppendOptionListToString() but handles WITH(a="foo",...)
syntax that is used in reloptions (aka Storage Parameters).

Fixes #1747.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine 952da72c55 Implement ALTER TABLE|INDEX ... SET|RESET ().
PostgreSQL implements support for several relation kinds in a single
statement, such as in the AlterTableStmt case, which supports both tables
and indexes and more (see ATExecSetRelOptions in PostgreSQL source code file
src/backend/commands/tablecmds.c for an example of that).

As a consequence, this patch implements support for setting and resetting
storage parameters on both relation kinds.
2018-01-17 21:56:40 +01:00
Dimitri Fontaine 17266e3301 Implement ALTER INDEX ... RENAME TO ...
The command is now distributed among the shards when the table is
distributed. To that effect, we fill in the DDLJob's targetRelationId with
the OID of the table for which the index is defined, rather than the OID of
the index itself.
2018-01-17 21:56:40 +01:00
velioglu d357d2fccd Bump citus version to 7.3devel 2018-01-16 11:50:28 +03:00
Marco Slot 6fb8cfc104 Add test that checks whether distributed transaction ID survives pg_dist_partition invalidation 2018-01-11 16:14:39 +02:00
Dimitri Fontaine 1f088791bd Add DDL tests with non-public schema.
Citus sometimes have regressions around non-default schema support, meaning
not public and not in the search_path, per @marcocitus. This patch changes
some regression tests to use a non-default schema in order to cover more
cases.
2018-01-11 13:21:24 +01:00
Dimitri Fontaine e010238280 Implement ALTER TABLE ... RENAME TO ...
The implementation was already mostly in place, but the code was protected
by a principled check against the operation. Turns out there's a nasty
concurrency bug though with long identifier names, much as in #1664.

To prevent deadlocks from happening, we could either review the DDL
transaction management in shards and placements, or we can simply reject
names with (NAMEDATALEN - 1) chars or more — that's because of the
PostgreSQL array types being created with a one-char prefix: '_'.
2018-01-11 13:21:24 +01:00
Hadi Moshayedi 5d7c52ffa6
Don't return in PG_TRY() block when cancellations happen in WaitForConnections(). (#1923)
We shouldn't return in middle of a PG_TRY() block because if we do, we won't reset PG_exception_stack, and later when a re-throw tries to jump to the jump-point which was active in this PG_TRY() block, it seg-faults.

We used to return in middle of PG_TRY() block in WaitForConnections() where we checked for cancellations. Whenever cancellations were caught here, Citus crashed. And example was reported by @onderkalaci at #1903.
2018-01-03 09:54:03 -05:00
Marco Slot 8f69973411 Fix cancellation issues in the real-time executor (#1905) 2018-01-01 23:10:29 -05:00
Marco Slot 3fd65cb91b Do not raise errors in the real-time executor (#1903) 2018-01-01 22:26:31 -05:00
Onder Kalaci a1bbdf2d44 Outer joins should also use subquery pushdown planner if join
clause is not supported

This change allows unsupported clauses to go through query pushdown
planner instead of erroring out as we already do for non-outer joins.
2017-12-29 16:40:47 +02:00
Marco Slot 09c09f650f Recursively plan set operations when leaf nodes recur 2017-12-26 13:46:55 +02:00
Onder Kalaci eb929e9001 Add some more basic regression tests, mostly for documentation purposes 2017-12-25 15:03:45 +02:00
mehmet furkan şahin 446893234a unsupported subquery error messages are fixed 2017-12-25 15:10:59 +03:00
mehmet furkan şahin 57bc86e23d new debug output for subplans 2017-12-25 09:50:51 +03:00
Marco Slot fa7fa2734b Log remote commands sent via MultiClientSendQuery 2017-12-22 16:18:40 +01:00
Murat Tuncer 87c6f306f1
Fix join clause eq restrictions (#1884)
We used to error out if the join clause includes filters like
t1.a < t2.a even if other filter like t1.key = t2.key exists.

Recently we lifted that restriction in subquery planning by
not lifting that restriction and focusing on equivalance classes
provided by postgres.

This checkin forwards previously erroring out real-time queries
due to join clauses to subquery planner and let it handle the
join even if the query does not have a subquery.

We are now pushing down queries that do not have any
subqueries in it. Error message looked misleading, changed to a more descriptive one.
2017-12-22 12:16:14 +03:00
metdos 32b7e152a3 Get shard resource locks for only DMLs 2017-12-22 10:30:41 +02:00
Murat Tuncer a9cf0c3e66
Fix CTE column alias issue (#1893)
We were creating intermediate query result's target
names from subquery target list. Now we also check
if cte re-defines its column name aliases, and create
intermediate result query accordingly.
2017-12-22 09:39:40 +03:00
Brian Cloutier 377b31dcf7 Remove enable_deadlock_prevention prevention warning 2017-12-21 14:47:52 +01:00
Brian Cloutier fb7b86fa14 Replace strtoull with pg_strtouint64
The macro we were using to detect strtoull isn't set on Windows, and
just in case there are differences use a portable function from PG
instead of calling strtoull directly.
2017-12-21 14:28:51 +01:00
mehmet furkan şahin fd546cf322 Intermediate result size limitation
This commit introduces a new GUC to limit the intermediate
result size which we handle when we use read_intermediate_result
function for CTEs and complex subqueries.
2017-12-21 14:26:56 +03:00
Onder Kalaci e2a5124830 Add regression tests for recursive subquery planning 2017-12-21 08:37:40 +02:00
Onder Kalaci 0d5a4b9c72 Recursively plan subqueries that are not safe to pushdown
With this commit, Citus recursively plans subqueries that
are not safe to pushdown, in other words, requires a merge
step.

The algorithm is simple: Recursively traverse the query from bottom
up (i.e., bottom meaning the leaf queries). On each level, check
whether the query is safe to pushdown (or a single repartition
subquery). If the answer is yes, do not touch that subquery. If the
answer is no, plan the subquery seperately (i.e., create a subPlan
for it) and replace the subquery with a call to
`read_intermediate_results(planId, subPlanId)`. During the the
execution, run the subPlans first, and make them avaliable to the
next query executions.

Some of the queries hat this change allows us:

   * Subqueries with LIMIT
   * Subqueries with GROUP BY/DISTINCT on non-partition keys
   * Subqueries involving re-partition joins, router queries
   * Mixed usage of subqueries and CTEs (i.e., use CTEs in
     subqueries as well). Nested subqueries as long as we
     support the subquery inside the nested subquery.
   * Subqueries with local tables (i.e., those subqueries
     has the limitation that they have to be leaf subqueries)

   * VIEWs on the distributed tables just works (i.e., the
     limitations mentioned below still applies to views)

Some of the queries that is still NOT supported:

  * Corrolated subqueries that are not safe to pushdown
  * Window function on non-partition keys
  * Recursively planned subqueries or CTEs on the outer
    side of an outer join
  * Only recursively planned subqueries and CTEs in the FROM
    (i.e., not any distributed tables in the FROM) and subqueries
    in WHERE clause
  * Subquery joins that are not on the partition columns (i.e., each
    subquery is individually joined on partition keys but not the upper
    level subquery.)
  * Any limitation that logical planner applies such as aggregate
    distincts (except for count) when GROUP BY is on non-partition key,
    or array_agg with ORDER BY
2017-12-21 08:37:40 +02:00
Onder Kalaci e12ea914b9 Refactor ErrorIfQueryNotSupported to defer errors 2017-12-20 09:03:49 +02:00
Onder Kalaci 71ce42b936 Refactor RecursivelyPlanSubqueriesAndCTEs() to make it ready
to work with subqueries
2017-12-20 09:03:47 +02:00
Marco Slot 6a6e986c2b Add EXPLAIN regression test with subplans 2017-12-19 16:34:56 +01:00
Marco Slot 5e0539efa3 Plan CTEs when subquery pushdown is on 2017-12-19 16:34:56 +01:00
Marco Slot 44a1ea631a Show distributed subplan ID in EXPLAIN output 2017-12-19 16:34:56 +01:00
Marco Slot 35dbacdb69 Do not reinitialise MyBackendData 2017-12-19 15:56:26 +01:00
Marco Slot 9b520ae194 Add test for using transaction ID in parallel worker 2017-12-19 09:30:29 +01:00
Marco Slot af201a2f6d Allow intermediate results to be used in parallel workers 2017-12-18 19:05:08 +01:00
Marco Slot 7dab078e67 Set cost estimates for read_intermediate_result 2017-12-18 16:23:44 +01:00
Marco Slot e49254f876 Revert "Add EXPLAIN regression test with subplans"
This reverts commit 8b6d641227.
2017-12-17 22:34:31 +01:00
Marco Slot 74bd33d0cc Revert "Plan CTEs when subquery pushdown is on"
This reverts commit e3b953b8e3.
2017-12-17 22:34:20 +01:00
Marco Slot aca5f35ab9 Revert "Show distributed subplan ID in EXPLAIN output"
This reverts commit 686b079272.
2017-12-17 22:34:04 +01:00
Marco Slot 8b6d641227 Add EXPLAIN regression test with subplans 2017-12-17 22:00:25 +01:00
Marco Slot e3b953b8e3 Plan CTEs when subquery pushdown is on 2017-12-17 21:49:36 +01:00
Marco Slot 686b079272 Show distributed subplan ID in EXPLAIN output 2017-12-16 11:32:01 +01:00
Marco Slot ea6b98fda4 Allow count(distinct) in queries with a subquery 2017-12-15 15:24:26 +01:00
Marco Slot 9ee0e68882 Do not take extra access exclusive lock partitioned tables 2017-12-15 13:02:31 +01:00
Marco Slot 5a69fc1b17 Relax checks on recurring tuples in FROM with sublinks 2017-12-15 11:56:06 +01:00
Marco Slot a64f0060ba Reduce the frequency of FinishConnectionIO calls during COPY (#1864) 2017-12-14 13:21:59 -05:00
Marco Slot a811aad264 Deparallelise multi_modifying_xacts tests 2017-12-14 10:27:17 +01:00
mehmet furkan şahin 5851f71bfb Add CTE regression tests 2017-12-14 09:32:55 +01:00
Marco Slot fa73abe6d4 Regression test output changes after CTE support 2017-12-14 09:32:55 +01:00
Marco Slot 2e2b4e81fa Add support for CTEs in distributed queries 2017-12-14 09:32:55 +01:00
Marco Slot d0335ec818 Send BEGIN for SELECTs in the router executor 2017-12-14 09:32:55 +01:00
Marco Slot cbbd418af2 Add citus.copy_format OIDs to metadata cache 2017-12-14 09:32:55 +01:00
Marco Slot 66f9f1d6cd Make some intermediate results functions public 2017-12-14 09:32:55 +01:00
Marco Slot 36ee21c323 Make CanUseBinaryCopyFormatForType public 2017-12-14 09:32:55 +01:00
Marco Slot 7d1191954d Add DistributedSubPlan node 2017-12-14 09:32:55 +01:00
Onder Kalaci 86b2d9420c Treat recurring tuples as reference table for GROUP BY checks
read_intermediate_results() and immutable functions are implemented.
Empty join trees seems not applicable here.
2017-12-13 14:55:42 +02:00
Marco Slot d1a470a52e Fix issue with multiple ANALYZE in transaction block 2017-12-12 10:28:48 +01:00
mehmet furkan şahin 3c941aedf1 adds citus.enable_repartition_joins GUC
The new GUC allows Citus to switch between task executors
when necessary
2017-12-11 09:36:37 +03:00
Marco Slot 5895c88552 Add materialized view regression tests 2017-12-07 16:20:23 +01:00
Marco Slot 60a1e31671 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 16:20:23 +01:00
Marco Slot f8550b8c85 Fix issues with read_intermediate_result signature 2017-12-07 13:47:56 +01:00
Marco Slot d8fea4efb8 Revert "Allow queries with local tables in NeedsDistributedPlanning"
This reverts commit d2bac081e8.
2017-12-07 11:19:11 +01:00
Marco Slot d2bac081e8 Allow queries with local tables in NeedsDistributedPlanning 2017-12-07 11:02:16 +01:00
Onder Kalaci c42a92afd2 Fix bug related to incrementing an index not properly 2017-12-07 08:50:57 +02:00
Marco Slot eab15aa035 Avoid deadlock in ColocatedTableId 2017-12-06 11:49:34 +01:00
metdos 12d5974d97 Increase sleep time in a regression test to give Valgrind tests enough time 2017-12-05 14:59:37 +02:00
Marco Slot 7279d42849 Treat read_intermediate_result as recurring tuples 2017-12-04 14:50:11 +01:00
Marco Slot 716448ddef Add regression tests for intermediate results 2017-12-04 14:50:11 +01:00
Marco Slot 4cdadfcab6 Add intermediate results infrastructure 2017-12-04 14:50:11 +01:00
Marco Slot bfcc76df69 Make several COPY-related functions public 2017-12-04 13:12:03 +01:00
Marco Slot 73989b07eb Refactor query execution functions 2017-12-04 13:12:03 +01:00
Murat Tuncer 2d66bf5f16
Fix hard coded formatting strings for 64 bit numbers (#1831)
Postgres provides OS agnosting formatting macros for
formatting 64 bit numbers. Replaced %ld %lu with
INT64_FORMAT and UINT64_FORMAT respectively.

Also found some incorrect usages of formatting
flags and fixed them.
2017-12-04 14:11:06 +03:00
Hadi Moshayedi ff706cf556 Test that COPY blocks UPDATE/DELETE/INSERT...SELECT when rep factor 2. 2017-11-30 14:52:29 -05:00
Marco Slot acbc0fe0de Use RowExclusiveLock shard resource lock in COPY 2017-11-30 09:15:45 -05:00
Onder Kalaci a273711500 The common attribute equivalance class always includes the input relations
We added the ability to filter out the planner restriction information
for specific parts of the query. This might lead to situations where
the common restriction includes some other relations that we're searching
for. The reason is that while filtering for join restrictions, we add the
restriction as soon as we find the relation.

With this commit we make sure that the common attribute
equivalance class always includes the input relations.
2017-11-30 16:00:26 +02:00
Marco Slot 0d6a7f5884 Add real-time BEGIN regression tests 2017-11-30 12:59:09 +01:00
Marco Slot d6dd0b3a81 Send BEGIN in the real-time executor when in a transaction 2017-11-30 12:59:09 +01:00
Marco Slot 3a4d5f8182 Remove filter checks on leaf queries 2017-11-30 12:25:14 +01:00
Marco Slot 3f03cb6a6a Support UNION with joins in the subqueries 2017-11-30 10:37:56 +01:00
Marco Slot a9933deac6 Make real time executor work in transactions 2017-11-30 09:59:32 +03:00
mehmet furkan şahin 6041f85b70
Add tests for non-propagated VACUUM/ANALYZE 2017-11-29 16:06:50 -07:00
Jason Petersen 0eacf6bd95
Refactor VacuumStmt checker to be single-return
Decided this would be safer for the future (defaults to unsupported).
2017-11-29 16:06:50 -07:00
Jason Petersen b12e77ab0e
Ensure unsupported VACUUMs don't go to workers
Apparently these two blocks have been incorrect for nearly a year…
2017-11-29 16:06:50 -07:00
Marco Slot 7ea718fd8d Round-robin over worker nodes for 0-shard router queries 2017-11-29 15:52:22 +01:00
Marco Slot ae67fa0e52 Do not run multi_mx_modifications in parallel with multi_mx_transaction_recovery 2017-11-29 15:35:21 +01:00
mehmet furkan şahin b6eb0c2823 multi_subquery_behavioral_analytics.sql query fix by adding proper order by 2017-11-28 14:15:46 +03:00
mehmet furkan şahin 1b06b2b306 The data used in regression tests is reduced
This commit reduces the size of the data in users_table.data
and events_table.data from 10K rows to 100 rows.
2017-11-28 14:15:46 +03:00
Onder Kalaci 05fb0dd020 Add infrastructure for filtering restriction contexts based on the input query
In subquery pushdown, we first ensure that each relation is joined with at least
on another relation on the partition keys. That's fine given that the decision
is binary: pushdown the query at all or not.

With recursive planning, we'd want to check whether any specific part
of the query can be pushded down or not. Thus, we need the ability to
understand which part(s) of the subquery is safe to pushdown. This commit
adds the infrastructure for doing that.
2017-11-28 09:58:21 +02:00
Onder Kalaci 26d9b58e9e Make sure that ExtractRangeTableRelationWalker never misses RTE_RELATION 2017-11-28 09:27:34 +02:00
Onder Kalaci 32def06ebd Split assigning RTE identities and partitioning related query modifications
Note that we used to iterate over the RTEs once for performance reasons.
However, keeping an extra copy of original query seems more costly and
hard to maintain/explain.
2017-11-28 09:27:34 +02:00
Marco Slot feffe86440 Subqueries containing functions go through subquery pushdown 2017-11-27 22:13:02 +01:00
Onder Kalaci 48f96bf3e5 Enable non equi joins in subquery pushdown
Subquery pushdown planning is based on relation restriction
equivalnce. This brings us the opportuneatly to allow any
other joins as long as there is an already equi join between
the distributed tables.

We already allow that for joins with reference tables and
this commit allows that for joins among distributed tables.
2017-11-23 16:13:46 +02:00
mehmet furkan şahin 032b34ea52 some more parallelization 2017-11-23 14:10:42 +03:00
Onder Kalaci 16421f089f Register citus custom scan nodes 2017-11-23 11:38:33 +02:00
Onder Kalaci 83c1143505 Refactor custom scan related codes
In this commit, we don't change any codes, only create a new
file and move the related functions and types there.
2017-11-23 11:38:12 +02:00
Marco Slot 20a526d5c4 Fix memory leak in ListToHashSet 2017-11-22 11:26:58 +01:00
Marco Slot f4ceea5a3d Enable 2PC by default 2017-11-22 11:26:58 +01:00
Marco Slot 8486f76e15 Auto-recover 2PC transactions 2017-11-22 11:26:58 +01:00
Marco Slot 6ba3f42d23 Rename MultiPlan to DistributedPlan 2017-11-22 09:36:24 +01:00
Marco Slot 0ad39b36fe Treat immutable table functions and constant subqueries as reference tables 2017-11-21 14:15:22 +01:00
Onder Kalaci d558ebb923 Relax the checks on ensuring distribution columns for target entries
With this commit, we allow pushing down subqueries with only
reference tables where GROUP BY or DISTINCT clause or Window
functions include only columns from reference tables.
2017-11-21 12:28:14 +02:00
Andres Freund d063658d6d Protect some initializations from being called during backend startup.
On EXEC_BACKEND builds these functions shouldn't be called at every
backend start.
2017-11-20 15:29:51 -08:00
Brian Cloutier d267e0f9fa EXEC_BACKEND: don't put pointers to shared hashes into shared memory
Store pointers to shared hashes in process-local variables. Previously
pointers to shared hashes were put into shared memory. This causes
problems on EXEC_BACKEND because everybody calls execve and receives a
brand new address space; the shared hash will be in a different place
for every backend. (normally we call fork, which gives you a copy of the
address space, so these pointers remain constant)
2017-11-20 15:29:51 -08:00
Brian Cloutier 30a2365d81 Rename CreateDirectory to CitusCreateDirectory 2017-11-20 14:38:26 -08:00
Brian Cloutier aa2ab023a2 Rename RemoveDirectory -> CitusRemoveDirectory 2017-11-20 14:21:52 -08:00
Brian Cloutier 06f756b0a1 Rename DeleteFile -> CitusDeleteFile 2017-11-20 13:30:11 -08:00
mehmet furkan şahin 34709c2a16 Regression tests parallelization PART-1 2017-11-20 18:03:37 +03:00
Marco Slot 9793218122 Do not commit already-committed prepared transactions in recovery 2017-11-20 13:18:48 +01:00
Marco Slot fe798cf0f9 Add recovery vs. recovery isolation test 2017-11-20 12:26:25 +01:00
Marco Slot ae47df01ea Observe prepared xacts twice in RecoverWorkerTransactions to avoid race condition 2017-11-20 11:44:08 +01:00
Marco Slot 2410c2e450 Rewrite recover_prepared_transactions to be fast, non-blocking 2017-11-20 11:27:40 +01:00
mehmet furkan şahin 314fc09d90 regression test shard_count is changed from 32 to 4 2017-11-20 12:47:49 +03:00
mehmet furkan şahin 8d55754b4d the tests are separated and some more added 2017-11-20 11:45:48 +03:00
mehmet furkan şahin 636faadc47 create_distributed_table vs create_distributed_table, master_append_table_to_shard vs master_apply_delete_command, master_apply_delete_command vs master_apply_delete_command are added 2017-11-20 11:45:48 +03:00
mehmet furkan şahin 0722334e50 concurrent master_append_table test is added 2017-11-20 11:45:48 +03:00
mehmet furkan şahin f45988962f multi-shard update affecting the same/different rows 2017-11-20 11:45:48 +03:00
Onder Kalaci 5bea95009b Skip autovacuum processes for distributed deadlock detection
Autovacuum process cancels itself if any modification starts
on the table in order to avoid blocking your regular Postgres
sessions. That's normal and expected. Thus, any locks held by
autovacuum process cannot involve in a distributed deadlock
since it'll be released if needed.
2017-11-15 14:32:16 +02:00
Onder Kalaci c65c153a46 Skip speculative locks for distributed deadlock detection
These locks are held for a very short duration time and cannot
contribute to a deadlock. Speculative locks are used by Postgres
for internal notification mechanism among transactions.
2017-11-15 12:43:45 +02:00
Marco Slot bbbadd6d1b Bump Citus version to 7.2devel 2017-11-15 10:32:49 +01:00
Marco Slot ea306c6cfe Use citus.next_placement_id where practical in regression tests 2017-11-15 10:12:06 +01:00
Marco Slot d3b634b301 Allow generating placement IDs without using the sequence 2017-11-15 10:12:06 +01:00
Marco Slot 89eb833375 Use citus.next_shard_id where practical in regression tests 2017-11-15 10:12:05 +01:00
Marco Slot c24a0875a5 Allow generating shard IDs without using the sequence 2017-11-15 10:12:05 +01:00
Brian Cloutier 0f3230170f Pull in INT32_MAXINT and INT32_MININT 2017-11-14 14:03:46 -08:00
Brian Cloutier 0db8277266 remove unused errno import 2017-11-14 13:09:34 -08:00
Brian Cloutier 5d9f3ae7fd Remove unused poll import from multi_real_time_executor 2017-11-14 13:09:34 -08:00
Marco Slot 533a533565 Only drop sequences on workers with metadata 2017-11-14 16:01:56 +01:00
velioglu be28ba8e70 Add stub UDF to run pg_upgrade flawlessly 2017-11-13 16:14:45 +02:00
metdos 111c04c2bd Warn on CLUSTER command for distributed tables 2017-11-10 12:14:45 +02:00
Burak Yücesoy 863df0b874
Merge branch 'master' into fix_partitioning_in_schema 2017-11-09 12:49:35 +02:00
Burak Yucesoy 17229ed7bd Fix attaching partition to a distributed table in schema
While attaching a partition to a distributed table in schema, we mistakenly
used unqualified name to find partitioned table's oid. This caused problems
while using partitioned tables with schemas. We are fixing this issue in
this PR.
2017-11-09 13:20:29 +03:00
Onder Kalaci 94921a2be1 Skip page-level locks on distributed deadlock detection
Short-term share/exclusive page-level locks are used for
read/write access. Locks are released immediately after
each index row is fetched or inserted.

Since those locks may not lead to any deadlocks, it's safe
to ignore them in the distributed deadlock detection.
2017-11-09 10:37:23 +02:00
Marco Slot f71728f634 Add GUC for specifying sslmode in connections to workers 2017-11-08 14:15:58 +01:00
Murat Tuncer 4e3d633ebf
Add check for connection failures during multishard update (#1765) 2017-11-07 12:33:25 +02:00
Hadi Moshayedi 6d79d25101 Fix a relcache reference leak in stats collection.
In DistributedTablesSize() we didn't close the relations that had
replication factor > 2. This caused relcache reference leaks, and
warning messages like following in logs:

    WARNING:  relcache reference leak: relation "researchers" not closed
2017-11-06 23:16:43 -05:00
metdos c83edc36b5 Check connection status before using it 2017-11-06 14:53:35 +02:00
Murat Tuncer 2332678793
Fix intermittent test failures on count distinct tests (#1761)
Added analyze to test which seem to force planner to use index.
2017-11-06 10:58:46 +02:00
Brian Cloutier 7be1545843 Support implicit casts during INSERT/SELECT
It's possible to build INSERT SELECT queries which include implicit
casts, currently we attempt to support these by adding explicit casts to
the SELECT query, but this sometimes crashes because we don't update all
nodes with the new types. (SortClauses, for instance)

This commit removes those explicit casts and passes an unmodified SELECT
query to the COPY executor (how we implement INSERT SELECT under the
scenes). In lieu of those cases, COPY has been given some extra logic to
inspect queries, notice that the types don't line up with the table it's
supposed to be inserting into, and "manually" casting every tuple before
sending them to workers.
2017-11-03 22:27:15 -07:00
Marco Slot 6883a09cdd Allow distributed partitioned table creation in Cloud 2017-11-03 10:09:18 +01:00
Marco Slot 6219186683 Allow distributed INSERT...SELECT via worker nodes in MX 2017-11-02 14:38:39 +01:00
Hadi Moshayedi 7280774cf4 Use list_length() != 1 in SingleReplicatedTable().
ShardPlacementList's implementation can return NIL. In previous implementation
we got a segmentation fault in this case. The relation can be dropped after
getting distributed table list but before calling SingleReplicatedTable().
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 7691991cb5 Do PG_TRY() inside a subtransaction block.
If we don't propagate the errors we are catching in PG_CATCH(), database's
internal state might not be clean. So we do PG_TRY() inside a subtransaction
so we can rollback to it after catching errors.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 9bfbbf8a04 Make reports hostname configurable and enable stats collection in tests.
This patch adds --with-reports-host configure option, which sets the
REPORTS_BASE_URL constant. The default is reports.citusdata.com.

It also enables stats collection in tests.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi acaf085a80 Add callback function for request by CollectBasicUsageStatistics().
Curl writes the received response to stdout if we don't specify a response
callback or an output file. This can pollute the PostgreSQL log. In this change
we add a callback function so the response messages aren't added to the log file.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 747e439601 Limit number of stats collection retries to once a day. 2017-10-31 21:51:43 -04:00
Hadi Moshayedi 78a2cd9052 Check for Citus updates.
Sends a request to /v1/releases/latest?flavor=$CITUS_EDITION once a day,
which returns a response similar to {"version": "7.1.0", "major": 7,
"minor": 1, "patch": 0}. Then compares it with current Citus version,
and if the latest release is newer, logs a LOG message.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 34f3ec0961 Call FlushDistTableCache() before stats collection. 2017-10-31 21:51:43 -04:00
Hadi Moshayedi c18c6625d9 Lock relations before calling citus_table_size().
This is to make sure they don't get dropped.
2017-10-31 21:51:43 -04:00
Hadi Moshayedi 97d544b75c Follow the patterns used in Deadlock Detection in Stats Collection.
This includes:

(1) Wrap everything inside a StartTransactionCommand()/CommitTransactionCommand().
This is so we can access the database. This also switches to a new memory context
and releases it, so we don't have to do our own memory management.

(2) LockCitusExtension() so the extension cannot be dropped or created concurrently.

(3) Check CitusHasBeenLoaded() && CheckCitusVersion() before doing any work.

(4) Do not PG_TRY() inside a loop.
2017-10-31 21:51:43 -04:00
Marco Slot 100aaeb3f5 Fix typo in distributed deadlock error message 2017-10-31 19:39:32 +01:00
metdos 8c356b2bc8 Don't try to add restrictions for reference tables in insert into select 2017-10-31 19:44:10 +02:00
mehmet furkan şahin 32fb19911c Add Constraint %s Add Primary Key Using index %s support
This commit makes a change in relay_event_utility.c to check if the
Alter Table command adds a constraint using index. If this is the
case, it appends the shard id to the index name.
2017-10-31 16:03:56 +03:00
Marco Slot 7e34348334 Add shard transfer mode parameter to shard copy functions 2017-10-31 13:30:48 +01:00
Marco Slot 2bb46bb5ee Reset connectionReady flag after moving a connection in WaitForAllConnections 2017-10-31 12:06:53 +01:00
Marco Slot e6e6897499 Defer initial PQflush to main loop in WaitForAllConnections 2017-10-31 12:06:53 +01:00
Marco Slot d6dadb1b25 Use correct index for ModifyWaitEvent in WaitForAllConnections 2017-10-31 12:06:53 +01:00
Furkan Sahin 2b39c52f0b Replica identity on create_distributed_table
By this commit, citus minds the replica identity of the table when
we distribute the table. So the shards of the distributed table
have the same replica identity with the local table.
2017-10-31 13:08:36 +03:00
Marco Slot 7f68f78ee9 Omit public schema from shard_name output 2017-10-31 00:22:07 +01:00
Murat Tuncer e16805215d
Support count(distinct) for non-partition columns (#1692)
Expands count distinct coverage by allowing more cases. We used to support
count distinct only if we can push down distinct aggregate to worker query
i.e. the count distinct clause was on the partition column of the table,
or there was a grouping on the partition column.

Now we can support
- non-partition columns, with or without grouping on partition column
- partition, and non partition column in the same query
- having clause
- single table subqueries
- insert into select queries
- join queries where count distinct is on partition, or non-partition column
- filters on count distinct clauses (extends existing support)

We first try to push down aggregate to worker query (original case), if we
can't then we modify worker query to return distinct columns to coordinator
node. We do that by adding distinct column targets to group by clauses. Then
we perform count distinct operation on the coordinator node.

This work should reduce the cases where HLL is used as it can address anything
that HLL can. However, if we start having performance issues due to very large
number rows, then we can recommend hll use.
2017-10-30 13:12:24 +02:00
Marco Slot be46661bf7 Block only 2PCs instead of all writes in citus_create_restore_point 2017-10-27 00:07:32 +02:00
mehmet furkan şahin 83ac84d594 order by and unnest are added to multi_colocation_utils tests 2017-10-26 13:44:28 +03:00
mehmet furkan şahin 61ae33dc7f ALTER TABLE .. REPLICA IDENTITY support is implemented 2017-10-26 13:44:28 +03:00
Brian Cloutier 4a17d12d74 Replace uint with uint32 2017-10-25 19:32:12 -07:00
velioglu 0b5db5d826 Support multi shard update/delete queries 2017-10-25 15:52:38 +03:00
Marco Slot 4bde83e1d2 Relay error message if DML fails on worker 2017-10-25 14:23:21 +02:00
Hadi Moshayedi 9a04b78980 Send server_id for statistics reports. (#1698)
This change introduces the `pg_dist_node_metadata` which has a single jsonb value. When creating
the extension, a random server id is generated and stored in there. Everything in the metadata table
is added as a nested objected to the json payload that is sent to the reports server.
2017-10-18 21:20:32 -04:00
Hadi Moshayedi 86bcd93a4a Don't collect stats when there is a version mismatch. (#1712)
The following scenario can cause an Assert() crash if we don't do this:
- Install Citus v7.0-15
- Restart server & run a query to start maintenanced.
- Install Citus v7.1
- Restart server & run a query. This will tell user to upgrade.
- Type "UPDATE EXTENSION c" & press tab. maintenanced will start and crash
  with Assert(CitusHasBeenLoaded() && CheckCitusVersion(WARNING));

This change checks Citus version before calling metadata functions so the
crash doesn't happen.
2017-10-17 14:01:14 -04:00
Jason Petersen f2c593b25c
Add CITUS_NAME and CITUS_EDITION
Unambiguous places to check whether we're running simply Citus or Citus
Enterprise, or to check for 'community' or 'enterprise'.
2017-10-16 18:09:29 -06:00
Jason Petersen 8544878c4b
Add citus_version(), analogous to PG's version()
This will provide the full project name (i.e. Citus/Citus Enterprise),
and the host system, compiler, and architecture word size.

I wanted to limit the number of copied files in 'config', so I added
only config.guess and call it manually, rather than using the macro
AC_CANONICAL_HOST, which requires several other files.
2017-10-16 18:09:29 -06:00
Brian Cloutier 91ff8cd2d5 {*,}create_distributed_table doesn't emit OID (#1710) 2017-10-16 18:08:51 -06:00
Brian Cloutier ebcb2b65e9 Add master_move_node function 2017-10-16 10:51:28 -07:00
Brian Cloutier 58cf15ceca DistributedTableSize doesn't emit oid when erring out 2017-10-14 02:42:57 +03:00
Hadi Moshayedi 2aec6eda49 Properly use #ifdef HAVE_LIBCURL. 2017-10-13 12:04:36 -06:00
Jason Petersen 01353cb7cb Use header define rather than -D flag
Eclipse apparently doesn't scan build output looking for -D flags, so
having the value actually appear in a header is nicer for those of us
using IDEs.
2017-10-13 11:00:09 -04:00
Hadi Moshayedi 946659aebe Delete StatsCollection memory context after we are done with stats reporting.
Previously we left the memory context untouched, which overtime leaked memory.
2017-10-13 11:00:09 -04:00
Hadi Moshayedi 873fd1e7ff Fix compiling --without-libcurl.
Previously <curl/curl.h> was included even if compiled --without-libcurl.
This can fail when libcurl headers are not there. This commit guards this
include by checks for HAVE_LIBCURL.
2017-10-13 11:00:09 -04:00
Murat Tuncer 4832abc7cb Make multi_master_planner.c coding convention compliant
Changed order of function definitions and added
declarations in the beginning of the file
2017-10-13 14:59:48 +03:00
Murat Tuncer f7ab901766 Add select distinct, and distinct on support
Distinct, and distinct on() clauses are supported
in simple selects, joins, subqueries, and insert into select
queries.
2017-10-13 14:59:48 +03:00
Hadi Moshayedi 6879f92e23 Fix out of bound memeory access when getting HTTP response code. (#1699) 2017-10-12 12:51:42 -04:00
Hadi Moshayedi a1387f4aa8 Basic usage statistics collection. (#1656)
Adds ```citus.enable_statistics_collection``` GUC variable, which ```true``` by default, unless built without libcurl. If statistics collection is enabled, sends basic usage data to Citus servers every 24 hours.

The data that is collected consists of:
- Citus version
- OS name & release
- Hardware Id
- Number of tables, rounded to next power of 2
- Size of data, rounded to next power of 2
- Number of workers
2017-10-11 09:55:15 -04:00
Onder Kalaci 498ac80d8b Add window function support for SUBQUERY PUSHDOWN and INSERT INTO SELECT
This commit provides the support for window functions in subquery and insert
into select queries. Note that our support for window functions is still limited
because it must have a partition by clause on the distribution key. This commit
makes changes in the files insert_select_planner and multi_logical_planner. The
required tests are also added with files multi_subquery_window_functions.out
and multi_insert_select_window.out.
2017-10-04 15:33:07 +03:00
Marco Slot 24915779d1 Remove separate citus.binary_worker_copy_format regression tests 2017-10-03 17:44:50 +02:00
Marco Slot 9e516513fc Use local group ID when querying for prepared transactions 2017-10-03 16:36:53 +02:00
Hadi Moshayedi 11adb9b034 Push down LIMIT and HAVING when grouped by partition key. (#1641)
We can do this because all rows belonging to a group are in the same shard when grouping by distribution column on a range/hash distributed table.
2017-10-02 20:17:51 -04:00
Marco Slot 394918f9d0 Invalidate worker and group ID cache in maintenance daemon 2017-10-02 18:14:29 +02:00
Marco Slot bb50fc9cb5 Add multi-user re-partitioning regression tests 2017-09-28 15:27:26 +02:00
Marco Slot 43d5e79eaa Execute transmit commands as superuser during task-tracker queries 2017-09-28 15:27:25 +02:00
Marco Slot 306c58d59b Check for absolute paths in COPY with format transmit 2017-09-28 15:27:11 +02:00
Marco Slot cb6b0e820c Allow read-only users to run task-tracker queries 2017-09-28 13:52:36 +02:00
Marco Slot da6b42a3e2 Use unique constraint index for transaction record deletion 2017-09-28 12:04:56 +02:00
Onder Kalaci 68ca8cb7f0 Skip relation extension locks
We should skip if the process blocked on the relation
extension since those locks are hold for a short duration
while the relation is actually extended on the disk and
released as soon as the extension is done. Thus, recording
such waits on our lock graphs could yield detecting wrong
distributed deadlocks.
2017-09-28 10:09:09 +03:00
Murat Tuncer 4676c4f7a5 Prevent crash when remote transaction start fails (#1662)
We sent multiple commands to worker when starting a transaction.
Previously we only checked the result of the first command that
is transaction 'BEGIN' which always succeeds. Any failure on
following commands were not checked.

With this commit, we make sure all command results are checked.
If there is any error we report the first error found.
2017-09-26 17:25:46 -07:00
Jason Petersen b4d53423fa
Add adapter functions for OpenFile changes 2017-09-25 17:20:24 -07:00
Jason Petersen d686123dae
Omit now-public Explain methods from PG11 build
This copy-pasted code is no longer needed in PG11.
2017-09-25 17:20:24 -07:00
Jason Petersen 89d02c6115
Add ruleutils file for PostgreSQL 11 2017-09-25 17:20:24 -07:00
Jason Petersen bbc15e0598
Handle HASHPROC changes
PostgreSQL 11 now has "standard" and "extended" (64-bit) versions of
hash functions.
2017-09-25 17:20:24 -07:00
Jason Petersen b4474fc0b0
Modify version-output tests for PostgreSQL 11
Basically we just care whether the running version is before or after
PostgreSQL 10, so testing the major version against 9 and printing a
boolean is sufficient.
2017-09-25 17:20:24 -07:00
Jason Petersen 6c9b19a954
Add version-compat header
For polyfill macros, etc.
2017-09-25 17:20:23 -07:00
Jason Petersen fbeaa2f9d0
Remove direct access to tupleDesc->attrs
A level of indirection was removed from this field for PostgreSQL 11.
By using the handy provided macro, we can be version agnostic.
2017-09-25 17:20:23 -07:00
Jason Petersen 6a020b5adc
Update CopyGetAttnums with latest from PostgreSQL
This function was recently modified to use the TupleDescAttr wrapper,
which abstracts away recent changes to TupleDesc.
2017-09-25 17:20:23 -07:00
Andres Freund 78716e5546 Fix possible shard cache incoherency.
When a table and it's shards are dropped, and afterwards the same
shard identifiers are reused, e.g. due to a DROP & CREATE EXTENSION,
the old entry in the shard cache and the required entry in the shard
cache might be for different tables.

Force invalidation for both old and new table to fix.
2017-09-25 13:05:09 -07:00
velioglu 0a56ed910b Change error message of queries with distributed and local table
Citus can handle INSERT INTO ... SELECT queries if the query inserts
into local table by reading data from distributed table. The opposite
way is not correct. With this commit we warn the user if the latter
option is used.
2017-09-22 13:46:19 -07:00
Onder Kalaci 867224bdd7 Make the tests produce more consistent outputs 2017-09-22 20:38:56 +03:00
Onder Kalaci 4782f9f98a Properly copy and trim the error messages that come from pg_conn
When a NULL connection is provided to PQerrorMessage(), the
returned error message is a static text. Modifying that static
text, which doesn't necessarly be in a writeable memory, is
dangreous and might cause a segfault.
2017-09-22 19:43:09 +03:00
Onder Kalaci 6736fd1682 Remove two obsolete functions
Namely GetConnectionFromPGconn() and CloseConnectionByPGconn()
2017-09-21 00:36:23 -06:00
Onder Kalaci 33ec33c5b3 Ensure schema exists on reference table creation
If the schema doesn't exists on the workers, create it.
2017-09-18 23:50:47 +03:00
Onder Kalaci 6116c8e93d Allow pushing down GROUP BYs when at least there is one distribution
column in the target list
2017-09-15 19:15:06 +03:00
Onder Kalaci a5b66912d4 Expand reference table support in subquery pushdown
With this commit, we relax the restrictions put on the reference
tables with subquery pushdown.

We did three notable improvements:

1) Relax equi-join restrictions

 Previously, we always expected that the non-reference tables are
 equi joined with reference tables on the partition key of the
 non-reference table.

 With this commit, we allow any column of non-reference tables
 joined using non-equi joins as well.

2) Relax OUTER JOIN restrictions

 Previously Citus errored out if any reference table exists at
 any point of the outer part of an outer join. For instance,
 See the below sketch where (h) denotes a hash distributed relation,
 (r) denotes a reference table, (L) denotes LEFT JOIN and
 (I) denotes INNER JOIN.

             (L)
             /  \
           (I)     h
          /  \
        r      h

 Before this commit Citus would error out since a reference table
 appears on the left most part of an left join. However, that was
 too restrictive so that we only error out if the reference table
 is directly below and in the outer part of an outer join.

3) Bug fixes

 We've done some minor bugfixes in the existing implementation.
2017-09-14 20:59:22 +03:00
Marco Slot d1befa4df9 Wait for I/O to finish after PQputCopyData 2017-09-12 16:18:42 -07:00
Marco Slot cbe16169b4 Free per-tuple COPY memory in INSERT...SELECT 2017-09-12 15:35:53 -07:00
Marco Slot 27da0a29d7 Add volatile function in prepared statement regression test 2017-09-12 13:09:31 -07:00
Marco Slot 5fe0845d7e Always copy MultiPlan in GetMultiPlan 2017-09-12 11:38:52 -07:00
Jason Petersen 8b2c3fcc15
Add clarifying comment to RngVarCallbackForDropIdx
We don't need the PARTITION-related logic recently added in PostgreSQL.
2017-09-01 15:57:30 -06:00
Jason Petersen ec30ad38ba
Update ruleutils_10 with latest PostgreSQL changes
See:
	postgres/postgres@21d304dfed
	postgres/postgres@bb5d6e80b1
	postgres/postgres@d363d42bb9
	postgres/postgres@eb145fdfea
	postgres/postgres@decb08ebdf
	postgres/postgres@a3ca72ae9a
	postgres/postgres@bc2d716ad0
	postgres/postgres@382ceffdf7
	postgres/postgres@c7b8998ebb
	postgres/postgres@e3860ffa4d
	postgres/postgres@76a3df6e5e
2017-09-01 14:26:59 -06:00
Jason Petersen ebecde8f6e
Update ruleutils_96 with latest PostgreSQL changes
See:
	postgres/postgres@41ada83774
	postgres/postgres@3b0c2dbed0
	postgres/postgres@ff2d537223
2017-09-01 14:26:53 -06:00
Burak Yucesoy 273b034720 Bump Citus version 2017-08-28 17:56:39 +03:00
Marco Slot 0aadbb1760 Convert multi-row INSERT target list to Vars 2017-08-25 10:55:56 +02:00
Marco Slot 1920390688 Multi-row INSERTs no longer throw errors in isolation tests 2017-08-25 10:55:56 +02:00
Marco Slot ae00795dab Allow default columns in multi-row INSERTs 2017-08-25 10:55:56 +02:00
Marco Slot c97692f382 Fix multi-row INSERT with RETURNING on reference tables 2017-08-24 10:42:12 +02:00
Marco Slot dbf18df995 Don't error out if BuildGlobalWaitGraph fails to connect 2017-08-23 19:08:03 +02:00
Burak Yucesoy 5be6eb9ef6 Increase coverage of isolation tests - Part 2
With this PR we add isolation tests for

COPY to reference table vs. other operations
COPY to partitioned table vs. other operations
Multi row INSERTs vs other operations
INSERT/SELECT vs. other operations
UPSERT vs. other operations
DELETE vs. other operations
TRUNCATE vs. other operations
DROP vs. other operations
DDL vs. other operations

other operations consist of basic SQL operations (like SELECT,
INSERT, DELETE, UPSERT, COPY TRUNCATE, CREATE INDEX) as well
as some Citus functionalities (like master_modify_multiple_shards,
master_apply_delete_command, citus_total_relation_size etc.)
2017-08-23 18:23:36 +03:00
Onder Kalaci c7bb29b69e Prevent maintanince deamon crashes due to dead processes
If after the distributed deadlock detection decides to cancel
a backend, the backend has been terminated/killed/cancelled
externally, we might be accessing to a NULL pointer. This commit
prevents that case by ignoring the current distributed deadlock.
2017-08-23 15:44:09 +03:00
Marco Slot 641420d79f Remove source node argument from dump_local_wait_edges 2017-08-23 13:14:00 +02:00
Jason Petersen 8cb69e3a14 Add alias for target in multi-row INSERTs
This is necessary for multi-row INSERTs for the same reasons we use it
in e.g. UPSERTs: if the range table list has more than one entry, then
PostgreSQL's deparse logic requires that vars be prefixed by the name
of their corresponding range table entry. This of course doesn't affect
single-row INSERTs, but since multi-row INSERTs have a VALUE RTE, they
were affected.

The piece of ruleutils which builds range table names wasn't modified
to handle shard extension; instead UPSERT/INSERT INTO ... SELECT added
an alias to the RTE. When present, this alias is favored. Doing the
same in the multi-row INSERT case fixes RETURNING for such commands.
2017-08-23 10:24:00 +02:00
Marco Slot 4d7927b672 Execute multi-row INSERTs sequentially 2017-08-23 10:04:57 +02:00
Marco Slot cf375d6a66 Consider dropped columns that precede the partition column in COPY 2017-08-22 13:02:35 +02:00
Marco Slot bd6bf29983 Don't add procs multiple times in BuildWaitGraphForSourceNode 2017-08-21 16:48:30 +02:00
Onder Kalaci 6532b69873 Kill the maintenance daemon on DROP DATABASE 2017-08-18 16:03:08 +03:00
Metin Doslu 0d052e9864 Fix a crash on zero-shard tables 2017-08-18 13:53:59 +03:00
Önder Kalacı b82f886ad3 Merge branch 'master' into improve_deadlock_detection 2017-08-18 13:07:18 +03:00
Marco Slot 7523753a73 Clear metadata OID cache prior to deadlock detection 2017-08-18 11:20:24 +02:00
Andres Freund b936bde936 Take AccessShareLock on the extension prior to running deadlock detection 2017-08-18 11:20:24 +02:00
Onder Kalaci 20679c9e8b Relax assertion on deadlock detection considering
self deadlocks.
2017-08-18 11:16:38 +03:00
Onder Kalaci 550a5578d8 Skip deadlock detection on the workers
Do not run distributed deadlock detection
on the worker nodes to prevent errornous
decisions to kill the deadlocks.
2017-08-17 19:43:38 +03:00
Burak Yucesoy ae32d786cf Add new isolation tests 2017-08-17 17:46:03 +03:00
Marco Slot 1eca53ad40 Exit maintenanced on database crash 2017-08-16 18:29:44 +02:00
Marco Slot 9e7b1fb858 Return readable nodes in master_get_active_worker_nodes 2017-08-16 11:28:47 +02:00
Hadi Moshayedi e5fbcf37dd Add Savepoint Support (#1539)
This change adds support for SAVEPOINT, ROLLBACK TO SAVEPOINT, and RELEASE SAVEPOINT.

When transaction connections are not established yet, savepoints are kept in a stack and sent to the worker when the connection is later established. After establishing connections, savepoint commands are sent as they arrive.

This change fixes #1493 .
2017-08-15 13:02:28 -04:00
Onder Kalaci 205501532a Add version check to the maintenance daemon
We should prevent running the deadlock detection if
there is a major version change. Otherwise, the daemon
may access to obsolete metadata catalog tables.
2017-08-15 18:47:13 +03:00
Marco Slot 3ff46245b3 Make sure we don't use 2PC in copy from worker 2017-08-15 13:44:20 +02:00
Marco Slot 4614814de1 Enable 2PC for INSERT...SELECT via coordinator 2017-08-15 13:44:20 +02:00
Marco Slot fa70089766 Enable 2PC during distributed table creation 2017-08-15 13:44:20 +02:00
Marco Slot 9232823070 Abort on failure on master connection during copy from worker 2017-08-15 13:44:20 +02:00
Marco Slot df7723cde5 Should not commit on aborted non-critical connections 2017-08-15 13:44:20 +02:00
Eren Başak 77626c4238 Fix NULL nodeClusterString crush on pg_worker_list.conf migrations 2017-08-14 18:13:53 +03:00
Eren Başak b3d2f9ba71 Fix pg_worker_list use-after-free bug
This change fixes a use-after-free bug while renaming obsolete
`pg_worker_list.conf` file, which causes Citus to crash during upgrade
(or even extension creation) if `pg_worker_list.conf` exists.
2017-08-14 18:13:53 +03:00
Burak Yucesoy 45b273321f Add tests for locking operations on partitioned tables 2017-08-14 14:55:45 +03:00
Burak Yucesoy dfdfb44ebf Acquire shard resource locks on parent tables while operating on partitions 2017-08-14 14:44:30 +03:00
Burak Yucesoy a321e750c0 Acquire relation locks on partitions while operation on parent table 2017-08-14 14:44:30 +03:00
Burak Yucesoy 52b9e35d50 Add relationIdList field to the Job struct 2017-08-14 14:06:22 +03:00
Onder Kalaci 4f668ad38b Make the test outputs consistent
by using VACUUM ANALYZE on the tables.
2017-08-12 13:29:25 +03:00
Onder Kalaci 0ba2f9e4e4 Add regression tests for distributed deadlock detection 2017-08-12 13:29:25 +03:00
Onder Kalaci 5b48de7430 Improve deadlock detection for MX
We added a new field to the transaction id that is set to true only
for the transactions initialized on the coordinator. This is only
useful for MX in order to distinguish the transaction that started
the distributed transaction on the coordinator where we could
 have the same transactions' worker queries on the same node.
2017-08-12 13:28:37 +03:00
Onder Kalaci 59133415b0 Add logging infrasture for distributed deadlock detection
We added a new GUC citus.log_distributed_deadlock_detection
which is off by default. When set to on, we log some debug messages
related to the distributed deadlock to the server logs.
2017-08-12 13:28:37 +03:00
Onder Kalaci e5d5bdff51 Enable distributed deadlock detection on the maintenance deamon
With this commit, the maintenance deamon starts to check for
distributed deadlocks.

We also introduced a GUC variable (distributed_deadlock_detection_factor)
whose value is multiplied with Postgres' deadlock_timeout. Setting
it to -1 disables the distributed deadlock detection.
2017-08-12 13:28:37 +03:00
Onder Kalaci 66936053a0 Improve error messages when a backend is cancelled by deadlock detection
We send SIGINT to a backend that is cancelled due to a deadlock. That
approach ends up being a very confusing error message.

With this commit we intercept the error messages and show a more
meaningful error message to the user.
2017-08-12 13:28:37 +03:00
Onder Kalaci be4fc45c03 Deprecate enable_deadlock_prevention flag
Now that we already have the necessary infrastructure for detecting
distributed deadlocks. Thus, we don't need enable_deadlock_prevention
which is purely intended for preventing some forms of distributed
deadlocks.
2017-08-12 13:28:37 +03:00
Onder Kalaci a333c9f16c Add infrastructure for distributed deadlock detection
This commit adds all the necessary pieces to do the distributed
deadlock detection.

Each distributed transaction is already assigned with distributed
transaction ids introduced with
3369f3486f. The dependency among the
distributed transactions are gathered with
80ea233ec1.

With this commit, we implement a DFS (depth first seach) on the
dependency graph and search for cycles. Finding a cycle reveals
a distributed deadlock.

Once we find the deadlock, we examine the path that the cycle exists
and cancel the youngest distributed transaction.

Note that, we're not yet enabling the deadlock detection by default
with this commit.
2017-08-12 13:28:37 +03:00
Marco Slot 59e626d158 Add regression tests for follower clusters 2017-08-12 12:05:56 +02:00
Marco Slot 55992d4bc0 Disallow task-tracker queries on follower clusters 2017-08-12 11:47:31 +02:00
velioglu 100739f62a Change citus subversion 2017-08-11 11:57:57 +03:00
Marco Slot 53584affa8 Fix locking in create_distributed_table 2017-08-11 11:34:33 +03:00
velioglu 7c65001e23 Do not delete row from colocation table within drop table 2017-08-11 11:34:33 +03:00
velioglu b0efffae1c Correct planner and add more tests 2017-08-11 10:16:13 +03:00
velioglu 7550b8ad52 Fix anchor shard id selection when reference table exists 2017-08-11 10:09:47 +03:00
velioglu ceba81ce35 Move physical planner checks to logical planner 2017-08-11 10:09:47 +03:00
velioglu 0359d03530 Add set operation check for reference tables 2017-08-11 10:09:47 +03:00
velioglu c4e3b8b5e1 Add planner changes and tests for subquery on reference tables 2017-08-11 10:09:47 +03:00
velioglu 45717dd013 Check equivalence on reference tables for subquery pushdown 2017-08-11 10:09:47 +03:00
Marco Slot 0ae265c436 Add citus_create_restore_point for distributed snapshots 2017-08-11 07:36:20 +02:00
Marco Slot fdff210ef7 Wait for commit/abort/prepare results asynchronously 2017-08-11 00:03:06 +02:00
Marco Slot fca986f214 Add API for waiting for multiple connections 2017-08-11 00:03:06 +02:00
Brian Cloutier 9d93fb5551 Create citus.use_secondary_nodes GUC
This GUC has two settings, 'always' and 'never'. When it's set to
'never' all behavior stays exactly as it was prior to this commit. When
it's set to 'always' only SELECT queries are allowed to run, and only
secondary nodes are used when processing those queries.

Add some helper functions:
- WorkerNodeIsSecondary(), checks the noderole of the worker node
- WorkerNodeIsReadable(), returns whether we're currently allowed to
  read from this node
- ActiveReadableNodeList(), some functions (namely, the ones on the
  SELECT path) don't require working with Primary Nodes. They should call
  this function instead of ActivePrimaryNodeList(), because the latter
  will error out in contexts where we're not allowed to write to nodes.
- ActiveReadableNodeCount(), like the above, replaces
  ActivePrimaryNodeCount().
- EnsureModificationsCanRun(), error out if we're not currently allowed
  to run queries which modify data. (Either we're in read-only mode or
  use_secondary_nodes is set)

Some parts of the code were switched over to use readable nodes instead
of primary nodes:
- Deadlock detection
- DistributedTableSize,
- the router, real-time, and task tracker executors
- ShardPlacement resolution
2017-08-10 17:37:17 +03:00
Brian Cloutier c854d51cd8 make multi_reference_table test more stable 2017-08-10 17:37:17 +03:00
Brian Cloutier 3fc87a7a29 Metadata sync also syncs nodes in other clusters 2017-08-10 16:55:55 +03:00
Brian Cloutier 0dee4f8418 Metadata sync syncs all nodes, not just primaries 2017-08-10 16:55:55 +03:00
Eren Başak deb89cb9ce Delete tesh_helper_functions.h 2017-08-10 14:00:44 +03:00
Eren Başak f9470329e5 Remove test_helper_functions.h inclusions 2017-08-10 12:42:46 +03:00
Eren Başak 3061737712 Define Some Utility Functions
This change declares two new functions:

`master_update_table_statistics` updates the statistics of shards belong
to the given table as well as its colocated tables.

`get_colocated_shard_array` returns the ids of colocated shards of a
given shard.
2017-08-10 12:42:46 +03:00
Brian Cloutier 1961add6f9 Improve error message when there are no nodes for a placement 2017-08-10 12:38:51 +03:00
Jason Petersen dee66e3959
Final review feedback 2017-08-10 01:10:09 -07:00
Jason Petersen a578506718
Add multi-row isolation tests 2017-08-10 01:10:09 -07:00
Jason Petersen addde54464
Add some tests 2017-08-10 00:32:46 -07:00
Jason Petersen 6a35c2937c
Enable multi-row INSERTs
This is a pretty substantial refactoring of the existing modify path
within the router executor and planner. In particular, we now hunt for
all VALUES range table entries in INSERT statements and group the rows
contained therein by shard identifier. These rows are stashed away for
later in "ModifyRoute" elements. During deparse, the appropriate RTE
is extracted from the Query and its values list is replaced by these
rows before any SQL is generated.

In this way, we can create multiple Tasks, but only one per shard, to
piecemeal execute a multi-row INSERT. The execution of jobs containing
such tasks now exclusively go through the "multi-router executor" which
was previously used for e.g. INSERT INTO ... SELECT.

By piggybacking onto that executor, we participate in ongoing trans-
actions, get rollback-ability, etc. In short order, the only remaining
use of the "single modify" router executor will be for bare single-
row INSERT statements (i.e. those not in a transaction).

This change appropriately handles deferred pruning as well as master-
evaluated functions.
2017-08-10 00:32:46 -07:00
velioglu 7e436c0277 Add bool expression to pruning instance with a function 2017-08-10 08:56:36 +03:00
Andres Freund e8b793c454 Support for IN (const, list) and = ANY(const, b, c) pruning. 2017-08-10 08:56:36 +03:00
Onder Kalaci b5ea3ab6a3 Improve locking semantics for backend management
We use the backend shared memory lock for preventing
new backends to be part of a new distributed transaction
or an existing backend to leave a distributed transaction
while we're reading the all backends' data.

The primary goal is to provide consistent view of the
current distributed transactions while doing the
deadlock detection.
2017-08-09 17:17:12 +03:00
Brian Cloutier 2e0916e15a Add master_add_secondary_node() UDF 2017-08-09 17:10:48 +03:00
Marco Slot 08ed6d8269 Prevent pg_dist_node changes during master_create_empty_shard 2017-08-09 14:22:09 +02:00
Murat Tuncer 5cb9466255 Rebase node metadata isolation tests 2017-08-09 14:22:09 +02:00
Marco Slot 3a0571e69b Remove LockMetadataSnapshot 2017-08-09 14:09:54 +02:00
Marco Slot ad0fdf57ca Add add/remove node rollback isolation tests 2017-08-09 14:09:54 +02:00
Marco Slot c2f8bafa05 Fix shard creation vs. pg_dist_node change locking 2017-08-09 14:09:54 +02:00
Marco Slot 868ee6be83 Fix and simplify pg_dist_node locking 2017-08-09 14:09:54 +02:00
Burak Yucesoy ab5f97861b Add regression tests for distributed partitioned tables 2017-08-09 10:01:35 +03:00
Burak Yucesoy 8455d1a4ef Ensure we are allowing partitioned tables at all appropriate places 2017-08-09 10:01:35 +03:00
Burak Yucesoy 2eee556738 Add distributed partitioned table support for COPY
For partitioned tables, PostgreSQL opens partition and its partitions
in BeginCopyFrom and it expects its caller to close those relations.
However, we do not have quick access to opened relations and performing
special operations for partitioned tables isn't necessary in coordinator
node. Therefore before calling BeginCopyFrom, we change relkind of those
partitioned tables to RELKIND_RELATION. This prevents PostgreSQL to open
its partitions as well.
2017-08-09 10:01:35 +03:00
Burak Yucesoy 31f3221342 Add distributed partitioned table support to router plannable queries
In standart_planner, PostgreSQL expands partitioned tables to their
partitions and call our restriction hook for each partition. It also,
for some queries, skips the partitioned table itself completely. This
behaviour makes it difficult to prune shards and decide whether query
is router plannable or not. To prevent this behaviour, we change inh
flag of partitioned tables to false in the query tree. In this case,
PostgreSQL treats those partitioned tables as regular relations and
does not expand them.

This behaviour is inline with our expectations, because we do not want
to treat partitioned tables differently on coordinator. Although we are
not entirely comfortable with modifying query tree, other solutions to
this problem is overly complicated.
2017-08-09 10:01:35 +03:00
Burak Yucesoy fddf9b3fcc Add distributed partitioned table support distributed table creation
With this PR, Citus starts to support all possible ways to create
distributed partitioned tables. These are;

- Distributing already created partitioning hierarchy
- CREATE TABLE ... PARTITION OF a distributed_table
- ALTER TABLE distributed_table ATTACH PARTITION non_distributed_table
- ALTER TABLE distributed_table ATTACH PARTITION distributed_table

We also support DETACHing partitions from partitioned tables and propogating
TRUNCATE and DDL commands to distributed partitioned tables.

This PR also refactors some parts of distributed table creation logic.
2017-08-09 10:01:35 +03:00
Metin Doslu b8a9e7c1bf Add support for UPDATE/DELETE with subqueries 2017-08-08 21:35:08 +03:00
Marco Slot d3e9746236 Avoid connections that accessed non-colocated placements in multi-shard commands 2017-08-08 18:32:34 +02:00
Brian Cloutier 7060ade6fe GetNodeTuple returns NULL it node does not exist
It never throws an error.
2017-08-08 13:12:06 +03:00
Brian Cloutier a3e9bef685 All users of WorkerNodeHash take an AccessShareLock
The metadata cache simulates a SELECT on pg_dist_node. Now the locks it
takes also simulate that SELECT.
2017-08-08 13:12:06 +03:00
Brian Cloutier 5914c992e6 cluster management UDFs see nodes in different clusters
- master_activate_node and master_disable_node correctly toggle
  isActive, without crashing
- master_add_node rejects duplicate nodes, even if they're in different
  clusters
- master_remove_node allows removing nodes in different clusters
2017-08-08 13:12:06 +03:00
Brian Cloutier 3151b52a0b Add citus.cluster_name GUC
- Nodes with a nodecluster which does not match citus.cluster_name
  are excluded from the metadata cache and never seen by another part of
  Citus.
2017-08-08 13:12:06 +03:00
Brian Cloutier 94947c0d54 Refactor: ReplicateShardToAllWorkers more explicitly locks pg_dist_node 2017-08-08 13:12:06 +03:00
Brian Cloutier f87fefa323 Refactor: DistributedTableSize more explicitly only locks pg_dist_node 2017-08-08 13:12:06 +03:00
Brian Cloutier 3769381366 Fix inaccurate comment on SetNodeState 2017-08-08 13:12:06 +03:00
Brian Cloutier bf197e9f0c Add test for super-long cluster names 2017-08-08 11:18:31 +03:00
Brian Cloutier fbecf48a03 Disallow adding primary nodes to non-default clusters 2017-08-08 11:18:31 +03:00
Brian Cloutier 5618e69386 Add pg_dist_node.nodecluster 2017-08-08 11:18:31 +03:00
Brian Cloutier 74ce4faab5 Make multi_cluster_management test more stable 2017-08-08 11:18:31 +03:00
Brian Cloutier e7846ba7d1 Allow metadata sync functions on secondaries
{start,stop}_metadata_sync_to_node now toggle the hasMetadata flag when
run on secondaries but don't attempt to actually sync any metadata.
2017-08-07 18:46:51 +03:00
Marco Slot 4cc7c36596 Simplify metadata lock acquisition for DML 2017-08-07 15:36:58 +02:00
Marco Slot aa7ca81548 Execute UPDATE/DELETE statements with 0 shards 2017-08-07 15:36:58 +02:00
Marco Slot bac60bb64f Function evaluation descends into expression trees 2017-08-06 19:53:05 +02:00
Brian Cloutier 37985de85e master_disable_node no longer crashes when given a non-existant node 2017-08-04 11:14:54 +03:00
Hadi Moshayedi 8229a64fe8 Remove distributed tables' dependency on distribution key columns. (#1527)
This change removes distributed tables' dependency on distribution key columns. We already check that we cannot drop distribution key columns in ErrorIfUnsupportedAlterTableStmt() at multi_utility.c, so we don't need to have distributed table to distribution key column dependency to avoid dropping of distribution key column.

Furthermore, having this dependency causes some warnings in pg_dump --schema-only (See #866), which are not desirable.

This change also adds check to disallow drop of distribution keys when citus.enable_ddl_propagation is set to false. Regression tests are updated accordingly.
2017-08-03 10:07:04 -04:00
Murat Tuncer fa18899cf9 Remove serialization/deserialization of multiplan node (#1477)
introduces copy functions for Citus MultiPlan nodes.
uses ExtensibleNode mechanism to store MultiPlan data
drops serialiazation of MultiPlans
2017-08-02 08:24:00 +03:00
Burak Yucesoy 37b200a52e Fix broken isolation tests
We try to run our isolation tests paralles as much as possible. In
some of those isolation tests we used same table name which causes
problem while running them in paralles. This commit changes table
names in those tests to ensure tests can run in parallel.
2017-07-31 11:11:49 +03:00
Burak Yucesoy 7769f1d012 Refactor distributed table creation logic
This commit is preperation for introducing distributed partitioned
table support. We want to clean and refactor some code in distributed
table creation logic so that we can handle partitioned tables in more
robust way.
2017-07-31 11:11:23 +03:00
Murat Tuncer 520d74b96d Add a regression test for citus.max_task_string_size (#1524) 2017-07-28 10:49:09 -07:00
Brian Cloutier 7d8bcb6a88 These tests sometimes deadlock on travis 2017-07-28 16:02:43 +03:00
Brian Cloutier b20a086a8f master_activate_node UDF also returns noderole 2017-07-28 16:02:43 +03:00
Murat Tuncer 26f020dc6e Make maxTaskStringSize configurable (#1501)
maxTaskStringSize determines the size of worker query string.
It was originally hard coded to a specific value. This has caused
issues at some users. Since it determines initial shared memory
allocation, we did not want to set it to an arbitrary higher number.
Instead made it configurable.

This commit introduces a new GUC variable max_task_string_size

Changes in this variable requires restart to be in effect.
2017-07-27 11:39:12 -07:00
Onder Kalaci 6132d17481 Convert global wait edges to adjacency list
In this commit, we add ability to convert global wait edges
into adjacency list with the following format:
 [transactionId] = [transactionNode->waitsFor {list of waiting transaction nodes}]
2017-07-27 19:53:51 +03:00
Murat Tuncer 8729b7d55a Use cstore_table_size function to determine cstore table size (#1521)
pg_table_size/pg_relation_size variants always return 0 for
cstore tables. We should be using cstore_table_size function
for cstore_tables.
2017-07-27 09:02:07 -07:00
Brian Cloutier 32e16ffe02 Give isolation tester ability to see locks on workers 2017-07-26 18:43:04 +03:00
Eren Başak a12f1980de Add Progress Tracking Infrastructure
This change adds a general purpose infrastructure to log and monitor
process about long running progresses. It uses
`pg_stat_get_progress_info` infrastructure, introduced with PostgreSQL
9.6 and used for tracking `VACUUM` commands.

This patch only handles the creation of a memory space in dynamic shared
memory, putting its info in `pg_stat_get_progress_info`, fetching the
progress monitors on demand and finalizing the progress tracking.
2017-07-26 14:12:15 +03:00
Marco Slot 80ea233ec1 Add function for dumping global wait edges 2017-07-25 16:52:32 +02:00
Marco Slot 81198a1d02 Add function for dumping local wait edges 2017-07-25 16:52:32 +02:00
Onder Kalaci 58faffa42b Fix bug on error check for assigning distributed transaction id
to a backend that has already been assigned a transaction.
2017-07-25 14:58:07 +03:00
Marco Slot 5923334114 Add transaction recovery regression tests 2017-07-24 20:44:38 +02:00
Marco Slot 3d7f79127d Do not release locks in LogTransactionRecord 2017-07-24 20:44:38 +02:00
Brian Cloutier 88702ca58a node_metadata takes out more sane locks
- Never release locks
- AddNodeMetadata takes ShareRowExclusiveLock so it'll conflict with the
  trigger which prevents multiple primary nodes.
- ActivateNode and SetNodeState used to take AccessShareLock, but they
  modify the table so they should take RowExclusiveLock.
- DeleteNodeRow and InsertNodeRow used to take AccessExclusiveLock but
  only need RowExclusiveLock.
2017-07-24 11:57:46 +03:00
Brian Cloutier ec99f8f983 Add nodeRole column
- master_add_node enforces that there is only one primary per group
- there's also a trigger on pg_dist_node to prevent multiple primaries
  per group
- functions in metadata cache only return primary nodes
- Rename ActiveWorkerNodeList -> ActivePrimaryNodeList
- Rename WorkerGetLive{Node->Group}Count()
- Refactor WorkerGetRandomCandidateNode
- master_remove_node only complains about active shard placements if the
  node being removed is a primary.
- master_remove_node only deletes all reference table placements in the
  group if the node being removed is the primary.
- Rename {Node->NodeGroup}HasShardPlacements, this reflects the behavior it
  already had.
- Rename DeleteAllReferenceTablePlacementsFrom{Node->NodeGroup}. This also
  reflects the behavior it already had, but the new signature forces the
  caller to pass in a groupId
- Rename {WorkerGetLiveGroup->ActivePrimaryNode}Count
2017-07-24 11:57:46 +03:00
Brian Cloutier e6c375eb81 Tiny refactor to master_create_empty_shard 2017-07-24 11:57:46 +03:00
Brian Cloutier ee270b65d7 make WorkerGetNodeWithName a static function 2017-07-24 11:57:46 +03:00
Marco Slot 601b17d544 Use distributed transaction number in 2PC identifiers 2017-07-21 17:36:33 +02:00
Marco Slot 18a6e478af Fix typo in GetCurrentDistributedTransctionId 2017-07-21 17:36:33 +02:00
Brian Cloutier 7f1343103e Fix PG 10 build, UNBOUNDED partitions now have different syntax
Update code and tests to match the changes made in pg's d363d42
2017-07-21 14:30:11 +03:00
Brian Cloutier 74dd5bb281 Fix crash when removing an inactive node 2017-07-20 18:55:40 +03:00
Hadi Moshayedi 953df34d22 Explicit switch/case fall-throughs to avoid compiler warnings.
GCC 7 added `-Wimplicit-fallthrough` to warn for not explicitly specified switch/case fall-throughs.

According to https://gcc.gnu.org/gcc-7/changes.html, to suppress that warning we could either use `__attribute__(fallthrough)`, which didn't seem to work for earlier GCC versions, or a `/* fallthrough */` comment just before the following `case`.

Previously Citus code had the fall-through comments inside the brackets, which didn't seem to suppress the warning. Putting a `/* fallthrough */` comment outside the brackets and right before the `case` fixes the problem.
2017-07-19 11:41:59 -04:00
Onder Kalaci 3369f3486f Introduce distributed transaction ids
This commit adds distributed transaction id infrastructure in
the scope of distributed deadlock detection.

In general, the distributed transaction id consists of a tuple
in the form of: `(databaseId, initiatorNodeIdentifier, transactionId,
timestamp)`.

Briefly, we add a shared memory block on each node, which holds some
information per backend (i.e., an array `BackendData backends[MaxBackends]`).
Later, on each coordinated transaction, Citus sends
`SELECT assign_distributed_transaction_id()` right after `BEGIN`.
For that backend on the worker, the distributed transaction id is set to
the values assigned via the function call.

The aim of the above is to correlate the transactions on the coordinator
to the transactions on the worker nodes.
2017-07-18 15:01:42 +03:00
velioglu 6ea15fbb25 Make create_distributed_table transactional 2017-07-18 12:35:40 +03:00
Marco Slot fd72cca6c8 Use predictable placement IDs in regression test output 2017-07-17 13:44:29 +03:00
Onder Kalaci ce8edd88f7 Apply regression test changes that are due to PostgreSQL 10 changes that have recently changed 2017-07-14 13:22:12 +03:00
Brian Cloutier 72d8d2429b Add a test for upgrading shard placements 2017-07-12 14:18:27 +02:00
Brian Cloutier ee4edc498f Don't release locks early in metadata functions 2017-07-12 14:18:27 +02:00
Brian Cloutier f40f03270a Fix locking in ReadWorkerNodes() 2017-07-12 14:18:27 +02:00
Brian Cloutier 7ad95b53d2 Rename pg_dist_shard_placement -> pg_dist_placement
Comes with a few changes:

- Change the signature of some functions to accept groupid
  - InsertShardPlacementRow
  - DeleteShardPlacementRow
  - UpdateShardPlacementState

- NodeHasActiveShardPlacements returns true if the group the node is a
  part of has any active shard placements

- TupleToShardPlacement now returns ShardPlacements which have NULL
  nodeName and nodePort.

- Populate (nodeName, nodePort) when creating ShardPlacements
- Disallow removing a node if it contains any shard placements

- DeleteAllReferenceTablePlacementsFromNode matches based on group. This
  doesn't change behavior for now (while there is only one node per
  group), but means in the future callers should be careful about
  calling it on a secondary node, it'll delete placements on the primary.

- Create concept of a GroupShardPlacement, which represents an actual
  tuple in pg_dist_placement and is distinct from a ShardPlacement,
  which has been resolved to a specific node. In the future
  ShardPlacement should be renamed to NodeShardPlacement.

- Create some triggers which allow existing code to continue to insert
  into and update pg_dist_shard_placement as if it still existed.
2017-07-12 14:17:31 +02:00
Brian Cloutier fe53fd4a8e Remove functions created just for unit testing
These functions are holdovers from pg_shard and were created for unit
testing c-level functions (like InsertShardPlacementRow) which our
regression tests already test quite effectively. Removing because it
makes refactoring the signatures of those c-level functions
unnecessarily difficult.

- create_healthy_local_shard_placement_row
- update_shard_placement_row_state
- delete_shard_placement_row
2017-07-12 14:16:24 +02:00
Brian Cloutier 0b64bb1092 Fix typo in comment in CachedRelationLookup 2017-07-12 14:16:24 +02:00
Brian Cloutier 385d9cbbb7 Ignore generated multi_behavioral_analytics_create_table test files 2017-07-12 14:16:24 +02:00
Marco Slot bf8377082c Use consistent placement IDs in mulity_modyfing_xactstest 2017-07-12 14:16:23 +02:00
Brian Cloutier fd8c142530 Remove unused line, @arguments was set but never used 2017-07-12 13:46:27 +02:00
Marco Slot 9f7e4769e2 Clarify placement connection error messages 2017-07-12 11:59:19 +02:00
Marco Slot d3785b97c0 Remove XactModificationLevel distinction between DML and multi-shard 2017-07-12 11:59:19 +02:00
Marco Slot 710fe8666b Use GetPlacementListConnection for router DML 2017-07-12 11:26:23 +02:00
Marco Slot 29f21fea59 Use GetPlacementListConnection for multi-shard commands 2017-07-12 11:26:22 +02:00
Marco Slot 01c9b1f921 Use GetPlacementListConnection for router SELECTs 2017-07-12 11:26:22 +02:00
Marco Slot 63676f5d65 Allow choosing a connection for multiple placements with GetPlacementListConnection 2017-07-12 11:26:22 +02:00
Jason Petersen 9018e698ec
Indentation cleanup
Uncrustify 0.65 appears to have changed some defaults, resulting in
breakages for those of us who have already upgraded; Travis still uses
Uncrustify 0.64, but these changes work with both versions (assuming
appropriately updated config), so this should permit use of either
version for the time being.
2017-07-11 15:59:28 -06:00
Jason Petersen d896fe7995
Add some test outputs to gitignore
These were bothering me.
2017-07-11 15:37:32 -06:00
Burak Yucesoy a15b3c6df2 Add tests for concurrent INSERT and VACUUM behaviour 2017-07-10 15:46:48 +03:00
Burak Yucesoy c8b9e4011b Remove LockRelationDistributionMetadata function 2017-07-10 15:46:37 +03:00
Burak Yucesoy cb6070c720 Use ShareUpdateExclusiveLock instead ShareLock in VACUUM
Before this change, we used ShareLock to acquire lock on distributed tables while
running VACUUM. This makes VACUUM and INSERT block each other. With this change we
changed lock mode from ShareLock to ShareUpdateExclusiveLock, which does not conflict
with the locks INSERT acquire.
2017-07-10 15:46:19 +03:00
Murat Tuncer 2a4eada150 Replace duplicate code and call check_functions_in_node (#1478)
MasterIrreducibleExpressionWalker has a copied code from
function check_functions_in_node() which was available with
PG 9.6+. Now PG 9.5 support is dropped we can remove
duplicate code and directly call check_functions_in_node().
2017-07-07 10:19:33 +03:00
Marco Slot 31debc96e3 Handle implicit casts in prepared INSERTs 2017-07-06 16:17:35 +02:00
Andres Freund d76b093185 Add tests for statement cancellation. 2017-07-04 14:46:03 -07:00
Andres Freund 3461244539 Don't wait for statement completion when aborting coordinated transaction.
Previously we used ForgetResults() in StartRemoteTransactionAbort() -
that's problematic because there might still be an ongoing statement,
and this causes us to wait for its completion.  That e.g. happens when
a statement running on the coordinator is cancelled.
2017-07-04 14:46:03 -07:00
Andres Freund 0d791f6740 Cancel statements when closing connection at transaction end.
That's important because the currently running statement on a worker
might continue to hold locks and consume resources, even after the
connection is closed.  Unfortunately postgres will only notice closed
connections when reading from / writing to the network. That might
only happen much later.
2017-07-04 14:46:03 -07:00
Andres Freund be8677f926 Add NonblockingForgetResults().
This is very similar to ForgetResults() except that no network IO is
performed. Primarily useful in error handling cases.
2017-07-04 14:46:03 -07:00
Andres Freund 24153fae5d Add ShutdownConnection() which cancels statement before closing connection.
That's primarily useful in error cases, where we want to make sure
locks etc held by commands running on workers are released promptly.
2017-07-04 14:46:03 -07:00
Andres Freund 75a7ddea0d Always use connections in non-blocking mode.
Now that there's no blocking libpq callers left, default to using
non-blocking mode in connection_management.c.  This has two
advantages:
1) Blockiness doesn't have to frequently be reset, simplifying code
2) Prevents accidental use of blocking libpq functions, since they'll
   frequently return 'need IO'
2017-07-04 14:46:03 -07:00
Andres Freund 90a2d13a64 Move multi_copy.c to interrupt aware libpq wrappers. 2017-07-04 14:46:03 -07:00
Andres Freund 21c25abbb1 Move multi_client_executor to interrupt aware libpq wrappers. 2017-07-04 12:38:52 -07:00
Andres Freund ddb0651967 Move citus tools to interrupt aware libpq wrappers. 2017-07-04 12:38:52 -07:00
Andres Freund c674bc8640 Add interrupt aware PQputCopy{End,Data} wrappers. 2017-07-04 12:38:52 -07:00
Andres Freund b7f9679ccc Move interrupt handling code from GetRemoteCommandResult to FinishConnectionIO.
Nearby commits will add additional interrupt handling functions, this
way we don't have to duplicate the code.
2017-07-04 12:38:52 -07:00
Andres Freund ec0ed677e3 Fix copy & pasto in WARNING message. 2017-07-04 12:38:52 -07:00
Andres Freund 3dedeadb5e Fix memory leak in RemoteFinalizedShardPlacementList(). 2017-07-04 12:38:52 -07:00
Andres Freund c161c2fbe3 Fix some trailing whitespace. 2017-07-04 12:38:52 -07:00
Marco Slot 04fe3f03f6 Change implementation of shard_name UDF to get schema-qualified shard name 2017-07-04 10:49:40 +03:00
Marco Slot da47a03b18 Move INSERT ... SELECT planning logic into one place 2017-06-29 15:03:14 +02:00
Onder Kalaci 5f3f1d75a3 Add some utility functions for partitioned tables
This commit is intended to be a base for supporting declarative partitioning
on distributed tables. Here we add the following utility functions and their
unit tests:

  * Very basic functions including differnentiating partitioned tables and
    partitions, listing the partitions
  * Generating the PARTITION BY (expr) and adding this to the DDL events
    of partitioned tables
  * Ability to generate text representations of the ranges for partitions
  * Ability to generate the `ALTER TABLE parent_table ATTACH PARTITION
    partition_table FOR VALUES value_range`
  * Ability to apply add shard ids to the above command using
    `worker_apply_inter_shard_ddl_command()`
  * Ability to generate `ALTER TABLE parent_table DETACH PARTITION`
2017-06-28 09:39:55 +03:00
Andres Freund 535416384c Remove version check from pg_regress_multi.pl
The check is not necessary anymore after f59cf2b818.
2017-06-26 18:07:43 -07:00
Andres Freund 9d7f33be2a Remove 9.5 references from comments in schedule files.
Replace with version-less reference, no point in repeating this for
every release.
2017-06-26 18:04:32 -07:00
Andres Freund 2dfd55070c Remove 9.5 regression test output files. 2017-06-26 12:17:46 -07:00
Andres Freund dc3997c3b8 Remove 9.5 related node wrappers.
Now that all branches support the extensible node infrastructure, we
don't need our wrappers anymore.
2017-06-26 08:46:32 -07:00
Andres Freund b96ba9b490 Fix code only enabled for 9.5.
There's still supporting wrappers used, a subsequent commit will
remove those.

This also removes the already unused tuplecount_t define.
2017-06-26 08:46:32 -07:00
Andres Freund 60c28ce7a6 Remove 9.5 specific C files. 2017-06-26 08:46:32 -07:00
Jason Petersen 2204da19f0 Support PostgreSQL 10 (#1379)
Adds support for PostgreSQL 10 by copying in the requisite ruleutils
and updating all API usages to conform with changes in PostgreSQL 10.
Most changes are fairly minor but they are numerous. One particular
obstacle was the change in \d behavior in PostgreSQL 10's psql; I had
to add SQL implementations (views, mostly) to mimic the pre-10 output.
2017-06-26 02:35:46 -06:00
Andres Freund 4a3b2de4c5 Add some tests checking that maintenance daemon gets started.
The 2nd database one is a bit slow, but also shows something
important, so we might want to keep it?
2017-06-23 11:53:39 -07:00
Andres Freund c3b7c5dc33 Introduce per-database maintenance process.
This will be used for deadlock detection, prepared transaction
recovery amongst others, but currently is just idling around.
2017-06-23 11:53:39 -07:00
Andres Freund 3483bb99eb Minimal infrastructure for per-backend citus initialization. 2017-06-23 11:20:10 -07:00
Andres Freund 1691f780fd Force cache invalidation machinery to be initialized earlier.
Previously it was not guaranteed that invalidations were registered
after creating the extension, only if the extension was used
afterwards.
2017-06-23 11:20:10 -07:00
Andres Freund f645dca593 Centralized metadata_cache cache variables into one struct, to avoid missing resets.
E.g. extensionOwner was already missed.
2017-06-23 11:20:10 -07:00
Marco Slot 04e4b7d82a Fix spuriously failing regression test 2017-06-23 10:06:15 +02:00
Marco Slot 6cafbf9b66 Add weird column name to create_distributed_table test 2017-06-22 16:27:39 +02:00
Marco Slot a6f42e4948 Clarify error message when copying NULL value into table 2017-06-22 15:48:24 +02:00
Marco Slot 2f8ac82660 Execute INSERT..SELECT via coordinator if it cannot be pushed down
Add a second implementation of INSERT INTO distributed_table SELECT ... that is used if
the query cannot be pushed down. The basic idea is to execute the SELECT query separately
and pass the results into the distributed table using a CopyDestReceiver, which is also
used for COPY and create_distributed_table. When planning the SELECT, we go through
planner hooks again, which means the SELECT can also be a distributed query.

EXPLAIN is supported, but EXPLAIN ANALYZE is not because preventing double execution was
a lot more complicated in this case.
2017-06-22 15:46:30 +02:00
Marco Slot 2cd358ad1a Support quoted column-names in COPY logic 2017-06-22 15:45:57 +02:00
Marco Slot 155db4d913 Simplify router planner call path 2017-06-22 15:45:57 +02:00
Murat Tuncer 0c4bf2d943 Remove fall back to select if poll is not available (#1466)
poll is supported on all relevant systems, there is no
need to have a fall back mechanism to use select()
2017-06-22 12:11:18 +03:00
Jason Petersen 294aeff2ed
Don't call PostProcessUtility for local commands
It is intended only to aid in processing of distributed DDL commands,
but as written could execute during local CREATE INDEX CONCURRENTLY
commands.
2017-06-19 15:56:03 -06:00
velioglu a17ab6408a Delete ExecuteRemoteCommand function 2017-06-15 17:11:19 +03:00
velioglu 173fe137af Convert DropShardsFromWorker to the new connection API 2017-06-15 15:24:06 +03:00
velioglu d7b68e5647 Convert TableDDLCommandList function to the new connection API 2017-06-14 17:29:58 +03:00
velioglu 0aa9572e18 Convert RemoteTableOwner function to the new connection API 2017-06-14 17:29:58 +03:00
velioglu 7fe29aad4c Convert worker_fetch_foreign_file to new connection API 2017-06-14 17:29:58 +03:00
velioglu 43d2cdbd35 Convert DistributedTableSizeOnWorker function to new connection API 2017-06-14 17:29:58 +03:00
Marco Slot 56876596d5 Add support for unlogged distributed tables 2017-06-14 13:50:00 +02:00
velioglu a1ea29ec2b Use placement connection to drop shards instead of node connection 2017-06-14 14:14:59 +03:00
Marco Slot 70abfd29d2 Allow COPY after a multi-shard command
This change removes the XactModificationLevel check at the start of COPY
that was made redundant by consistently using GetPlacementConnection.
2017-06-09 13:54:58 +02:00