Commit Graph

4710 Commits (bd5544fdc5829d2026073cd67babce6910844b50)

Author SHA1 Message Date
SaitTalhaNisanci ebda3eff61
read database name inside the function (#3730) 2020-04-09 13:11:13 +03:00
SaitTalhaNisanci 233e4a24d1
use local execution within transaction block (#3714)
* use local executon when in a transaction block

When we are inside a transaction block, there could be other methods
that need local execution, therefore we will use local execution in a
transaction block.

* update test outputs with transaction block local execution

* add a test to verify we dont leak intermediate schemas
2020-04-09 12:41:58 +03:00
SaitTalhaNisanci fa88046ce1
test that we don't leak intermediate schemas (#3737)
* test that we don't leak intermediate schemas

We have tests to make sure that we don't intermediate any intermediate
files, tables etc but we don't test if we are leaking schemas. It makes
sense to test this as well.

* remove all repartition schemas in case of error

This solution is not an ideal one but it seems to be doing the job.
We should have a more generic solution for the cleanup but it seems that
putting the cleanup in the abort handler is dangerous and it was
crashing.
2020-04-09 12:17:41 +03:00
SaitTalhaNisanci 362d72853c
return early in ExecuteTaskListExtended (#3738)
It is possible to return an error in ExecuteTaskListExtended after
performing local execution with the current structure. However there is
no point in execution the local tasks if we are going to return an error
later. So the local execution is moved after the error check.
2020-04-09 10:10:49 +03:00
Hadi Moshayedi 117233c1e0
Merge pull request #3736 from citusdata/remove_todo
Remove todo from reference_table_utils
2020-04-08 12:54:48 -07:00
Hadi Moshayedi cd877f3fdd
Merge pull request #3637 from citusdata/defer_reference_table_replication_copy
Defer reference table replication
2020-04-08 12:54:04 -07:00
Hadi Moshayedi 9b8802ba2d Remove todo from reference_table_utils 2020-04-08 12:46:55 -07:00
Hadi Moshayedi dda53a0bba GUC for replicate reference tables on activate. 2020-04-08 12:42:45 -07:00
Hadi Moshayedi c168a53ebc Tests for replicate_reference_tables 2020-04-08 12:41:36 -07:00
Hadi Moshayedi acfa850c38 Make multi_replicate_reference_table check-base friendly 2020-04-08 12:41:36 -07:00
Hadi Moshayedi 0758a81287 Prevent reference tables being dropped when replicating reference tables 2020-04-08 12:41:36 -07:00
Marco Slot 924cd7343a Defer reference table replication to shard creation time 2020-04-08 12:41:36 -07:00
Philip Dubé 76a8a3c7c9
Merge pull request #3719 from citusdata/stricter-trigger-checks
Verify trigger relation before reading old/new tuples
2020-04-07 16:18:36 +00:00
Philip Dubé 26797bfb94 Verify trigger relation before reading old/new tuples
master_dist_placement_cache_invalidate: bail when triggering on pg_dist_shard_placement
2020-04-07 15:39:31 +00:00
Önder Kalacı 9fb83d6e5d
Merge pull request #3703 from citusdata/get_rid_of_side_channel
Move connection establishment for intermediate results after query execution
2020-04-07 17:21:30 +02:00
Önder Kalacı 70012dfd33 Do not error when an intermediate file does not exit (#3707)
When the file does not exist, it could mean two different things.
First -- and a lot more common -- case is that a failure happened
in a concurrent backend on the same distributed transaction. And,
one of the backends in that transaction has already been roll
backed, which has already removed the file. If we throw an error
here, the user might see this error instead of the actual error
message. Instead, we prefer to WARN the user and pretend that the
file has no data in it. In the end, the user would see the actual
error message for the failure.

Second, in case of any bugs in intermediate result broadcasts,
we could try to read a non-existing file. That is most likely
to happen during development. Thus, when asserts enabled, we throw
an error instead of WARNING so that the developers cannot miss.
2020-04-07 17:06:55 +02:00
Onder Kalaci a695b44ce9 Add new regression tests 2020-04-07 17:06:55 +02:00
Onder Kalaci 4b3d17f466 Make sure that tests are not failing randomly 2020-04-07 17:06:55 +02:00
Onder Kalaci 4f7c902c6c Move connection establishment for intermediate results after query execution
When we have a query like the following:

```SQL
WITH a AS (SELECT * FROM foo LIMIT 10) SELECT max(x) FROM a JOIN bar 2 USING (y);
```

Citus currently opens side channels for doing the
	`COPY "1_1"` FROM STDIN (format 'result')

before starting the execution of
	`SELECT * FROM foo LIMIT 10`

Since we need at least 1 connection per worker to do
	`SELECT * FROM foo LIMIT 10`
We need to have 2 connections to worker in order to broadcast the results.

However, we don't actually send a single row over the side channel until the
execution of `SELECT * FROM foo LIMIT 10` is completely done (and connections
unclaimed) and the results are written to a tuple store. We could actually
reuse the same connection for doing the `COPY "1_1"` FROM STDIN (format 'result').

This also fixes the issue that Citus doesn't obey `citus.max_adaptive_executor_pool_size`
when the query includes an intermediate result.
2020-04-07 17:06:55 +02:00
Onder Kalaci 721daec9a5 Move the logic that initilize connections/local files into a function 2020-04-07 17:06:55 +02:00
Onder Kalaci 9b29a32d7a Remove all references for side channel connections
We don't need any side channel connections. That is actually
problematic in the sense that it creates extra connections.
Say, citus.max_adaptive_executor_pool_size equals to 1, Citus
ends up using one extra connection for the intermediate results.
Thus, not obeying citus.max_adaptive_executor_pool_size.

In this PR, we remove the following entities from the codebase
to allow further commits to implement not requiring extra connection
for the intermediate results:

- The connection flag REQUIRE_SIDECHANNEL
- The function GivePurposeToConnection
- The ConnectionPurpose struct and related fields
2020-04-07 17:06:55 +02:00
Hanefi Onaldi e31dcff178
Merge pull request #3666 from citusdata/size-functions-without-locks
Remove metadata locks from size functions
2020-04-07 18:02:39 +03:00
Hanefi Onaldi 1d22d0c2ff
Remove metadata locks from size functions 2020-04-07 17:37:15 +03:00
SaitTalhaNisanci 0430b568be
explicitly return false if transaction connected to local node (#3715)
* explicitly return false if transaction connected to local node

* not set TransactionConnectedToLocalGroup if we are writing to a file

We use TransactionConnectedToLocalGroup to prevent local execution from
happening as that might cause visibility problems. As files are visible
to all transactions, we shouldn't set this variable if we are writing to
a file.
2020-04-07 17:30:34 +03:00
Marco Slot 225adbc7ac
Merge pull request #3720 from citusdata/fix/intermediate_result_pruning
Simplify and fix issues in intermediate result pruning
2020-04-07 11:20:07 +02:00
Marco Slot 2632343f64 Fix intermediate result pruning for INSERT..SELECT 2020-04-07 11:07:49 +02:00
Marco Slot 84672c3dbd Simplify intermediate result pruning logic 2020-04-07 10:53:29 +02:00
SaitTalhaNisanci a710b3cdc5
fix null tupleStoreState case in ExecuteLocalTaskListExtended (#3711)
In case we don't care about the tupleStoreState in
ExecuteLocalTaskListExtended, it could be passed as null. In that case
we will get a seg error. This changes it so that a dummy tuple store
will be created when it is null.

Do not use local execution in ExecuteTaskListOutsideTransaction.
As we are going to run the tasks outside transaction, we shouldn't use local execution.
However, there is some problem when using local execution related to
repartition joins, when we solve that problem, we can execute the tasks
coming to this path with local execution.

Also logging the local command is simplified.

normalize job id in worker_hash_partition_table in test outputs.
2020-04-07 11:47:09 +03:00
SaitTalhaNisanci a369f9001d
fix incorrect groupid or nodeid (#3710)
For shardplacements, we were setting nodeid, nodename, nodeport and
nodegroup manually. This makes it very error prone, and it seems that we
already forgot to set some of them. This would mean that they would have
their default values, e.g group id would be 0 when its group id is not
0.

So the implication is that we would have inconsistent worker metadata.

A new method is introduced, and we call the method to set those fields
now, so that as long as we call this method, we won't be setting
inconsistent metadata.

It probably makes sense to have a struct for these fields. We already
have NodeMetadata but it doesn't have nodename or nodeport. So that
could be done over another refactor to make things simpler.
2020-04-07 11:14:14 +03:00
Philip Dubé ec734a643b
Merge pull request #3722 from citusdata/optimistic-duplicate-grouping
Duplicate grouping on worker whenever possible
2020-04-06 21:31:23 +00:00
Philip Dubé 4860e11561 Duplicate grouping on worker whenever possible
This is possible whenever we aren't pulling up intermediate rows

We want to do this because this was done in 9.2,
some queries rely on the performance of grouping causing distinct values

This change was introduced when implementing window functions on coordinator
2020-04-06 18:51:30 +00:00
Philip Dubé 6a6d5af8a3
Merge pull request #3403 from citusdata/fix-rollback-savepoint-hang
Check connections from connection_placement before polling
2020-04-06 18:04:03 +00:00
Philip Dubé b01bae5937 Check connections from connection_placement before polling 2020-04-06 17:45:44 +00:00
SaitTalhaNisanci cd3e499834
not log in debug level in null parameters (#3718)
The purpose of null_parameters is to make sure that citus doesn't crash
with null parameters. (The related issue is #3493.) The logs in this
file are not that important and they are flaky. The flakiness is related
to postgres part as well so it is hard to reproduce them. Therefore it
makes sense to decrease the log level.
2020-04-06 17:59:46 +03:00
SaitTalhaNisanci 3d3605be80
simplify vacuum test and fix the flakiness (#3704)
look at sent commands to simplify complex logic in vacuum test

also normalize connection id as that can differ when we don't have to
choose a specific connection.
2020-04-03 21:39:54 +03:00
Onur Tirtir a0f95c5b70
Merge pull request #3701 from citusdata/refactor/planner-ref-rte
Do not traverse query tree one more time in distributed planner
2020-04-03 18:40:10 +03:00
Onur Tirtir 4c95ad1579 do not traverse parse tree in distributed planner one more time 2020-04-03 18:24:48 +03:00
Onur Tirtir abdabbedb2 refactor distributed_planner.c 2020-04-03 18:24:41 +03:00
Onur Tirtir 13a35c6813 implement GetOnlyShardOidOfReferenceTable and some refactor in shard_uitls 2020-04-03 18:24:13 +03:00
Jelte Fennema 459a4829ae
Fix isolation tests on OSX (#3706)
* Don't print out comments in make output

* Remove empty lines with sed
2020-04-03 16:28:06 +02:00
SaitTalhaNisanci 32156dbf5c
fix flaky log statement in null_parameters (#3705)
It seems that sometimes the pruning is deferred and sometimes not with
this statement. What we care in this test is to see that it doesn't
crash. I think we don't care about the log statement for this line. So
it makes sense to not log this statement, and care about the result.
2020-04-03 17:01:59 +03:00
Hanefi Onaldi 7e682cd5e8
Merge pull request #3700 from citusdata/bump-migration-version
Remove migration paths to 9.3-1, introduce 9.3-2
2020-04-03 13:02:58 +03:00
Hanefi Önaldı d1223bd6cc
Remove migration paths to 9.3-1, introduce 9.3-2 2020-04-03 12:50:45 +03:00
SaitTalhaNisanci 710970407f
not wait forever in multi_extension test (#3702) 2020-04-03 12:21:02 +03:00
SaitTalhaNisanci 659283c9a7
fix multi utilities vacuum test (#3699) 2020-04-03 11:50:00 +03:00
Marco Slot 9b26dcaf31
Merge pull request #3680 from citusdata/fix/nextval
Evaluate nextval in the target list on the coordinator
2020-04-02 16:23:32 +02:00
Marco Slot fd8cdb92f4 Evaluate nextval in the target list on the coordinator 2020-04-02 02:53:19 +02:00
SaitTalhaNisanci d80baa3557
Merge pull request #3636 from citusdata/enh/localShardCreationExecution
add local shard creation support
2020-04-01 18:30:50 +03:00
SaitTalhaNisanci df88ab71b6 normalize assign_distributed_transaction_id in tests 2020-04-01 18:23:16 +03:00
SaitTalhaNisanci 0aebd78ea7 use localExecution in ExecuteTaskListExtended
ExecuteTaskListExtended is the common method for different codepaths,
and instead of writing separate local execution logics in different
codepaths, it makes more sense to have the logic here. We still need to
do some refactoring, this is an initial step.

After this commit, we can run create shard commands locally. There is a
special case with shard creation commands. A create shard command might
have a concatenated query string, however local execution did not know
how to execute a task with multiple query strings. This is also
implemented in this commit. We go over each query in the concatenated
query string and plan/execute them one by one.

A more clean solution to this would be to make sure that each task has a
single query. We currently cannot do that because we need to ensure the
task dependencies. However, it would make sense to do that at some point
and it would simplify the code a lot.
2020-04-01 18:23:16 +03:00