Commit Graph

6078 Commits (1b66403cb9d7b5eac7a1a153f1d23a41daa52420)

Author SHA1 Message Date
Marco Slot 5a9d31f136
Fix union (all) pushdown issue (#3306)
Fix union (all) pushdown issue
2020-01-02 13:56:06 +01:00
Marco Slot ba39d72fe1 Fix incorrect union all pushdown issue 2020-01-01 09:03:50 +01:00
Hanefi Onaldi 7a909fc807
Add changelog entry for 9.1.2 2019-12-30 11:33:10 +03:00
Jelte Fennema 0cd5d6ac49
Support any inner join on a reference table (#3323)
This PR works by doing two things:
1. Expand the notion of a join condition to any expression that contains
   columns from two or more tables.
2. Support cartesian products on reference tables.

Cartesian products on reference tables are considered in the join order planner
as the least desirable join (except for normal cartesian products). That way
they will be done at the end of the join. This is preferable since the
cartesian product multiplies the rows. By doing it at the end at least these
multiplications of rows will not be sent over the network when doing
repartitioning, only when sending to the master.

Fixes #3079
Fixes #3198
2019-12-27 15:14:50 +01:00
Jelte Fennema cf88bdf833 Add tests for complex joins on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 3a042e4611 Allow cartesian products on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 61e2501645 Make any expression with two or more tables a join expression 2019-12-27 15:05:51 +01:00
Jelte Fennema 4233cd0d9d Allow non equi joins on reference tables 2019-12-27 15:05:51 +01:00
Jelte Fennema 7642928be1
Makefile fix DESTDIR together with cleanup (#3342)
This should fix this build issue: redmine.postgresql.org/issues/5032
2019-12-27 10:34:57 +01:00
Philip Dubé e91755f73c
Merge pull request #3307 from citusdata/group_by_speedup
Do not repeat GROUP BY distribution_column on coordinator
2019-12-25 01:39:56 +00:00
Marco Slot b21b6905ae Do not repeat GROUP BY distribution_column on coordinator
Allow arbitrary aggregates to be pushed down in these scenarios
2019-12-25 01:33:41 +00:00
Philip Dubé 11368451f4
Merge pull request #3344 from citusdata/fix-extension-already-exists-test
Fix tests when hll/topn installed
2019-12-24 21:45:36 +00:00
Philip Dubé a6ffcab59d CREATE EXTENSION is propagated now 2019-12-24 21:04:37 +00:00
Marco Slot ee71b24538
Fix inconsistent shard metadata issue (#3334)
Fix inconsistent shard metadata issue
2019-12-24 13:28:17 +01:00
Hadi Moshayedi 10605f8a26
Merge pull request #3329 from citusdata/predistribute
Partitioned intermediate results
2019-12-24 04:00:43 -08:00
Hadi Moshayedi d7aea7fa10 Implement partitioned intermediate results. 2019-12-24 03:53:39 -08:00
Marco Slot 1aef63abfb
Fix error in distributed queries when shards are on the coordin… (#3308)
Fix error in distributed queries when shards are on the coordinator
2019-12-24 12:14:55 +01:00
Marco Slot a2ddfecd86 Fix inconsistent shard metadata issue 2019-12-24 08:01:32 +01:00
Marco Slot b37ef0e394 Fix error in distributed queries when shards are on the coordinator 2019-12-24 06:36:43 +01:00
Philip Dubé 2349f838a1
Merge pull request #3316 from citusdata/fix-empty-agg-combine
Fix handling of empty intermediate results when distributing custom aggregates
2019-12-23 17:38:52 +00:00
Philip Dubé e9bbdb8f31 Fix handling of empty intermediate results when distributing custom aggregates 2019-12-23 17:27:52 +00:00
Hadi Moshayedi bb6ba89708
Merge pull request #3327 from citusdata/fix_reindent
Fix reindent version inconsistencies.
2019-12-20 08:38:15 -08:00
Philip Dubé f007b7f91d Also fix reindent inconsistencies with fake_fdw.c 2019-12-20 08:27:47 +00:00
Hadi Moshayedi 08eb0ade31 Fix reindent version inconsistencies.
Different versions of reindent tool reformatted citus_custom_scan.c
and citus_copyfuncs.c differently. So some developers spent some
extra attention not to commit these two files after reindent.

This PR tries to address this.
2019-12-19 23:10:34 -08:00
Jelte Fennema b655c02352
Add the necessary changes for rebalance strategies on enterprise (#3325)
This commit adds the SQL and C changes necessary to support custom rebalance
strategies in the Enterprise version of Citus.
2019-12-19 15:23:08 +01:00
Hadi Moshayedi c9ceff7d78
Merge pull request #3318 from citusdata/fetch_intermediate_results
Implement fetch_intermediate_results
2019-12-18 10:51:56 -08:00
Hadi Moshayedi ef487e0792 Implement fetch_intermediate_results 2019-12-18 10:46:35 -08:00
Onur TIRTIR eb3c1b4eb4
Add changelog entry for 9.1.1 (#3321) 2019-12-18 15:32:48 +03:00
Hadi Moshayedi e96201c609
Merge pull request #3304 from citusdata/read_intermediate_results
Implement read_intermediate_results
2019-12-17 14:08:39 -08:00
Hadi Moshayedi 249508d267 Estimate cost of read_intermediate_results() 2019-12-17 13:51:51 -08:00
Hadi Moshayedi 113bd1e5f1 Implement read_intermediate_results 2019-12-17 13:51:16 -08:00
SaitTalhaNisanci 7ff4ce2169
Add adaptive executor support for repartition joins (#3169)
* WIP

* wip

* add basic logic to run a single job with repartioning joins with adaptive executor

* fix some warnings and return in ExecuteDependedTasks if there is none

* Add the logic to run depended jobs in adaptive executor

The execution of depended tasks logic is changed. With the current
logic:
- All tasks are created from the top level task list.
- At one iteration:
	- CurTasks whose dependencies are executed are found.
	- CurTasks are executed in parallel with adapter executor main
logic.
- The iteration is repeated until all tasks are completed.

* Separate adaptive executor repartioning logic

* Remove duplicate parts

* cleanup directories and schemas

* add basic repartion tests for adaptive executor

* Use the first placement to fetch data

In task tracker, when there are replicas, we try to fetch from a replica
for which a map task is succeeded. TaskExecution is used for this,
however TaskExecution is not used in adaptive executor. So we cannot use
the same thing as task tracker.

Since adaptive executor fails when a map task fails (There is no retry
logic yet). We know that if we try to execute a fetch task, all of its
map tasks already succeeded, so we can just use the first one to fetch
from.

* fix clean directories logic

* do not change the search path while creating a udf

* Enable repartition joins with adaptive executor with only enable_reparitition_joins guc

* Add comments to adaptive_executor_repartition

* dont run adaptive executor repartition test in paralle with other tests

* execute cleanup only in the top level execution

* do cleanup only in the top level ezecution

* not begin a transaction if repartition query is used

* use new connections for repartititon specific queries

New connections are opened to send repartition specific queries. The
opened connections will be closed at the FinishDistributedExecution.

While sending repartition queries no transaction is begun so that
we can see all changes.

* error if a modification was done prior to repartition execution

* not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level

* fix cleanup logic

* update tests

* add missing function comments

* add test for transaction with DDL before repartition query

* do not close repartition connections in adaptive executor

* rollback instead of commit in repartition join test

* use close connection instead of shutdown connection

* remove unnecesary connection list, ensure schema owner before removing directory

* rename ExecuteTaskListRepartition

* put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case

* split adaptive executor repartition to DAG execution logic and repartition logic

* apply review items

* apply review items

* use an enum for remote transaction state and fix cleanup for repartition

* add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction

* fix style

* wip

* rename removejobdir to partition cleanup

* do not close connections at the end of repartition queries

* do repartition cleanup in pg catch

* apply review items

* decide whether to use transaction or not at execution creation

* rename isOutsideTransaction and add missing comment

* not error in pg catch while doing cleanup

* use replication factor of the creation time, not current time to decide if task tracker should be chosen

* apply review items

* apply review items

* apply review item
2019-12-17 19:09:45 +03:00
Marco Slot 8cea662f17
Use any available non-data connection for intermediate results (#3301)
Use any available non-data connection for intermediate results
2019-12-17 12:22:59 +01:00
Onur TIRTIR 8092529a2c
Split propagate extension test and add alternative output (#3314)
* Split extension name tests from propagate_extension_commands.sql

* Add alternative output for escape_extension_name.sql
2019-12-17 13:49:16 +03:00
Marco Slot 2f568ad5a5 Forbid using connections that sent intermediate results for data access and vice versa 2019-12-17 11:49:13 +01:00
Marco Slot 5aec71855a
Clean up transaction block usage logic in adaptive executor (#3288)
Clean up transaction block usage logic in adaptive executor
2019-12-17 11:35:57 +01:00
Marco Slot f4031dd477 Clean up transaction block usage logic in adaptive executor 2019-12-17 10:48:19 +01:00
Nils Dijk bfc3d2eb90
make sure to correctly decrement ExecutorLevel (#3311)
DESCRIPTION: Fix counter that keeps track of internal depth in executor

While reviewing #3302 I ran into the `ExecutorLevel` variable which used a variable to keep the original value to restore on successful exit. I haven't explored the full space and if it is possible to get into an inconsistent state. However using `PG_TRY`/`PG_CATCH` seems generally more correct.

Given very bad things will happen if this level is not reset, I kept the failsafe of setting the variiable back to 0 on the `XactCallback` but I did add an assert to treat it as a developer bug.
2019-12-16 20:50:13 +01:00
Marco Slot f90bbc64f6
Fix a crash when calling a distributed function from PL/pgSQL (#3302)
Fix a crash when calling a distributed function from PL/pgSQL
2019-12-16 19:03:45 +01:00
SaitTalhaNisanci 97bfd0bba0
add circleci build status (#3310) (#3309) 2019-12-16 19:25:32 +03:00
Marco Slot 5f656e22db Fix issue in IsMultiStatementTransaction detection 2019-12-16 17:01:43 +01:00
SaitTalhaNisanci e3db433ec1
add circleci build status (#3310) 2019-12-16 17:46:36 +03:00
SaitTalhaNisanci 2829c601dd
replace Begin words in coordinated transactions with use (#3293) 2019-12-16 10:40:31 +03:00
SaitTalhaNisanci a2f2107e6a
refactor MapTaskList in multi physical planner (#3297) 2019-12-13 22:41:49 +03:00
Marco Slot 3b6b3f8c48
Fix crash in IN (.., NULL) queries (#3299)
Fix crash in IN (.., NULL) queries
2019-12-13 18:49:36 +01:00
Marco Slot 1633123d78 Fix crash in IN (NULL) queries 2019-12-13 08:35:54 +01:00
Hadi Moshayedi 7666d02537
Merge pull request #3294 from citusdata/fix_typos
Fix some typos from #3280
2019-12-12 13:35:38 -08:00
Hadi Moshayedi e7a6cc0801 Fix some typos from #3280 2019-12-12 13:29:26 -08:00
SaitTalhaNisanci 420e21919b
refactor extract distributed insert values rte (#3287) 2019-12-12 23:47:44 +03:00
Marco Slot 7447dfe156
Fix error in DML with NULL expression in where clause (#3262)
Fix error in DML with NULL expression in where clause
2019-12-12 17:25:23 +01:00