Commit Graph

99 Commits (25767dcf3c2d8c972ab89f2f17a04334d90605af)

Author SHA1 Message Date
naisila 25767dcf3c Handles EXPLAIN output diffs in PG15: enable_group_by_reordering
Relevant PG commit
db0d67db2401eb6238ccc04c6407a4fd4f985832
2022-08-22 16:54:38 +03:00
Marco Slot cff013a057 Fix issues with insert..select casts and column ordering 2022-07-28 13:23:57 +02:00
Naisila Puka 7d6410c838
Drop postgres 12 support (#6040)
* Remove if conditions with PG_VERSION_NUM < 13

* Remove server_above_twelve(&eleven) checks from tests

* Fix tests

* Remove pg12 and pg11 alternative test output files

* Remove pg12 specific normalization rules

* Some more if conditions in the code

* Change RemoteCollationIdExpression and some pg12/pg13 comments

* Remove some more normalization rules
2022-07-20 17:49:36 +03:00
Marco Slot c0827703ec Fix EXPLAIN ANALYZE JSON format for subplans 2022-04-07 11:38:20 +02:00
Marco Slot 544dce919a Handle user-defined type parameters in EXPLAIN ANALYZE 2022-04-07 11:14:32 +02:00
Marco Slot 49467e27e6
Ensure worker_save_query_explain_analyze always fully qualifies types (#5776)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-03-10 07:30:11 -08:00
Philip Dubé cc50682158 Fix typos. Spurred spotting "connectios" in logs 2021-10-25 13:54:09 +00:00
Marco Slot 2206e64e42 Disable single-repartition joins for append tables 2021-10-18 21:11:16 +02:00
Marco Slot 386d2567d4 Reduce reliance on append tables in regression tests 2021-10-08 21:27:14 +02:00
Jelte Fennema bb5c494104 Enable binary encoding by default on PG14
Since PG14 we can now use binary encoding for arrays and composite types
that contain user defined types. This was fixed in this commit in
Postgres: 670c0a1d47

This change starts using that knowledge, by not necessarily falling back
to text encoding anymore for those types.

While doing this and testing a bit more I found various cases where
binary encoding would fail that our checks didn't cover. This fixes
those cases and adds tests for those. It also fixes EXPLAIN ANALYZE
never using binary encoding, which was a leftover of workaround that
was not necessary anymore.

Finally, it changes the default for both `citus.enable_binary_protocol`
and `citus.binary_worker_copy_format` to `true` for PG14 and up. In our
cloud offering `binary_worker_copy_format` already was true by default.
`enable_binary_protocol` had some bug with MX and user defined types,
this bug was fixed by the above mentioned fixes.
2021-09-06 10:27:29 +02:00
Sait Talha Nisanci dc81cae18f Turn off COSTS to avoid alternative output for pg14 2021-09-03 15:41:28 +03:00
Ahmet Gedemenli d530d79d73 Fix tests for public schema 2021-04-06 10:29:17 +03:00
SaitTalhaNisanci 03832f353c Drop postgres 11 support 2021-03-25 09:20:28 +03:00
Onur Tirtir 9f619a85d6
Fix EXPLAIN ANALYZE exec when query returns no cols (#4672)
We do not include dummy column if original task didn't return any
columns.
Otherwise, number of columns that original task returned wouldn't
match number of columns returned by worker_save_query_explain_analyze.
2021-02-10 17:59:47 +03:00
Marco Slot f2056e553f
Expose partition column of subqueries in optimizer (#4355)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2020-12-18 20:32:52 +01:00
Onur Tirtir 0556952607
Normalize partitioned table aliases in explain output (#4295)
Aliases that postgres choose for partitioned tables in explain output
might change in different pg versions, so normalize them and remove
the alternative test output
2020-11-06 10:44:01 +03:00
Hanefi Önaldı 6d8e83d24f
Replace worker_hash calls with partkey IS NOT NULL filters 2020-10-02 18:16:24 +03:00
Ahmet Gedemenli abfb79bda6 Sort explain analyze output by task time
Add sort method parameter for regression tests

Fix check-style

Change sorting method parameters to enum

Polish

Add task fields to OutTask

Add test into multi_explain

Fix isolation test
2020-09-24 11:38:40 +03:00
Marco Slot c9d46c618b Fix EXPLAIN ANALYZE truncation 2020-09-17 14:42:21 +02:00
Önder Kalacı 983206c5e1
Hide `citus.subquery_pushdown` flag and NOTICE when enabled (#4124)
* Hide citus.subquery_pushdown flag

This flag is dangerous and could likely to let queries
return wrong results.

The flag has a very specific purpose for a very specific
data distribution and query structure. In those cases, when
the flag is set, the user can skip recursive planning altogether
*at their own risk*.

The meaning of the flag is that "I know what I'm doing such that
the query structure/data distribution is on my control, so Citus
can skip many correctness checks".

For regular users, enabling this flag is discouraged. We have to
keep the support only for backward compatibility for some users.

In addition to that, give a NOTICE to discourage new users to
use it.
2020-08-28 14:53:09 +02:00
SaitTalhaNisanci 2459ba6eca
Update docker images (#4122)
* Update and separate test images

The build image was a single one and it would contain pg11, pg12 and
pg13. Now it is separated so that we can build each pg major
independently.

Tags are used as full postgres versions so that we can know which
version we use by looking at the tag. For example exttester:11.9 would
mean we are using pg11.9.

pg11 is updated from 11.5 to 11.9.
pg12 is updated from 12rc to 12.4.

* Ignore memory usage in pg13 explain

* Use citus instead of personal repo
2020-08-26 16:23:59 +03:00
Hadi Moshayedi 7b74eca22d Support EXPLAIN EXECUTE ANALYZE. 2020-08-10 13:44:30 -07:00
Sait Talha Nisanci fe4ac51d8c Normalize Output:.. since it changes with pg13
Fix indentation for better readability
2020-08-04 15:38:13 +03:00
SaitTalhaNisanci b3af63c8ce
Remove task tracker executor (#3850)
* use adaptive executor even if task-tracker is set

* Update check-multi-mx tests for adaptive executor

Basically repartition joins are enabled where necessary. For parallel
tests max adaptive executor pool size is decresed to 2, otherwise we
would get too many clients error.

* Update limit_intermediate_size test

It seems that when we use adaptive executor instead of task tracker, we
exceed the intermediate result size less in the test. Therefore updated
the tests accordingly.

* Update multi_router_planner

It seems that there is one problem with multi_router_planner when we use
adaptive executor, we should fix the following error:
+ERROR:  relation "authors_range_840010" does not exist
+CONTEXT:  while executing command on localhost:57637

* update repartition join tests for check-multi

* update isolation tests for repartitioning

* Error out if shard_replication_factor > 1 with repartitioning

As we are removing the task tracker, we cannot switch to it if
shard_replication_factor > 1. In that case, we simply error out.

* Remove MULTI_EXECUTOR_TASK_TRACKER

* Remove multi_task_tracker_executor

Some utility methods are moved to task_execution_utils.c.

* Remove task tracker protocol methods

* Remove task_tracker.c methods

* remove unused methods from multi_server_executor

* fix style

* remove task tracker specific tests from worker_schedule

* comment out task tracker udf calls in tests

We were using task tracker udfs to test permissions in
multi_multiuser.sql. We should find some other way to test them, then we
should remove the commented out task tracker calls.

* remove task tracker test from follower schedule

* remove task tracker tests from multi mx schedule

* Remove task-tracker specific functions from worker functions

* remove multi task tracker extra schedule

* Remove unused methods from multi physical planner

* remove task_executor_type related things in tests

* remove LoadTuplesIntoTupleStore

* Do initial cleanup for repartition leftovers

During startup, task tracker would call TrackerCleanupJobDirectories and
TrackerCleanupJobSchemas to clean up leftover directories and job
schemas. With adaptive executor, while doing repartitions it is possible
to leak these things as well. We don't retry cleanups, so it is possible
to have leftover in case of errors.

TrackerCleanupJobDirectories is renamed as
RepartitionCleanupJobDirectories since it is repartition specific now,
however TrackerCleanupJobSchemas cannot be used currently because it is
task tracker specific. The thing is that this function is a no-op
currently.

We should add cleaning up intermediate schemas to DoInitialCleanup
method when that problem is solved(We might want to solve it in this PR
as well)

* Revert "remove task tracker tests from multi mx schedule"

This reverts commit 03ecc0a681.

* update multi mx repartition parallel tests

* not error with task_tracker_conninfo_cache_invalidate

* not run 4 repartition queries in parallel

It seems that when we run 4 repartition queries in parallel we get too
many clients error on CI even though we don't get it locally. Our guess
is that, it is because we open/close many connections without doing some
work and postgres has some delay to close the connections. Hence even
though connections are removed from the pg_stat_activity, they might
still not be closed. If the above assumption is correct, it is unlikely
for it to happen in practice because:
- There is some network latency in clusters, so this leaves some times
for connections to be able to close
- Repartition joins return some data and that also leaves some time for
connections to be fully closed.

As we don't get this error in our local, we currently assume that it is
not a bug. Ideally this wouldn't happen when we get rid of the
task-tracker repartition methods because they don't do any pruning and
might be opening more connections than necessary.

If this still gives us "too many clients" error, we can try to increase
the max_connections in our test suite(which is 100 by default).

Also there are different places where this error is given in postgres,
but adding some backtrace it seems that we get this from
ProcessStartupPacket. The backtraces can be found in this link:
https://circleci.com/gh/citusdata/citus/138702

* Set distributePlan->relationIdList when it is needed

It seems that we were setting the distributedPlan->relationIdList after
JobExecutorType is called, which would choose task-tracker if
replication factor > 1 and there is a repartition query. However, it
uses relationIdList to decide if the query has a repartition query, and
since it was not set yet, it would always think it is not a repartition
query and would choose adaptive executor when it should choose
task-tracker.

* use adaptive executor even with shard_replication_factor > 1

It seems that we were already using adaptive executor when
replication_factor > 1. So this commit removes the check.

* remove multi_resowner.c and deprecate some settings

* remove TaskExecution related leftovers

* change deprecated API error message

* not recursively plan single relatition repartition subquery

* recursively plan single relation repartition subquery

* test depreceated task tracker functions

* fix overlapping shard intervals in range-distributed test

* fix error message for citus_metadata_container

* drop task-tracker deprecated functions

* put the implemantation back to worker_cleanup_job_schema_cachesince citus cloud uses it

* drop some functions, add downgrade script

Some deprecated functions are dropped.
Downgrade script is added.
Some gucs are deprecated.
A new guc for repartition joins bucket size is added.

* order by a test to fix flappiness
2020-07-18 13:11:36 +03:00
Hadi Moshayedi 4ed59d2db3 Move more from insert_select_executor to insert_select_planner 2020-06-26 08:08:26 -07:00
Jelte Fennema 64506143e4
Replace flaky repartition analyze test with a non flaky one (#3950)
The flaky test was introduced in #3941. This removes that flaky test and
adds a new one that fails in the same manner when removing the fix in #3941.

An example of a random failure can be found here:
https://app.circleci.com/pipelines/github/citusdata/citus/9558/workflows/de76e7a5-6558-46c9-97e7-8b1dae1f173b/jobs/135876/steps
2020-06-25 15:19:15 +02:00
Jelte Fennema 0259815d3a
Fix EXPLAIN ANALYZE received data counter issues (#3917)
In #3901 the "Data received from worker(s)" sections were added to EXPLAIN
ANALYZE. After merging @pykello posted some review comments. This addresses
those comments as well as fixing a other issues that I found while addressing 
them. The things this does:

1. Fix `EXPLAIN ANALYZE EXECUTE p1` to not increase received data on every
   execution
2. Fix `EXPLAIN ANALYZE EXECUTE p1(1)` to not return 0 bytes as received data
   allways.
3. Move `EXPLAIN ANALYZE` specific logic to `multi_explain.c` from
   `adaptive_executor.c`
4. Change naming of new explain sections to `Tuple data received from node(s)`.
   Firstly because a task can reference the coordinator too, so "worker(s)" was
   incorrect. Secondly to indicate that this is tuple data and not all network
   traffic that was performed.
5. Rename `totalReceivedData` in our codebase to `totalReceivedTupleData` to
   make it clearer that it's a tuple data counter, not all network traffic.
6. Actually add `binary_protocol` test to `multi_schedule` (woops)
7. Fix a randomly failing test in `local_shard_execution.sql`.
2020-06-17 11:33:38 +02:00
Jelte Fennema 927de6d187
Show amount of data received in EXPLAIN ANALYZE (#3901)
Sadly this does not actually work yet for binary protocol data, because
when doing EXPLAIN ANALYZE we send two commands at the same time. This
means we cannot use `SendRemoteCommandParams`, and thus cannot use the
binary protocol. This can still be useful though when using the text
protocol, to find out that a lot of data is being sent.
2020-06-15 16:01:05 +02:00
Hadi Moshayedi 0e3140c14d Include execution duration in worker_last_saved_explain_analyze 2020-06-11 02:54:54 -07:00
Hadi Moshayedi 7c52c6edb0 CTE statistics in EXPLAIN ANALYZE 2020-06-11 02:39:59 -07:00
Hadi Moshayedi 1f6d6ee4a5 Show query text in EXPLAIN output 2020-06-11 02:19:55 -07:00
Hadi Moshayedi bb96ef5047 Does the EXPLAIN ANALYZE at the same time as execution, so avoids executing twice.
We wrap worker tasks in worker_save_query_explain_analyze() so we can fetch
their explain output later by a call worker_last_saved_explain_analyze().

Fixes #3519
Fixes #2347
Fixes #2613
Fixes #621
2020-06-11 01:55:57 -07:00
Hadi Moshayedi 6ca621bd16 Test we don't support multi-shard EXPLAIN EXECUTE 2020-06-10 17:11:27 -07:00
Hadi Moshayedi 5cdfa9f571 Implement EXPLAIN ANALYZE udfs.
Implements worker_save_query_explain_analyze and worker_last_saved_explain_analyze.

worker_save_query_explain_analyze executes and returns results of query while
saving its EXPLAIN ANALYZE to be fetched later.

worker_last_saved_explain_analyze returns the saved EXPLAIN ANALYZE result.
2020-06-09 10:02:05 -07:00
Hadi Moshayedi 45a41e249f Test EXPLAIN ANALYZE doesn't show repartition join tasks 2020-06-06 23:24:45 -07:00
Philip Dubé 4860e11561 Duplicate grouping on worker whenever possible
This is possible whenever we aren't pulling up intermediate rows

We want to do this because this was done in 9.2,
some queries rely on the performance of grouping causing distinct values

This change was introduced when implementing window functions on coordinator
2020-04-06 18:51:30 +00:00
Philip Dubé 720525cfda Add support for window functions on coordinator
Some refactoring:
Consolidate expression which decides whether GROUP BY/HAVING are pushed down
Rename early pullUpIntermediateRows to hasNonDistributableAggregates
Create WorkerColumnName to handle formatting WORKER_COLUMN_FORMAT
Ignore NULL StringInfo pointers to SafeToPushdownWindowFunction
Fix bug where SubqueryPushdownMultiNodeTree mutates supplied Query,
	SafeToPushdownWindowFunction requires the original query as it relies on rtable
2020-03-25 15:31:20 +00:00
Marco Slot 5b1d1dd413 Remove unnecessary use of max_parallel_workers_per_gather 2020-03-06 13:18:58 +01:00
Nils Dijk a77ed9cd23
Refactor master query to be planned by postgres' planner (#3326)
DESCRIPTION: Replace the query planner for the coordinator part with the postgres planner

Closes #2761 

Citus had a simple rule based planner for the query executed on the query coordinator. This planner grew over time with the addigion of SQL support till it was getting close to the functionality of the postgres planner. Except the code was brittle and its complexity rose which made it hard to add new SQL support.

Given its resemblance with the postgres planner it was a long outstanding wish to replace our hand crafted planner with the well supported postgres planner. This patch replaces our planner with a call to postgres' planner.

Due to the functionality of the postgres planner we needed to support both projections and filters/quals on the citus custom scan node. When a sort operation is planned above the custom scan it might require fields to be reordered in the custom scan before returning the tuple (projection). The postgres planner assumes every custom scan node implements projections. Because we controlled the plan that was created we prevented reordering in the custom scan and never had implemented it before.

A same optimisation applies to having clauses that could have been where clauses. Instead of applying the filter as a having on the aggregate it will push it down into the plan which could reach a custom scan node.

For both filters and projections we have implemented them when tuples are read from the tuple store. If no projections or filters are required it will directly return the tuple from the tuple store. Otherwise it will loop tuples from the tuple store through the filter and projection until a tuple is found and returned.

Besides filters being pushed down a side effect of having quals that could have been a where clause is that a call to read intermediate result could be called before the first tuple is fetched from the custom scan. This failed because the intermediate result would only be pulled to the coordinator on the first tuple fetch. To overcome this problem we do run the distributed subplans now before we run the postgres executor. This ensures the intermediate result is present on the coordinator in time. We do account for total time instrumentation by removing the instrumentation before handing control to the psotgres executor and update the timings our self.

For future SQL support it is enough to create a valid query structure for the part of the query to be executed on the query coordinating node. As a utility we do serialise and print the query at debug level4 for engineers to inspect what kind of query is being planned on the query coordinator.
2020-02-25 14:39:56 +01:00
Hadi Moshayedi 97072c9eb1 INSERT/SELECT: show method in EXPLAIN output 2020-01-16 23:24:52 -08:00
Onder Kalaci 64560b07be Update regression tests-2
In this commit, we're introducing a way to prevent CTE inlining via a GUC.

The GUC is used in all the tests where PG 11 and PG 12 tests would diverge
otherwise.

Note that, in PG 12, the restriction information for CTEs are generated. It
means that for some queries involving CTEs, Citus planner (router planner/
pushdown planner) may behave differently. So, via the GUC, we prevent
tests to diverge on PG 11 vs PG 12.

When we drop PG 11 support, we should get rid of the GUC, and mark
relevant ctes as MATERIALIZED, which does the same thing.
2020-01-16 12:28:15 +01:00
Jelte Fennema 7c3e8e150e Normalize tests: s/Subplan [0-9]+\_/Subplan XXX\_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema acd12a6de5 Normalize tests: s/read_intermediate_result\('[0-9]+_/read_intermediate_result('XXX_/g 2020-01-06 09:32:03 +01:00
Jelte Fennema 7730bd449c Normalize tests: Remove trailing whitespace 2020-01-06 09:32:03 +01:00
Jelte Fennema f21f00544e Normalize tests: s/ port=[0-9]+ / port=xxxxx /g 2020-01-03 11:42:09 +01:00
Philip Dubé 5fcc169a3a Stray depended to dependent tidy up 2019-12-03 15:28:32 +00:00
Jelte Fennema 7abedc38b0
Support subqueries in HAVING (#3098)
Areas for further optimization:
- Don't save subquery results to a local file on the coordinator when the subquery is not in the having clause
- Push the the HAVING with subquery to the workers if there's a group by on the distribution column
- Don't push down the results to the workers when we don't push down the HAVING clause, only the coordinator needs it

Fixes #520
Fixes #756
Closes #2047
2019-10-16 16:40:14 +02:00
Philip Dubé 74cb168205 Remove Postgres 10 support 2019-10-11 21:56:56 +00:00
Jelte Fennema 82ec918b29
Add explain summary support (#3046)
Fixes #2922 and also adds explain analyze regression tests
2019-09-30 10:58:49 +02:00
Philip Dubé 871dabdc63 Force CTE materialization in pg12 2019-08-09 15:25:59 +00:00