citus

Commit Graph

Author	SHA1	Message	Date
Marco Slot	eeffbde8bd	Fix pushdown of constants in aggregate queries	2020-06-30 11:41:16 -07:00
Jelte Fennema	392c5e2c34	Fix wrong cancellation message about distributed deadlocks (#3956 )	2020-06-30 14:57:46 +02:00
Marco Slot	634d6cf9d7	Improve performance of metadata cache (#3924 ) #3866 removed the shard ID hash in metadata_cache.c to simplify cache management, but we observed a significant performance regression that was being masked by the performance improvement provided by #3654 in our benchmarks, but #3654 only applies to specific workloads. This PR brings back the shard ID cache as it existed before #3866 with some extra measures to handle invalidation. When we load a table entry, we overwrite ShardIdCacheEntry->tableEntry pointers for all the shards in that table, though it's possible that the table no longer contains the old shard ID or the table entry is never reloaded, which would leave a dangling pointer once the table entry is freed. To handle that case, we remove all shard ID cache entries that point exactly to that table entry when a table is freed (at the end of the transaction or any call to CitusTableCacheFlushInvalidatedEntries). Co-authored-by: SaitTalhaNisanci <s.talhanisanci@gmail.com> Co-authored-by: Marco Slot <marco.slot@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2020-06-30 12:10:10 +02:00
Hadi Moshayedi	4ed59d2db3	Move more from insert_select_executor to insert_select_planner	2020-06-26 08:08:26 -07:00
Hadi Moshayedi	d34c21890f	Rename CoordinatorInsertSelect... to NonPushableInsertSelect	2020-06-25 08:55:48 -07:00
Hadi Moshayedi	4e8d79998e	Save INSERT/SELECT method in DistributedPlan. This is so we don't need to calculate it twice in insert_select_executor.c and multi_explain.c, which can cause discrepancy if an update in one of them is not reflected in the other site.	2020-06-25 08:55:48 -07:00
SaitTalhaNisanci	f458d1fd1c	Fix/task execution (#3941 ) * Not set TaskExecution with adaptive executor Adaptive executor is using a utility method from task tracker for repartition joins, however adaptive executor doesn't need taskExecution. It is only used by task tracker. This causes a problem when explain analyze is used because what taskExecution is pointing to might be random. We solve this by not setting taskExecution from adaptive executor. So it will stay NULL as set by CreateTask. * use same memory context as task for taskExecution Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2020-06-24 12:10:00 +03:00
Marco Slot	2a3234ca26	Rename masterQuery to combineQuery	2020-06-17 14:14:37 +02:00
Jelte Fennema	0259815d3a	Fix EXPLAIN ANALYZE received data counter issues (#3917 ) In #3901 the "Data received from worker(s)" sections were added to EXPLAIN ANALYZE. After merging @pykello posted some review comments. This addresses those comments as well as fixing a other issues that I found while addressing them. The things this does: 1. Fix `EXPLAIN ANALYZE EXECUTE p1` to not increase received data on every execution 2. Fix `EXPLAIN ANALYZE EXECUTE p1(1)` to not return 0 bytes as received data allways. 3. Move `EXPLAIN ANALYZE` specific logic to `multi_explain.c` from `adaptive_executor.c` 4. Change naming of new explain sections to `Tuple data received from node(s)`. Firstly because a task can reference the coordinator too, so "worker(s)" was incorrect. Secondly to indicate that this is tuple data and not all network traffic that was performed. 5. Rename `totalReceivedData` in our codebase to `totalReceivedTupleData` to make it clearer that it's a tuple data counter, not all network traffic. 6. Actually add `binary_protocol` test to `multi_schedule` (woops) 7. Fix a randomly failing test in `local_shard_execution.sql`.	2020-06-17 11:33:38 +02:00
Marco Slot	d1bab78d79	Remove master from file hierarchy	2020-06-16 17:49:09 +02:00
Philip Dubé	39400319e6	Defer freeing CitusTableCacheEntry, as there were memory safety issues before Shard id to index mapping stored in cache entry as there may now be multiple entries alive for a given relation insert_select_executor: revert copying cache entry, which was a hack added to avoid memory safety issues	2020-06-15 16:20:50 +00:00
Jelte Fennema	927de6d187	Show amount of data received in EXPLAIN ANALYZE (#3901 ) Sadly this does not actually work yet for binary protocol data, because when doing EXPLAIN ANALYZE we send two commands at the same time. This means we cannot use `SendRemoteCommandParams`, and thus cannot use the binary protocol. This can still be useful though when using the text protocol, to find out that a lot of data is being sent.	2020-06-15 16:01:05 +02:00
Marco Slot	4f7989ad8e	Rename WorkersContainingAllShards to PlacementsForWorkersContainingAllShards	2020-06-12 18:36:02 -07:00
Marco Slot	d953f084db	Rename FindRouterWorkerList to CreateTaskPlacementListForShardIntervals	2020-06-12 18:36:01 -07:00
Marco Slot	24feadc230	Handle joins between local/reference/cte via router planner	2020-06-12 18:36:01 -07:00
Halil Ozan Akgül	8c5eb6b7ea	Insert Select Into Local Table (#3870 ) * Insert select with master query * Use relid to set custom_scan_tlist varno * Reviews * Fixes null check Co-authored-by: Marco Slot <marco.slot@gmail.com>	2020-06-12 17:06:31 +03:00
Jelte Fennema	0e12d045b1	Support use of binary protocol in between nodes (#3877 ) This can save a lot of data to be sent in some cases, thus improving performance for which inter query bandwidth is the bottleneck. There's some issues with enabling this as default, so that's currently not done.	2020-06-12 15:02:51 +02:00
Nils Dijk	da8f2b0134	Feature: tdigest aggregate (#3897 ) DESCRIPTION: Adds support to partially push down tdigest aggregates tdigest extensions: https://github.com/tvondra/tdigest This PR implements the partial pushdown of tdigest calculations when possible. The extension adds a tdigest type which can be combined into the same structure. There are several aggregate functions that can be used to get; - a quantile - a list of quantiles - the quantile of a hypothetical value - a list of quantiles for a list of hypothetical values These function can work both on values or tdigest types. Since we can create tdigest values either by combining them, or based on a group of values we can rewrite the aggregates in such a way that most of the computation gets delegated to the compute on the shards. This both speeds up the percentile calculations because the values don't have to be sorted while at the same time making the transfer size from the shards to the coordinator significantly less.	2020-06-12 13:50:28 +02:00
Philip Dubé	1722d8ac8b	Allow routing modifying CTEs We still recursively plan some cases, eg: - INSERTs - SELECT FOR UPDATE when reference tables in query - Everything must be same single shard & replication model	2020-06-11 15:14:06 +00:00
Hadi Moshayedi	7c52c6edb0	CTE statistics in EXPLAIN ANALYZE	2020-06-11 02:39:59 -07:00
Hadi Moshayedi	bb96ef5047	Does the EXPLAIN ANALYZE at the same time as execution, so avoids executing twice. We wrap worker tasks in worker_save_query_explain_analyze() so we can fetch their explain output later by a call worker_last_saved_explain_analyze(). Fixes #3519 Fixes #2347 Fixes #2613 Fixes #621	2020-06-11 01:55:57 -07:00
Marco Slot	1243b6a948	Execute shard creation as utility tasks	2020-06-10 11:29:49 +02:00
Onder Kalaci	06461ca55f	Coerce types properly for INSERT Also, unify similar code-paths to rely on more accurate function.	2020-06-10 10:40:28 +02:00
Hadi Moshayedi	5cdfa9f571	Implement EXPLAIN ANALYZE udfs. Implements worker_save_query_explain_analyze and worker_last_saved_explain_analyze. worker_save_query_explain_analyze executes and returns results of query while saving its EXPLAIN ANALYZE to be fetched later. worker_last_saved_explain_analyze returns the saved EXPLAIN ANALYZE result.	2020-06-09 10:02:05 -07:00
Onur Tirtir	a4f1c41391	Implement GetQueryLockMode helper (#3860 ) If we want to get necessary lockmode for a relation RangeVar within a query, we can get the lockmode easily from the RangeVar itself (if pg version >= 12). However, if we want to decide the lockmode appropriate for the "query", we can derive this information by using GetQueryLockMode according to the code comment from RangeTblEntry->rellockmode.	2020-06-09 13:08:44 +03:00
Hadi Moshayedi	198d5d8b0f	typedef TupleDestination once	2020-06-08 20:38:28 -07:00
Hadi Moshayedi	0bfd39ea52	Implement TupleDestination intereface. Implements a new `TupleDestination` interface to allow custom tuple processing per task. This can be specially useful if a task contains multiple queries. An example of this EXPLAIN ANALYZE, where it needs to add some UDF calls to the query to fetch the explain output from worker after fetching the actual query results.	2020-06-05 17:47:40 -07:00
SaitTalhaNisanci	d0f47eb338	Check the removeType in IsDropCitusStmt (#3859 ) We should check the remove type in IsDropCitusStmt because if the remove type is not OBJECT_EXTENSION then the stored objects in dropStmt->objects may not be of type Value. This was crashing PG-13. Also rename the method as IsDropCitusExtensionStmt.	2020-06-05 20:49:54 +03:00
Onur Tirtir	f7224a12f2	Implement PushOverrideEmptySearchPath (#3874 ) To reduce code duplication, implement function that pushes search_path to be NIL and sets addCatalog to true so that all objects outside of pg_catalog will be schema-prefixed.	2020-06-05 19:23:59 +03:00
Onur Tirtir	f3f711e097	Implement IndexIsImpliedByAConstraint	2020-06-05 15:33:54 +03:00
Onur Tirtir	dfcc18468c	Error out for unsupported trigger objects Error out if creating a citus table from a table having triggers. Error out for CREATE TRIGGER commands that are run on citus tables.	2020-05-31 23:10:01 +03:00
Onur Tirtir	6e6bc155a9	Implement methods to process & recreate triggers on citus tables	2020-05-31 15:28:17 +03:00
Philip Dubé	c0515dcd67	This prepares for routing modifying CTEs, where modLevel should not be used to infer whether a plan is a select or not SELECT_TASK is renamed to READ_TASK as a SELECT with modifying CTEs will be a MODIFYING_TASK RouterInsertJob: Assert originalQuery->commandType == CMD_INSERT CreateModifyPlan: Assert originalQuery->commandType != CMD_SELECT Remove unused function IsModifyDistributedPlan DistributedExecution, ExecutionParams, DistributedPlan: Rename hasReturning to expectResults SELECTs set expectResults to true Rename CreateSingleTaskRouterPlan to CreateSingleTaskRouterSelectPlan	2020-05-20 17:26:12 +00:00
Onur Tirtir	79a688ffe0	Refactor the methods accessing to pg_constraint Implement internal functions to accces to pg_contraint and utilize them in existing foreign key checks.	2020-05-20 17:27:17 +03:00
SaitTalhaNisanci	22c903b151	remove ExecuteUtilityTaskListWithoutResults (#3696 ) This PR removes ExecuteUtilityTaskListWithoutResults and uses the same path for local execution via ExecuteTaskListExtended. ExecuteUtilityTaskList is added. ExecuteLocalTaskListExtended now has a parameter for utility commands so that it can call the right method. In order not to change the existing calls, ExecuteTaskListExtendedInternal is added, which is the main method that runs the execution, via local and remote execution.	2020-05-07 13:30:50 +03:00
Marco Slot	6ce2803777	Make sure we don't wrap GROUP BY expressions in any_value	2020-05-05 05:12:45 +02:00
Onder Kalaci	0cb7ab2d05	Explicitly mark queries in physical planner for [not] having parameters Physical planner doesn't support parameters. If the parameters have already been resolved when the physical planner handling the queries, mark it. The reason is that the executor is unaware of this, and sends the parameters along with the worker queries, which fails for composite types. (See `DissuadePlannerFromUsingPlan()` for the details of paramater resolving)	2020-04-24 12:49:43 +02:00
Onur Tirtir	b8dd8f50d1	Fix build issue in GCC 10 (#3790 ) As reported in #3787, we were having issues while building citus with "GCC Red Hat 10" (maybe in some other versions of gcc as well). Fixes "multiple definition of 'CitusNodeTagNames'" error by explicitly specifying storage of CitusNodeTagNames to be extern.	2020-04-22 16:41:34 +03:00
Philip Dubé	c0a95a3adb	Copy data from CitusTableCacheEntry more often This copies over fixes from reference counting branch, all CitusTableCacheEntry data may be freed when a GetCitusTableCacheEntry call occurs for its relationId This fix is not complete, but reference counting is being deferred until 9.4 CopyShardInterval: remove dest parameter, always return newly allocated object	2020-04-17 14:17:18 +00:00
Önder Kalacı	a919f09c96	Remove the entries from the shared connection counter hash when no connections remain (#3775 ) We initially considered removing entries just before any change to pg_dist_node. However, that ended-up being very complex and making MX even more complex. Instead, we're switching to a simpler solution, where we remove entries when the counter gets to 0. With certain workloads, this may have some performance penalty. But, two notes on that: - When counter == 0, it implies that the cluster is not busy - With cached connections, that's not possible	2020-04-17 17:14:58 +03:00
SaitTalhaNisanci	a9a3be15cc	introduce TASK_QUERY_NULL task type (#3774 ) When we call SetTaskQueryString we would set the task type to TASK_QUERY_TEXT, and some parts of the codebase rely on the fact that if TASK_QUERY_TEXT is set, the data can be read safely. However if SetTaskQueryString is called with a NULL taskQueryString this can cause crashes. In that case taskQueryType will simply be set to TASK_QUERY_NULL.	2020-04-17 14:59:22 +03:00
Nils Dijk	1d6ba1d09e	Refactor alter role to work on distributed roles (#3739 ) DESCRIPTION: Alter role only works for citus managed roles Alter role was implemented before we implemented good role management that hooks into the object propagation framework. This is a refactor of all alter role commands that have been implemented to - be on by default - only work for supported roles - make the citus extension owner a supported role Instead of distributing the alter role commands for roles at the beginning of the node activation role it now _only_ executes the alter role commands for all users in all databases and in the current database. In preparation of full role support small refactors have been done in the deparser. Earlier tests targeting other roles than the citus extension owner have been either slightly changed or removed to be put back where we have full role support. Fixes #2549	2020-04-16 12:23:27 +02:00
Marco Slot	8b83306a27	Issue worker messages with the same log level	2020-04-14 21:08:25 +02:00
SaitTalhaNisanci	132efdbc56	add execution params struct (#3747 ) We had 9+ parameters in some of the functions related to execution. Execution params is created to simplify this a bit so that we can set only the fields that we are interested in and it is easier to read.	2020-04-14 14:32:40 +03:00
Onder Kalaci	aa6b641828	Throttle connections to the worker nodes With this commit, we're introducing a new infrastructure to throttle connections to the worker nodes. This infrastructure is useful for multi-shard queries, router queries are have not been affected by this. The goal is to prevent establishing more than citus.max_shared_pool_size number of connections per worker node in total, across sessions. To do that, we've introduced a new connection flag OPTIONAL_CONNECTION. The idea is that some connections are optional such as the second (and further connections) for the adaptive executor. A single connection is enough to finish the distributed execution, the others are useful to execute the query faster. Thus, they can be consider as optional connections. When an optional connection is not allowed to the adaptive executor, it simply skips it and continues the execution with the already established connections. However, it'll keep retrying to establish optional connections, in case some slots are open again.	2020-04-14 10:27:48 +02:00
Onder Kalaci	0dbfbe0c37	Add the necessary shared memory infrastructure - The hashmap in the shared memory - The lock to access the hashmap - The GUC to control the size	2020-04-14 10:03:26 +02:00
Philip Dubé	30f10984e1	Defer get_agg_clause_costs, it happens later & avoids errors	2020-04-10 13:26:05 +00:00
SaitTalhaNisanci	3dc7cad754	use an enum for local execution status (#3733 ) We have two variables that are related to local execution status. TransactionAccessedLocalPlacement and TransactionConnectedToLocalGroup. Only one of these fields should be set, however we didn't have any check for this contraint and it was error prone. What those two variables are used is that we are trying to understand if we should use local execution, the current session, or if we should be using a connection to execute the current query, therefore the tasks. In the enum, now it is more clear what these variables mean. Also, now we have a method to change the local execution status. The method will error if we are trying to transition from a state to a wrong state. This will help us avoid problems.	2020-04-09 19:11:04 +03:00
Hadi Moshayedi	dda53a0bba	GUC for replicate reference tables on activate.	2020-04-08 12:42:45 -07:00
Hadi Moshayedi	0758a81287	Prevent reference tables being dropped when replicating reference tables	2020-04-08 12:41:36 -07:00

1 2 3 4 5 ...

764 Commits (test/adaptive_executor_repartition)