citus

Commit Graph

Author	SHA1	Message	Date
Philip Dubé	cd0b2ad5b5	citus_evaluate_expression: call expand_function_arguments beforehand to avoid segfaulting on implicit parameters	2020-06-23 18:06:46 +00:00
Jelte Fennema	a98226842d	Use rename to make sure no files are inserted while deleting (#3912 ) As suggested by @marcocitus in https://github.com/citusdata/citus/pull/3911#issuecomment-643978531, there was a regression in #3893. If another backend would write a file during deletion of the intermediate results directory, this file would not necessarily be deleted. The approach used in `CitusRemoveDirectory` is to try recursive removal of the directory again if it has failed. This does not work here, since when a file can not be removed for other reasons (e.g. `EPERM`) it will not throw an error anymore. So then we would get into an infinite removal loop. Instead I now `rename` the directory before removing it. That way other backends will not write files to it anymore.	2020-06-23 10:38:44 +02:00
Onder Kalaci	88c473e007	Sort WorkerPool in executions We sort the workerList because adaptive connection management (e.g., OPTIONAL_CONNECTION) requires any concurrent executions to wait for the connections in the same order to prevent any starvation. If we don't sort, we might end up with: Execution 1: Get connection for worker 1, wait for worker 2 Execution 2: Get connection for worker 2, wait for worker 1 and, none could proceed. Instead, we enforce every execution establish the required connections to workers in the same order.	2020-06-22 16:39:27 +02:00
Hanefi Önaldı	618453a2ba	Disallow C-style comments in migration files	2020-06-22 12:51:16 +03:00
Jelte Fennema	b3ec6fbe7a	Make check_enterprise_merge script stricter (#3918 ) We've had two issues with merge conflicts to enterprise in the last week, that suddenly happened. Because of this CI check this actually blocks all community PRs from being merged. This PR tries to improve on the previous script we had, by putting tougher constraints on when a merge is allowed. Previously the check would pass in two cases: 1. This PR be merged without conflicts into `enterprise-master` 2. A branch exists with the same name as this PR on enterprise and that can be merged into `enterprise-master`. The first case stays the same, but I've changed the second case to require the following instead: 1. A branch exists on enterprise with the same name as this PR 2. NEW: This branch contains the the last commit of the community PR branch 3. This branch can be merged into enterprise-master This makes sure the enterprise branch is actually up to date and not forgotten about. If we still get problems with this change, future improvements could be: 1. Check that the PR on enterprise passes CI 2. Check that the PR on enterprise has been approved 3. Require the enterprise PR branch to be merged before merging community.	2020-06-19 12:45:36 +02:00
SaitTalhaNisanci	3a789352b6	rename citus hammerdb branch prefix as citus_github_push (#3925 ) When we are using hammerdb jobs, the job creates a branch on test automation, since that branch should be deleted, it would have `delete_me` prefix, however since the result branch on release-test-results will have the test automation branch as prefix, it will also have `delete_me` prefix, which seems a bit confusing. This PR updates it as citus_github_push	2020-06-18 21:11:58 +03:00
Marco Slot	2a3234ca26	Rename masterQuery to combineQuery	2020-06-17 14:14:37 +02:00
Jelte Fennema	0259815d3a	Fix EXPLAIN ANALYZE received data counter issues (#3917 ) In #3901 the "Data received from worker(s)" sections were added to EXPLAIN ANALYZE. After merging @pykello posted some review comments. This addresses those comments as well as fixing a other issues that I found while addressing them. The things this does: 1. Fix `EXPLAIN ANALYZE EXECUTE p1` to not increase received data on every execution 2. Fix `EXPLAIN ANALYZE EXECUTE p1(1)` to not return 0 bytes as received data allways. 3. Move `EXPLAIN ANALYZE` specific logic to `multi_explain.c` from `adaptive_executor.c` 4. Change naming of new explain sections to `Tuple data received from node(s)`. Firstly because a task can reference the coordinator too, so "worker(s)" was incorrect. Secondly to indicate that this is tuple data and not all network traffic that was performed. 5. Rename `totalReceivedData` in our codebase to `totalReceivedTupleData` to make it clearer that it's a tuple data counter, not all network traffic. 6. Actually add `binary_protocol` test to `multi_schedule` (woops) 7. Fix a randomly failing test in `local_shard_execution.sql`.	2020-06-17 11:33:38 +02:00
Marco Slot	d1bab78d79	Remove master from file hierarchy	2020-06-16 17:49:09 +02:00
Jelte Fennema	b71f82b31e	Use 5 second isolation test timeout (#3907 ) Sometimes isolation tests get stuck in CI and we cannot see why, because the job is killed by the CI runner. This will instead fail inside make the testsuite continue, but mark it as a failure like this in the diff output: ```diff +isolationtester: canceling step s2-ddl-create-index-concurrently after 5 seconds step s2-ddl-create-index-concurrently: CREATE INDEX CONCURRENTLY select_append_index ON select_append(id); +ERROR: CONCURRENTLY-enabled index command failed ``` We should detect blockages very quickly and the queries we run are also very fast, so 5 seconds should be more than enough to catch any random slowness. The default from Postgres is 5 minutes, which is waaay to much for us.	2020-06-16 14:57:49 +02:00
Jelte Fennema	799bfdab56	Temporarily disable connection leak tests that fail a lot (#3911 ) MX connection leak failures: 1. https://app.circleci.com/pipelines/github/citusdata/citus/9296/workflows/e36d1088-662a-4f60-acec-293132632c2f/jobs/131908/steps 2. https://app.circleci.com/pipelines/github/citusdata/citus/9258/workflows/37659d82-2c5b-495e-b0e7-905811e30444/jobs/131299 Failure connection leak failures: 1. https://app.circleci.com/pipelines/github/citusdata/citus/9297/workflows/c0ebc326-8c93-468f-8b70-f470bd492fb9/jobs/131920 2. https://app.circleci.com/pipelines/github/citusdata/citus/9283/workflows/9af154d0-ff96-4c5d-ae19-81faae1e0c18/jobs/131668	2020-06-16 13:48:48 +02:00
Philip Dubé	39400319e6	Defer freeing CitusTableCacheEntry, as there were memory safety issues before Shard id to index mapping stored in cache entry as there may now be multiple entries alive for a given relation insert_select_executor: revert copying cache entry, which was a hack added to avoid memory safety issues	2020-06-15 16:20:50 +00:00
Jelte Fennema	927de6d187	Show amount of data received in EXPLAIN ANALYZE (#3901 ) Sadly this does not actually work yet for binary protocol data, because when doing EXPLAIN ANALYZE we send two commands at the same time. This means we cannot use `SendRemoteCommandParams`, and thus cannot use the binary protocol. This can still be useful though when using the text protocol, to find out that a lot of data is being sent.	2020-06-15 16:01:05 +02:00
SaitTalhaNisanci	077c784fe9	Create EnsureTableCanBeCreated for some checks (#3839 )	2020-06-14 14:25:58 +03:00
Hadi Moshayedi	ef778c1cd7	address feedback from Sait Talha & Hadi	2020-06-12 18:36:02 -07:00
Marco Slot	4f7989ad8e	Rename WorkersContainingAllShards to PlacementsForWorkersContainingAllShards	2020-06-12 18:36:02 -07:00
Marco Slot	080f711e62	Remove useless debug message in router planner	2020-06-12 18:36:02 -07:00
Marco Slot	d953f084db	Rename FindRouterWorkerList to CreateTaskPlacementListForShardIntervals	2020-06-12 18:36:01 -07:00
Marco Slot	24feadc230	Handle joins between local/reference/cte via router planner	2020-06-12 18:36:01 -07:00
Nils Dijk	f57711b3d2	fix test output for tdigest (#3909 ) Due to the problem described in #3908 we don't cover the tdigest integration (and other extensions) on CI. Due to this a bug got in the patch due to a change in `EXPLAIN VERBOSE` being merged concurrently with the tdigest integration. This PR fixes the test output that missed the newly added information.	2020-06-12 20:54:27 +02:00
Halil Ozan Akgül	8c5eb6b7ea	Insert Select Into Local Table (#3870 ) * Insert select with master query * Use relid to set custom_scan_tlist varno * Reviews * Fixes null check Co-authored-by: Marco Slot <marco.slot@gmail.com>	2020-06-12 17:06:31 +03:00
Jelte Fennema	0e12d045b1	Support use of binary protocol in between nodes (#3877 ) This can save a lot of data to be sent in some cases, thus improving performance for which inter query bandwidth is the bottleneck. There's some issues with enabling this as default, so that's currently not done.	2020-06-12 15:02:51 +02:00
Nils Dijk	da8f2b0134	Feature: tdigest aggregate (#3897 ) DESCRIPTION: Adds support to partially push down tdigest aggregates tdigest extensions: https://github.com/tvondra/tdigest This PR implements the partial pushdown of tdigest calculations when possible. The extension adds a tdigest type which can be combined into the same structure. There are several aggregate functions that can be used to get; - a quantile - a list of quantiles - the quantile of a hypothetical value - a list of quantiles for a list of hypothetical values These function can work both on values or tdigest types. Since we can create tdigest values either by combining them, or based on a group of values we can rewrite the aggregates in such a way that most of the computation gets delegated to the compute on the shards. This both speeds up the percentile calculations because the values don't have to be sorted while at the same time making the transfer size from the shards to the coordinator significantly less.	2020-06-12 13:50:28 +02:00
Philip Dubé	8faaaee6a5	IsReferenceTable, ShardIntervalCount: remove misleading isCitusTable check GetCitusTableCacheEntry raises an error if relationId is not distributed	2020-06-11 15:35:02 +00:00
Philip Dubé	1722d8ac8b	Allow routing modifying CTEs We still recursively plan some cases, eg: - INSERTs - SELECT FOR UPDATE when reference tables in query - Everything must be same single shard & replication model	2020-06-11 15:14:06 +00:00
Hadi Moshayedi	0e3140c14d	Include execution duration in worker_last_saved_explain_analyze	2020-06-11 02:54:54 -07:00
Hadi Moshayedi	7c52c6edb0	CTE statistics in EXPLAIN ANALYZE	2020-06-11 02:39:59 -07:00
Hadi Moshayedi	1f6d6ee4a5	Show query text in EXPLAIN output	2020-06-11 02:19:55 -07:00
Hadi Moshayedi	bb96ef5047	Does the EXPLAIN ANALYZE at the same time as execution, so avoids executing twice. We wrap worker tasks in worker_save_query_explain_analyze() so we can fetch their explain output later by a call worker_last_saved_explain_analyze(). Fixes #3519 Fixes #2347 Fixes #2613 Fixes #621	2020-06-11 01:55:57 -07:00
Hadi Moshayedi	6ca621bd16	Test we don't support multi-shard EXPLAIN EXECUTE	2020-06-10 17:11:27 -07:00
Jelte Fennema	6f2eb4cdb6	Remove FlattenJoinVars (#3880 ) This code is not needed anymore since #3668 was merged. It's actually causing some issues when using the binary Postgres protocol, because postgres thinks it gets a `bigint` from the worker, but actually gets an normal `int`. The query in question that fails is this: ```sql CREATE TABLE test_table_1(id int, val1 int); CREATE TABLE test_table_2(id int, val1 bigint); SELECT create_distributed_table('test_table_1', 'id'); SELECT create_distributed_table('test_table_2', 'id'); INSERT INTO test_table_1 VALUES(1,1),(2,2),(3,3); INSERT INTO test_table_2 VALUES(1,1),(3,3),(4,5); SELECT val1 FROM test_table_1 LEFT JOIN test_table_2 USING(id, val1) ORDER BY 1; ``` The difference in queries that is sent to the workers after this change is this, for this query: ```diff --- query_old.sql 2020-06-09 09:51:21.460000000 +0200 +++ query_new.sql 2020-06-09 09:51:39.500000000 +0200 @@ -1 +1 @@ -SELECT worker_column_1 AS val1 FROM (SELECT test_table_1.val1 AS worker_column_1 FROM (public.test_table_1_102015 test_table_1(id, val1) LEFT JOIN public.test_table_2_102019 test_table_2(id, val1) USING (id, val1))) worker_subquery +SELECT worker_column_1 AS val1 FROM (SELECT val1 AS worker_column_1 FROM (public.test_table_1_102015 test_table_1(id, val1) LEFT JOIN public.test_table_2_102019 test_table_2(id, val1) USING (id, val1))) worker_subquery ```	2020-06-10 17:24:53 +02:00
Jelte Fennema	f4791fcb10	Remove SwallowErrors by using PathNameDeleteTemporaryDir (#3893 ) This is a different version of #3634. It also removes SwallowErrors, but instead of modifying our own functions to not throw errors, it uses the postgres built in `PathNameDeleteTemporaryDir` function. This function does not throw errors. Since this change is for a bugfix, I tried to minimize the changes. PRs with the following changes would be good to do separately from this PR: 1. Use PathName(Create\|Open\|Delete)Temporary(File\|Dir) to open and remove all files/dirs instead of our own custom file functions. 2. Prefix our outmost files/directories with `PG_TEMP_FILE_PREFIX` so that they are identified by Postgres as temporary files, which will be removed at postmaster start. This way we do not have to do this cleanup ourselves. 3. Store the files in the temporary table space if it exists. Fixes #3634 Fixes #3618	2020-06-10 17:04:07 +02:00
Onder Kalaci	640717bea2	Copy doesn't use more than MaxAdaptiveExecutor Co-authored-by: Hanefi Önaldı <Hanefi.Onaldi@Microsoft.com>	2020-06-10 16:46:21 +03:00
Jelte Fennema	b87bae71bb	Error out when using different users in the same transaction (#3869 ) Fixes #3867 As described in the issue above we return incorrect results when changing user within a transaction. This causes us to error out instead.	2020-06-10 14:07:40 +02:00
Marco Slot	1243b6a948	Execute shard creation as utility tasks	2020-06-10 11:29:49 +02:00
Onder Kalaci	06461ca55f	Coerce types properly for INSERT Also, unify similar code-paths to rely on more accurate function.	2020-06-10 10:40:28 +02:00
Hadi Moshayedi	5cdfa9f571	Implement EXPLAIN ANALYZE udfs. Implements worker_save_query_explain_analyze and worker_last_saved_explain_analyze. worker_save_query_explain_analyze executes and returns results of query while saving its EXPLAIN ANALYZE to be fetched later. worker_last_saved_explain_analyze returns the saved EXPLAIN ANALYZE result.	2020-06-09 10:02:05 -07:00
Onur Tirtir	a4f1c41391	Implement GetQueryLockMode helper (#3860 ) If we want to get necessary lockmode for a relation RangeVar within a query, we can get the lockmode easily from the RangeVar itself (if pg version >= 12). However, if we want to decide the lockmode appropriate for the "query", we can derive this information by using GetQueryLockMode according to the code comment from RangeTblEntry->rellockmode.	2020-06-09 13:08:44 +03:00
Hadi Moshayedi	198d5d8b0f	typedef TupleDestination once	2020-06-08 20:38:28 -07:00
Hadi Moshayedi	45a41e249f	Test EXPLAIN ANALYZE doesn't show repartition join tasks	2020-06-06 23:24:45 -07:00
Hadi Moshayedi	02cff1a7c6	Test that EXPLAIN ANALYZE is not supported for some forms of INSERT/SELECT	2020-06-06 23:24:45 -07:00
Hadi Moshayedi	f54a8e53c0	Remove unused consts from multi_explain.c	2020-06-06 23:24:45 -07:00
Hadi Moshayedi	0bfd39ea52	Implement TupleDestination intereface. Implements a new `TupleDestination` interface to allow custom tuple processing per task. This can be specially useful if a task contains multiple queries. An example of this EXPLAIN ANALYZE, where it needs to add some UDF calls to the query to fetch the explain output from worker after fetching the actual query results.	2020-06-05 17:47:40 -07:00
SaitTalhaNisanci	d0f47eb338	Check the removeType in IsDropCitusStmt (#3859 ) We should check the remove type in IsDropCitusStmt because if the remove type is not OBJECT_EXTENSION then the stored objects in dropStmt->objects may not be of type Value. This was crashing PG-13. Also rename the method as IsDropCitusExtensionStmt.	2020-06-05 20:49:54 +03:00
Onur Tirtir	f7224a12f2	Implement PushOverrideEmptySearchPath (#3874 ) To reduce code duplication, implement function that pushes search_path to be NIL and sets addCatalog to true so that all objects outside of pg_catalog will be schema-prefixed.	2020-06-05 19:23:59 +03:00
Onur Tirtir	8b39d12846	Append IF NOT EXISTS to deparsed CREATE SERVER commands (#3875 ) Append IF NOT EXISTS to CREATE SERVER commands generated by pg_get_serverdef_string function when deparsing an existing server object that a foreign table depends.	2020-06-05 18:04:33 +03:00
Onur Tirtir	f3f711e097	Implement IndexIsImpliedByAConstraint	2020-06-05 15:33:54 +03:00
Philip Dubé	25f86bca3f	multi_router_planner: Remove NULL check which would've segfaulted earlier	2020-06-02 13:08:38 +00:00
Philip Dubé	2623aefe38	multi_router_planner: replace GetUpdateOrDeleteRTE with ExtractResultRelationRTE	2020-06-02 00:22:30 +00:00
Onur Tirtir	dfcc18468c	Error out for unsupported trigger objects Error out if creating a citus table from a table having triggers. Error out for CREATE TRIGGER commands that are run on citus tables.	2020-05-31 23:10:01 +03:00
Onur Tirtir	6e6bc155a9	Implement methods to process & recreate triggers on citus tables	2020-05-31 15:28:17 +03:00
Onur Tirtir	5af64084ea	Copy & paste pg_get_triggerdef_worker from Postgres	2020-05-31 15:25:07 +03:00
Sait Talha Nisanci	dec2b28d49	use RelationGetPartitionDesc to be more safe For getting the partition desc, we should use RelationGetPartitionDesc method so that even if it is NULL, it will be created in the method.	2020-05-29 10:55:52 +03:00
Philip Dubé	c0515dcd67	This prepares for routing modifying CTEs, where modLevel should not be used to infer whether a plan is a select or not SELECT_TASK is renamed to READ_TASK as a SELECT with modifying CTEs will be a MODIFYING_TASK RouterInsertJob: Assert originalQuery->commandType == CMD_INSERT CreateModifyPlan: Assert originalQuery->commandType != CMD_SELECT Remove unused function IsModifyDistributedPlan DistributedExecution, ExecutionParams, DistributedPlan: Rename hasReturning to expectResults SELECTs set expectResults to true Rename CreateSingleTaskRouterPlan to CreateSingleTaskRouterSelectPlan	2020-05-20 17:26:12 +00:00
Onur Tirtir	98a660d0b7	Don't release lock on pg_constraint until the xact ends Do not release AccessShareLock when closing pg_constraint to prevent modifications to be done on pg_constraint to make sure that caller will process valid foreign key constraints through the transaction.	2020-05-20 17:27:17 +03:00
Onur Tirtir	79a688ffe0	Refactor the methods accessing to pg_constraint Implement internal functions to accces to pg_contraint and utilize them in existing foreign key checks.	2020-05-20 17:27:17 +03:00
SaitTalhaNisanci	80e34382cf	Rename AppropriateReplicationModel -> DecideReplicationModel (#3842 )	2020-05-17 10:24:14 +03:00
Onur Tirtir	8f9ef63e8a	Implement get_relation_constraint_oid_compat helper (#3836 )	2020-05-15 17:36:59 +03:00
MoYi	9e1f198155	Fix composite create type deparsing to preserve typmod	2020-05-15 13:12:54 +00:00
Onur Tirtir	249550b815	Refactor EnsureLocalTableEmptyIfNecessary (#3830 )	2020-05-15 14:20:33 +03:00
Onur Tirtir	8f3373c702	Remove unused parameter from RecordDistributedRelationDependencies (#3831 )	2020-05-15 10:34:35 +03:00
Sait Talha Nisanci	41fceb7849	Add optional ch_benchmark and tpcc_benchmark job With this commit: You can trigger two types of hammerdb benchmark jobs: -ch_benchmark (analytical and transactional queries) -tpcc_benchmark (only transactional queries) Your branch will be run against `master` branch. In order to trigger the jobs prepend `ch_benchmark/` or `tpcc_benchmark/` to your branch and push it. For example if you were running on a feature/improvement branch with name `improve/adaptive_executor`. In order to trigger a tpcc benchmark, you can do the following: ```bash git checkout improve/adaptive_executor git checkout -b tpcc_benchmark/improve/adaptive_executor git push origin tpcc_benchmark/improve/adaptive_executor # the tpcc benchmark job will be triggered. ``` You will see the results in a branch in [https://github.com/citusdata/release-test-results](https://github.com/citusdata/release-test-results). The branch name will be something like: `delete_me/citusbot_tpcc_benchmark_rg/<date>/<date>`. The resource groups will be deleted automatically but if the benchmark fails, they won't be deleted(If you don't see the results after a reasonable time, it might mean it failed, you can check the resource usage from portal, if it is almost 0 and you didn't see the results, it means it probably failed). In that case, you will need to delete the resource groups manually from portal, the resource groups are `citusbot_ch_benchmark_rg` and `citusbot_tpcc_benchmark_rg`.	2020-05-14 16:01:48 +03:00
SaitTalhaNisanci	cf98b9d6d5	not wait forever for metadata sync in tests (#3760 ) We shouldn't wait forever for metada sync in tests, otherwise when a test gets stuck, we don't know which line causes the problem.	2020-05-14 10:51:24 +03:00
SaitTalhaNisanci	22c903b151	remove ExecuteUtilityTaskListWithoutResults (#3696 ) This PR removes ExecuteUtilityTaskListWithoutResults and uses the same path for local execution via ExecuteTaskListExtended. ExecuteUtilityTaskList is added. ExecuteLocalTaskListExtended now has a parameter for utility commands so that it can call the right method. In order not to change the existing calls, ExecuteTaskListExtendedInternal is added, which is the main method that runs the execution, via local and remote execution.	2020-05-07 13:30:50 +03:00
Nils Dijk	105de7beb8	Fix for pruned target list entries (#3818 ) DESCRIPTION: Ignore pruned target list entries in coordinator plan The postgres planner has the ability to prune target list entries that are proven not used in the output relation. When this happens at the `CitusCustomScan` boundary we need to _not_ return these pruned columns to not upset the rest of the planner. By using the target list the planner asks us to return we fix issues that lead to Assertion failures, and potentially could be runtime errors when they hit in a production build. Fixes #3809	2020-05-06 13:56:02 +02:00
Marco Slot	6ce2803777	Make sure we don't wrap GROUP BY expressions in any_value	2020-05-05 05:12:45 +02:00
Hadi Moshayedi	dbf509bbdd	Don't error out when cannot create maintenanced	2020-05-04 09:53:52 -07:00
SaitTalhaNisanci	4a9d516f1b	Add a job to check if merge to enterprise master would fail (#3777 ) * add a job to check if merge to enterprise master would fail Add a job to check if merge to enterprise master would fail. The job does the following: - It checks if there is already a branch with the same name on enterprise, if so it tries to merge it to enterprise master, if the merge fails the job fails. - If the branch doesn't exist on the enterprise, it tries to merge the current branch to enterprise master, it fails if there is any conflict while merging. The motivation is that if a branch on community would create a conflict on enterprise-master, until we create a PR on enterprise that would solve this conflict, we won't be able to merge the PR on community. This way we won't have many conflicts when merging to enterprise master and the author, who has the most context will be responsible for resolving the conflict when he has the most context, not after 1 month. * Improve test suite to be able to easily run locally * Add documentation on how to resolve conflicts to enterprise master * Improve enterprise merge script * Improve merge conflict job README * Improve merge conflict job README * Improve merge conflict job README * Improve merge conflict job README Co-authored-by: Nils Dijk <nils@citusdata.com>	2020-05-04 17:08:17 +03:00
Onder Kalaci	f9d4a9cf38	Remove assertion for subqueries in WHERE clause ANDed with FALSE In the code, we had the assumption that if restriction information is NULL, it means that we cannot have any disributetd tables in the subquery. However, for subqueries in WHERE clause, that is not the case when the subquery is ANDed with FALSE. In that case, Citus operates on the originalQuery (which doesn't go through the standard_planner()), and rely on the restriction information generated by standard_plannner(). As Postgres is smart enough to no generate restriction information for subqueries ANDed with FALSE, we hit the assertion.	2020-05-04 10:52:15 +02:00
Onder Kalaci	891d99efaf	add order by to some tests to make the output consistent	2020-05-01 12:41:51 +02:00
Onder Kalaci	77c397e9ae	Rebuild wait event sets after PQconnectPoll() if socket changes The reason is that PQconnectPoll() may change the underlying socket. If we don't rebuild the wait event set, the low level APIs (such as epoll_ctl()) may fail due to invalid sockets. Instead, rebuilding ensures that we'll use accurate/active sockets.	2020-05-01 09:44:21 +02:00
Jelte Fennema	c6f5d5fe88	Add some asserts to pass static analysis (#3805 )	2020-04-29 11:19:11 +02:00
SaitTalhaNisanci	cbda951395	Fix task copy and appending empty task in ExtractLocalAndRemoteTasks (#3802 ) * Not append empty task in ExtractLocalAndRemoteTasks ExtractLocalAndRemoteTasks extracts the local and remote tasks. If we do not have a local task the localTaskPlacementList will be NIL, in this case we should not append anything to local tasks. Previously we would first check if a task contains a single placement or not, now we first check if there is any local task before doing anything. * fix copy of node task Task node has task query, which might contain a list of strings in its fields. We were using postgres copyObject for these lists. Postgres assumes that each element of list will be a node type. If it is not a node type it will error. As a solution to that, a new macro is introduced to copy a list of strings.	2020-04-29 11:05:34 +03:00
Philip Dubé	b6b3c1bc17	Fix COPY TO's COPY (SELECT) with distributed table having generated columns It's necessary to omit generated columns from output	2020-04-28 14:40:47 +00:00
SaitTalhaNisanci	164c00cf08	Fix typo: longer visible -> no longer visible (#3803 )	2020-04-27 16:32:46 +03:00
Onder Kalaci	bc54c5125f	Increase the default value of citus.node_connection_timeout The previous default was 5 seconds, and we change it to 30 seconds. The main motivation for this is that for busy clusters, 5 seconds can be too aggressive. Especially with connection throttling, the servers might be kept busy for a really long time, and users may see the connection errors more frequently. We've done some sanity checks, for really quick queries (like `SELECT count(*) from table`), 30 seconds is a decent value even if users execute 300 distributed queries on the coordinator. We've verified this on Hyperscale(Citus).	2020-04-24 15:16:42 +02:00
Onder Kalaci	0cb7ab2d05	Explicitly mark queries in physical planner for [not] having parameters Physical planner doesn't support parameters. If the parameters have already been resolved when the physical planner handling the queries, mark it. The reason is that the executor is unaware of this, and sends the parameters along with the worker queries, which fails for composite types. (See `DissuadePlannerFromUsingPlan()` for the details of paramater resolving)	2020-04-24 12:49:43 +02:00
Onder Kalaci	f517fa2e2a	Re-enable isolation test for reference tables + distributed deadlock detection	2020-04-24 11:53:03 +02:00
SaitTalhaNisanci	07cbd84631	Add base isolation schedule (#3784 ) We should do some setup steps in check-isolation-base target. This PR adds base_isolation_schedule which will set up the cluster.	2020-04-24 12:38:37 +03:00
Onur Tirtir	b8dd8f50d1	Fix build issue in GCC 10 (#3790 ) As reported in #3787, we were having issues while building citus with "GCC Red Hat 10" (maybe in some other versions of gcc as well). Fixes "multiple definition of 'CitusNodeTagNames'" error by explicitly specifying storage of CitusNodeTagNames to be extern.	2020-04-22 16:41:34 +03:00
Onur Tirtir	2e927bd6b7	Bump Citus to 9.4devel (#3788 )	2020-04-22 12:50:00 +03:00
Hanefi Önaldı	e85b835065	Skip dependency setup on coordinator node	2020-04-21 12:06:31 +03:00
Philip Dubé	9093d51a22	maintenanced: handle before_shmem_exit, assert workerPid == 0 on start	2020-04-20 14:41:40 +00:00
Jelte Fennema	1423433531	Fix running check-isolation-base (#3782 )	2020-04-20 15:36:09 +02:00
Onder Kalaci	e182215d96	Improve connection error message from the worker nodes We currently put the actual error message to the detail part. However, many drivers don't show detail part. As connection errors are somehow common, and hard to trace back, can't we added the detail to the message itself. In addition to that, we changed "connection error" message, as it was confusing to the users who think that the error was happening while connecting to the coordinator. In fact, this error is showing up when the coordinator fails to connect remote nodes.	2020-04-20 13:32:55 +02:00
Hadi Moshayedi	1250d691d3	Replicate reference tables before master_create_empty_shard	2020-04-17 16:47:03 -07:00
Philip Dubé	8e79672839	Try copying shard intervals out of cache for long lived borrow	2020-04-17 22:00:41 +00:00
Philip Dubé	c00d57a955	CreateDistributedInsertSelectPlan: avoid calling GetCitusTableCacheEntry in a way that would invalidate live ShardInterval pointers	2020-04-17 14:44:23 +00:00
SaitTalhaNisanci	1d0f4bdcd2	invalidate plan cache in master_update_node (#3758 ) * invalidate plan cache in master_update_node If a plan is cached by postgres but a user uses master_update_node, then when the plan cache is used for the updated node, they will get the old nodename/nodepost in the plan. This is because the plan cache doesn't know about the master_update_node. This could be a problem in prepared statements or anything that goes into plancache. As a solution the plan cache is invalidated inside master_update_node. * add invalidate_inactive_shared_connections test function We introduce invalidate_inactive_shared_connections udf to be used in testing. It is possible that a connection count for an inactive node will be greater than 0 and in that case it will not be removed at the time of invalidation. However, later we don't have a mechanism to remove it, which means that it will stay in the hash. For this not to cause a problem, we use this udf in testing. * move invalidate_inactive_shared_connections to udfs from test as it will be used in mx * remove the test udf * remove the IsInactive check	2020-04-17 17:43:48 +03:00
Philip Dubé	c0a95a3adb	Copy data from CitusTableCacheEntry more often This copies over fixes from reference counting branch, all CitusTableCacheEntry data may be freed when a GetCitusTableCacheEntry call occurs for its relationId This fix is not complete, but reference counting is being deferred until 9.4 CopyShardInterval: remove dest parameter, always return newly allocated object	2020-04-17 14:17:18 +00:00
Önder Kalacı	a919f09c96	Remove the entries from the shared connection counter hash when no connections remain (#3775 ) We initially considered removing entries just before any change to pg_dist_node. However, that ended-up being very complex and making MX even more complex. Instead, we're switching to a simpler solution, where we remove entries when the counter gets to 0. With certain workloads, this may have some performance penalty. But, two notes on that: - When counter == 0, it implies that the cluster is not busy - With cached connections, that's not possible	2020-04-17 17:14:58 +03:00
Philip Dubé	e4a4707f4a	Avoid setting hasWindowFuncs true after window functions have been optimized out of query	2020-04-17 12:22:48 +00:00
SaitTalhaNisanci	a9a3be15cc	introduce TASK_QUERY_NULL task type (#3774 ) When we call SetTaskQueryString we would set the task type to TASK_QUERY_TEXT, and some parts of the codebase rely on the fact that if TASK_QUERY_TEXT is set, the data can be read safely. However if SetTaskQueryString is called with a NULL taskQueryString this can cause crashes. In that case taskQueryType will simply be set to TASK_QUERY_NULL.	2020-04-17 14:59:22 +03:00
Hanefi Önaldı	0c5d0cfee9	Notice message to help truncate local data after distribution	2020-04-17 13:21:34 +03:00
Hanefi Önaldı	d535121f8d	Introduce truncate_local_data_after_distributing_table()	2020-04-17 13:21:34 +03:00
Hadi Moshayedi	61198251fd	Use block_writes for replicate_reference_tables	2020-04-16 19:25:41 -07:00
Nils Dijk	1d6ba1d09e	Refactor alter role to work on distributed roles (#3739 ) DESCRIPTION: Alter role only works for citus managed roles Alter role was implemented before we implemented good role management that hooks into the object propagation framework. This is a refactor of all alter role commands that have been implemented to - be on by default - only work for supported roles - make the citus extension owner a supported role Instead of distributing the alter role commands for roles at the beginning of the node activation role it now _only_ executes the alter role commands for all users in all databases and in the current database. In preparation of full role support small refactors have been done in the deparser. Earlier tests targeting other roles than the citus extension owner have been either slightly changed or removed to be put back where we have full role support. Fixes #2549	2020-04-16 12:23:27 +02:00
Hadi Moshayedi	59b9a4e5a1	Detect deadlocks in replicate_reference_tables()	2020-04-15 11:06:18 -07:00
SaitTalhaNisanci	df9048ebaa	update outdated comments related to local_execution (#3759 )	2020-04-15 16:15:43 +03:00
Marco Slot	8b83306a27	Issue worker messages with the same log level	2020-04-14 21:08:25 +02:00
SaitTalhaNisanci	132efdbc56	add execution params struct (#3747 ) We had 9+ parameters in some of the functions related to execution. Execution params is created to simplify this a bit so that we can set only the fields that we are interested in and it is easier to read.	2020-04-14 14:32:40 +03:00
SaitTalhaNisanci	d58b5e67c1	not run multi_router_planner_fast_path in parallel (#3744 )	2020-04-14 13:14:23 +03:00
Onder Kalaci	aa6b641828	Throttle connections to the worker nodes With this commit, we're introducing a new infrastructure to throttle connections to the worker nodes. This infrastructure is useful for multi-shard queries, router queries are have not been affected by this. The goal is to prevent establishing more than citus.max_shared_pool_size number of connections per worker node in total, across sessions. To do that, we've introduced a new connection flag OPTIONAL_CONNECTION. The idea is that some connections are optional such as the second (and further connections) for the adaptive executor. A single connection is enough to finish the distributed execution, the others are useful to execute the query faster. Thus, they can be consider as optional connections. When an optional connection is not allowed to the adaptive executor, it simply skips it and continues the execution with the already established connections. However, it'll keep retrying to establish optional connections, in case some slots are open again.	2020-04-14 10:27:48 +02:00
Onder Kalaci	38b8a9ad62	Add citus_remote_connection_stats() function This function is intended to be used for monitoring the remote connections.	2020-04-14 10:03:27 +02:00
Onder Kalaci	0dbfbe0c37	Add the necessary shared memory infrastructure - The hashmap in the shared memory - The lock to access the hashmap - The GUC to control the size	2020-04-14 10:03:26 +02:00
Hadi Moshayedi	2639a9a19d	Test master_copy_shard_placement errors on foreign constraints	2020-04-13 12:45:27 -07:00
Hadi Moshayedi	f9de734329	Ensure metadata is synced on ReplicateColocatedShardPlacement	2020-04-13 11:45:21 -07:00
Hadi Moshayedi	2218b7e38d	Refactor ReplicateColocatedShardPlacement	2020-04-13 11:07:26 -07:00
SaitTalhaNisanci	2b2a146af4	update gitignores with new files in test folder (#3749 )	2020-04-13 17:09:18 +03:00
SaitTalhaNisanci	2438e80a58	use CURSOR_OPT_PARALLEL_OK flag in local execution (#3745 ) We currently don't use any cursor flags in local execution, but we can use CURSOR_OPT_PARALLEL_OK flag to potentially benefit from parallelism when possible.	2020-04-12 19:49:22 +03:00
Philip Dubé	30f10984e1	Defer get_agg_clause_costs, it happens later & avoids errors	2020-04-10 13:26:05 +00:00
Philip Dubé	ab0b59ad3b	GetConnParams: Set runtimeParamStart before setting keywords/values to avoid out of bounds access	2020-04-10 13:14:06 +00:00
Halil Ozan Akgul	34c2b7e056	Fixes the psql connection bug	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	56e814a333	Adds public host to only hyperscale tests	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	d574ac33a8	Adds next shard ids to multi_create_table tests	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	a701fc774a	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	5bf350faf9	Removes failing tests This task just removes the failing tests. It doesn't mean this tests cannot be saved. It's just a starting point	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	1aa1f55d8e	Adds check_multi_hyperscale_superuser schedule	2020-04-10 13:05:07 +03:00
Halil Ozan Akgul	c2edf989cf	Adds public host parameters	2020-04-10 13:04:24 +03:00
Halil Ozan Akgul	4b9705f714	Adds worker host parameters	2020-04-10 13:03:28 +03:00
Halil Ozan Akgul	119bf590c8	Creates normalize_modified.sed	2020-04-10 13:03:19 +03:00
Halil Ozan Akgul	c8a81ef1ce	Changes copy to \copy	2020-04-10 13:03:15 +03:00
Halil Ozan Akgul	93b97248b2	Adds a connection string to run tests on that connection	2020-04-10 13:03:03 +03:00
SaitTalhaNisanci	17373d51da	not wait forever in upgrade distributed function before (#3731 )	2020-04-10 09:43:42 +03:00
SaitTalhaNisanci	07f9a442b0	Refactor CopyLocalDataIntoShards (#3693 ) This PR: - Declares variables when they are needed. - Creates DoCopyFromLocalTableIntoShards for better readability. - Doesn't use a hardcoded value, instead use a variable for better readability.	2020-04-10 09:25:26 +03:00
Marco Slot	a4b2197450	Correctly handle non-constant LIMIT/OFFSET clauses	2020-04-09 19:59:50 +00:00
SaitTalhaNisanci	3dc7cad754	use an enum for local execution status (#3733 ) We have two variables that are related to local execution status. TransactionAccessedLocalPlacement and TransactionConnectedToLocalGroup. Only one of these fields should be set, however we didn't have any check for this contraint and it was error prone. What those two variables are used is that we are trying to understand if we should use local execution, the current session, or if we should be using a connection to execute the current query, therefore the tasks. In the enum, now it is more clear what these variables mean. Also, now we have a method to change the local execution status. The method will error if we are trying to transition from a state to a wrong state. This will help us avoid problems.	2020-04-09 19:11:04 +03:00
SaitTalhaNisanci	24dcb02bca	enable local table join with reference table (#3697 ) * enable local table join with reference table * test different cases with local table and reference join	2020-04-09 15:25:54 +03:00
SaitTalhaNisanci	ebda3eff61	read database name inside the function (#3730 )	2020-04-09 13:11:13 +03:00
SaitTalhaNisanci	233e4a24d1	use local execution within transaction block (#3714 ) * use local executon when in a transaction block When we are inside a transaction block, there could be other methods that need local execution, therefore we will use local execution in a transaction block. * update test outputs with transaction block local execution * add a test to verify we dont leak intermediate schemas	2020-04-09 12:41:58 +03:00
SaitTalhaNisanci	fa88046ce1	test that we don't leak intermediate schemas (#3737 ) * test that we don't leak intermediate schemas We have tests to make sure that we don't intermediate any intermediate files, tables etc but we don't test if we are leaking schemas. It makes sense to test this as well. * remove all repartition schemas in case of error This solution is not an ideal one but it seems to be doing the job. We should have a more generic solution for the cleanup but it seems that putting the cleanup in the abort handler is dangerous and it was crashing.	2020-04-09 12:17:41 +03:00
SaitTalhaNisanci	362d72853c	return early in ExecuteTaskListExtended (#3738 ) It is possible to return an error in ExecuteTaskListExtended after performing local execution with the current structure. However there is no point in execution the local tasks if we are going to return an error later. So the local execution is moved after the error check.	2020-04-09 10:10:49 +03:00
Hadi Moshayedi	9b8802ba2d	Remove todo from reference_table_utils	2020-04-08 12:46:55 -07:00
Hadi Moshayedi	dda53a0bba	GUC for replicate reference tables on activate.	2020-04-08 12:42:45 -07:00
Hadi Moshayedi	c168a53ebc	Tests for replicate_reference_tables	2020-04-08 12:41:36 -07:00
Hadi Moshayedi	acfa850c38	Make multi_replicate_reference_table check-base friendly	2020-04-08 12:41:36 -07:00
Hadi Moshayedi	0758a81287	Prevent reference tables being dropped when replicating reference tables	2020-04-08 12:41:36 -07:00
Marco Slot	924cd7343a	Defer reference table replication to shard creation time	2020-04-08 12:41:36 -07:00
Philip Dubé	26797bfb94	Verify trigger relation before reading old/new tuples master_dist_placement_cache_invalidate: bail when triggering on pg_dist_shard_placement	2020-04-07 15:39:31 +00:00
Önder Kalacı	70012dfd33	Do not error when an intermediate file does not exit (#3707 ) When the file does not exist, it could mean two different things. First -- and a lot more common -- case is that a failure happened in a concurrent backend on the same distributed transaction. And, one of the backends in that transaction has already been roll backed, which has already removed the file. If we throw an error here, the user might see this error instead of the actual error message. Instead, we prefer to WARN the user and pretend that the file has no data in it. In the end, the user would see the actual error message for the failure. Second, in case of any bugs in intermediate result broadcasts, we could try to read a non-existing file. That is most likely to happen during development. Thus, when asserts enabled, we throw an error instead of WARNING so that the developers cannot miss.	2020-04-07 17:06:55 +02:00
Onder Kalaci	a695b44ce9	Add new regression tests	2020-04-07 17:06:55 +02:00
Onder Kalaci	4b3d17f466	Make sure that tests are not failing randomly	2020-04-07 17:06:55 +02:00
Onder Kalaci	4f7c902c6c	Move connection establishment for intermediate results after query execution When we have a query like the following: ```SQL WITH a AS (SELECT * FROM foo LIMIT 10) SELECT max(x) FROM a JOIN bar 2 USING (y); ``` Citus currently opens side channels for doing the `COPY "1_1"` FROM STDIN (format 'result') before starting the execution of `SELECT * FROM foo LIMIT 10` Since we need at least 1 connection per worker to do `SELECT * FROM foo LIMIT 10` We need to have 2 connections to worker in order to broadcast the results. However, we don't actually send a single row over the side channel until the execution of `SELECT * FROM foo LIMIT 10` is completely done (and connections unclaimed) and the results are written to a tuple store. We could actually reuse the same connection for doing the `COPY "1_1"` FROM STDIN (format 'result'). This also fixes the issue that Citus doesn't obey `citus.max_adaptive_executor_pool_size` when the query includes an intermediate result.	2020-04-07 17:06:55 +02:00
Onder Kalaci	721daec9a5	Move the logic that initilize connections/local files into a function	2020-04-07 17:06:55 +02:00
Onder Kalaci	9b29a32d7a	Remove all references for side channel connections We don't need any side channel connections. That is actually problematic in the sense that it creates extra connections. Say, citus.max_adaptive_executor_pool_size equals to 1, Citus ends up using one extra connection for the intermediate results. Thus, not obeying citus.max_adaptive_executor_pool_size. In this PR, we remove the following entities from the codebase to allow further commits to implement not requiring extra connection for the intermediate results: - The connection flag REQUIRE_SIDECHANNEL - The function GivePurposeToConnection - The ConnectionPurpose struct and related fields	2020-04-07 17:06:55 +02:00
Hanefi Onaldi	1d22d0c2ff	Remove metadata locks from size functions	2020-04-07 17:37:15 +03:00
SaitTalhaNisanci	0430b568be	explicitly return false if transaction connected to local node (#3715 ) * explicitly return false if transaction connected to local node * not set TransactionConnectedToLocalGroup if we are writing to a file We use TransactionConnectedToLocalGroup to prevent local execution from happening as that might cause visibility problems. As files are visible to all transactions, we shouldn't set this variable if we are writing to a file.	2020-04-07 17:30:34 +03:00
Marco Slot	2632343f64	Fix intermediate result pruning for INSERT..SELECT	2020-04-07 11:07:49 +02:00
Marco Slot	84672c3dbd	Simplify intermediate result pruning logic	2020-04-07 10:53:29 +02:00
SaitTalhaNisanci	a710b3cdc5	fix null tupleStoreState case in ExecuteLocalTaskListExtended (#3711 ) In case we don't care about the tupleStoreState in ExecuteLocalTaskListExtended, it could be passed as null. In that case we will get a seg error. This changes it so that a dummy tuple store will be created when it is null. Do not use local execution in ExecuteTaskListOutsideTransaction. As we are going to run the tasks outside transaction, we shouldn't use local execution. However, there is some problem when using local execution related to repartition joins, when we solve that problem, we can execute the tasks coming to this path with local execution. Also logging the local command is simplified. normalize job id in worker_hash_partition_table in test outputs.	2020-04-07 11:47:09 +03:00
SaitTalhaNisanci	a369f9001d	fix incorrect groupid or nodeid (#3710 ) For shardplacements, we were setting nodeid, nodename, nodeport and nodegroup manually. This makes it very error prone, and it seems that we already forgot to set some of them. This would mean that they would have their default values, e.g group id would be 0 when its group id is not 0. So the implication is that we would have inconsistent worker metadata. A new method is introduced, and we call the method to set those fields now, so that as long as we call this method, we won't be setting inconsistent metadata. It probably makes sense to have a struct for these fields. We already have NodeMetadata but it doesn't have nodename or nodeport. So that could be done over another refactor to make things simpler.	2020-04-07 11:14:14 +03:00
Philip Dubé	4860e11561	Duplicate grouping on worker whenever possible This is possible whenever we aren't pulling up intermediate rows We want to do this because this was done in 9.2, some queries rely on the performance of grouping causing distinct values This change was introduced when implementing window functions on coordinator	2020-04-06 18:51:30 +00:00
Philip Dubé	b01bae5937	Check connections from connection_placement before polling	2020-04-06 17:45:44 +00:00
SaitTalhaNisanci	cd3e499834	not log in debug level in null parameters (#3718 ) The purpose of null_parameters is to make sure that citus doesn't crash with null parameters. (The related issue is #3493.) The logs in this file are not that important and they are flaky. The flakiness is related to postgres part as well so it is hard to reproduce them. Therefore it makes sense to decrease the log level.	2020-04-06 17:59:46 +03:00
SaitTalhaNisanci	3d3605be80	simplify vacuum test and fix the flakiness (#3704 ) look at sent commands to simplify complex logic in vacuum test also normalize connection id as that can differ when we don't have to choose a specific connection.	2020-04-03 21:39:54 +03:00
Onur Tirtir	4c95ad1579	do not traverse parse tree in distributed planner one more time	2020-04-03 18:24:48 +03:00
Onur Tirtir	abdabbedb2	refactor distributed_planner.c	2020-04-03 18:24:41 +03:00
Onur Tirtir	13a35c6813	implement GetOnlyShardOidOfReferenceTable and some refactor in shard_uitls	2020-04-03 18:24:13 +03:00
Jelte Fennema	459a4829ae	Fix isolation tests on OSX (#3706 ) * Don't print out comments in make output * Remove empty lines with sed	2020-04-03 16:28:06 +02:00
SaitTalhaNisanci	32156dbf5c	fix flaky log statement in null_parameters (#3705 ) It seems that sometimes the pruning is deferred and sometimes not with this statement. What we care in this test is to see that it doesn't crash. I think we don't care about the log statement for this line. So it makes sense to not log this statement, and care about the result.	2020-04-03 17:01:59 +03:00
Hanefi Önaldı	d1223bd6cc	Remove migration paths to 9.3-1, introduce 9.3-2	2020-04-03 12:50:45 +03:00
SaitTalhaNisanci	710970407f	not wait forever in multi_extension test (#3702 )	2020-04-03 12:21:02 +03:00
SaitTalhaNisanci	659283c9a7	fix multi utilities vacuum test (#3699 )	2020-04-03 11:50:00 +03:00
Marco Slot	fd8cdb92f4	Evaluate nextval in the target list on the coordinator	2020-04-02 02:53:19 +02:00
SaitTalhaNisanci	df88ab71b6	normalize assign_distributed_transaction_id in tests	2020-04-01 18:23:16 +03:00
SaitTalhaNisanci	0aebd78ea7	use localExecution in ExecuteTaskListExtended ExecuteTaskListExtended is the common method for different codepaths, and instead of writing separate local execution logics in different codepaths, it makes more sense to have the logic here. We still need to do some refactoring, this is an initial step. After this commit, we can run create shard commands locally. There is a special case with shard creation commands. A create shard command might have a concatenated query string, however local execution did not know how to execute a task with multiple query strings. This is also implemented in this commit. We go over each query in the concatenated query string and plan/execute them one by one. A more clean solution to this would be to make sure that each task has a single query. We currently cannot do that because we need to ensure the task dependencies. However, it would make sense to do that at some point and it would simplify the code a lot.	2020-04-01 18:23:16 +03:00
SaitTalhaNisanci	ba01f3457a	use macros for pg versions instead of hardcoded values (#3694 ) 3 Macros are defined for removing the hardcoded pg versions. PG_VERSION_11, PG_VERSION_12 and PG_VERSION_13.	2020-04-01 17:01:52 +03:00
Philip Dubé	3bb4f14efd	upgrade_type_after: ORDER BY	2020-04-01 01:07:21 +00:00
Philip Dubé	d155149c18	tests: remove stale comment, fix typo	2020-03-31 20:13:51 +00:00
Philip Dubé	ddc3377026	Assert bounds checks on two array reads which rely on data not being out of bounds	2020-03-31 18:58:35 +00:00
Marco Slot	252abcce16	Allow table type to be used in target list	2020-03-31 11:11:01 -07:00
SaitTalhaNisanci	5bf9f32dd3	disable one of deadlock detection test (#3682 ) It seems that one of the deadlock detection tests fails way too often in our CI. The difference is only ordering. Currently it seems that it is a good idea to disable this test for the sake of development.	2020-03-31 19:47:58 +03:00
SaitTalhaNisanci	6cd32b0db1	refactor ExecuteLocalTaskList (#3617 ) ExecuteLocalTaskList doesn't need scanState as it only uses paramListInfo, distributedPlan and tupleStoreState. It is better to pass only the variables that the function needs, so that we can call this function from other places when we dont have scanState.	2020-03-31 19:19:54 +03:00
SaitTalhaNisanci	b5591b1b28	use taskQuery as a struct to simplify the code	2020-03-31 15:47:55 +03:00
SaitTalhaNisanci	8806c4d697	move queryStringList into taskQuery Also allocate task query in the memory context of task.	2020-03-31 15:47:55 +03:00
SaitTalhaNisanci	c796ac335d	add TaskQuery struct to abstract query string related fields We had many fields in task related to query strings. It was kind of complex, and only of them could be set at a time. Therefore it makes more sense to abstract this and use a union so that it is clear that only of them should be set. We have three fields that could have query related strings: - queryForLocation - queryStringLazy - perPlacementQueryStrings Relatively, they can be set with: - SetTaskQueryString - SetTaskQueryIfShouldLazyDeparse - SetTaskPerPlacementQueryStrings The direct usage of the query related fields are also removed. Rename queryForLocalExecution Currently queryForLocalExecution is only used for deparsing purposes, therefore it makes sense to rename it to what it is doing.	2020-03-31 15:47:55 +03:00
SaitTalhaNisanci	98f95e2a5e	add TaskQueryStringForPlacement TaskQueryStringForPlacement simplifies how the executor gets the query string for a given placement. Task will use the necessary fields to return the correct query placement string. Executor doesn't need to know the details for this. rename TaskQueryString as TaskQueryStringAllPlacements TaskQueryString returns the query string that will be the same for all the placements. In INSERT..SELECT the query string can be different for each placement. Adaptive executor uses TaskQueryStringForPlacement, which returns the query string for a placement. It makes sense to rename TaskQueryString as TaskQueryStringAllPlacements as it is returning the query string for all placements. rename SetTaskQuery as SetTaskQueryIfShouldLazyDeparse SetTaskQuery does not always sets the task query. It can set the query string as well. So it is more clear to name it SetTaskQueryIfShouldLazyDeparse, since it will set the query not query string only when we should deparse the query in a lazy way.	2020-03-31 15:47:55 +03:00
SaitTalhaNisanci	982b5fbabf	add SetTaskPerPlacementStrings It is possible that a task will have different query string for each placement. This is the case in INSERT..SELECT via repartitioning. When we are setting task->perPlacementQueryString, we should set queryStringLazy to NULL. Therefore a method for that purpose is created.	2020-03-31 15:47:55 +03:00
Marco Slot	331b45348c	Fix error when using LEFT JOIN with GROUP BY on primary key	2020-03-30 16:42:22 +02:00
SaitTalhaNisanci	e1802c5c00	extract local plan cache related methods into a file (#3667 )	2020-03-31 11:11:34 +03:00
SaitTalhaNisanci	8dfc2cb122	not append ; if end of the list in StringJoin (#3672 )	2020-03-31 10:01:28 +03:00
Philip Dubé	67d2ad4e37	Fixes flaky test in multi_reference_table: ORDER BY (#3676 ) Fixes app.circleci.com/pipelines/github/citusdata/citus/7744/workflows/0848f36c-af9e-46b7-9dda-a421df54ba56/jobs/109503	2020-03-30 23:31:10 +02:00
Philip Dubé	4eb2c33f38	multi_copy.c: remove tableMetadata	2020-03-30 19:26:44 +00:00
Jelte Fennema	3be665269f	Reintroduce ForceSearchShardPlacementInList (#3664 ) This was added to silence static analysis errors. It was removed accidentally in #3591. This reintroduces it again.	2020-03-27 14:28:50 +01:00
Hanefi Onaldi	0e8103b101	Propagate ALTER ROLE .. SET statements In PostgreSQL, user defaults for config parameters can be changed by ALTER ROLE .. SET statements. We wish to propagate those defaults accross the Citus cluster so that the behaviour will be similar in different workers. The defaults can either be set in a specific database, or the whole cluster, similarly they can be set for a single role or all roles. We propagate the ALTER ROLE .. SET if all the conditions below are met: - The query affects the current database, or all databases - The user is already created in worker nodes	2020-03-27 13:02:48 +03:00
Marco Slot	a65ffee266	Fixes a bug that causes some DML queries containing aggregates to fail	2020-03-26 16:08:34 +00:00
SaitTalhaNisanci	d3fdade2e8	add missing perPlacementQueryStrings to copy and out funcs (#3657 )	2020-03-26 17:16:29 +03:00
Marco Slot	b89e9dc158	Fix a bug which caused queries with SRFs and function evalution to fail	2020-03-25 06:55:53 +01:00
SaitTalhaNisanci	dd1a456407	store query command list in task (#3649 ) Sometimes we have concatenated query strings for a task. However, when we want to find each query string, it is not a trivial task. Therefore, it makes sense to store this in task so that when we need each query string we can easily get it.	2020-03-26 12:04:08 +03:00
Philip Dubé	917cb6ae93	Don't segfault on queries using GROUPING GROUPING will always return 0 outside of GROUPING SETS, CUBE, or ROLLUP Since we don't support those, it makes sense to reject GROUPING in queries	2020-03-25 15:46:43 +00:00
Philip Dubé	720525cfda	Add support for window functions on coordinator Some refactoring: Consolidate expression which decides whether GROUP BY/HAVING are pushed down Rename early pullUpIntermediateRows to hasNonDistributableAggregates Create WorkerColumnName to handle formatting WORKER_COLUMN_FORMAT Ignore NULL StringInfo pointers to SafeToPushdownWindowFunction Fix bug where SubqueryPushdownMultiNodeTree mutates supplied Query, SafeToPushdownWindowFunction requires the original query as it relies on rtable	2020-03-25 15:31:20 +00:00
Nils Dijk	4e611cfc25	Refactor dependency resolution and resolve from pg_shdepend (#3633 ) DESCRIPTION: Refactor dependency resolution and resolve from pg_shdepend This PR refactors how dependencies are resolved by not assuming solely a `pg_depend` record describing the dependency. Instead we keep a definition of the dependency around which records how the dependency is resolved. This can be one of the following ways - `pg_depend`, data will contain a copy of the `pg_depend` record - `pg_shdepend`, data will contain a copy of the `pg_shdepend` record - `ObjectAddress`, data will contain only an `ObjectAddress` describing a dependency Irregardless of way the dependency was found it will always be able to get to the address of the dependency as that is the most important property. For some checks we can inspect the source where the dependency was found and perform a deep inspection to decide if we want to follow the dependency. This is important to not distribute dependencies coming from extensions for example.	2020-03-25 13:38:25 +01:00
Onur Tirtir	52fd58d51f	move MakeNameListFromRangeVar function to a more appropriate file	2020-03-25 11:01:50 +03:00
Onur Tirtir	2396b66ac5	remove an outdated comment in local executor	2020-03-25 11:01:40 +03:00
Onur Tirtir	8ebb8ef31d	use PG_USED_FOR_ASSERTS_ONLY	2020-03-25 11:01:33 +03:00
Onur Tirtir	81d48d3466	fix some typos	2020-03-25 11:01:26 +03:00
Jelte Fennema	149f0b2122	Use Microsoft approved cipher string (#3639 ) This cipher string is approved by the Microsoft security team and only enables TLSv1.2 ciphers.	2020-03-24 15:51:44 +01:00
Jelte Fennema	2aabe3e2ef	Mark all connections for shutdown when citus.node_conninfo chan… (#3642 ) We cache connections between nodes in our connection management code. This is good for speed. For security this can be a problem though. If the user changes settings related to TLS encryption they want those to be applied to future queries. This is especially important when they did not have TLS enabled before and now they want to enable it. This can normally be achieved by changing citus.node_conninfo. However, because connections are not reopened there will still be old connections that might not be encrypted at all. This commit changes that by marking all connections to be shutdown at the end of their current transaction. This way running transactions will succeed, even if placement requires connections to be reused for this transaction. But after this transaction completes any future statements will use a connection created with the new connection options. If a connection is requested and a connection is found that is marked for shutdown, then we don't return this connection. Instead a new one is created. This is needed to make sure that if there are no running transactions, then the next statement will not use an old cached connection, since connections are only actually shutdown at the end of a transaction.	2020-03-24 15:31:41 +01:00
Hadi Moshayedi	b46b9a68ae	Tests for master_copy_shard_placement	2020-03-23 08:33:55 -07:00
Marco Slot	ede176d849	Implement shard placement copying	2020-03-23 08:33:08 -07:00

... 2 3 4 5 6 ...

2492 Commits (e076d2a14e114ff6adc063bd699f38b95f5ca335)