citus

Commit Graph

Author	SHA1	Message	Date
SaitTalhaNisanci	1df9601e13	not use local copy if current transaction is connected to local group If current transaction is connected to local group we should not use local copy, because we might not see some of the changes that are made over the connection to the local group.	2020-03-18 09:28:59 +03:00
SaitTalhaNisanci	39bbec0f30	add tests for local copy execution	2020-03-18 09:28:59 +03:00
SaitTalhaNisanci	f9c4431885	add the support to execute copy locally A copy will be executed locally if - Local execution is enabled and current transaction accessed a local placement - Local execution is enabled and we are inside a transaction block. So even if local execution is enabled but we are not in a transaction block, the copy will not be run locally. This will not run locally: ``` COPY distributed_table FROM STDIN; .... ``` This will run locally: ``` SET citus.enable_local_execution to 'on'; BEGIN; COPY distributed_table FROM STDIN; COMMIT; .... ``` . There are 3 ways to do a copy in postgres programmatically: - from a file - from a program - from a callback function I have chosen to implement it with a callback function, which means that we write the rows of copy from a callback function to the output buffer, which is used to insert tuples into the actual table. For each shard id, we have a buffer that keeps the current rows to be written, we perform the actual copy operation either when: - copy buffer for the given shard id reaches to a threshold, which is currently 512KB - we reach to the end of the copy The buffer size is debatable(512KB). At a given time, we might allocate (local placement * buffer size) memory at most. The local copy uses the same copy format as remote copy, which means that we serialize the data in the same format as remote copy and send it locally. There was also the option to use ExecSimpleRelationInsert to insert slots one by one, which would avoid the extra serialization/deserialization but doing some benchmarks it seems that using buffers are significantly better in terms of the performance. You can see this comment for more details: https://github.com/citusdata/citus/pull/3557#discussion_r389499054	2020-03-18 09:28:59 +03:00
Jelte Fennema	99c5b0add7	Make building safestringlib on some distros easier (#3616 ) On some distros (e.g. Redhat 7) there is cmake version 2 and cmake version 3, safestringlib requires cmake version 3. On those distros the binary is called cmake3, so try to use that one before falling back to regular cmake binary.	2020-03-16 11:34:30 +01:00
Philip Dubé	7b382e43bc	multi_logical_optimizer: replace ListCopyDeep with copyObject, stack allocate WorkerAggregateWalkerContext	2020-03-13 15:46:01 +00:00
Nils Dijk	e5237b9e20	Fix left join shard pruning (#3569 ) DESCRIPTION: Fix left join shard pruning in pushdown planner Due to #2481 which moves outer join planning through the pushdown planner we caused a regression on the shard pruning behaviour for outer joins. In the pushdown planner we make a union of the placement groups for all shards accessed by a query based on the filters we see during planning. Unfortunately implicit filters for left joins are not available during this part. This causes the inner part of an outer join to not prune any shards away. When we take the union of the placement groups it shows the behaviour of not having any shards pruned. Since the inner part of an outer query will not return any rows if the outer part does not contain any rows we have observed we do not have to add the shard intervals of the inner part of an outer query to the list of shard intervals to query. Fixes: #3512	2020-03-13 15:20:45 +01:00
Onur Tirtir	a14739f808	Local execution of ddl/drop/truncate commands (#3514 ) * reimplement ExecuteUtilityTaskListWithoutResults for local utility command execution * introduce new functions for local execution of utility commands * change ErrorIfTransactionAccessedPlacementsLocally logic for local utility command execution * enable local execution for TRUNCATE command on distributed & reference tables * update existing tests for local utility command execution * enable local execution for DDL commands on distributed & reference tables * enable local execution for DROP command on distributed & reference tables * add normalization rules for cascaded commands * add new tests for local utility command execution	2020-03-13 15:39:32 +03:00
Jelte Fennema	ca8f7119fe	Semmle: Protect against theoretical race in recursive directory… (#3559 ) In between stat at the start of the loop and unlink/rmdir at the end the item that the filename references might have changed. In some cases this can be a security bug, but since we only delete the file/directory it should not be for us as far as I can tell. It could in theory still cause errors though if the a file is changed into a directory by some other process. This commit makes the code robust against that, by not using stat and only rely on error codes and retries.	2020-03-13 10:37:13 +01:00
Jelte Fennema	c7aa6eddf3	Fix some bugs in string to int functions (#3602 ) This fixes 3 bugs: 1. `strtoul` never underflows, so that branch was useless 2. `strtoul` has ULONG_MAX instead of LONG_MAX when it overflows 3. `long` and `unsigned long` are not necessarily 64bit, they can be either more or less. So now `strtoll` and `strtoull` are used and 64 bit bounds are checked.	2020-03-11 23:03:02 +01:00
Jelte Fennema	c4cc26ed37	Semmle: Ensure stack memory is not leaked through uninitialized… (#3561 ) New stack memory can contain anything including passwords/private keys. In these functions we return structs that can have their padding bytes uninitialized. By first zeroing out the struct fully, we try to ensure that any data that is in these padding bytes is at least overwritten once. It might not be zero anymore after setting the fields, but at least it shouldn't be private data anymore.	2020-03-11 20:05:36 +01:00
Philip Dubé	11b968bc30	Add runtime type checking to AGGREGATE_CUSTOM_COMBINE helper functions	2020-03-11 17:20:30 +00:00
Jelte Fennema	e0bbe1ca38	Semmle: Actively check one possible NULL deref case (#3560 ) Calling ErrorIfUnsupportedConstraint was still giving errors on Semmle. This makes sure that we check for NULL at runtime. This way we can safely ignore all errors created by this function.	2020-03-11 18:11:56 +01:00
Philip Dubé	4b68ee12c6	Also check aggregates in havingQual when scanning for non pushdownable aggregates Came across this while coming up with test cases, 'result "68_1" does not exist' I'll seek to address in a future PR, for now avoid segfault	2020-03-11 15:47:04 +00:00
Onder Kalaci	7d787e3d5e	Prevent create_distributed_function() from the workers As this could cause weird edge cases.	2020-03-10 18:24:20 +01:00
Onur Tirtir	e902581cb6	implement DropTaskList before introducing local DROP table execution (#3603 )	2020-03-10 19:12:44 +03:00
Marco Slot	cb3d90bdc8	Simplify INSERT logic in router planner	2020-03-10 15:54:40 +01:00
Philip Dubé	2b4ea33a2b	maintenanced: Don't call proc_exit in SIGTERM handler Instead set got_SIGTERM to true to signal mainloop to exit	2020-03-09 23:22:19 +00:00
Philip Dubé	81cfa05d3d	First phase of addressing HAVING subquery issues Add failing tests, make changes to avoid crashes at least Fix HAVING subquery pushdown ignoring reference table only subqueries, also include HAVING in recursive planning Given that we have a function IsDistributedTable which includes reference tables, it seems best to have IsDistributedTableRTE & QueryContainsDistributedTableRTE reflect that they do not include reference tables in their check Similarly SublinkList's name should reflect that it only scans WHERE contain_agg_clause asserts that we don't have SubLinks, use contain_aggs_of_level as suggested by pg sourcecode	2020-03-09 17:58:30 +00:00
Onder Kalaci	2ed19181fe	Improve definition of RelationInfoContainsOnlyRecurringTuples Before this commit, we considered !ContainsRecurringRTE() enough for NotContainsOnlyRecurringTuples. However, instead, we can check for existince of any distributed table. DESCRIPTION: Fixes a bug that causes wrong results with complex outer joins	2020-03-09 17:28:33 +01:00
SaitTalhaNisanci	321d0152c1	add a utility to get shard oid from relation oid and shard id (#3596 )	2020-03-09 15:50:29 +03:00
SaitTalhaNisanci	4509d9a72b	Create a variable SLOW_START_DISABLED (#3593 ) When ExecutorSlowStartInterval is set to 0, it has a special meaning that we do not want to use slow start. Therefore, in the code we have checks such as ExecutorSlowStartInterval > 0 to understand if it is enabled or not. However, this is kind of subtle, and it creates an extra mapping in our mind. Therefore, I thought that using a variable for the special value removes the mapping and makes it easier to understand.	2020-03-09 14:54:01 +03:00
Hanefi Onaldi	2595b4864b	Remove all GetWorkerNodeCount() references As @onderkalaci suggested removing the definition of GetWorkerNodeCount() that can potentially cause misunderstandings. I can advise using ActiveReadableWorkerNodeCount() that returns the number of active primaries is a safer alternative than GetWorkerNodeCount() that returns the total number of workers containing inactives, primaries, and unavailable nodes. I introduced a bug #3556 and in the bugfix #3564 removed the single usage of said function	2020-03-09 13:35:18 +03:00
Philip Dubé	7cdfa1daab	Rename LookupCitusTableCacheEntry to GetCitusTableCacheEntry, LookupLookupCitusTableCacheEntry back to LookupCitusTableCacheEntry	2020-03-08 14:08:23 +00:00
Philip Dubé	a7cca1bcde	Rename DistTableCacheEntry to CitusTableCacheEntry	2020-03-07 14:08:03 +00:00
Philip Dubé	b514ab0f55	Fix typos, rename isDistributedRelation to isCitusRelation	2020-03-06 19:20:34 +00:00
Philip Dubé	bec58000d6	Given IsDistributedTableRTE, there's ambiguity in what DistributedTable means Elsewhere we used DistributedTable to include reference tables Marco suggested we use CitusTable for distributed & reference tables So renaming: - IsDistributedTable -> IsCitusTable - IsDistributedTableViaCatalog -> IsCitusTableViaCatalog - DistributedTableCacheEntry -> CitusTableCacheEntry - DistributedTableList -> CitusTableList - isDistributedTable -> isCitusTable - InsertSelectIntoDistributedTable -> InsertSelectIntoCitusTable - ExtractFirstDistributedTableId -> ExtractFirstCitusTableId	2020-03-06 18:57:55 +00:00
Onur Tirtir	bdce9acc30	some refactor around foreign key constraints	2020-03-05 20:20:41 +03:00
Onur Tirtir	88bfd2e4b7	refactor around local group id checks Mostyl optimizes the calls made to GetLocalGroupId and refactors its usages	2020-03-05 20:20:41 +03:00
Onur Tirtir	1e128a6ee4	fix a potential infinite loop	2020-03-05 20:20:41 +03:00
SaitTalhaNisanci	a75436a54b	refactor CoordinatedTransactionCallback (#3571 )	2020-03-05 18:36:12 +03:00
Hanefi Onaldi	c0ad44f975	Fix early exit bug on intermediate result pruning There are 2 problems with our early exit strategy that this commit fixes: 1- When we decide that a subplan results are sent to all worker nodes, we used to skip traversing the whole distributed plan, instead of skipping only the subplan. 2- We used to consider all available nodes in the cluster (secondaries and inactive nodes as well as active primaries) when deciding on early exit strategy. This resulted in failures to early exit when there are secondaries or inactive nodes.	2020-03-05 16:41:44 +03:00
Marco Slot	dc4c0c032e	Refactor CitusBeginScan into separate DML / SELECT paths	2020-03-05 12:37:22 +01:00
Nils Dijk	268ad741a9	Refactor the deparsing of a CREATE EXTENSION to prevent NULL POINTER dereferences (#3518 ) DESCRIPTION: satisfy static analysis tool for a nullptr dereference During the static analysis project on the codebase this code has been flagged as having the potential for a null pointer dereference. Funnily enough the author had already made a comment of it in the code this was not possible due to us setting the schema name before we pass in the statement. If we want to reuse this code in a later setting this comment might not always apply and we could actually run into null pointer dereference. This patch changes a bit of the code around to first of all make sure there is no NULL pointer dereference in this code anymore. Secondly we allow for better deparsing by setting and adhering to the `if_not_exists` flag on the statement. And finally add support for all syntax described in the documentation of postgres (FROM was missing).	2020-03-04 16:47:07 +01:00
Onder Kalaci	087f6eb4c0	For composite types, add cast to the parameter to ease remote node detect the type.	2020-03-04 11:27:45 +01:00
Onur Tirtir	ff9c9d1808	make VacuumTaskList even with other taskList functions and some safety changes Makees VacuumTaskList function even with other TaskList creator functions. Also, previously we were generating per-shard vacuum command strings via unconventional usage of StringInfo struct (setting the stringInfo->len field manually) which could cause unexepected memory errors (that I cannot foresee now).	2020-03-02 10:25:28 +03:00
Onur Tirtir	cf718ffe77	safely error out in DistributedTableCacheEntry function	2020-03-02 10:25:12 +03:00
Onur Tirtir	17d9b934c3	refactor local_executor.c lines with >78 characters	2020-02-29 15:04:34 +03:00
Philip Dubé	34f241af16	Fix create_distributed_table on a table using GENERATED ALWAYS AS If the generated column does not come at the end of the column list, columnNameList doesn't line up with the column indexes. Seek past CREATE TABLE test_table ( test_id int PRIMARY KEY, gen_n int GENERATED ALWAYS AS (1) STORED, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); SELECT create_distributed_table('test_table', 'test_id'); Would raise ERROR: cannot cast 23 to 1184	2020-02-28 09:34:26 -08:00
Philip Dubé	2fae132e45	repartition_join_execution: Don't store 64 bit integers as poin… (#3551 ) Pointers are not necessarily 64bit	2020-02-28 15:06:06 +01:00
Philip Dubé	20abc4d2b5	Replace foreach with foreach_ptr/foreach_oid (#3544 )	2020-02-27 16:54:49 +01:00
Jelte Fennema	685b54b3de	Semmle: Check for NULL in some places where it might occur (#3509 ) Semmle reported quite some places where we use a value that could be NULL. Most of these are not actually a real issue, but better to be on the safe side with these things and make the static analysis happy.	2020-02-27 10:45:29 +01:00
Jelte Fennema	eb8e099f09	Fix Makefile so that it builds safestringlib correctly on OSX	2020-02-26 17:44:44 +01:00
Jelte Fennema	8e7eaaf949	Add clean-full to also clean full builds of vendored libraries	2020-02-26 17:44:44 +01:00
Hadi Moshayedi	e7cce40e6e	Address pykello's feedback	2020-02-26 07:17:32 -08:00
Hadi Moshayedi	1b3e58f0c3	Merge branch 'improve-shard-pruning' of https://github.com/MarkusSintonen/citus into MarkusSintonen-improve-shard-pruning	2020-02-26 07:13:33 -08:00
SaitTalhaNisanci	82d22b34fe	create temp schemas in parallel (#3540 )	2020-02-26 16:20:08 +03:00
SaitTalhaNisanci	d94c3fd43d	send repartition cleanup jobs in parallel to all workers (#3485 ) * send repartition cleanup jobs in parallel to all workers * add review items	2020-02-26 13:44:06 +03:00
Marco Slot	c7f123947e	Make merge tables during re-partitioning unlogged	2020-02-26 10:46:07 +01:00
Jelte Fennema	62bf571ced	Make SafeSnprintf work on PG11	2020-02-25 15:39:27 +01:00
Jelte Fennema	7d24cebc80	Add pg11 snprintf file to repo for use in pg11 when it's not compiled	2020-02-25 15:39:27 +01:00
Jelte Fennema	8de8b62669	Convert unsafe APIs to safe ones	2020-02-25 15:39:27 +01:00
Nils Dijk	a77ed9cd23	Refactor master query to be planned by postgres' planner (#3326 ) DESCRIPTION: Replace the query planner for the coordinator part with the postgres planner Closes #2761 Citus had a simple rule based planner for the query executed on the query coordinator. This planner grew over time with the addigion of SQL support till it was getting close to the functionality of the postgres planner. Except the code was brittle and its complexity rose which made it hard to add new SQL support. Given its resemblance with the postgres planner it was a long outstanding wish to replace our hand crafted planner with the well supported postgres planner. This patch replaces our planner with a call to postgres' planner. Due to the functionality of the postgres planner we needed to support both projections and filters/quals on the citus custom scan node. When a sort operation is planned above the custom scan it might require fields to be reordered in the custom scan before returning the tuple (projection). The postgres planner assumes every custom scan node implements projections. Because we controlled the plan that was created we prevented reordering in the custom scan and never had implemented it before. A same optimisation applies to having clauses that could have been where clauses. Instead of applying the filter as a having on the aggregate it will push it down into the plan which could reach a custom scan node. For both filters and projections we have implemented them when tuples are read from the tuple store. If no projections or filters are required it will directly return the tuple from the tuple store. Otherwise it will loop tuples from the tuple store through the filter and projection until a tuple is found and returned. Besides filters being pushed down a side effect of having quals that could have been a where clause is that a call to read intermediate result could be called before the first tuple is fetched from the custom scan. This failed because the intermediate result would only be pulled to the coordinator on the first tuple fetch. To overcome this problem we do run the distributed subplans now before we run the postgres executor. This ensures the intermediate result is present on the coordinator in time. We do account for total time instrumentation by removing the instrumentation before handing control to the psotgres executor and update the timings our self. For future SQL support it is enough to create a valid query structure for the part of the query to be executed on the query coordinating node. As a utility we do serialise and print the query at debug level4 for engineers to inspect what kind of query is being planned on the query coordinator.	2020-02-25 14:39:56 +01:00
Philip Dubé	025cb94159	Fix multi_task_string_size sometimes leaking intermediate files	2020-02-24 16:33:34 +00:00
Onur Tirtir	873e9fd604	Refactor DropShards before introducing local DROP execution	2020-02-24 17:52:20 +03:00
Onur Tirtir	3c99db40b9	Some small typos & cleanup	2020-02-24 16:37:55 +03:00
Jelte Fennema	2a9fccc7a0	Remove READFUNCs (#3536 ) We don't actually use these functions anymore since merging #1477. Advantages of removing: 1. They add work whenever we add a new node. 2. They contain some usage of stdlib APIs that are banned by Microsoft. Removing it means we don't have to replace those with safe ones.	2020-02-24 12:43:28 +01:00
Philip Dubé	bcf54c5014	Address a couple issues with maintenace daemon management: - Stop the daemon when citus extension is dropped - Bail on maintenance daemon startup if myDbData is started with a non-zero pid - Stop maintenance daemon from spawning itself - Don't use postgres die, just wrap proc_exit(0) - Assert(myDbData->workerPid == MyProcPid) The two issues were that multiple daemons could be running for a database, or that a daemon would be leftover after DROP EXTENSION citus	2020-02-21 16:49:01 +00:00
Nils Dijk	6ee82c381e	Add missing pieces for version bump of #3482 (#3523 )	2020-02-21 12:35:29 +01:00
Jelte Fennema	00d667c41d	Semmle: Fix obvious issues (#3502 ) Fixes some obvious issues found by the Semmle static analysis tool.	2020-02-21 10:16:00 +01:00
Onur Tirtir	926a1a61b9	change "relation" with "table" in error messages related with foreign keys on reference tables	2020-02-20 09:58:47 +03:00
Onur Tirtir	001089783c	Fix null relation name issue in CheckConflictingRelationAccesses	2020-02-19 19:10:35 +03:00
Philip Dubé	52042d4a00	Prefer instr_time to TimestampTz when we want CLOCK_MONOTONIC	2020-02-19 00:34:17 +00:00
Philip Dubé	08f6842d50	Fix typos Equivalance -> Equivalence utillity -> utility shorted lived one -> shortly lived one elegible -> eligible	2020-02-18 17:14:40 +00:00
Marco Slot	038e5999cb	Implement direct COPY table TO stdout	2020-02-17 15:15:10 +01:00
Jelte Fennema	3f7c5a5cf6	Semmle: Fix possible infite loops caused by overflow (#3503 ) Comparison between differently sized integers in loop conditions can cause infinite loops. This can happen when doing something like this: ```c int64 very_big = MAX_INT32 + 1; for (int32 i = 0; i < very_big; i++) { // do something } // never reached because i overflows before it can reach the value of very_big ```	2020-02-17 14:35:10 +01:00
Jelte Fennema	15f1173b1d	Semmle: Ensure permissions of private keys are 0600 (#3506 ) When using --allow-group-access option from initdb our keys and certificates would be created with 0640 permissions. Which is a pretty serious security issue: This changes that. This would not be exploitable though, since postgres would not actually enable SSL and would output the following message in the logs: ``` DETAIL: File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by root. ``` Since citus still expected the cluster to have SSL enabled handshakes between workers and coordinator would fail. So instead of a security issue the cluster would simply be unusable.	2020-02-17 12:58:40 +01:00
SaitTalhaNisanci	9302e6e699	apply review items	2020-02-17 14:16:49 +03:00
SaitTalhaNisanci	1b78045867	rename AssignTasksToConnections with AssignTasksToConnectionsOrWorkerPool	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	355805c7d8	create ProcessWaitEvents for separating the logic of handling events	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	c35981f9de	create UpdateWaitEventSet for better readability	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	a7e735a648	use a utility method to get event size	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	71f1aa48a3	remove unnecessary if check (#3500 )	2020-02-17 14:15:36 +03:00
Markus Sintonen	cf8319b992	Add comment, add subquery NOT tests	2020-02-16 01:21:10 +02:00
Markus Sintonen	3d3d615040	Add comment about NOT_EXPR. Treat it as invalid constraint for safety.	2020-02-15 16:54:38 +02:00
Philip Dubé	7382c8be00	Clean up from code review Only change to behavior is: - don't ignore array const's constcollid in SAORestrictions - don't end lines with commas in DebugLogPruningInstance	2020-02-14 17:58:23 +00:00
Markus Sintonen	cdedb98c54	Improve shard pruning logic to understand OR-conditions. Previously a limitation in the shard pruning logic caused multi distribution value queries to always go into all the shards/workers whenever query also used OR conditions in WHERE clause. Related to https://github.com/citusdata/citus/issues/2593 and https://github.com/citusdata/citus/issues/1537 There was no good workaround for this limitation. The limitation caused quite a bit of overhead with simple queries being sent to all workers/shards (especially with setups having lot of workers/shards). An example of a previous plan which was inadequately pruned: ``` EXPLAIN SELECT count() FROM orders_hash_partitioned WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22); QUERY PLAN --------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0) Task Count: 4 Tasks Shown: One of 4 -> Task Node: host=localhost port=xxxxx dbname=regression -> Aggregate (cost=13.68..13.69 rows=1 width=8) -> Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned (cost=0.00..13.68 rows=1 width=0) Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22))) (9 rows) ``` After this commit the task count is what one would expect from the query defining multiple distinct values for the distribution column: ``` EXPLAIN SELECT count() FROM orders_hash_partitioned WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22); QUERY PLAN --------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0) Task Count: 2 Tasks Shown: One of 2 -> Task Node: host=localhost port=xxxxx dbname=regression -> Aggregate (cost=13.68..13.69 rows=1 width=8) -> Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned (cost=0.00..13.68 rows=1 width=0) Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22))) (9 rows) ``` "Core" of the pruning logic works as previously where it uses `PrunableInstances` to queue ORable valid constraints for shard pruning. The difference is that now we build a compact internal representation of the query expression tree with PruningTreeNodes before actual shard pruning is run. Pruning tree nodes represent boolean operators and the associated constraints of it. This internal format allows us to have compact representation of the query WHERE clauses which allows "core" pruning logic to work with OR-clauses correctly. For example query having `WHERE (o_orderkey IN (1,2)) AND (o_custkey=11 OR (o_shippriority > 1 AND o_shippriority < 10))` gets transformed into: 1. AND(o_orderkey IN (1,2), OR(X, AND(X, X))) 2. AND(o_orderkey IN (1,2), OR(X, X)) 3. AND(o_orderkey IN (1,2), X) Here X is any set of unknown condition(s) for shard pruning. This allow the final shard pruning to correctly recognize that shard pruning is done with the valid condition of `o_orderkey IN (1,2)`. Another example with unprunable condition in query `WHERE (o_orderkey IN (1,2)) OR (o_custkey=11 AND o_custkey=22)` gets transformed into: 1. OR(o_orderkey IN (1,2), AND(X, X)) 2. OR(o_orderkey IN (1,2), X) Which is recognized as unprunable due to the OR condition between distribution column and unknown constraint -> goes to all shards. Issue https://github.com/citusdata/citus/issues/1537 originally suggested transforming the query conditions into a full disjunctive normal form (DNF), but this process of transforming into DNF is quite a heavy operation. It may "blow up" into a really large DNF form with complex queries having non trivial `WHERE` clauses. I think the logic for shard pruning could be simplified further but I decided to leave the "core" of the shard pruning untouched.	2020-02-14 17:58:13 +00:00
SaitTalhaNisanci	72d1850b4e	enhance local executor description (#3499 )	2020-02-13 20:19:08 +03:00
Onder Kalaci	975c4c2264	Do not prune shards if the distribution key is NULL The root of the problem is that, standard_planner() converts the following qual ``` {OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ( {VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 45 } {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull true :location 51 :constvalue <> } ) :location 49 } ``` To ``` ( {CONST :consttype 16 :consttypmod -1 :constcollid 0 :constlen 1 :constbyval true :constisnull true :location -1 :constvalue <> } ) ``` So, Citus doesn't deal with NULL values in real-time or non-fast path router queries. And, in the FastPathRouter planner, we check constisnull in DistKeyInSimpleOpExpression(). However, in deferred pruning case, we do not check for isnull for const. Thus, the fix consists of two parts: - Let PruneShards() not crash when NULL parameter is passed - For deferred shard pruning in fast-path queries, explicitly check that we have CONST which is not NULL	2020-02-13 15:00:31 +01:00
Onur Tirtir	cd8210d516	Bump citus version to 9.3devel (#3482 )	2020-02-13 16:22:05 +03:00
Philip Dubé	3a906b8210	Fix typos noticed while reading through code trying to understand HAVING	2020-02-11 19:55:10 +00:00
Onur Tirtir	ab0b49db82	fix uninitialized variable warning (#3483 )	2020-02-11 15:44:31 +01:00
Onur Tirtir	39df51e903	Introduce objects to dist. infrastructure when updating Citus (#3477 ) Mark existing objects that are not included in distributed object infrastructure in older versions of Citus (but now should be) as distributed, after updating Citus successfully.	2020-02-07 18:07:59 +03:00
Nils Dijk	d5433400f9	Fix: Unnecessary repartition on joins with more than 4 tables (#3473 ) DESCRIPTION: Fix unnecessary repartition on joins with more than 4 tables In 9.1 we have introduced support for all CH-benCHmark queries by widening our definitions of joins to include joins with expressions in them. This had the undesired side effect of Q5 regressing on its plan by implementing a repartition join. It turned out this regression was not directly related to widening of the join clause, nor the schema employed by CH-benCHmark. Instead it had to do with 4 or more tables being joined in a chain. A chain meaning: ```sql SELECT * FROM a,b,c,d WHERE a.part = b.part AND b.part = c.part AND .... ``` Due to how our join order planner was implemented it would only keep track of 1 of the partition columns when comparing if the join could be executed locally. This manifested in a join chain of 4 tables to _always_ be executed as a repartition join. 3 tables joined in a chain would have the middle table shared by the two outer tables causing the local join possibility to be found. With this patch we keep a unique list (or set) of all partition columns participating in the join. When a candidate table is checked for a possibility to execute a local join it will check if there is any partition column in that set that matches an equality join clause on the partition column of the candidate table. By taking into account all partition columns in the left relation it will now find the local join path on >= 4 tables joined in a chain. fixes: #3276	2020-02-06 15:07:07 +01:00
Philip Dubé	ecad4aa5e6	Fill in jobIdList field of DistributedExecution Pass down jobIdList from ExecuteTasksInDependencyOrder Also clean up comment for ExecuteTaskListOutsideTransaction	2020-02-05 17:32:22 +00:00
Philip Dubé	c252811884	dont: don't, wont: won't, acylic: acyclic	2020-02-05 17:32:22 +00:00
Halil Ozan Akgul	8ce4f20061	Fixes the bug of grants on public schema propagation	2020-02-05 18:05:58 +03:00
Hadi Moshayedi	9dd14fa90d	Rename discarded target list items in repartitioned INSERT/SELECT	2020-02-05 11:06:44 +01:00
Onder Kalaci	c7e2309f4c	Improve single hash-repartitioning with numeric (or non-int) types We used to treat the shard interval array that we passed as numeric[]. However, it should be int[], as the shard ranges are int[].	2020-02-04 20:30:04 +01:00
Hadi Moshayedi	bc1a800f70	Use current user for repartition join temp schemas. Otherwise when using a less privileged user we might get errors when trying to create the schema.	2020-02-04 09:48:20 -08:00
Hadi Moshayedi	264530311a	Don't use distributed insert/select for repartitioned joins	2020-02-03 13:13:30 -08:00
Marco Slot	be77d3304f	Fixup	2020-02-03 11:59:55 +01:00
Marco Slot	b0fd6aa006	If reference tables was read over multiple connections, do not assign connection	2020-02-03 11:54:29 +01:00
Onder Kalaci	2f274a4fce	Make sure to go deeper into the functions to search for PARAMs For example, a PARAM might reside inside a function just because of a casting of a type such as the follows: ``` {FUNCEXPR :funcid 1740 :funcresulttype 1700 :funcretset false :funcvariadic false :funcformat 2 :funccollid 0 :inputcollid 0 :args ( {PARAM :paramkind 0 :paramid 15 :paramtype 23 :paramtypmod -1 :paramcollid 0 :location 356 } ) ``` We should recursively check the expression before bailing out.	2020-02-03 09:36:12 +01:00
Philip Dubé	d43c80d4d8	pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by	2020-01-30 21:18:08 +00:00
Philip Dubé	84a500ffc6	CitusRemoveDirectory: loop when directory is not empty Sometimes during errors workers will create files while we're deleting intermediate directories example: DEBUG: could not remove file "base/pgsql_job_cache/10_0_431": Directory not empty DETAIL: WARNING from localhost:57637	2020-01-30 20:02:08 +00:00
Philip Dubé	5fccc56d3e	Expand the set of aggregates which cannot have LIMIT approximated Previously we only prevented AVG from being pushed down, but this is incorrect: - array_agg, while somewhat non sensical to order by, will potentially be missing values - combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative) This commit limits approximating LIMIT pushdown when ordering by aggregates to: min, max, sum, count, bit_and, bit_or, every, any Which means of those we previously supported, we now exclude: avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union	2020-01-30 17:45:18 +00:00
Önder Kalacı	8584cb005b	Do not evaluate functions on the coordinator for SELECT queries (#3440 ) Previously, the logic for evaluting the functions and the parameters were the same. That ended-up evaluting the functions inaccurately on the coordinator. Instead, split the function evaluation logic from parameter evalution logic.	2020-01-30 08:47:28 +01:00
Önder Kalacı	412fe719f7	Hide citus.enable_ddl_propagation setting (#3437 ) As that is powerful and cause metadata inconsistency. See the following steps: (Note that we cannot use PGC_SUSET because on Citus MX we need this flag for non- superusers as well) ```SQL CREATE TABLE test_ref_table(key int); SELECT create_reference_table('test_ref_table'); SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition; ┌────────────────┬──────────────┐ │ logicalrelid │ logicalrelid │ ├────────────────┼──────────────┤ │ test_ref_table │ 16831 │ └────────────────┴──────────────┘ (1 row) Time: 0.929 ms SELECT relname FROM pg_class WHERE oid = 16831; ┌────────────────┐ │ relname │ ├────────────────┤ │ test_ref_table │ └────────────────┘ (1 row) Time: 0.785 ms SET citus.enable_ddl_propagation TO off; DROP TABLE test_ref_table ; SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition; ┌──────────────┬──────────────┐ │ logicalrelid │ logicalrelid │ ├──────────────┼──────────────┤ │ 16831 │ 16831 │ └──────────────┴──────────────┘ (1 row) Time: 0.972 ms SELECT relname FROM pg_class WHERE oid = 16831; ┌─────────┐ │ relname │ ├─────────┤ └─────────┘ (0 rows) Time: 0.908 ms SELECT master_add_node('localhost', 9703); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. Time: 5.028 ms !> ```	2020-01-29 10:17:53 +01:00
SaitTalhaNisanci	94bd563ff0	switch back to old memory context in cache local plan for task (#3428 )	2020-01-27 13:00:46 +03:00
Önder Kalacı	4519d3411d	Improve the representation of used sub plans (#3411 ) Previously, we've identified the usedSubPlans by only looking to the subPlanId. With this commit, we're expanding it to also include information on the location of the subPlan. This is useful to distinguish the cases where the subPlan is used either on only HAVING or both HAVING and any other part of the query.	2020-01-24 10:47:14 +01:00
Philip Dubé	50c5e814c8	CurrentDatabaseName: return const char* as we're borrowing from cache	2020-01-23 22:49:35 +00:00
Hadi Moshayedi	1dc19215eb	Don't error for ENOENT in CitusRemoveDirectory. For concurrency reasons, this can happen even if initial stat succeeded.	2020-01-23 10:07:54 -08:00
Hadi Moshayedi	3e1004c232	Change DistributedResultFragment::nodeId to uint32. This is to match the type of WorkerNode::nodeId.	2020-01-23 09:33:15 -08:00
Önder Kalacı	ef7d1ea91d	Locally execute queries that don't need any data access (#3410 ) * Update shardPlacement->nodeId to uint As the source of the shardPlacement->nodeId is always workerNode->nodeId, and that is uint32. We had this hack because of: `0ea4e52df5 (r266421409)` And, that is gone with: `90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)` So, we're safe to do it now. * Relax the restrictions on using the local execution Previously, whenever any local execution happens, we disabled further commands to do any remote queries. The basic motivation for doing that is to prevent any accesses in the same transaction block to access the same placements over multiple sessions: one is local session the other is remote session to the same placement. However, the current implementation does not distinguish local accesses being to a placement or not. For example, we could have local accesses that only touches intermediate results. In that case, we should not implement the same restrictions as they become useless. So, this is a pre-requisite for executing the intermediate result only queries locally. * Update the error messages As the underlying implementation has changed, reflect it in the error messages. * Keep track of connections to local node With this commit, we're adding infrastructure to track if any connection to the same local host is done or not. The main motivation for doing this is that we've previously were more conservative about not choosing local execution. Simply, we disallowed local execution if any connection to any remote node is done. However, if we want to use local execution for intermediate result only queries, this'd be annoying because we expect all queries to touch remote node before the final query. Note that this approach is still limiting in Citus MX case, but for now we can ignore that. * Formalize the concept of Local Node Also some minor refactoring while creating the dummy placement * Write intermediate results locally when the results are only needed locally Before this commit, Citus used to always broadcast all the intermediate results to remote nodes. However, it is possible to skip pushing the results to remote nodes always. There are two notable cases for doing that: (a) When the query consists of only intermediate results (b) When the query is a zero shard query In both of the above cases, we don't need to access any data on the shards. So, it is a valuable optimization to skip pushing the results to remote nodes. The pattern mentioned in (a) is actually a common patterns that Citus users use in practice. For example, if you have the following query: WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...) SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...; The final query could be operating only on intermediate results. With this patch, the intermediate results of the ctes are not unnecessarily pushed to remote nodes. * Add specific regression tests As there are edge cases in Citus MX and with round-robin policy, use the same queries on those cases as well. * Fix failure tests By forcing not to use local execution for intermediate results since all the tests expects the results to be pushed remotely. * Fix flaky test * Apply code-review feedback Mostly style changes * Limit the max value of pg_dist_node_seq to reserve for internal use	2020-01-23 18:28:34 +01:00
Onder Kalaci	a0dff301c7	Update shardPlacement->nodeId to uint As the source of the shardPlacement->nodeId is always workerNode->nodeId, and that is uint32. We had this hack because of: `0ea4e52df5 (r266421409)` And, that is gone with: `90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)` So, we're safe to do it now.	2020-01-23 13:00:24 +01:00
Jelte Fennema	c62b756f34	Fix new method of locking shard distribition metadata (#3407 ) In #3374 a new way of locking shard distribution metadata was implemented. However, this was only done in the function `LockShardDistributionMetadata` and not in `TryLockShardDistributionMetadata`. This is bad, since it causes these locks to not block eachother in some cases. This commit fixes this issue by sharing the code that sets the locktag between the two function.	2020-01-22 16:44:17 +01:00
Jelte Fennema	cd5259a25a	Do not place new shards with shards in TO_DELETE state (#3408 ) When creating a new distributed table. The shards would colocate with shards with SHARD_STATE_TO_DELETE (shardstate = 4). This means if that state was because of a shard move the new shard would be created on two nodes and it would not get deleted since it's shard state would be 1.	2020-01-22 14:52:12 +01:00
Onder Kalaci	4be69bbf6f	Fix reference table issue	2020-01-20 18:45:18 +00:00
Halil Ozan Akgul	b40f067d05	Adds propagation for grant on schema commands	2020-01-20 14:51:28 +03:00
Philip Dubé	fdcc413559	Code cleanup of adaptive_executor, connection_management, placement_connection adaptive_executor: sort includes, use foreach_ptr, remove lies from FinishDistributedExecution docs connection_management: rename msecs, which isn't milliseconds placement_connection: small typos	2020-01-17 17:44:47 +00:00
Onder Kalaci	2f0ef8bc36	Apply feedback 1	2020-01-17 16:06:04 +01:00
Onder Kalaci	0bf1e81e33	Cache local plans on BeginScan	2020-01-17 16:02:57 +01:00
Onder Kalaci	5dc454cdad	Exclude localPlannedStatements from copy distributedPlan	2020-01-17 16:02:57 +01:00
Onder Kalaci	ff12df411b	Add LocalPlannedStatement struct	2020-01-17 16:02:57 +01:00
Onder Kalaci	3833a7e686	Fix issues for CTE inlining on Postgres 11 Comment from code: /* * We had to implement this hack because on Postgres11 and below, the originalQuery * and the query would have significant differences in terms of CTEs where CTEs * would not be inlined on the query (as standard_planner() wouldn't inline CTEs * on PG 11 and below). * * Instead, we prefer to pass the inlined query to the distributed planning. We rely * on the fact that the query includes subqueries, and it'd definitely go through * query pushdown planning. During query pushdown planning, the only relevant query * tree is the original query. */	2020-01-17 11:59:02 +01:00
Jelte Fennema	246435be7e	Lazy query deparsing executable queries (#3350 ) Deparsing and parsing a query can be heavy on CPU. When locally executing the query we don't need to do this in theory most of the time. This PR is the first step in allowing to skip deparsing and parsing the query in these cases, by lazily creating the query string and storing the query in the task. Future commits will make use of this and not deparse and parse the query anymore, but use the one from the task directly.	2020-01-17 11:49:43 +01:00
Hadi Moshayedi	6cf1c01660	Don't use repartitioned INSERT/SELECT for repartition joins	2020-01-16 23:40:31 -08:00
Hadi Moshayedi	5eeb07124f	Repartitioned INSERT/SELECT: include job id in result id prefix	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	a079278b0c	Repartitioned INSERT/SELECT: Add a GUC to enable/disable it	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	ce5eea4885	INSERT/SELECT: make SELECT column names unique	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	8635396cea	Repartitioned INSERT/SELECT: Test rollback behaviour	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	97072c9eb1	INSERT/SELECT: show method in EXPLAIN output	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	fe548b762f	Repartitioned INSERT/SELECT: Test CTEs	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	494cc383cc	Repartitioned INSERT/SELECT: Enable RETURNING	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	44a2aede16	Don't start a coordinated transaction on workers. Otherwise transaction hooks of Citus kick in and might cause unwanted errors.	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	42c3c03b85	Handle extra columns added in ExpandWorkerTargetEntry() in repartitioned INSERT/SELECT	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	89463f9760	Repartitioned INSERT/SELECT: cast columns in SELECT targets	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	d67a384350	Enable repartitioned INSERT/SELECT ON CONFLICT.	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	b4e5f4b10a	Implement INSERT ... SELECT with repartitioning	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	ced876358d	INSERT/SELECT: Refactor out AddInsertSelectCasts	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	d449c1857c	INSERT/SELECT: Use ExecutePlan* instead of ExecuteSelect*	2020-01-16 23:24:52 -08:00
Jelte Fennema	0ee1eab070	Make tests fail with a useful error message	2020-01-16 18:30:30 +01:00
Marco Slot	82f1fffa28	Fix epoll_ctl() error message on connection error	2020-01-16 06:40:57 +01:00
Onder Kalaci	dc17c2658e	Defer shard pruning for fast-path router queries to execution This is purely to enable better performance with prepared statements. Before this commit, the fast path queries with prepared statements where the distribution key includes a parameter always went through distributed planning. After this change, we only go through distributed planning on the first 5 executions.	2020-01-16 16:59:36 +01:00
Onder Kalaci	933d666c0d	Do not forget to copy fastPathRouterPlan@DistributedPlan	2020-01-16 16:39:20 +01:00
Halil Ozan Akgul	c5539d20d9	Adds alter table schema propagation	2020-01-16 17:04:16 +03:00
Nils Dijk	b6e09eb691	Fix: distributed function with table reference in declare (#3384 ) DESCRIPTION: Fixes a problem when adding a new node due to tables referenced in a functions body Fixes #3378 It was reported that `master_add_node` would fail if a distributed function has a table name referenced in its declare section of the body. By default postgres validates the body of a function on creation. This is not a problem in the normal case as tables are replicated to the workers when we distribute functions. However when a new node is added we first create dependencies on the workers before we try to create any tables, and the original tables get created out of bound when the metadata gets synced to the new node. This causes the function body validator to raise an error the table is not on the worker. To mitigate this issue we set `check_function_bodies` to `off` right before we are creating the function. The added test shows this does resolve the issue. (issue can be reproduced on the commit without the fix)	2020-01-16 14:21:54 +01:00
Jelte Fennema	e76281500c	Replace shardId lock with lock on colocation+shardIntervalIndex (#3374 ) This new locking pattern makes sure that some deadlocks that could happend during rebalancing cannot occur anymore.	2020-01-16 13:14:01 +01:00
Onder Kalaci	81d8178625	Note that we'll drop the GUC after PG 11 support dropped	2020-01-16 12:28:15 +01:00
Onder Kalaci	64560b07be	Update regression tests-2 In this commit, we're introducing a way to prevent CTE inlining via a GUC. The GUC is used in all the tests where PG 11 and PG 12 tests would diverge otherwise. Note that, in PG 12, the restriction information for CTEs are generated. It means that for some queries involving CTEs, Citus planner (router planner/ pushdown planner) may behave differently. So, via the GUC, we prevent tests to diverge on PG 11 vs PG 12. When we drop PG 11 support, we should get rid of the GUC, and mark relevant ctes as MATERIALIZED, which does the same thing.	2020-01-16 12:28:15 +01:00
Onder Kalaci	5cb203b276	Update regression tests-1 These set of tests has changed in both PG 11 and PG 12. The changes are only about CTE inlining kicking in both versions, and yielding the exact same distributed planning.	2020-01-16 12:28:15 +01:00
Onder Kalaci	efb1577d06	Handle CTE aliases accurately Basically, make sure to update the column name with the CTEs alias if we need to do so.	2020-01-16 12:28:15 +01:00
Onder Kalaci	05d600dd8f	Call CTE inlining in Citus planner The idea is simple: Inline CTEs(if any), try distributed planning. If the planning yields a successful distributed plan, simply return it. If the planning fails, fallback to distributed planning on the query tree where CTEs are not inlined. In that case, if the planning failed just because of the CTE inlining, via recursive planning, the same query would yield a successful plan. A very basic set of examples: WITH cte_1 AS (SELECT * FROM test_table) SELECT , row_number() OVER () FROM cte_1; or WITH a AS (SELECT FROM test_table), b AS (SELECT * FROM test_table) SELECT * FROM a JOIN b ON (a.value> b.value);	2020-01-16 12:28:15 +01:00
Onder Kalaci	01a5800ee8	Add Citus' CTE inlining functions With this commit we add the necessary Citus function to inline CTEs in a queryTree. You might ask, why do we need to inline CTEs if Postgres is already going to do it? Few reasons behind this decision: - One techinal node here is that Citus does the recursive CTE planning by checking the originalQuery which is the query that has not gone through the standard_planner(). CTEs in Citus is super powerful. It is practically key for full SQL coverage for multi-shard queries. With CTEs, you can always reduce any query multi-shard query into a router query via recursive planning (thus full SQL coverage). We cannot let CTE inlining break that. The main idea is Citus should be able to retry planning if anything goes after CTE inlining. So, by taking ownership of CTE inlining on the originalQuery, Citus can fallback to recursive planning of CTEs if the planning with the inlined query fails. It could have been a lot harder if we had relied on standard_planner() to have the inlined CTEs on the original query. - We want to have this feature in PostgreSQL 11 as well, but Postgres only inlines in version 12	2020-01-16 12:28:15 +01:00
Onder Kalaci	1856ab6cdd	Copy & paste code from Postgres source All the code in this commit is direct copy & paste from Postgres source code. We can classify the copy&paste code into two: - Copy paste from CTE inline patch from postgres (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b) These include the functions inline_cte(), inline_cte_walker(), contain_dml(), contain_dml_walker(). It also include the code in function PostgreSQLCTEInlineCondition(). We prefer to extract that code into a seperate function, because (a) we'll re-use the logic later (b) we added one check for PG_11 Finally, the struct "inline_cte_walker_context" is also copied from the same Postgres commit. - Copy paste from the other parts of the Postgres code In order to implement CTE inlining in Postgres 12, the hackers modified the query_tree_walker()/range_table_walker() with the `18c0da88a5` Since Citus needs to support the same logic in PG 11, we copy & pasted that functions (and related flags) with the names pg_12_query_tree_walker() and pg_12_range_table_walker()	2020-01-16 12:28:15 +01:00
Philip Dubé	4d9a733c2f	Fix inserting multiple values with row expression partition column causing the insert to be ignored Raise an error instead of silently inserting nothing if we hit this condition in the future	2020-01-15 21:10:50 +00:00
Philip Dubé	4989c9a15c	PlacementExecutionDone: We may mark placements as failed multiple times, but should only act the first time.	2020-01-15 18:20:01 +00:00
Marco Slot	f1a0582973	Make ApplyLogRedaction a macro and redefine ereport	2020-01-13 18:24:36 +01:00
Marco Slot	06709ee108	Always use NOTICE in log_remote_commands and avoid redaction when possible	2020-01-13 18:24:36 +01:00
Marco Slot	90056f7d3c	Remove copy from worker for append-partitioned table	2020-01-13 23:03:40 -08:00
Philip Dubé	ccabf19090	Propagate DROP ROUTINE, ALTER ROUTINE In two places I've made code more straight forward by using ROUTINE in our own codegen Two changes which may seem extraneous: AppendFunctionName was updated to not use pg_get_function_identity_arguments. This is because that function includes ORDER BY when printing an aggregate like my_rank. While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres, ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not. Tests were updated to use macaddr over integer. Using integer is flaky, our logic could sometimes end up on tables like users_table. I originally wanted to use money, but money isn't hashable.	2020-01-13 15:37:46 +00:00
Philip Dubé	4b5d6c3ebe	Rename RelayFileState to ShardState Replace FILE_ prefix with SHARD_STATE_	2020-01-12 05:57:53 +00:00
Philip Dubé	e71386af33	Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT Also use stack allocation for walkerContext in multi_logical_optimizer	2020-01-10 16:54:00 +00:00
Hadi Moshayedi	40ba2cdd6e	Test RedistributeTaskListResult	2020-01-09 23:47:25 -08:00
Hadi Moshayedi	527d7d41c1	Implement RedistributeTaskListResult	2020-01-09 23:47:25 -08:00
Philip Dubé	281aacce9b	Fix row-gather for subqueries being handled by task-tracker task-tracker has specific logic for MultiPartition when GROUP BY is missing We were ending up in this code path because row-gather removes GROUP BY	2020-01-10 01:51:37 +00:00
Hadi Moshayedi	e1e383cb59	Don't override xact id assigned by coordinator on workers. We might need to send commands from workers to other workers. In these cases we shouldn't override the xact id assigned by coordinator, or otherwise we won't read the consistent set of result files accross the nodes.	2020-01-09 11:09:11 -08:00
Hadi Moshayedi	c7c460e843	PartitionTasklistResults: Use different queries per placement We need to know which placement succeeded in executing the worker_partition_query_result() call. Otherwise we wouldn't know which node to fetch from. This change allows that by introducing Task::perPlacementQueryStrings.	2020-01-09 10:55:58 -08:00
Hadi Moshayedi	f38d0e5b3f	Partitioned task list results.	2020-01-09 10:32:58 -08:00
Philip Dubé	73c06fae3b	Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type	2020-01-09 18:24:29 +00:00
Philip Dubé	bf7d86a3e8	Fix typo: aggragate -> aggregate	2020-01-07 01:16:09 +00:00
Philip Dubé	863bf49507	Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default	2020-01-07 01:16:04 +00:00
Jelte Fennema	5b0baea72c	Refactor distributed_planner for better understandability	2020-01-06 14:23:38 +01:00
Onder Kalaci	5a1e752726	Apply feedback - add fastPath field to plan	2020-01-06 12:42:43 +01:00
Onder Kalaci	13a9b55695	Skip expensive checks when fast-path query The definition of fast-path query is very strict. So, we don't need to do some extra checks.	2020-01-06 12:42:43 +01:00
Onder Kalaci	7f3ab7892d	Skip shard pruning when possible We're already traversing the queryTree and finding the distribution key value, so pass it to the later stages of the planning.	2020-01-06 12:42:43 +01:00
Onder Kalaci	ca293116fa	Reduce calls to FastPathRouterQuery() Before this commit, we called it twice durning planning. Instead, we save the information and pass it.	2020-01-06 12:42:43 +01:00
Onder Kalaci	c8f14c9f6c	Make sure to update shard states of partitions on failures Fixes #3331 In #2389, we've implemented support for partitioned tables with rep > 1. The implementation is limiting the use of modification queries on the partitions. In fact, we error out when any partition is modified via EnsurePartitionTableNotReplicated(). However, we seem to forgot an important case, where the parent table's partition is marked as INVALID. In that case, at least one of the partition becomes INVALID. However, we do not mark partitions as INVALID ever. If the user queries the partition table directly, Citus could happily send the query to INVALID placements -- which are not marked as INVALID. This PR fixes it by marking the placements of the partitions as INVALID as well. The shard placement repair logic already re-creates all the partitions, so should be fine in that front.	2020-01-06 12:26:08 +01:00
Önder Kalacı	0c70a5470e	Allow RETURNING in fast-path queries (#3352 ) * Allow RETURNING in fast-path queries Because there is no specific reason for that.	2020-01-03 13:42:50 +00:00
Önder Kalacı	a174eb4f7b	Do not go through standard_planner() for INSERTs (#3348 ) That seems unnecessary. We already have the notion of FastPath queries, simply add it there.	2020-01-03 12:15:22 +00:00
Marco Slot	ba39d72fe1	Fix incorrect union all pushdown issue	2020-01-01 09:03:50 +01:00
Jelte Fennema	3a042e4611	Allow cartesian products on reference tables	2019-12-27 15:05:51 +01:00
Jelte Fennema	61e2501645	Make any expression with two or more tables a join expression	2019-12-27 15:05:51 +01:00
Jelte Fennema	4233cd0d9d	Allow non equi joins on reference tables	2019-12-27 15:05:51 +01:00
Jelte Fennema	7642928be1	Makefile fix DESTDIR together with cleanup (#3342 ) This should fix this build issue: redmine.postgresql.org/issues/5032	2019-12-27 10:34:57 +01:00
Marco Slot	b21b6905ae	Do not repeat GROUP BY distribution_column on coordinator Allow arbitrary aggregates to be pushed down in these scenarios	2019-12-25 01:33:41 +00:00
Marco Slot	a2ddfecd86	Fix inconsistent shard metadata issue	2019-12-24 08:01:32 +01:00
Hadi Moshayedi	d7aea7fa10	Implement partitioned intermediate results.	2019-12-24 03:53:39 -08:00
Marco Slot	b37ef0e394	Fix error in distributed queries when shards are on the coordinator	2019-12-24 06:36:43 +01:00
Philip Dubé	e9bbdb8f31	Fix handling of empty intermediate results when distributing custom aggregates	2019-12-23 17:27:52 +00:00
Philip Dubé	f007b7f91d	Also fix reindent inconsistencies with fake_fdw.c	2019-12-20 08:27:47 +00:00
Hadi Moshayedi	08eb0ade31	Fix reindent version inconsistencies. Different versions of reindent tool reformatted citus_custom_scan.c and citus_copyfuncs.c differently. So some developers spent some extra attention not to commit these two files after reindent. This PR tries to address this.	2019-12-19 23:10:34 -08:00
Jelte Fennema	b655c02352	Add the necessary changes for rebalance strategies on enterprise (#3325 ) This commit adds the SQL and C changes necessary to support custom rebalance strategies in the Enterprise version of Citus.	2019-12-19 15:23:08 +01:00
Hadi Moshayedi	ef487e0792	Implement fetch_intermediate_results	2019-12-18 10:46:35 -08:00
Hadi Moshayedi	249508d267	Estimate cost of read_intermediate_results()	2019-12-17 13:51:51 -08:00
Hadi Moshayedi	113bd1e5f1	Implement read_intermediate_results	2019-12-17 13:51:16 -08:00
SaitTalhaNisanci	7ff4ce2169	Add adaptive executor support for repartition joins (#3169 ) * WIP * wip * add basic logic to run a single job with repartioning joins with adaptive executor * fix some warnings and return in ExecuteDependedTasks if there is none * Add the logic to run depended jobs in adaptive executor The execution of depended tasks logic is changed. With the current logic: - All tasks are created from the top level task list. - At one iteration: - CurTasks whose dependencies are executed are found. - CurTasks are executed in parallel with adapter executor main logic. - The iteration is repeated until all tasks are completed. * Separate adaptive executor repartioning logic * Remove duplicate parts * cleanup directories and schemas * add basic repartion tests for adaptive executor * Use the first placement to fetch data In task tracker, when there are replicas, we try to fetch from a replica for which a map task is succeeded. TaskExecution is used for this, however TaskExecution is not used in adaptive executor. So we cannot use the same thing as task tracker. Since adaptive executor fails when a map task fails (There is no retry logic yet). We know that if we try to execute a fetch task, all of its map tasks already succeeded, so we can just use the first one to fetch from. * fix clean directories logic * do not change the search path while creating a udf * Enable repartition joins with adaptive executor with only enable_reparitition_joins guc * Add comments to adaptive_executor_repartition * dont run adaptive executor repartition test in paralle with other tests * execute cleanup only in the top level execution * do cleanup only in the top level ezecution * not begin a transaction if repartition query is used * use new connections for repartititon specific queries New connections are opened to send repartition specific queries. The opened connections will be closed at the FinishDistributedExecution. While sending repartition queries no transaction is begun so that we can see all changes. * error if a modification was done prior to repartition execution * not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level * fix cleanup logic * update tests * add missing function comments * add test for transaction with DDL before repartition query * do not close repartition connections in adaptive executor * rollback instead of commit in repartition join test * use close connection instead of shutdown connection * remove unnecesary connection list, ensure schema owner before removing directory * rename ExecuteTaskListRepartition * put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case * split adaptive executor repartition to DAG execution logic and repartition logic * apply review items * apply review items * use an enum for remote transaction state and fix cleanup for repartition * add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction * fix style * wip * rename removejobdir to partition cleanup * do not close connections at the end of repartition queries * do repartition cleanup in pg catch * apply review items * decide whether to use transaction or not at execution creation * rename isOutsideTransaction and add missing comment * not error in pg catch while doing cleanup * use replication factor of the creation time, not current time to decide if task tracker should be chosen * apply review items * apply review items * apply review item	2019-12-17 19:09:45 +03:00
Marco Slot	2f568ad5a5	Forbid using connections that sent intermediate results for data access and vice versa	2019-12-17 11:49:13 +01:00
Marco Slot	f4031dd477	Clean up transaction block usage logic in adaptive executor	2019-12-17 10:48:19 +01:00
Nils Dijk	bfc3d2eb90	make sure to correctly decrement ExecutorLevel (#3311 ) DESCRIPTION: Fix counter that keeps track of internal depth in executor While reviewing #3302 I ran into the `ExecutorLevel` variable which used a variable to keep the original value to restore on successful exit. I haven't explored the full space and if it is possible to get into an inconsistent state. However using `PG_TRY`/`PG_CATCH` seems generally more correct. Given very bad things will happen if this level is not reset, I kept the failsafe of setting the variiable back to 0 on the `XactCallback` but I did add an assert to treat it as a developer bug.	2019-12-16 20:50:13 +01:00
Marco Slot	5f656e22db	Fix issue in IsMultiStatementTransaction detection	2019-12-16 17:01:43 +01:00
SaitTalhaNisanci	2829c601dd	replace Begin words in coordinated transactions with use (#3293 )	2019-12-16 10:40:31 +03:00
SaitTalhaNisanci	a2f2107e6a	refactor MapTaskList in multi physical planner (#3297 )	2019-12-13 22:41:49 +03:00
Marco Slot	1633123d78	Fix crash in IN (NULL) queries	2019-12-13 08:35:54 +01:00
Hadi Moshayedi	e7a6cc0801	Fix some typos from #3280	2019-12-12 13:29:26 -08:00
SaitTalhaNisanci	420e21919b	refactor extract distributed insert values rte (#3287 )	2019-12-12 23:47:44 +03:00
Marco Slot	e7a8db5493	Fix issue with some zero-shard modifications	2019-12-12 07:19:10 +01:00
SaitTalhaNisanci	2c040d2c8f	use a function for duplicate code in connection state machine (#3209 )	2019-12-12 17:55:38 +03:00
SaitTalhaNisanci	a0fe8646e0	add IsHoldOffCancellationReceived utility function (#3290 )	2019-12-12 17:32:59 +03:00
SaitTalhaNisanci	053fe18404	not continue in sequential execution if a cancellation is received (#3289 )	2019-12-12 17:22:30 +03:00

... 2 3 4 5 6 ...

1775 Commits (24feadc23013016450a731614c15a970b777b39d)