citus

Commit Graph

Author	SHA1	Message	Date
Philip Dubé	7cdfa1daab	Rename LookupCitusTableCacheEntry to GetCitusTableCacheEntry, LookupLookupCitusTableCacheEntry back to LookupCitusTableCacheEntry	2020-03-08 14:08:23 +00:00
Philip Dubé	a7cca1bcde	Rename DistTableCacheEntry to CitusTableCacheEntry	2020-03-07 14:08:03 +00:00
Philip Dubé	b514ab0f55	Fix typos, rename isDistributedRelation to isCitusRelation	2020-03-06 19:20:34 +00:00
Philip Dubé	bec58000d6	Given IsDistributedTableRTE, there's ambiguity in what DistributedTable means Elsewhere we used DistributedTable to include reference tables Marco suggested we use CitusTable for distributed & reference tables So renaming: - IsDistributedTable -> IsCitusTable - IsDistributedTableViaCatalog -> IsCitusTableViaCatalog - DistributedTableCacheEntry -> CitusTableCacheEntry - DistributedTableList -> CitusTableList - isDistributedTable -> isCitusTable - InsertSelectIntoDistributedTable -> InsertSelectIntoCitusTable - ExtractFirstDistributedTableId -> ExtractFirstCitusTableId	2020-03-06 18:57:55 +00:00
Onur Tirtir	bdce9acc30	some refactor around foreign key constraints	2020-03-05 20:20:41 +03:00
Onur Tirtir	88bfd2e4b7	refactor around local group id checks Mostyl optimizes the calls made to GetLocalGroupId and refactors its usages	2020-03-05 20:20:41 +03:00
Onur Tirtir	1e128a6ee4	fix a potential infinite loop	2020-03-05 20:20:41 +03:00
SaitTalhaNisanci	a75436a54b	refactor CoordinatedTransactionCallback (#3571 )	2020-03-05 18:36:12 +03:00
Hanefi Onaldi	c0ad44f975	Fix early exit bug on intermediate result pruning There are 2 problems with our early exit strategy that this commit fixes: 1- When we decide that a subplan results are sent to all worker nodes, we used to skip traversing the whole distributed plan, instead of skipping only the subplan. 2- We used to consider all available nodes in the cluster (secondaries and inactive nodes as well as active primaries) when deciding on early exit strategy. This resulted in failures to early exit when there are secondaries or inactive nodes.	2020-03-05 16:41:44 +03:00
Marco Slot	dc4c0c032e	Refactor CitusBeginScan into separate DML / SELECT paths	2020-03-05 12:37:22 +01:00
Nils Dijk	268ad741a9	Refactor the deparsing of a CREATE EXTENSION to prevent NULL POINTER dereferences (#3518 ) DESCRIPTION: satisfy static analysis tool for a nullptr dereference During the static analysis project on the codebase this code has been flagged as having the potential for a null pointer dereference. Funnily enough the author had already made a comment of it in the code this was not possible due to us setting the schema name before we pass in the statement. If we want to reuse this code in a later setting this comment might not always apply and we could actually run into null pointer dereference. This patch changes a bit of the code around to first of all make sure there is no NULL pointer dereference in this code anymore. Secondly we allow for better deparsing by setting and adhering to the `if_not_exists` flag on the statement. And finally add support for all syntax described in the documentation of postgres (FROM was missing).	2020-03-04 16:47:07 +01:00
Onder Kalaci	087f6eb4c0	For composite types, add cast to the parameter to ease remote node detect the type.	2020-03-04 11:27:45 +01:00
Onur Tirtir	ff9c9d1808	make VacuumTaskList even with other taskList functions and some safety changes Makees VacuumTaskList function even with other TaskList creator functions. Also, previously we were generating per-shard vacuum command strings via unconventional usage of StringInfo struct (setting the stringInfo->len field manually) which could cause unexepected memory errors (that I cannot foresee now).	2020-03-02 10:25:28 +03:00
Onur Tirtir	cf718ffe77	safely error out in DistributedTableCacheEntry function	2020-03-02 10:25:12 +03:00
Onur Tirtir	17d9b934c3	refactor local_executor.c lines with >78 characters	2020-02-29 15:04:34 +03:00
Philip Dubé	34f241af16	Fix create_distributed_table on a table using GENERATED ALWAYS AS If the generated column does not come at the end of the column list, columnNameList doesn't line up with the column indexes. Seek past CREATE TABLE test_table ( test_id int PRIMARY KEY, gen_n int GENERATED ALWAYS AS (1) STORED, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); SELECT create_distributed_table('test_table', 'test_id'); Would raise ERROR: cannot cast 23 to 1184	2020-02-28 09:34:26 -08:00
Philip Dubé	2fae132e45	repartition_join_execution: Don't store 64 bit integers as poin… (#3551 ) Pointers are not necessarily 64bit	2020-02-28 15:06:06 +01:00
Philip Dubé	20abc4d2b5	Replace foreach with foreach_ptr/foreach_oid (#3544 )	2020-02-27 16:54:49 +01:00
Jelte Fennema	685b54b3de	Semmle: Check for NULL in some places where it might occur (#3509 ) Semmle reported quite some places where we use a value that could be NULL. Most of these are not actually a real issue, but better to be on the safe side with these things and make the static analysis happy.	2020-02-27 10:45:29 +01:00
Jelte Fennema	eb8e099f09	Fix Makefile so that it builds safestringlib correctly on OSX	2020-02-26 17:44:44 +01:00
Jelte Fennema	8e7eaaf949	Add clean-full to also clean full builds of vendored libraries	2020-02-26 17:44:44 +01:00
Hadi Moshayedi	e7cce40e6e	Address pykello's feedback	2020-02-26 07:17:32 -08:00
Hadi Moshayedi	1b3e58f0c3	Merge branch 'improve-shard-pruning' of https://github.com/MarkusSintonen/citus into MarkusSintonen-improve-shard-pruning	2020-02-26 07:13:33 -08:00
SaitTalhaNisanci	82d22b34fe	create temp schemas in parallel (#3540 )	2020-02-26 16:20:08 +03:00
SaitTalhaNisanci	d94c3fd43d	send repartition cleanup jobs in parallel to all workers (#3485 ) * send repartition cleanup jobs in parallel to all workers * add review items	2020-02-26 13:44:06 +03:00
Marco Slot	c7f123947e	Make merge tables during re-partitioning unlogged	2020-02-26 10:46:07 +01:00
Jelte Fennema	62bf571ced	Make SafeSnprintf work on PG11	2020-02-25 15:39:27 +01:00
Jelte Fennema	7d24cebc80	Add pg11 snprintf file to repo for use in pg11 when it's not compiled	2020-02-25 15:39:27 +01:00
Jelte Fennema	8de8b62669	Convert unsafe APIs to safe ones	2020-02-25 15:39:27 +01:00
Nils Dijk	a77ed9cd23	Refactor master query to be planned by postgres' planner (#3326 ) DESCRIPTION: Replace the query planner for the coordinator part with the postgres planner Closes #2761 Citus had a simple rule based planner for the query executed on the query coordinator. This planner grew over time with the addigion of SQL support till it was getting close to the functionality of the postgres planner. Except the code was brittle and its complexity rose which made it hard to add new SQL support. Given its resemblance with the postgres planner it was a long outstanding wish to replace our hand crafted planner with the well supported postgres planner. This patch replaces our planner with a call to postgres' planner. Due to the functionality of the postgres planner we needed to support both projections and filters/quals on the citus custom scan node. When a sort operation is planned above the custom scan it might require fields to be reordered in the custom scan before returning the tuple (projection). The postgres planner assumes every custom scan node implements projections. Because we controlled the plan that was created we prevented reordering in the custom scan and never had implemented it before. A same optimisation applies to having clauses that could have been where clauses. Instead of applying the filter as a having on the aggregate it will push it down into the plan which could reach a custom scan node. For both filters and projections we have implemented them when tuples are read from the tuple store. If no projections or filters are required it will directly return the tuple from the tuple store. Otherwise it will loop tuples from the tuple store through the filter and projection until a tuple is found and returned. Besides filters being pushed down a side effect of having quals that could have been a where clause is that a call to read intermediate result could be called before the first tuple is fetched from the custom scan. This failed because the intermediate result would only be pulled to the coordinator on the first tuple fetch. To overcome this problem we do run the distributed subplans now before we run the postgres executor. This ensures the intermediate result is present on the coordinator in time. We do account for total time instrumentation by removing the instrumentation before handing control to the psotgres executor and update the timings our self. For future SQL support it is enough to create a valid query structure for the part of the query to be executed on the query coordinating node. As a utility we do serialise and print the query at debug level4 for engineers to inspect what kind of query is being planned on the query coordinator.	2020-02-25 14:39:56 +01:00
Philip Dubé	025cb94159	Fix multi_task_string_size sometimes leaking intermediate files	2020-02-24 16:33:34 +00:00
Onur Tirtir	873e9fd604	Refactor DropShards before introducing local DROP execution	2020-02-24 17:52:20 +03:00
Onur Tirtir	3c99db40b9	Some small typos & cleanup	2020-02-24 16:37:55 +03:00
Jelte Fennema	2a9fccc7a0	Remove READFUNCs (#3536 ) We don't actually use these functions anymore since merging #1477. Advantages of removing: 1. They add work whenever we add a new node. 2. They contain some usage of stdlib APIs that are banned by Microsoft. Removing it means we don't have to replace those with safe ones.	2020-02-24 12:43:28 +01:00
Philip Dubé	bcf54c5014	Address a couple issues with maintenace daemon management: - Stop the daemon when citus extension is dropped - Bail on maintenance daemon startup if myDbData is started with a non-zero pid - Stop maintenance daemon from spawning itself - Don't use postgres die, just wrap proc_exit(0) - Assert(myDbData->workerPid == MyProcPid) The two issues were that multiple daemons could be running for a database, or that a daemon would be leftover after DROP EXTENSION citus	2020-02-21 16:49:01 +00:00
Nils Dijk	6ee82c381e	Add missing pieces for version bump of #3482 (#3523 )	2020-02-21 12:35:29 +01:00
Jelte Fennema	00d667c41d	Semmle: Fix obvious issues (#3502 ) Fixes some obvious issues found by the Semmle static analysis tool.	2020-02-21 10:16:00 +01:00
Onur Tirtir	926a1a61b9	change "relation" with "table" in error messages related with foreign keys on reference tables	2020-02-20 09:58:47 +03:00
Onur Tirtir	001089783c	Fix null relation name issue in CheckConflictingRelationAccesses	2020-02-19 19:10:35 +03:00
Philip Dubé	52042d4a00	Prefer instr_time to TimestampTz when we want CLOCK_MONOTONIC	2020-02-19 00:34:17 +00:00
Philip Dubé	08f6842d50	Fix typos Equivalance -> Equivalence utillity -> utility shorted lived one -> shortly lived one elegible -> eligible	2020-02-18 17:14:40 +00:00
Marco Slot	038e5999cb	Implement direct COPY table TO stdout	2020-02-17 15:15:10 +01:00
Jelte Fennema	3f7c5a5cf6	Semmle: Fix possible infite loops caused by overflow (#3503 ) Comparison between differently sized integers in loop conditions can cause infinite loops. This can happen when doing something like this: ```c int64 very_big = MAX_INT32 + 1; for (int32 i = 0; i < very_big; i++) { // do something } // never reached because i overflows before it can reach the value of very_big ```	2020-02-17 14:35:10 +01:00
Jelte Fennema	15f1173b1d	Semmle: Ensure permissions of private keys are 0600 (#3506 ) When using --allow-group-access option from initdb our keys and certificates would be created with 0640 permissions. Which is a pretty serious security issue: This changes that. This would not be exploitable though, since postgres would not actually enable SSL and would output the following message in the logs: ``` DETAIL: File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by root. ``` Since citus still expected the cluster to have SSL enabled handshakes between workers and coordinator would fail. So instead of a security issue the cluster would simply be unusable.	2020-02-17 12:58:40 +01:00
SaitTalhaNisanci	9302e6e699	apply review items	2020-02-17 14:16:49 +03:00
SaitTalhaNisanci	1b78045867	rename AssignTasksToConnections with AssignTasksToConnectionsOrWorkerPool	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	355805c7d8	create ProcessWaitEvents for separating the logic of handling events	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	c35981f9de	create UpdateWaitEventSet for better readability	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	a7e735a648	use a utility method to get event size	2020-02-17 14:16:20 +03:00
SaitTalhaNisanci	71f1aa48a3	remove unnecessary if check (#3500 )	2020-02-17 14:15:36 +03:00
Markus Sintonen	cf8319b992	Add comment, add subquery NOT tests	2020-02-16 01:21:10 +02:00
Markus Sintonen	3d3d615040	Add comment about NOT_EXPR. Treat it as invalid constraint for safety.	2020-02-15 16:54:38 +02:00
Philip Dubé	7382c8be00	Clean up from code review Only change to behavior is: - don't ignore array const's constcollid in SAORestrictions - don't end lines with commas in DebugLogPruningInstance	2020-02-14 17:58:23 +00:00
Markus Sintonen	cdedb98c54	Improve shard pruning logic to understand OR-conditions. Previously a limitation in the shard pruning logic caused multi distribution value queries to always go into all the shards/workers whenever query also used OR conditions in WHERE clause. Related to https://github.com/citusdata/citus/issues/2593 and https://github.com/citusdata/citus/issues/1537 There was no good workaround for this limitation. The limitation caused quite a bit of overhead with simple queries being sent to all workers/shards (especially with setups having lot of workers/shards). An example of a previous plan which was inadequately pruned: ``` EXPLAIN SELECT count() FROM orders_hash_partitioned WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22); QUERY PLAN --------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0) Task Count: 4 Tasks Shown: One of 4 -> Task Node: host=localhost port=xxxxx dbname=regression -> Aggregate (cost=13.68..13.69 rows=1 width=8) -> Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned (cost=0.00..13.68 rows=1 width=0) Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22))) (9 rows) ``` After this commit the task count is what one would expect from the query defining multiple distinct values for the distribution column: ``` EXPLAIN SELECT count() FROM orders_hash_partitioned WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22); QUERY PLAN --------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0) Task Count: 2 Tasks Shown: One of 2 -> Task Node: host=localhost port=xxxxx dbname=regression -> Aggregate (cost=13.68..13.69 rows=1 width=8) -> Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned (cost=0.00..13.68 rows=1 width=0) Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22))) (9 rows) ``` "Core" of the pruning logic works as previously where it uses `PrunableInstances` to queue ORable valid constraints for shard pruning. The difference is that now we build a compact internal representation of the query expression tree with PruningTreeNodes before actual shard pruning is run. Pruning tree nodes represent boolean operators and the associated constraints of it. This internal format allows us to have compact representation of the query WHERE clauses which allows "core" pruning logic to work with OR-clauses correctly. For example query having `WHERE (o_orderkey IN (1,2)) AND (o_custkey=11 OR (o_shippriority > 1 AND o_shippriority < 10))` gets transformed into: 1. AND(o_orderkey IN (1,2), OR(X, AND(X, X))) 2. AND(o_orderkey IN (1,2), OR(X, X)) 3. AND(o_orderkey IN (1,2), X) Here X is any set of unknown condition(s) for shard pruning. This allow the final shard pruning to correctly recognize that shard pruning is done with the valid condition of `o_orderkey IN (1,2)`. Another example with unprunable condition in query `WHERE (o_orderkey IN (1,2)) OR (o_custkey=11 AND o_custkey=22)` gets transformed into: 1. OR(o_orderkey IN (1,2), AND(X, X)) 2. OR(o_orderkey IN (1,2), X) Which is recognized as unprunable due to the OR condition between distribution column and unknown constraint -> goes to all shards. Issue https://github.com/citusdata/citus/issues/1537 originally suggested transforming the query conditions into a full disjunctive normal form (DNF), but this process of transforming into DNF is quite a heavy operation. It may "blow up" into a really large DNF form with complex queries having non trivial `WHERE` clauses. I think the logic for shard pruning could be simplified further but I decided to leave the "core" of the shard pruning untouched.	2020-02-14 17:58:13 +00:00
SaitTalhaNisanci	72d1850b4e	enhance local executor description (#3499 )	2020-02-13 20:19:08 +03:00
Onder Kalaci	975c4c2264	Do not prune shards if the distribution key is NULL The root of the problem is that, standard_planner() converts the following qual ``` {OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ( {VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 45 } {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull true :location 51 :constvalue <> } ) :location 49 } ``` To ``` ( {CONST :consttype 16 :consttypmod -1 :constcollid 0 :constlen 1 :constbyval true :constisnull true :location -1 :constvalue <> } ) ``` So, Citus doesn't deal with NULL values in real-time or non-fast path router queries. And, in the FastPathRouter planner, we check constisnull in DistKeyInSimpleOpExpression(). However, in deferred pruning case, we do not check for isnull for const. Thus, the fix consists of two parts: - Let PruneShards() not crash when NULL parameter is passed - For deferred shard pruning in fast-path queries, explicitly check that we have CONST which is not NULL	2020-02-13 15:00:31 +01:00
Onur Tirtir	cd8210d516	Bump citus version to 9.3devel (#3482 )	2020-02-13 16:22:05 +03:00
Philip Dubé	3a906b8210	Fix typos noticed while reading through code trying to understand HAVING	2020-02-11 19:55:10 +00:00
Onur Tirtir	ab0b49db82	fix uninitialized variable warning (#3483 )	2020-02-11 15:44:31 +01:00
Onur Tirtir	39df51e903	Introduce objects to dist. infrastructure when updating Citus (#3477 ) Mark existing objects that are not included in distributed object infrastructure in older versions of Citus (but now should be) as distributed, after updating Citus successfully.	2020-02-07 18:07:59 +03:00
Nils Dijk	d5433400f9	Fix: Unnecessary repartition on joins with more than 4 tables (#3473 ) DESCRIPTION: Fix unnecessary repartition on joins with more than 4 tables In 9.1 we have introduced support for all CH-benCHmark queries by widening our definitions of joins to include joins with expressions in them. This had the undesired side effect of Q5 regressing on its plan by implementing a repartition join. It turned out this regression was not directly related to widening of the join clause, nor the schema employed by CH-benCHmark. Instead it had to do with 4 or more tables being joined in a chain. A chain meaning: ```sql SELECT * FROM a,b,c,d WHERE a.part = b.part AND b.part = c.part AND .... ``` Due to how our join order planner was implemented it would only keep track of 1 of the partition columns when comparing if the join could be executed locally. This manifested in a join chain of 4 tables to _always_ be executed as a repartition join. 3 tables joined in a chain would have the middle table shared by the two outer tables causing the local join possibility to be found. With this patch we keep a unique list (or set) of all partition columns participating in the join. When a candidate table is checked for a possibility to execute a local join it will check if there is any partition column in that set that matches an equality join clause on the partition column of the candidate table. By taking into account all partition columns in the left relation it will now find the local join path on >= 4 tables joined in a chain. fixes: #3276	2020-02-06 15:07:07 +01:00
Philip Dubé	ecad4aa5e6	Fill in jobIdList field of DistributedExecution Pass down jobIdList from ExecuteTasksInDependencyOrder Also clean up comment for ExecuteTaskListOutsideTransaction	2020-02-05 17:32:22 +00:00
Philip Dubé	c252811884	dont: don't, wont: won't, acylic: acyclic	2020-02-05 17:32:22 +00:00
Halil Ozan Akgul	8ce4f20061	Fixes the bug of grants on public schema propagation	2020-02-05 18:05:58 +03:00
Hadi Moshayedi	9dd14fa90d	Rename discarded target list items in repartitioned INSERT/SELECT	2020-02-05 11:06:44 +01:00
Onder Kalaci	c7e2309f4c	Improve single hash-repartitioning with numeric (or non-int) types We used to treat the shard interval array that we passed as numeric[]. However, it should be int[], as the shard ranges are int[].	2020-02-04 20:30:04 +01:00
Hadi Moshayedi	bc1a800f70	Use current user for repartition join temp schemas. Otherwise when using a less privileged user we might get errors when trying to create the schema.	2020-02-04 09:48:20 -08:00
Hadi Moshayedi	264530311a	Don't use distributed insert/select for repartitioned joins	2020-02-03 13:13:30 -08:00
Marco Slot	be77d3304f	Fixup	2020-02-03 11:59:55 +01:00
Marco Slot	b0fd6aa006	If reference tables was read over multiple connections, do not assign connection	2020-02-03 11:54:29 +01:00
Onder Kalaci	2f274a4fce	Make sure to go deeper into the functions to search for PARAMs For example, a PARAM might reside inside a function just because of a casting of a type such as the follows: ``` {FUNCEXPR :funcid 1740 :funcresulttype 1700 :funcretset false :funcvariadic false :funcformat 2 :funccollid 0 :inputcollid 0 :args ( {PARAM :paramkind 0 :paramid 15 :paramtype 23 :paramtypmod -1 :paramcollid 0 :location 356 } ) ``` We should recursively check the expression before bailing out.	2020-02-03 09:36:12 +01:00
Philip Dubé	d43c80d4d8	pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by	2020-01-30 21:18:08 +00:00
Philip Dubé	84a500ffc6	CitusRemoveDirectory: loop when directory is not empty Sometimes during errors workers will create files while we're deleting intermediate directories example: DEBUG: could not remove file "base/pgsql_job_cache/10_0_431": Directory not empty DETAIL: WARNING from localhost:57637	2020-01-30 20:02:08 +00:00
Philip Dubé	5fccc56d3e	Expand the set of aggregates which cannot have LIMIT approximated Previously we only prevented AVG from being pushed down, but this is incorrect: - array_agg, while somewhat non sensical to order by, will potentially be missing values - combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative) This commit limits approximating LIMIT pushdown when ordering by aggregates to: min, max, sum, count, bit_and, bit_or, every, any Which means of those we previously supported, we now exclude: avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union	2020-01-30 17:45:18 +00:00
Önder Kalacı	8584cb005b	Do not evaluate functions on the coordinator for SELECT queries (#3440 ) Previously, the logic for evaluting the functions and the parameters were the same. That ended-up evaluting the functions inaccurately on the coordinator. Instead, split the function evaluation logic from parameter evalution logic.	2020-01-30 08:47:28 +01:00
Önder Kalacı	412fe719f7	Hide citus.enable_ddl_propagation setting (#3437 ) As that is powerful and cause metadata inconsistency. See the following steps: (Note that we cannot use PGC_SUSET because on Citus MX we need this flag for non- superusers as well) ```SQL CREATE TABLE test_ref_table(key int); SELECT create_reference_table('test_ref_table'); SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition; ┌────────────────┬──────────────┐ │ logicalrelid │ logicalrelid │ ├────────────────┼──────────────┤ │ test_ref_table │ 16831 │ └────────────────┴──────────────┘ (1 row) Time: 0.929 ms SELECT relname FROM pg_class WHERE oid = 16831; ┌────────────────┐ │ relname │ ├────────────────┤ │ test_ref_table │ └────────────────┘ (1 row) Time: 0.785 ms SET citus.enable_ddl_propagation TO off; DROP TABLE test_ref_table ; SELECT logicalrelid, logicalrelid::oid FROM pg_dist_partition; ┌──────────────┬──────────────┐ │ logicalrelid │ logicalrelid │ ├──────────────┼──────────────┤ │ 16831 │ 16831 │ └──────────────┴──────────────┘ (1 row) Time: 0.972 ms SELECT relname FROM pg_class WHERE oid = 16831; ┌─────────┐ │ relname │ ├─────────┤ └─────────┘ (0 rows) Time: 0.908 ms SELECT master_add_node('localhost', 9703); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. Time: 5.028 ms !> ```	2020-01-29 10:17:53 +01:00
SaitTalhaNisanci	94bd563ff0	switch back to old memory context in cache local plan for task (#3428 )	2020-01-27 13:00:46 +03:00
Önder Kalacı	4519d3411d	Improve the representation of used sub plans (#3411 ) Previously, we've identified the usedSubPlans by only looking to the subPlanId. With this commit, we're expanding it to also include information on the location of the subPlan. This is useful to distinguish the cases where the subPlan is used either on only HAVING or both HAVING and any other part of the query.	2020-01-24 10:47:14 +01:00
Philip Dubé	50c5e814c8	CurrentDatabaseName: return const char* as we're borrowing from cache	2020-01-23 22:49:35 +00:00
Hadi Moshayedi	1dc19215eb	Don't error for ENOENT in CitusRemoveDirectory. For concurrency reasons, this can happen even if initial stat succeeded.	2020-01-23 10:07:54 -08:00
Hadi Moshayedi	3e1004c232	Change DistributedResultFragment::nodeId to uint32. This is to match the type of WorkerNode::nodeId.	2020-01-23 09:33:15 -08:00
Önder Kalacı	ef7d1ea91d	Locally execute queries that don't need any data access (#3410 ) * Update shardPlacement->nodeId to uint As the source of the shardPlacement->nodeId is always workerNode->nodeId, and that is uint32. We had this hack because of: `0ea4e52df5 (r266421409)` And, that is gone with: `90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)` So, we're safe to do it now. * Relax the restrictions on using the local execution Previously, whenever any local execution happens, we disabled further commands to do any remote queries. The basic motivation for doing that is to prevent any accesses in the same transaction block to access the same placements over multiple sessions: one is local session the other is remote session to the same placement. However, the current implementation does not distinguish local accesses being to a placement or not. For example, we could have local accesses that only touches intermediate results. In that case, we should not implement the same restrictions as they become useless. So, this is a pre-requisite for executing the intermediate result only queries locally. * Update the error messages As the underlying implementation has changed, reflect it in the error messages. * Keep track of connections to local node With this commit, we're adding infrastructure to track if any connection to the same local host is done or not. The main motivation for doing this is that we've previously were more conservative about not choosing local execution. Simply, we disallowed local execution if any connection to any remote node is done. However, if we want to use local execution for intermediate result only queries, this'd be annoying because we expect all queries to touch remote node before the final query. Note that this approach is still limiting in Citus MX case, but for now we can ignore that. * Formalize the concept of Local Node Also some minor refactoring while creating the dummy placement * Write intermediate results locally when the results are only needed locally Before this commit, Citus used to always broadcast all the intermediate results to remote nodes. However, it is possible to skip pushing the results to remote nodes always. There are two notable cases for doing that: (a) When the query consists of only intermediate results (b) When the query is a zero shard query In both of the above cases, we don't need to access any data on the shards. So, it is a valuable optimization to skip pushing the results to remote nodes. The pattern mentioned in (a) is actually a common patterns that Citus users use in practice. For example, if you have the following query: WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...) SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...; The final query could be operating only on intermediate results. With this patch, the intermediate results of the ctes are not unnecessarily pushed to remote nodes. * Add specific regression tests As there are edge cases in Citus MX and with round-robin policy, use the same queries on those cases as well. * Fix failure tests By forcing not to use local execution for intermediate results since all the tests expects the results to be pushed remotely. * Fix flaky test * Apply code-review feedback Mostly style changes * Limit the max value of pg_dist_node_seq to reserve for internal use	2020-01-23 18:28:34 +01:00
Onder Kalaci	a0dff301c7	Update shardPlacement->nodeId to uint As the source of the shardPlacement->nodeId is always workerNode->nodeId, and that is uint32. We had this hack because of: `0ea4e52df5 (r266421409)` And, that is gone with: `90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)` So, we're safe to do it now.	2020-01-23 13:00:24 +01:00
Jelte Fennema	c62b756f34	Fix new method of locking shard distribition metadata (#3407 ) In #3374 a new way of locking shard distribution metadata was implemented. However, this was only done in the function `LockShardDistributionMetadata` and not in `TryLockShardDistributionMetadata`. This is bad, since it causes these locks to not block eachother in some cases. This commit fixes this issue by sharing the code that sets the locktag between the two function.	2020-01-22 16:44:17 +01:00
Jelte Fennema	cd5259a25a	Do not place new shards with shards in TO_DELETE state (#3408 ) When creating a new distributed table. The shards would colocate with shards with SHARD_STATE_TO_DELETE (shardstate = 4). This means if that state was because of a shard move the new shard would be created on two nodes and it would not get deleted since it's shard state would be 1.	2020-01-22 14:52:12 +01:00
Onder Kalaci	4be69bbf6f	Fix reference table issue	2020-01-20 18:45:18 +00:00
Halil Ozan Akgul	b40f067d05	Adds propagation for grant on schema commands	2020-01-20 14:51:28 +03:00
Philip Dubé	fdcc413559	Code cleanup of adaptive_executor, connection_management, placement_connection adaptive_executor: sort includes, use foreach_ptr, remove lies from FinishDistributedExecution docs connection_management: rename msecs, which isn't milliseconds placement_connection: small typos	2020-01-17 17:44:47 +00:00
Onder Kalaci	2f0ef8bc36	Apply feedback 1	2020-01-17 16:06:04 +01:00
Onder Kalaci	0bf1e81e33	Cache local plans on BeginScan	2020-01-17 16:02:57 +01:00
Onder Kalaci	5dc454cdad	Exclude localPlannedStatements from copy distributedPlan	2020-01-17 16:02:57 +01:00
Onder Kalaci	ff12df411b	Add LocalPlannedStatement struct	2020-01-17 16:02:57 +01:00
Onder Kalaci	3833a7e686	Fix issues for CTE inlining on Postgres 11 Comment from code: /* * We had to implement this hack because on Postgres11 and below, the originalQuery * and the query would have significant differences in terms of CTEs where CTEs * would not be inlined on the query (as standard_planner() wouldn't inline CTEs * on PG 11 and below). * * Instead, we prefer to pass the inlined query to the distributed planning. We rely * on the fact that the query includes subqueries, and it'd definitely go through * query pushdown planning. During query pushdown planning, the only relevant query * tree is the original query. */	2020-01-17 11:59:02 +01:00
Jelte Fennema	246435be7e	Lazy query deparsing executable queries (#3350 ) Deparsing and parsing a query can be heavy on CPU. When locally executing the query we don't need to do this in theory most of the time. This PR is the first step in allowing to skip deparsing and parsing the query in these cases, by lazily creating the query string and storing the query in the task. Future commits will make use of this and not deparse and parse the query anymore, but use the one from the task directly.	2020-01-17 11:49:43 +01:00
Hadi Moshayedi	6cf1c01660	Don't use repartitioned INSERT/SELECT for repartition joins	2020-01-16 23:40:31 -08:00
Hadi Moshayedi	5eeb07124f	Repartitioned INSERT/SELECT: include job id in result id prefix	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	a079278b0c	Repartitioned INSERT/SELECT: Add a GUC to enable/disable it	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	ce5eea4885	INSERT/SELECT: make SELECT column names unique	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	8635396cea	Repartitioned INSERT/SELECT: Test rollback behaviour	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	97072c9eb1	INSERT/SELECT: show method in EXPLAIN output	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	fe548b762f	Repartitioned INSERT/SELECT: Test CTEs	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	494cc383cc	Repartitioned INSERT/SELECT: Enable RETURNING	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	44a2aede16	Don't start a coordinated transaction on workers. Otherwise transaction hooks of Citus kick in and might cause unwanted errors.	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	42c3c03b85	Handle extra columns added in ExpandWorkerTargetEntry() in repartitioned INSERT/SELECT	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	89463f9760	Repartitioned INSERT/SELECT: cast columns in SELECT targets	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	d67a384350	Enable repartitioned INSERT/SELECT ON CONFLICT.	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	b4e5f4b10a	Implement INSERT ... SELECT with repartitioning	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	ced876358d	INSERT/SELECT: Refactor out AddInsertSelectCasts	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	d449c1857c	INSERT/SELECT: Use ExecutePlan* instead of ExecuteSelect*	2020-01-16 23:24:52 -08:00
Jelte Fennema	0ee1eab070	Make tests fail with a useful error message	2020-01-16 18:30:30 +01:00
Marco Slot	82f1fffa28	Fix epoll_ctl() error message on connection error	2020-01-16 06:40:57 +01:00
Onder Kalaci	dc17c2658e	Defer shard pruning for fast-path router queries to execution This is purely to enable better performance with prepared statements. Before this commit, the fast path queries with prepared statements where the distribution key includes a parameter always went through distributed planning. After this change, we only go through distributed planning on the first 5 executions.	2020-01-16 16:59:36 +01:00
Onder Kalaci	933d666c0d	Do not forget to copy fastPathRouterPlan@DistributedPlan	2020-01-16 16:39:20 +01:00
Halil Ozan Akgul	c5539d20d9	Adds alter table schema propagation	2020-01-16 17:04:16 +03:00
Nils Dijk	b6e09eb691	Fix: distributed function with table reference in declare (#3384 ) DESCRIPTION: Fixes a problem when adding a new node due to tables referenced in a functions body Fixes #3378 It was reported that `master_add_node` would fail if a distributed function has a table name referenced in its declare section of the body. By default postgres validates the body of a function on creation. This is not a problem in the normal case as tables are replicated to the workers when we distribute functions. However when a new node is added we first create dependencies on the workers before we try to create any tables, and the original tables get created out of bound when the metadata gets synced to the new node. This causes the function body validator to raise an error the table is not on the worker. To mitigate this issue we set `check_function_bodies` to `off` right before we are creating the function. The added test shows this does resolve the issue. (issue can be reproduced on the commit without the fix)	2020-01-16 14:21:54 +01:00
Jelte Fennema	e76281500c	Replace shardId lock with lock on colocation+shardIntervalIndex (#3374 ) This new locking pattern makes sure that some deadlocks that could happend during rebalancing cannot occur anymore.	2020-01-16 13:14:01 +01:00
Onder Kalaci	81d8178625	Note that we'll drop the GUC after PG 11 support dropped	2020-01-16 12:28:15 +01:00
Onder Kalaci	64560b07be	Update regression tests-2 In this commit, we're introducing a way to prevent CTE inlining via a GUC. The GUC is used in all the tests where PG 11 and PG 12 tests would diverge otherwise. Note that, in PG 12, the restriction information for CTEs are generated. It means that for some queries involving CTEs, Citus planner (router planner/ pushdown planner) may behave differently. So, via the GUC, we prevent tests to diverge on PG 11 vs PG 12. When we drop PG 11 support, we should get rid of the GUC, and mark relevant ctes as MATERIALIZED, which does the same thing.	2020-01-16 12:28:15 +01:00
Onder Kalaci	5cb203b276	Update regression tests-1 These set of tests has changed in both PG 11 and PG 12. The changes are only about CTE inlining kicking in both versions, and yielding the exact same distributed planning.	2020-01-16 12:28:15 +01:00
Onder Kalaci	efb1577d06	Handle CTE aliases accurately Basically, make sure to update the column name with the CTEs alias if we need to do so.	2020-01-16 12:28:15 +01:00
Onder Kalaci	05d600dd8f	Call CTE inlining in Citus planner The idea is simple: Inline CTEs(if any), try distributed planning. If the planning yields a successful distributed plan, simply return it. If the planning fails, fallback to distributed planning on the query tree where CTEs are not inlined. In that case, if the planning failed just because of the CTE inlining, via recursive planning, the same query would yield a successful plan. A very basic set of examples: WITH cte_1 AS (SELECT * FROM test_table) SELECT , row_number() OVER () FROM cte_1; or WITH a AS (SELECT FROM test_table), b AS (SELECT * FROM test_table) SELECT * FROM a JOIN b ON (a.value> b.value);	2020-01-16 12:28:15 +01:00
Onder Kalaci	01a5800ee8	Add Citus' CTE inlining functions With this commit we add the necessary Citus function to inline CTEs in a queryTree. You might ask, why do we need to inline CTEs if Postgres is already going to do it? Few reasons behind this decision: - One techinal node here is that Citus does the recursive CTE planning by checking the originalQuery which is the query that has not gone through the standard_planner(). CTEs in Citus is super powerful. It is practically key for full SQL coverage for multi-shard queries. With CTEs, you can always reduce any query multi-shard query into a router query via recursive planning (thus full SQL coverage). We cannot let CTE inlining break that. The main idea is Citus should be able to retry planning if anything goes after CTE inlining. So, by taking ownership of CTE inlining on the originalQuery, Citus can fallback to recursive planning of CTEs if the planning with the inlined query fails. It could have been a lot harder if we had relied on standard_planner() to have the inlined CTEs on the original query. - We want to have this feature in PostgreSQL 11 as well, but Postgres only inlines in version 12	2020-01-16 12:28:15 +01:00
Onder Kalaci	1856ab6cdd	Copy & paste code from Postgres source All the code in this commit is direct copy & paste from Postgres source code. We can classify the copy&paste code into two: - Copy paste from CTE inline patch from postgres (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=608b167f9f9c4553c35bb1ec0eab9ddae643989b) These include the functions inline_cte(), inline_cte_walker(), contain_dml(), contain_dml_walker(). It also include the code in function PostgreSQLCTEInlineCondition(). We prefer to extract that code into a seperate function, because (a) we'll re-use the logic later (b) we added one check for PG_11 Finally, the struct "inline_cte_walker_context" is also copied from the same Postgres commit. - Copy paste from the other parts of the Postgres code In order to implement CTE inlining in Postgres 12, the hackers modified the query_tree_walker()/range_table_walker() with the `18c0da88a5` Since Citus needs to support the same logic in PG 11, we copy & pasted that functions (and related flags) with the names pg_12_query_tree_walker() and pg_12_range_table_walker()	2020-01-16 12:28:15 +01:00
Philip Dubé	4d9a733c2f	Fix inserting multiple values with row expression partition column causing the insert to be ignored Raise an error instead of silently inserting nothing if we hit this condition in the future	2020-01-15 21:10:50 +00:00
Philip Dubé	4989c9a15c	PlacementExecutionDone: We may mark placements as failed multiple times, but should only act the first time.	2020-01-15 18:20:01 +00:00
Marco Slot	f1a0582973	Make ApplyLogRedaction a macro and redefine ereport	2020-01-13 18:24:36 +01:00
Marco Slot	06709ee108	Always use NOTICE in log_remote_commands and avoid redaction when possible	2020-01-13 18:24:36 +01:00
Marco Slot	90056f7d3c	Remove copy from worker for append-partitioned table	2020-01-13 23:03:40 -08:00
Philip Dubé	ccabf19090	Propagate DROP ROUTINE, ALTER ROUTINE In two places I've made code more straight forward by using ROUTINE in our own codegen Two changes which may seem extraneous: AppendFunctionName was updated to not use pg_get_function_identity_arguments. This is because that function includes ORDER BY when printing an aggregate like my_rank. While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres, ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not. Tests were updated to use macaddr over integer. Using integer is flaky, our logic could sometimes end up on tables like users_table. I originally wanted to use money, but money isn't hashable.	2020-01-13 15:37:46 +00:00
Philip Dubé	4b5d6c3ebe	Rename RelayFileState to ShardState Replace FILE_ prefix with SHARD_STATE_	2020-01-12 05:57:53 +00:00
Philip Dubé	e71386af33	Replace ARRAY_OUT_FUNC_ID with postgres's F_ARRAY_OUT Also use stack allocation for walkerContext in multi_logical_optimizer	2020-01-10 16:54:00 +00:00
Hadi Moshayedi	40ba2cdd6e	Test RedistributeTaskListResult	2020-01-09 23:47:25 -08:00
Hadi Moshayedi	527d7d41c1	Implement RedistributeTaskListResult	2020-01-09 23:47:25 -08:00
Philip Dubé	281aacce9b	Fix row-gather for subqueries being handled by task-tracker task-tracker has specific logic for MultiPartition when GROUP BY is missing We were ending up in this code path because row-gather removes GROUP BY	2020-01-10 01:51:37 +00:00
Hadi Moshayedi	e1e383cb59	Don't override xact id assigned by coordinator on workers. We might need to send commands from workers to other workers. In these cases we shouldn't override the xact id assigned by coordinator, or otherwise we won't read the consistent set of result files accross the nodes.	2020-01-09 11:09:11 -08:00
Hadi Moshayedi	c7c460e843	PartitionTasklistResults: Use different queries per placement We need to know which placement succeeded in executing the worker_partition_query_result() call. Otherwise we wouldn't know which node to fetch from. This change allows that by introducing Task::perPlacementQueryStrings.	2020-01-09 10:55:58 -08:00
Hadi Moshayedi	f38d0e5b3f	Partitioned task list results.	2020-01-09 10:32:58 -08:00
Philip Dubé	73c06fae3b	Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type	2020-01-09 18:24:29 +00:00
Philip Dubé	bf7d86a3e8	Fix typo: aggragate -> aggregate	2020-01-07 01:16:09 +00:00
Philip Dubé	863bf49507	Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default	2020-01-07 01:16:04 +00:00
Jelte Fennema	5b0baea72c	Refactor distributed_planner for better understandability	2020-01-06 14:23:38 +01:00
Onder Kalaci	5a1e752726	Apply feedback - add fastPath field to plan	2020-01-06 12:42:43 +01:00
Onder Kalaci	13a9b55695	Skip expensive checks when fast-path query The definition of fast-path query is very strict. So, we don't need to do some extra checks.	2020-01-06 12:42:43 +01:00
Onder Kalaci	7f3ab7892d	Skip shard pruning when possible We're already traversing the queryTree and finding the distribution key value, so pass it to the later stages of the planning.	2020-01-06 12:42:43 +01:00
Onder Kalaci	ca293116fa	Reduce calls to FastPathRouterQuery() Before this commit, we called it twice durning planning. Instead, we save the information and pass it.	2020-01-06 12:42:43 +01:00
Onder Kalaci	c8f14c9f6c	Make sure to update shard states of partitions on failures Fixes #3331 In #2389, we've implemented support for partitioned tables with rep > 1. The implementation is limiting the use of modification queries on the partitions. In fact, we error out when any partition is modified via EnsurePartitionTableNotReplicated(). However, we seem to forgot an important case, where the parent table's partition is marked as INVALID. In that case, at least one of the partition becomes INVALID. However, we do not mark partitions as INVALID ever. If the user queries the partition table directly, Citus could happily send the query to INVALID placements -- which are not marked as INVALID. This PR fixes it by marking the placements of the partitions as INVALID as well. The shard placement repair logic already re-creates all the partitions, so should be fine in that front.	2020-01-06 12:26:08 +01:00
Önder Kalacı	0c70a5470e	Allow RETURNING in fast-path queries (#3352 ) * Allow RETURNING in fast-path queries Because there is no specific reason for that.	2020-01-03 13:42:50 +00:00
Önder Kalacı	a174eb4f7b	Do not go through standard_planner() for INSERTs (#3348 ) That seems unnecessary. We already have the notion of FastPath queries, simply add it there.	2020-01-03 12:15:22 +00:00
Marco Slot	ba39d72fe1	Fix incorrect union all pushdown issue	2020-01-01 09:03:50 +01:00
Jelte Fennema	3a042e4611	Allow cartesian products on reference tables	2019-12-27 15:05:51 +01:00
Jelte Fennema	61e2501645	Make any expression with two or more tables a join expression	2019-12-27 15:05:51 +01:00
Jelte Fennema	4233cd0d9d	Allow non equi joins on reference tables	2019-12-27 15:05:51 +01:00
Jelte Fennema	7642928be1	Makefile fix DESTDIR together with cleanup (#3342 ) This should fix this build issue: redmine.postgresql.org/issues/5032	2019-12-27 10:34:57 +01:00
Marco Slot	b21b6905ae	Do not repeat GROUP BY distribution_column on coordinator Allow arbitrary aggregates to be pushed down in these scenarios	2019-12-25 01:33:41 +00:00
Marco Slot	a2ddfecd86	Fix inconsistent shard metadata issue	2019-12-24 08:01:32 +01:00
Hadi Moshayedi	d7aea7fa10	Implement partitioned intermediate results.	2019-12-24 03:53:39 -08:00
Marco Slot	b37ef0e394	Fix error in distributed queries when shards are on the coordinator	2019-12-24 06:36:43 +01:00
Philip Dubé	e9bbdb8f31	Fix handling of empty intermediate results when distributing custom aggregates	2019-12-23 17:27:52 +00:00
Philip Dubé	f007b7f91d	Also fix reindent inconsistencies with fake_fdw.c	2019-12-20 08:27:47 +00:00
Hadi Moshayedi	08eb0ade31	Fix reindent version inconsistencies. Different versions of reindent tool reformatted citus_custom_scan.c and citus_copyfuncs.c differently. So some developers spent some extra attention not to commit these two files after reindent. This PR tries to address this.	2019-12-19 23:10:34 -08:00
Jelte Fennema	b655c02352	Add the necessary changes for rebalance strategies on enterprise (#3325 ) This commit adds the SQL and C changes necessary to support custom rebalance strategies in the Enterprise version of Citus.	2019-12-19 15:23:08 +01:00
Hadi Moshayedi	ef487e0792	Implement fetch_intermediate_results	2019-12-18 10:46:35 -08:00
Hadi Moshayedi	249508d267	Estimate cost of read_intermediate_results()	2019-12-17 13:51:51 -08:00
Hadi Moshayedi	113bd1e5f1	Implement read_intermediate_results	2019-12-17 13:51:16 -08:00
SaitTalhaNisanci	7ff4ce2169	Add adaptive executor support for repartition joins (#3169 ) * WIP * wip * add basic logic to run a single job with repartioning joins with adaptive executor * fix some warnings and return in ExecuteDependedTasks if there is none * Add the logic to run depended jobs in adaptive executor The execution of depended tasks logic is changed. With the current logic: - All tasks are created from the top level task list. - At one iteration: - CurTasks whose dependencies are executed are found. - CurTasks are executed in parallel with adapter executor main logic. - The iteration is repeated until all tasks are completed. * Separate adaptive executor repartioning logic * Remove duplicate parts * cleanup directories and schemas * add basic repartion tests for adaptive executor * Use the first placement to fetch data In task tracker, when there are replicas, we try to fetch from a replica for which a map task is succeeded. TaskExecution is used for this, however TaskExecution is not used in adaptive executor. So we cannot use the same thing as task tracker. Since adaptive executor fails when a map task fails (There is no retry logic yet). We know that if we try to execute a fetch task, all of its map tasks already succeeded, so we can just use the first one to fetch from. * fix clean directories logic * do not change the search path while creating a udf * Enable repartition joins with adaptive executor with only enable_reparitition_joins guc * Add comments to adaptive_executor_repartition * dont run adaptive executor repartition test in paralle with other tests * execute cleanup only in the top level execution * do cleanup only in the top level ezecution * not begin a transaction if repartition query is used * use new connections for repartititon specific queries New connections are opened to send repartition specific queries. The opened connections will be closed at the FinishDistributedExecution. While sending repartition queries no transaction is begun so that we can see all changes. * error if a modification was done prior to repartition execution * not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level * fix cleanup logic * update tests * add missing function comments * add test for transaction with DDL before repartition query * do not close repartition connections in adaptive executor * rollback instead of commit in repartition join test * use close connection instead of shutdown connection * remove unnecesary connection list, ensure schema owner before removing directory * rename ExecuteTaskListRepartition * put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case * split adaptive executor repartition to DAG execution logic and repartition logic * apply review items * apply review items * use an enum for remote transaction state and fix cleanup for repartition * add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction * fix style * wip * rename removejobdir to partition cleanup * do not close connections at the end of repartition queries * do repartition cleanup in pg catch * apply review items * decide whether to use transaction or not at execution creation * rename isOutsideTransaction and add missing comment * not error in pg catch while doing cleanup * use replication factor of the creation time, not current time to decide if task tracker should be chosen * apply review items * apply review items * apply review item	2019-12-17 19:09:45 +03:00
Marco Slot	2f568ad5a5	Forbid using connections that sent intermediate results for data access and vice versa	2019-12-17 11:49:13 +01:00
Marco Slot	f4031dd477	Clean up transaction block usage logic in adaptive executor	2019-12-17 10:48:19 +01:00
Nils Dijk	bfc3d2eb90	make sure to correctly decrement ExecutorLevel (#3311 ) DESCRIPTION: Fix counter that keeps track of internal depth in executor While reviewing #3302 I ran into the `ExecutorLevel` variable which used a variable to keep the original value to restore on successful exit. I haven't explored the full space and if it is possible to get into an inconsistent state. However using `PG_TRY`/`PG_CATCH` seems generally more correct. Given very bad things will happen if this level is not reset, I kept the failsafe of setting the variiable back to 0 on the `XactCallback` but I did add an assert to treat it as a developer bug.	2019-12-16 20:50:13 +01:00
Marco Slot	5f656e22db	Fix issue in IsMultiStatementTransaction detection	2019-12-16 17:01:43 +01:00
SaitTalhaNisanci	2829c601dd	replace Begin words in coordinated transactions with use (#3293 )	2019-12-16 10:40:31 +03:00
SaitTalhaNisanci	a2f2107e6a	refactor MapTaskList in multi physical planner (#3297 )	2019-12-13 22:41:49 +03:00
Marco Slot	1633123d78	Fix crash in IN (NULL) queries	2019-12-13 08:35:54 +01:00
Hadi Moshayedi	e7a6cc0801	Fix some typos from #3280	2019-12-12 13:29:26 -08:00
SaitTalhaNisanci	420e21919b	refactor extract distributed insert values rte (#3287 )	2019-12-12 23:47:44 +03:00
Marco Slot	e7a8db5493	Fix issue with some zero-shard modifications	2019-12-12 07:19:10 +01:00
SaitTalhaNisanci	2c040d2c8f	use a function for duplicate code in connection state machine (#3209 )	2019-12-12 17:55:38 +03:00
SaitTalhaNisanci	a0fe8646e0	add IsHoldOffCancellationReceived utility function (#3290 )	2019-12-12 17:32:59 +03:00
SaitTalhaNisanci	053fe18404	not continue in sequential execution if a cancellation is received (#3289 )	2019-12-12 17:22:30 +03:00
Hadi Moshayedi	939d3c955b	Don't plan function joins locally	2019-12-11 16:53:29 -08:00
Hadi Moshayedi	067d92a7f6	Don't plan joins between ref tables and views locally	2019-12-11 14:31:34 -08:00
Hadi Moshayedi	e3e174f30f	Fix the way we check for local/reference table joins in the executor	2019-12-11 12:50:20 -08:00
SaitTalhaNisanci	13204487e9	remove copyright years (#3286 )	2019-12-11 21:14:08 +03:00
SaitTalhaNisanci	d10f97998c	rename REMOTE_TRANS_INVALID to REMOTE_TRANS_NOT_STARTED	2019-12-11 15:24:18 +03:00
Marco Slot	133b8e1e0e	Move coordinator insert..select logic into executor	2019-12-10 11:21:35 -08:00
Marco Slot	486c620a3c	Fix inserts into local tables with distributed subqueries	2019-12-10 10:17:18 +01:00
Philip Dubé	fcf2fd819b	Add distributioncolumncollation to to pg_dist_colocation Use partition column's collation for range distributed tables Don't allow non deterministic collations for hash distributed tables CoPartitionedTables: don't compare unequal types	2019-12-09 19:51:40 +00:00
Philip Dubé	d138bb89bf	Support creating collations as part of dependency resolution. Propagate ALTER/DROP on distributed collations Propagate CREATE COLLATION when outside transaction	2019-12-09 04:42:51 +00:00
Alexander Pyhalov	6174a4d3d6	Fix build on illumos	2019-12-06 14:40:47 +01:00
Marco Slot	6a9c0ea7fe	Fix errors in DML with sublinks hidden by null expressions	2019-12-06 14:25:04 +01:00
Hadi Moshayedi	d28beb3711	Detect SQL UDF Calls.	2019-12-05 14:31:05 -08:00
Philip Dubé	5a17fd6d9d	Test more reference/local cases, also ALTER ROLE Test ALTER ROLE doesn't deadlock when coordinator added, or propagate from mx workers Consolidate wait_until_metadata_sync & verify_metadata to multi_test_helpers	2019-12-03 22:23:14 +00:00
Philip Dubé	1597fbb369	aggregate_support test: test DISTINCT, ORDER BY, FILTER, & no intermediate results Previously, - we'd push down ORDER BY, but this doesn't order intermediate results between workers - we'd keep FILTER on master aggregate, which would raise an error about unexpected cstrings	2019-12-03 15:46:01 +00:00
Philip Dubé	5fcc169a3a	Stray depended to dependent tidy up	2019-12-03 15:28:32 +00:00
Marco Slot	bb3bc10f0c	Fix segfault in column_to_column_name	2019-12-01 23:57:25 +01:00
Marco Slot	b1b13e394e	Fix segfault when executing DDL via UDF	2019-12-01 22:54:41 +01:00
Marco Slot	4c8d43c5d0	Bump repo version to 9.2devel	2019-11-29 07:33:39 +01:00
Nils Dijk	1ef1667ddb	add gitref to the output of citus_version (#3246 ) DESCRIPTION: add gitref to the output of citus_version During debugging of custom builds it is hard to know the exact version of the citus build you are using. This patch will add a human readable/understandable git reference to the build of citus which can be retrieved by calling `citus_version();`.	2019-11-29 15:54:09 +01:00
Marco Slot	16d1ad3666	Remove distinction between SQL_TASK and ROUTER_TASK	2019-11-29 05:58:29 +01:00
SaitTalhaNisanci	aeec3d1544	fix typo in dependent jobs and dependent task (#3244 )	2019-11-28 23:47:28 +03:00
Philip Dubé	0d04ff1692	RECORD: Add support for more expression types - OpExpr - NullIfExpr - MinMaxExpr - CoalesceExpr - CaseExpr Also fix case where ARRAY[(1,2), NULL] was rejected	2019-11-27 17:07:22 +00:00
Philip Dubé	168e11cc9b	Implement support for RECORD[] where we support RECORD Support for ARRAY[] expressions is limited to having a consistent shape, eg ARRAY[(int,text),(int,text)] as opposed to ARRAY[(int,text),(float,text)] or ARRAY[(int,text),(int,text,float)]	2019-11-27 15:02:43 +00:00
Hadi Moshayedi	2268a9cae6	Error for metadata commands if any metadata node is out-of-sync (#3226 ) * Error for metadata commands if any metadata node is out-of-sync * Make the functions have separate APIs for all workers/metadata workers	2019-11-27 09:52:57 +01:00
Önder Kalacı	1cfbeb89ec	Make NodeCanHaveDistTablePlacements() public (#3229 ) Since it is required in rebalancer.	2019-11-26 12:15:38 +01:00
Marco Slot	60b741927f	Add missing include to deparse_function_stmts.c	2019-11-24 06:04:22 +01:00
Philip Dubé	261a9de42d	Fix typos: VAR_SET_VALUE_KIND -> VAR_SET_VALUE kind beginnig -> beginning plannig -> planning the the -> the er then -> er than	2019-11-25 23:24:13 +00:00
Marco Slot	4b0ac4b0dd	Properly escape ALTER FUNCTION .. SET deparsing. Also test	2019-11-25 23:01:30 +00:00
Philip Dubé	3c10c27b13	GetFunctionAlterOwnerCommand: use format_procedure_qualified distributed_functions: test a function with a quote in name AppendDefElemSet: quote variable names	2019-11-25 23:01:30 +00:00
Philip Dubé	a81e6a81ab	Fix distributed aggregation for non superuser roles Moves support functions to pg_catalog for now. We'd prefer a different solution for when we're creating these support functions dynamically	2019-11-25 20:46:25 +00:00
Khashayar Fereidani	f81785ad14	Fix underflow initialization of default values Initialization of queryWindowClause and queryOrderByLimit "memset" underflow these variables. It's possible due to the invalid usage sizeof this part of the program cause buffer overflow and function return data corruption in future changes.	2019-11-25 19:25:51 +00:00
Onur TIRTIR	bef32624c3	Escape extension name in extension command propagation (#3218 )	2019-11-24 12:16:10 +03:00
Philip Dubé	99164398bf	Fix potential segfault from standard_planner inlining functions	2019-11-21 18:47:36 +00:00
Philip Dubé	c563e0825c	Strip trailing whitespace and add final newline (#3186 ) This brings files in line with our editorconfig file	2019-11-21 14:25:37 +01:00
Jelte Fennema	1d8dde232f	Automatically convert useless declarations using regex replace (#3181 ) * Add declaration removal to CI * Convert declarations	2019-11-21 13:47:29 +01:00
Onur TIRTIR	9961297d7b	Improve extension command propagation logic and tests * Improve extension command propagation tests * patch for hardcoded citus extension name (cherry picked from commit 0bb3dbac0afabda10e8928f9c17eda048dc4361a)	2019-11-21 11:24:39 +03:00
Marco Slot	e0cccf7f9a	Move C files into the appropriate directory	2019-11-16 11:36:17 +01:00
Hanefi Onaldi	d82f3e9406	Introduce intermediate result broadcasting In plain words, each distributed plan pulls the necessary intermediate results to the worker nodes that the plan hits. This is primarily useful in three ways. (i) If the distributed plan that uses intermediate result(s) is a router query, then the intermediate results are only broadcasted to a single node. (ii) If a distributed plan consists of only intermediate results, which is not uncommon, the intermediate results are broadcasted to a single node only. (iii) If a distributed query hits a sub-set of the shards in multiple workers, the intermediate results will be broadcasted to the relevant node(s). The final item (iii) becomes crucial for append/range distributed tables where typically the distributed queries hit a small subset of shards/workers. To do this, for each query that Citus creates a distributed plan, we keep track of the subPlans used in the queryTree, and save it in the distributed plan. Just before Citus executes each subPlan, Citus first keeps track of every worker node that the distributed plan hits, and marks every subPlan should be broadcasted to these nodes. Later, for each subPlan which is a distributed plan, Citus does this operation recursively since these distributed plans may access to different subPlans, and those have to be recorded as well.	2019-11-20 15:26:36 +03:00
Philip Dubé	b7fef5c31a	Miscellaneous cleanup in prep for collation propagation	2019-11-19 17:28:59 +00:00
Onur TIRTIR	26c306d188	Add extensions to distributed object propagation infrastructure (#3185 )	2019-11-19 17:56:28 +03:00
SaitTalhaNisanci	2cb82ae9bd	create a utility method to mark tasks as failed (#3150 )	2019-11-19 16:35:56 +03:00
SaitTalhaNisanci	306d159072	refactor AfterXacthodtConnectionHandling (#3202 )	2019-11-19 14:50:23 +03:00
Marco Slot	622462cad7	Return early in CitusHasBeenLoaded when creating a different extension	2019-11-15 03:00:20 +01:00
Önder Kalacı	40fa3862ce	Prevent Citus extension becoming distributed object (#3197 ) Prevent Citus extension being distributed Because that could prevent doing rolling upgrades, where users may prefer to upgrade the version on the coordinator but not the workers. There could be some other edge cases, so I'd prefer to keep Citus extension outside the picture for now.	2019-11-18 16:57:10 +01:00
Halil Ozan Akgul	5ae7b219ff	Create the ALTER ROLE propagation	2019-11-18 18:31:28 +03:00
Nils Dijk	217890af5f	Feature: Expression in reference join (#3180 ) DESCRIPTION: Expression in reference join Fixed: #2582 This patch allows arbitrary expressions in the join clause when joining to a reference table. An example of such joins could be found in CHbenCHmark queries 7, 8, 9 and 11; `mod((s_w_id * s_i_id),10000) = su_suppkey` and `ascii(substr(c_state,1,1)) = n2.n_nationkey`. Since the join is on a reference table these queries are able to be pushed down to the workers. To implement these queries we will widen the `IsJoinClause` predicate to not check if the expressions are a type `Var` after stripping the implicit coerciens. Instead we define a join clause when the `Var`'s in a clause come from more than 1 table. This allows more clauses to pass into the logical planner's `MultiNodeTree(...)` planning function. To compensate for this we tighten down the `LocalJoin`, `SinglePartitionJoin` and `DualPartitionJoin` to check for direct column references when planning. This allows the planner to work with arbitrary join expressions on reference tables.	2019-11-18 16:25:46 +01:00
Önder Kalacı	a4c90b6ee1	Make distributed object dependency logic follow upto extensions (#3195 ) With this commit, we're slightly changing the dependency traversal logic to enable extension propagation. The main idea is to "follow" the extension dependencies, but do not "apply" them. Since some extension dependencies are base types, and base types could have circular dependencies, we implement a logic to prevent revisiting an already visited object.	2019-11-17 17:21:21 +01:00
Hadi Moshayedi	d9dcba25e3	Plan reference/local table joins locally	2019-11-15 07:36:50 -08:00
Onder Kalaci	90943a6ce6	Do not include coordinator shards when round-robin is selected When the user picks "round-robin" policy, the aim is that the load is distributed across nodes. However, for reference tables on the coordinator, since local execution kicks in immediately, round-robin is ignored. With this change, we're excluding the placement on the coordinator. Although the approach seems a little bit invasive because of modifications in the placement list, that sounds acceptable. We could have done this in some other ways such as: 1) Add a field to "Task->roundRobinPlacement" (or such), which is updated as the first element after RoundRobinPolicy is applied. During the execution, if that placement is local to the coordinator, skip it and try the other remote placements. 2) On TaskAccessesLocalNode()@local_execution.c, check task_assignment_policy, if round-robin selected and there is local placement on the coordinator, skip it. However, task assignment is done on planning, but this decision is happening on the execution, which could create weird edge cases.	2019-11-15 06:03:32 -08:00
Hadi Moshayedi	15af1637aa	Replicate reference tables to coordinator.	2019-11-15 05:50:19 -08:00
Hadi Moshayedi	cb011bb30f	Propagate isactive to metadata nodes.	2019-11-15 05:48:42 -08:00
SaitTalhaNisanci	b9b7fd7660	add IsLoggableLevel utility function (#3149 ) * add IsLoggableLevel utility function * add function comment for IsLoggableLevel * put ApplyLogRedaction to logutils	2019-11-15 14:59:13 +03:00
Jelte Fennema	1b2c438e69	Rename variables to not shadow globals in RHEL6 (#3194 ) Fixes #2839	2019-11-15 12:12:24 +01:00
Jelte Fennema	a8bd2d58f5	Update SQL definitions to prepare for drain node functionality (#3179 )	2019-11-15 10:11:56 +01:00
Jelte Fennema	4b9b4b0995	Don't warn for declaration-after-statement since we only support GNU99 (#3132 ) This change was actually already intended in #3124. However, the postgres Makefile manually enables this warning too. This way we undo that. To confirm that it works two functions were changed to make use of not having the warning anymore.	2019-11-15 09:46:06 +01:00
Philip Dubé	495c0f5117	Phase 1 implementation of custom aggregates Phase 1 seeks to implement minimal infrastructure, so does not include: - dynamic generation of support aggregates to handle multiple arguments - configuration methods to direct aggregation strategy, or mark an aggregate's serialize/deserialize as safe to operate across nodes Aggregates can be distributed when: - they have a single argument - they have a combinefunc - their transition type is not a pseudotype	2019-11-14 19:01:24 +00:00
Philip Dubé	edc7a2ee38	Improve RECORD support	2019-11-14 18:32:22 +00:00
Philip Dubé	eb35743c3f	Remove citus.worker_list_file & master_initialize_node_metadata	2019-11-13 00:49:58 +00:00
Philip Dubé	48552bfffe	Call DestReceiver rDestroy before it goes out of scope CitusCopyDestReceiverDestroy: call hash_destroy on shardStateHash & connectionStateHash	2019-11-12 15:03:07 +00:00
Jelte Fennema	adc6ca6100	Make simple in queries on unique columns work with repartion join (#3171 ) This is necassery to support Q20 of the CHbenCHmark: #2582. To summarize the fix: The subquery is converted into an INNER JOIN on a table. This fixes the issue, since an INNER JOIN on a table is already supported by the repartion planner. The way this replacement is happening.: 1. Postgres replaces `col in (subquery)` with a SEMI JOIN (subquery) on col = subquery_result 2. If this subquery is simple enough Postgres will replace it with a regular read from a table 3. If the subquery returns unique results (e.g. a primary key) Postgres will convert the SEMI JOIN into an INNER JOIN during the planning. It will not change this in the rewritten query though. 4. We check if Postgres sends us any SEMI JOINs during its join order planning, if it doesn't we replace all SEMI JOINs in the rewritten query with INNER JOIN (which we already support).	2019-11-11 13:44:28 +01:00
SaitTalhaNisanci	57380fd668	remove duplicated method in multi_logical_optimizer (#3166 )	2019-11-11 13:51:21 +03:00
Philip Dubé	ad86c1b866	AcquireDistributedLockOnRelations: escape relation names	2019-11-08 21:23:01 +00:00
Philip Dubé	e8ecbbfcb3	Escape transaction names	2019-11-08 21:23:01 +00:00
Jelte Fennema	9fb897a074	Fix queries with repartition joins and group by unique column (#3157 ) Postgres doesn't require you to add all columns that are in the target list to the GROUP BY when you group by a unique column (or columns). It even actively removes these group by clauses when you do. This is normally fine, but for repartition joins it is not. The reason for this is that the temporary tables don't have these primary key columns. So when the worker executes the query it will complain that it is missing columns in the group by. This PR fixes that by adding an ANY_VALUE aggregate around each variable in the target list that does is not contained in the group by or in an aggregate. This is done only for repartition joins. The ANY_VALUE aggregate chooses the value from an undefined row in the group.	2019-11-08 15:36:18 +01:00
SaitTalhaNisanci	02b359623f	remove duplicate code in citus_dist_stat_activity (#3165 )	2019-11-08 15:41:32 +03:00
Önder Kalacı	0b3d4e55d9	Local execution should not change hasReturning for distributed tables (#3160 ) It looks like the logic to prevent RETURNING in reference tables to have duplicate entries that comes from local and remote executions leads to missing some tuples for distributed tables. With this PR, we're ensuring to kick in the logic for reference tables only.	2019-11-08 12:49:56 +01:00
Philip Dubé	72c3d64ead	Rename OpenConnectionsToAllNodes to OpenConnectionsToAllWorkerNodes	2019-11-07 17:50:22 +00:00
Philip Dubé	2fc45e5897	create_distributed_function: accept aggregates Adds support for OCLASS_PROC to worker_create_or_replace_object	2019-11-06 18:23:37 +00:00
Hadi Moshayedi	e00d1546f3	Don't maintain replicationfactor of reference tables	2019-11-05 07:23:14 -08:00
Onder Kalaci	471703bfaf	DEBUG only when the function is distributed Otherwise, we're seeing this message way to often.	2019-11-05 15:08:35 +00:00
Önder Kalacı	960cd02c67	Remove real time router executors (#3142 ) * Remove unused executor codes All of the codes of real-time executor. Some functions in router executor still remains there because there are common functions. We'll move them to accurate places in the follow-up commits. * Move GUCs to transaction mngnt and remove unused struct * Update test output * Get rid of references of real-time executor from code * Warn if real-time executor is picked * Remove lots of unused connection codes * Removed unused code for connection restrictions Real-time and router executors cannot handle re-using of the existing connections within a transaction block. Adaptive executor and COPY can re-use the connections. So, there is no reason to keep the code around for applying the restrictions in the placement connection logic.	2019-11-05 12:48:10 +01:00
Jelte Fennema	f0c35ad134	Include fmgr.h, don't duplicate FunctionCallInfo typedef	2019-11-04 17:10:33 +00:00
SaitTalhaNisanci	7c410e3cd7	pass CitusCustomState directly to adaptive executor (#3151 )	2019-11-01 19:57:32 +03:00
Önder Kalacı	ffd89e4e01	Include all relevant relations in the ExtractRangeTableRelationWalker (#3135 ) We've changed the logic for pulling RTE_RELATIONs in #3109 and non-colocated subquery joins and partitioned tables. @onurctirtir found this steps where I traced back and found the issues. While looking into it in more detail, we decided to expand the list in a way that the callers get all the relevant RTE_RELATIONs RELKIND_RELATION, RELKIND_PARTITIONED_TABLE, RELKIND_FOREIGN_TABLE and RELKIND_MATVIEW. These are all relation kinds that Citus planner is aware of.	2019-11-01 16:06:58 +01:00
Onur TIRTIR	d3f68bf44f	Fix view is not distributed error when view is used in modify statements (#3104 )	2019-11-01 16:34:01 +03:00
SaitTalhaNisanci	c7ceca3216	update outdated comment in JobExecutorType (#3148 )	2019-11-01 11:36:56 +03:00
SaitTalhaNisanci	70e46703aa	Fix debug1 message in JobExecutorType (#3147 ) When citus.enable_repartition_joins guc is set to on, and we have adaptive executor, there was a typo in the debug message, which was saying realtime executor no adaptive executor.	2019-11-01 11:14:19 +03:00
Marco Slot	51c64c70c9	Do not try to sync metadata on standby coordinator	2019-10-30 05:15:45 +01:00
SaitTalhaNisanci	dadbe86af1	refactor some of hard coded values in citus gucs (#3137 ) * refactor some of hard coded values in citus gucs * rename GUC_ALLOW_ALL to GUC_STANDARD	2019-10-30 10:35:39 +03:00
Marco Slot	067657af26	Disallow distributed functions with distribution arguments unless replication_model is streaming	2019-10-26 23:57:59 +02:00
SaitTalhaNisanci	29d45bd1b9	Do not assign InvalidOid for local execution while extracting parameters (#3131 ) * do not assign InvalidOid for local execution while extracting parameters * rename functions * rename parameter and replace function	2019-10-28 14:28:22 +03:00
Önder Kalacı	dceaddbe4d	Remove real-time/router executors (step 1) (#3125 ) See #3125 for details on each item. * Remove real-time/router executor tests-1 These are the ones which doesn't have '_%d' in the test output files. * Remove real-time/router executor tests-2 These are the ones which has in the test output files. * Move the tests outputs to correct place * Make sure that single shard commits use 2PC on adaptive executor It looks like we've messed the tests in #2891. Fixing back. * Use adaptive executor for all router queries This becomes important because when task-tracker is picked, we used to pick router executor, which doesn't make sense. * Remove explicit references to real-time/router executors in the tests * JobExecutorType never picks real-time/router executors * Make sure to go incremental in test output numbers * Even users cannot pick real-time anymore * Do not use real-time/router custom scans * Get rid of unnecessary normalizations * Reflect unneeded normalizations * Get rid of unnecessary test output file	2019-10-25 10:54:54 +02:00
Marco Slot	a1162b2023	Rename 9.1 upgrade script to upgrade from 9.0-2	2019-10-23 00:08:17 +02:00
Marco Slot	04040e0a37	Revoke usage from the citus schema	2019-10-23 00:08:17 +02:00
Jelte Fennema	a5010e5b17	Add extra foreach convenience macros (#3117 ) This completely hides `ListCell` to the user of the loop Example usage: ```c WorkerNode workerNode = NULL; foreach_ptr(workerNode, workerNodeList) { // Do stuff with workerNode } ``` Instead of: ```c ListCell workerNodeCell = NULL; foreach(cell, workerNodeList) { WorkerNode *workerNode = lfirst(workerNodeCell); // Do stuff with workerNode } ```	2019-10-23 16:49:12 +02:00
Philip Dubé	b2f084d7f5	UnsetMetadataSyncedForAll: use CatalogTupleUpdateWithInfo	2019-10-23 00:45:11 +00:00
Onder Kalaci	a208f8b151	Fix memory leak on ReceiveResults It turns out that TupleDescGetAttInMetadata() allocates quite a lot of memory. And, if the target list is long and there are too many rows returning, the leak becomes appereant. You can reproduce the issue wout the fix with the following commands: ```SQL CREATE TABLE users_table (user_id int, time timestamp, value_1 int, value_2 int, value_3 float, value_4 bigint); SELECT create_distributed_table('users_table', 'user_id'); insert into users_table SELECT i, now(), i, i, i, i FROM generate_series(0,99999)i; -- load faster -- 200,000 INSERT INTO users_table SELECT * FROM users_table; -- 400,000 INSERT INTO users_table SELECT * FROM users_table; -- 800,000 INSERT INTO users_table SELECT * FROM users_table; -- 1,600,000 INSERT INTO users_table SELECT * FROM users_table; -- 3,200,000 INSERT INTO users_table SELECT * FROM users_table; -- 6,400,000 INSERT INTO users_table SELECT * FROM users_table; -- 12,800,000 INSERT INTO users_table SELECT * FROM users_table; -- making the target list entry wider speeds up the leak to show up select ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, FROM users_table ; ```	2019-10-22 17:22:26 +02:00
Jelte Fennema	78e495e030	Add shouldhaveshards to pg_dist_node (#2960 ) This is an improvement over #2512. This adds the boolean shouldhaveshards column to pg_dist_node. When it's false, create_distributed_table for new collocation groups will not create shards on that node. Reference tables will still be created on nodes where it is false.	2019-10-22 16:47:16 +02:00
Hanefi Onaldi	7ebda04494	Update all c-style comments in migration files	2019-10-21 16:05:53 +03:00
Jelte Fennema	7abedc38b0	Support subqueries in HAVING (#3098 ) Areas for further optimization: - Don't save subquery results to a local file on the coordinator when the subquery is not in the having clause - Push the the HAVING with subquery to the workers if there's a group by on the distribution column - Don't push down the results to the workers when we don't push down the HAVING clause, only the coordinator needs it Fixes #520 Fixes #756 Closes #2047	2019-10-16 16:40:14 +02:00
Onur TIRTIR	3bfb2a078b	Make changes on if-statement in ExtractRangeTableList for furhter walker types (#3110 )	2019-10-16 15:50:09 +03:00
Onur TIRTIR	d5f83dc110	Refactor range table walkers (#3109 )	2019-10-16 01:20:49 +03:00
SaitTalhaNisanci	94a7e6475c	Remove copyright years (#2918 ) * Update year as 2012-2019 * Remove copyright years	2019-10-15 17:44:30 +03:00
Philip Dubé	74cb168205	Remove Postgres 10 support	2019-10-11 21:56:56 +00:00
Hadi Moshayedi	b50d216536	Fix a typo	2019-10-10 10:44:41 -07:00
Philip Dubé	4063e7ca67	CALL delegation: apply strip_implicit_coercions to distribution argument	2019-10-10 17:42:43 +00:00
Philip Dubé	dd490b6376	Cache whether an object is in pg_dist_object. Avoids redundant lookups for non-distributed objects	2019-10-10 14:50:38 +00:00
Nils Dijk	4a4a220945	Fix enum add value order and pg12 (#3082 ) DESCRIPTION: Fix order for enum values and correctly support pg12 PG 12 introduces `ALTER TYPE ... ADD VALUE ...` during transactions. Earlier versions would error out when called in a transaction, hence we connect to workers outside of the transaction which could cause inconsistencies on pg12 now that postgres doesn't error with this syntax anymore. During the implementation of this fix it became apparent there was an error with the ordering of enum labels when the type was recreated. A patch and test have been included.	2019-10-07 17:16:19 +02:00
Jelte Fennema	01da11f264	Change citus truncate trigger to AFTER and add more upgrade tests (#3070 ) * Add more upgrade tests * Fix citus trigger generation after upgrade citus_truncate_trigger runs before truncate when created by create_distributed_table: `492d1b2cba/src/backend/distributed/commands/create_distributed_table.c (L1163)` * Remove pg_dist_jobid_seq	2019-10-07 16:43:04 +02:00
Onder Kalaci	3be72ce42f	Make sure that distributed functions always have the correct user Objectives: (a) both super user and regular user should have the correct owner for the function on the worker (b) The transactional semantics would work fine for both super user and regular user (c) non-super-user and non-function owner would get a reasonable error message if tries to distribute the function Co-authored-by: @serprex	2019-10-04 21:38:49 +00:00
Marco Slot	1a3a174f67	Grant usage on schema citus to public	2019-10-04 12:26:08 +02:00
Marco Slot	89377ee578	Move RowExclusiveLock to start in SyncMetadataToNodes	2019-10-04 12:07:41 +02:00
Hadi Moshayedi	217db2a03e	Don't block for locks in SyncMetadataToNodes()	2019-10-03 16:53:36 -07:00
Hadi Moshayedi	ae915493e6	Don't send metadata commands to not-synced workers. Otherwise some of the dependencies might not exist yet and commands will error out.	2019-10-03 16:52:25 -07:00
Marco Slot	0b4b63e647	Drop the rebalancer before creating new UDFs	2019-10-03 16:08:58 +02:00
Marco Slot	2e50306cf8	Check command type in TryToDelegateFunctionCall	2019-10-03 15:37:15 +02:00
Hanefi Onaldi	bd416ef68f	Fix empty FROM clauses in PG12	2019-10-01 19:54:11 +00:00
Jelte Fennema	ec4a165eec	Improve isolation test block detection (#3055 )	2019-10-01 14:10:15 +02:00
Jelte Fennema	40f785e6d8	Move citus_isolation_test_session_is_blocked to separate udf sql file	2019-10-01 14:10:15 +02:00
Philip Dubé	89d35e9692	Attempt to force custom plans for prepared statements when trying to delegate function calls We discern between PARAM_EXEC & PARAM_EXTERN: `d52eaa0948/src/include/nodes/primnodes.h (L211)` According to primnodes.h we should only run into PARAM_EXEC or PARAM_EXTERN	2019-09-30 23:49:14 +00:00
Philip Dubé	29f1ea079b	PG_VERSION_NUM > 110000 should be PG_VERSION_NUM >= 110000 Also fix a > 12000 typo	2019-09-30 23:37:43 +00:00
Hadi Moshayedi	5e97e5c98e	Don't push down queries when in subqueries/ctes	2019-09-30 14:22:05 -07:00
Marco Slot	35bef0f3db	Avoid caching connections from backends that servicei internal connections	2019-09-28 08:32:10 +02:00
Nils Dijk	01b26cf91a	Disallow distributed functions for functions depending on an extension (#3049 ) DESCRIPTION: Disallow distributed functions for functions depending on an extension Functions depending on an extension cannot (yet) be distributed by citus. If we would allow this it would cause issues with our dependency following mechanism as we stop following objects depending on an extension. By not allowing functions to be distributed when they depend on an extension as well as not allowing to make distributed functions depend on an extension we won't break the ability to add new nodes. Allowing functions depending on extensions to be distributed at the moment could cause problems in that area.	2019-09-30 15:19:47 +02:00
Nils Dijk	473cbc0115	Propagate CREATE OR REPLACE FUNCTION to workers for distributed functions (#3043 ) DESCRIPTION: Propagate CREATE OR REPLACE FUNCTION Distributed functions could be replaced, which should be propagated to the workers to keep the function in sync between all nodes. Due to the complexity of deparsing the `CreateFunctionStmt` we actually produce the plan during the processing phase of our utilityhook. Since the changes have already been made in the catalog tables we can reuse `pg_get_functiondef` to get us the generated `CREATE OR REPLACE` sql.	2019-09-30 12:41:17 +02:00
Jelte Fennema	82ec918b29	Add explain summary support (#3046 ) Fixes #2922 and also adds explain analyze regression tests	2019-09-30 10:58:49 +02:00
Nils Dijk	9c2c50d875	Hookup function/procedure deparsing to our utility hook (#3041 ) DESCRIPTION: Propagate ALTER FUNCTION statements for distributed functions Using the implemented deparser for function statements to propagate changes to both functions and procedures that are previously distributed.	2019-09-27 22:06:49 +02:00
Philip Dubé	363409a0c2	Propagate REINDEX TABLE & REINDEX INDEX	2019-09-27 18:14:53 +00:00
Hanefi Onaldi	66b9f2e887	Deparsing and qualifiying for FUNCTION/PROCEDURE statements (#3014 ) This PR aims to add all the necessary logic to qualify and deparse all possible `{ALTER\|DROP} .. {FUNCTION\|PROCEDURE}` queries. As Procedures are introduced in PG11, the code contains many PG version checks. I tried my best to make it easy to clean up once we drop PG10 support. Here are some caveats: - I assumed that the parse tree is a valid one. There are some queries that are not allowed, but still are parsed successfully by postgres planner. Such queries will result in errors in execution time. (e.g. `ALTER PROCEDURE p STRICT` -> `STRICT` action is valid for functions but not procedures. Postgres decides to parse them nevertheless.)	2019-09-27 19:02:52 +02:00
Marco Slot	2868e02a3d	Implement SELECT function call delegation. When a function is marked as colocated with a distributed table, we try delegating queries of kind "SELECT func(...)" to workers. We currently only support this simple form, and don't delegate forms like "SELECT f1(...), f2(...)", "SELECT f1(...) FROM ...", or function calls inside transactions. As a side effect, we also fix the transactional semantics of DO blocks. Previously we didn't consider a DO block a multi-statement transaction. Now we do. Co-authored-by: Marco Slot <marco@citusdata.com> Co-authored-by: serprex <serprex@users.noreply.github.com> Co-authored-by: pykello <hadi.moshayedi@microsoft.com>	2019-09-27 09:13:25 -07:00
Jelte Fennema	dab16be283	Set default threshold on get_rebalance_table_shards_plan to 0, like rebalance_table_shards (#3039 ) In this PR the default `threshold` of `rebalance_table_shards` was set to 0: https://github.com/citusdata/shard_rebalancer/pull/73 However, the default for get_rebalance_table_shards_plan was not updated. This can cause the confusing situation where the actual steps run by `rebalance_table_shards` are not the same as the ones returned by `get_rebalance_table_shards_plan`.	2019-09-27 17:21:36 +02:00
Marco Slot	32a11bdf6c	Return early for common commands in the utility hook (#3031 ) We started copying parse trees by default further on in `multi_ProcessUtility`. That's not a problem for maintenance command, but might register for things like `PREPARE` and `EXECUTE`, which might happen thousands of times per second. Add a few common commands to the check at the start.	2019-09-26 11:43:35 +02:00
Philip Dubé	4f60e3a149	Feedback	2019-09-24 17:31:09 +00:00
Marco Slot	ca478defeb	Deparse CALL statement instead of using original query string	2019-09-24 17:31:09 +00:00
Philip Dubé	90e1f1442a	Annotated tests for multi_mx_call. Co-authored-by: pykello <hadi.moshayedi@microsoft.com>	2019-09-24 17:31:09 +00:00
Marco Slot	e269d990c9	Cast the distribution argument value when possible	2019-09-24 17:31:09 +00:00
Philip Dubé	432a8ef85b	Hadi's feedback Co-authored-by: pykello <hadi.moshayedi@microsoft.com> Co-authored-by: serprex <serprex@users.noreply.github.com>	2019-09-24 17:31:09 +00:00
Philip Dubé	bc1ad67eb5	Distribute CALL on distributed procedures to metadata workers Lots taken from https://github.com/citusdata/citus/pull/2829	2019-09-24 17:31:09 +00:00
Onder Kalaci	18de78f386	Relax the colocation checks for distributed functions As long as the types can be coerced, it is safe to pushdown functions.	2019-09-24 16:31:08 +02:00
Marco Slot	42be8afd74	Swap pg_dist_node groupid and nodeid sequences	2019-09-24 12:03:44 +02:00
Hadi Moshayedi	48078a30e6	Fix wait_until_metadata_sync() for postgres 12. Postgres 12 now has an assertion that the calls to WaitLatchOrSocket handle postmaster death.	2019-09-23 14:15:35 -07:00
Philip Dubé	06faba91c0	Include ifdefs for pg12 API changes, update local_shard_executiuon test to avoid CTE inlining	2019-09-23 20:22:35 +00:00
Onder Kalaci	d37745bfc7	Sync metadata to worker nodes after create_distributed_function Since the distributed functions are useful when the workers have metadata, we automatically sync it. Also, after master_add_node(). We do it lazily and let the deamon sync it. That's mainly because the metadata syncing cannot be done in transaction blocks, and we don't want to add lots of transactional limitations to master_add_node() and create_distributed_function().	2019-09-23 18:30:53 +02:00
Marco Slot	5f23b951c7	Support serial and smallserial when syncing metadata	2019-09-23 17:39:21 +02:00
Marco Slot	e58d76c5f6	Fix assert failure in bare SELECT FROM reference table FOR UPDATE in MX	2019-09-23 17:00:09 +02:00
Marco Slot	d85d77634d	Handle anonymous composite types on the target list	2019-09-23 14:53:02 +02:00
Onder Kalaci	d7e2968120	Add parameters to create_distributed_function() With this commit, we're changing the API for create_distributed_function() such that users can provide the distribution argument and the colocation information.	2019-09-22 21:53:33 +02:00
Onder Kalaci	e1fe8d60b4	Make sure that functions are also listed in SupportedDependencyByCitus We've recently merged two commits, `db5d03931d` and `eccba1d4c3`, which actually operates on the very similar places. It turns out that we've an integration issue, where master_add_node() fails to replicate the functions to newly added node.	2019-09-20 11:02:50 +02:00
Hadi Moshayedi	d24cefd055	Set active snapshot before SyncMetadataToNodes().	2019-09-19 09:00:25 -07:00
Hanefi Onaldi	ed11b9590c	Add distributed func creation queries in dependency replication logic	2019-09-18 20:07:45 +03:00
Hadi Moshayedi	d2f2acc4b2	Make master_update_node citus-ha friendly.	2019-09-18 09:32:54 -07:00
Hadi Moshayedi	76f3933b05	Add metadatasynced, and sync on master_update_node() Co-authored-by: pykello <hadi.moshayedi@microsoft.com> Co-authored-by: serprex <serprex@users.noreply.github.com>	2019-09-18 09:32:54 -07:00
Nils Dijk	db5d03931d	Feature disable object propagation (#2986 ) DESCRIPTION: Provide a GUC to turn of the new dependency propagation functionality In the case the dependency propagation functionality introduced in 9.0 causes issues to a cluster of a user they can turn it off almost completely. The only dependency that will still be propagated and kept track of is the schema to emulate the old behaviour. GUC to change is `citus.enable_object_propagation`. When set to `false` the functionality will be mostly turned off. Be aware that objects marked as distributed in `pg_dist_object` will still be kept in the catalog as a distributed object. Alter statements to these objects will not be propagated to workers and may cause desynchronisation.	2019-09-18 17:16:22 +02:00
Nils Dijk	2b7f5552c8	Fix: rename remote type on conflict (#2983 ) DESCRIPTION: Rename remote types during type propagation To prevent data to be destructed when a remote type differs from the type on the coordinator during type propagation we wanted to rename the type instead of `DROP CASCADE`. This patch removes the `DROP` logic and adds the creation of a rename statement to a free name.	2019-09-17 18:54:10 +02:00
Nils Dijk	0a3152d09c	Add feature flag to turn off create type propagation (#2982 ) DESCRIPTION: Add feature flag to turn off create type propagation When `citus.enable_create_type_propagation` is set to `false` citus will not propagate `CREATE TYPE` statements to the workers. Types are still distributed when tables that depend on these types are distributed.	2019-09-17 15:50:06 +02:00
Onder Kalaci	cde6b02858	Add columns to pg_dist_object for distributed functions This PR simply adds the columns to pg_dist_object and implements the necessary metadata changes to keep track of distribution argument of the functions/procedures.	2019-09-16 17:28:04 +02:00
Jelte Fennema	af9fb9f785	Fix depend arguments for OSX clang cpp (#2978 ) A better fix for #2975. Apparently for OSX cpp -MF and -MT shouldn't have a space in between the flag and their value. Without the space it still works for gcc as well.	2019-09-16 15:22:07 +02:00
Jelte Fennema	31fac3b90e	Don't generate SQL files twice by not making directories a target (#2977 )	2019-09-16 12:53:17 +02:00
Önder Kalacı	13947a63ce	Don't use flags that mac clang doesn't support as it does on other platforms (#2975 )	2019-09-16 11:44:06 +02:00
Hanefi Onaldi	8f2a3a0604	Introduce create_distributed_function(regproc) UDF (#2961 ) This PR aims to add the minimal set of changes required to start distributing functions. You can use create_distributed_function(regproc) UDF to distribute a function. SELECT create_distributed_function('add(int,int)'); The function definition should include the param types to properly identify the correct function that we wish to distribute	2019-09-13 23:27:46 +03:00
Philip Dubé	492d1b2cba	ActivePrimaryNodeList: add lockMode parameter	2019-09-13 17:44:56 +00:00
Philip Dubé	5e5f4628a0	Fix pg12 compile	2019-09-13 17:25:30 +00:00
Jelte Fennema	4bbf65d913	Change SQL migration build process for easier reviews (#2951 ) @thanodnl told me it was a bit of a problem that it's impossible to see the history of a UDF in git. The only way to do so is by reading all the sql migration files from new to old. Another problem is that it's also hard to review the changed UDF during code review, because to find out what changed you have to do the same. I thought of a IMHO better (but not perfect) way to handle this. We keep the definition of a UDF in sql/udfs/{name_of_udf}/latest.sql. That file we change whenever we need to make a change to the the UDF. On top of that you also make a snapshot of the file in sql/udfs/{name_of_udf}/{migration-version}.sql (e.g. 9.0-1.sql) by copying the contents. This way you can easily view what the actual changes were by looking at the latest.sql file. There's still the question on how to use these files then. Sadly postgres doesn't allow inclusion of other sql files in the migration sql file (it does in psql using \i). So instead I used the C preprocessor+ make to compile a sql/xxx.sql to a build/sql/xxx.sql file. This final build/sql/xxx.sql file has every occurence of #include "somefile.sql" in sql/xxx.sql replaced by the contents of somefile.sql.	2019-09-13 18:44:27 +02:00
Nils Dijk	2879689441	Distribute Types to worker nodes (#2893 ) DESCRIPTION: Distribute Types to worker nodes When to propagate ============== There are two logical moments that types could be distributed to the worker nodes - When they get used ( just in time distribution ) - When they get created ( proactive distribution ) The just in time distribution follows the model used by how schema's get created right before we are going to create a table in that schema, for types this would be when the table uses a type as its column. The proactive distribution is suitable for situations where it is benificial to have the type on the worker nodes directly. They can later on be used in queries where an intermediate result gets created with a cast to this type. Just in time creation is always the last resort, you cannot create a distributed table before the type gets created. A good example use case is; you have an existing postgres server that needs to scale out. By adding the citus extension, add some nodes to the cluster, and distribute the table. The type got created before citus existed. There was no moment where citus could have propagated the creation of a type. Proactive is almost always a good option. Types are not resource intensive objects, there is no performance overhead of having 100's of types. If you want to use them in a query to represent an intermediate result (which happens in our test suite) they just work. There is however a moment when proactive type distribution is not beneficial; in transactions where the type is used in a distributed table. Lets assume the following transaction: ```sql BEGIN; CREATE TYPE tt1 AS (a int, b int); CREATE TABLE t1 AS (a int PRIMARY KEY, b tt1); SELECT create_distributed_table('t1', 'a'); \copy t1 FROM bigdata.csv ``` Types are node scoped objects; meaning the type exists once per worker. Shards however have best performance when they are created over their own connection. For the type to be visible on all connections it needs to be created and committed before we try to create the shards. Here the just in time situation is most beneficial and follows how we create schema's on the workers. Outside of a transaction block we will just use 1 connection to propagate the creation. How propagation works ================= Just in time ----------- Just in time propagation hooks into the infrastructure introduced in #2882. It adds types as a supported object in `SupportedDependencyByCitus`. This will make sure that any object being distributed by citus that depends on types will now cascade into types. When types are depending them self on other objects they will get created first. Creation later works by getting the ddl commands to create the object by its `ObjectAddress` in `GetDependencyCreateDDLCommands` which will dispatch types to `CreateTypeDDLCommandsIdempotent`. For the correct walking of the graph we follow array types, when later asked for the ddl commands for array types we return `NIL` (empty list) which makes that the object will not be recorded as distributed, (its an internal type, dependant on the user type). Proactive distribution --------------------- When the user creates a type (composite or enum) we will have a hook running in `multi_ProcessUtility` after the command has been applied locally. Running after running locally makes that we already have an `ObjectAddress` for the type. This is required to mark the type as being distributed. Keeping the type up to date ==================== For types that are recorded in `pg_dist_object` (eg. `IsObjectDistributed` returns true for the `ObjectAddress`) we will intercept the utility commands that alter the type. - `AlterTableStmt` with `relkind` set to `OBJECT_TYPE` encapsulate changes to the fields of a composite type. - `DropStmt` with removeType set to `OBJECT_TYPE` encapsulate `DROP TYPE`. - `AlterEnumStmt` encapsulates changes to enum values. Enum types can not be changed transactionally. When the execution on a worker fails a warning will be shown to the user the propagation was incomplete due to worker communication failure. An idempotent command is shown for the user to re-execute when the worker communication is fixed. Keeping types up to date is done via the executor. Before the statement is executed locally we create a plan on how to apply it on the workers. This plan is executed after we have applied the statement locally. All changes to types need to be done in the same transaction for types that have already been distributed and will fail with an error if parallel queries have already been executed in the same transaction. Much like foreign keys to reference tables.	2019-09-13 17:46:07 +02:00
Jelte Fennema	e4cfea3751	Correctly add schema when distributing sequence definitons Fixes 2958	2019-09-13 17:19:35 +02:00
Jelte Fennema	389086102a	Refactor 9 argument function to use a struct (#2952 ) For another PR I needed to add another column which would require to add another argument to an already 9 argument function signature. In this case it would be a boolean flag and there were already two boolean flags in there. In my experience it becomes really easy to mess up the order of these flags at that point. Especially because the type system doesn't distinguish between the 3 different booleans with completely different meanings. So I refactored these signatures to receive a struct containing most of these arguments. Like that you don't mess up orderening, because the meaning of the boolean is not order dependent but fieldname dependent. It also makes it possible to set good shared defaults for this struct.	2019-09-13 15:49:53 +02:00
Nils Dijk	05f0668cdc	Fix: schema leak onto create index statement cache (#2964 ) DESCRIPTION: Fix schema leak on CREATE INDEX statement When a CREATE INDEX is cached between execution we might leak the schema name onto the cached statement of an earlier execution preventing the right index to be created. Even though the cache is cleared when the search_path changes we can trigger this behaviour by having the schema already on the search path before a colliding table is created in a schema earlier on the `search_path`. When calling an unqualified create index via a function (used to trigger the caching behaviour) we see that the index is created on the wrong table after the schema leaked onto the statement. By copying the complete `PlannedStmt` and `utilityStmt` during our planning phase for distributed ddls we make sure we are not leaking the schema name onto a cached data structure. Caveat; COPY statements already have a lot of parsestree copying ongoing without directly putting it back on the `pstmt`. We should verify that copies modify the statement and potentially copy the complete `pstmt` there already.	2019-09-13 14:04:23 +02:00
Hadi Moshayedi	48ff4691a0	Return nodeid instead of record in some UDFs	2019-09-12 12:46:21 -07:00
Philip Dubé	2aa6852dea	Begin searching AggregateNames from 1, not 0	2019-09-12 16:55:05 +00:00
Jelte Fennema	d6deb062aa	Add shard rebalancer stubs	2019-09-12 16:40:25 +02:00
Jelte Fennema	eb7e45d556	Make LookupNodeForGroup extern	2019-09-12 16:40:25 +02:00
Jelte Fennema	257406fda7	Fix ArrayObjectCount for zero sized arrays	2019-09-12 16:40:25 +02:00
Onder Kalaci	0b0c779c77	Introduce the concept of Local Execution /* * local_executor.c * * The scope of the local execution is locally executing the queries on the * shards. In other words, local execution does not deal with any local tables * that are not shards on the node that the query is being executed. In that sense, * the local executor is only triggered if the node has both the metadata and the * shards (e.g., only Citus MX worker nodes). * * The goal of the local execution is to skip the unnecessary network round-trip * happening on the node itself. Instead, identify the locally executable tasks and * simply call PostgreSQL's planner and executor. * * The local executor is an extension of the adaptive executor. So, the executor uses * adaptive executor's custom scan nodes. * * One thing to note that Citus MX is only supported with replication factor = 1, so * keep that in mind while continuing the comments below. * * On the high level, there are 3 slightly different ways of utilizing local execution: * * (1) Execution of local single shard queries of a distributed table * * This is the simplest case. The executor kicks at the start of the adaptive * executor, and since the query is only a single task the execution finishes * without going to the network at all. * * Even if there is a transaction block (or recursively planned CTEs), as long * as the queries hit the shards on the same, the local execution will kick in. * * (2) Execution of local single queries and remote multi-shard queries * * The rule is simple. If a transaction block starts with a local query execution, * all the other queries in the same transaction block that touch any local shard * have to use the local execution. Although this sounds restrictive, we prefer to * implement in this way, otherwise we'd end-up with as complex scenarious as we * have in the connection managements due to foreign keys. * * See the following example: * BEGIN; * -- assume that the query is executed locally * SELECT count() FROM test WHERE key = 1; * -- at this point, all the shards that reside on the * -- node is executed locally one-by-one. After those finishes * -- the remaining tasks are handled by adaptive executor * SELECT count() FROM test; * * (3) Modifications of reference tables * * Modifications to reference tables have to be executed on all nodes. So, after the * local execution, the adaptive executor keeps continuing the execution on the other * nodes. * * Note that for read-only queries, after the local execution, there is no need to * kick in adaptive executor. * * There are also few limitations/trade-offs that is worth mentioning. First, the * local execution on multiple shards might be slow because the execution has to * happen one task at a time (e.g., no parallelism). Second, if a transaction * block/CTE starts with a multi-shard command, we do not use local query execution * since local execution is sequential. Basically, we do not want to lose parallelism * across local tasks by switching to local execution. Third, the local execution * currently only supports queries. In other words, any utility commands like TRUNCATE, * fails if the command is executed after a local execution inside a transaction block. * Forth, the local execution cannot be mixed with the executors other than adaptive, * namely task-tracker, real-time and router executors. Finally, related with the * previous item, COPY command cannot be mixed with local execution in a transaction. * The implication of that any part of INSERT..SELECT via coordinator cannot happen * via the local execution. */	2019-09-12 11:51:25 +02:00
Marco Slot	810aca8d41	Drop foreign key from pg_dist_poolinfo to pg_dist_node	2019-09-10 09:52:19 +02:00
Onder Kalaci	485189c0b6	Make sure that lost connections are handled properly Before this patch, when a connection is lost, we'd have the following situation: - Pop a task execution from readyQueue - Lost connection - Fail the session/pool. -> This step was not acting properly because we've popped the task, but not set to session->currentTask yet After the patch: - Pop a task execution from readyQueue - Immediately set it to session->currentTask - Lost connection - Fail the session/pool. -> At this step, failing the session would trigger query failures (or failovers) properly.	2019-09-10 17:54:27 +02:00
Philip Dubé	a28b82d67d	get_catalog_object_by_oid requires an extra parameter in pg12	2019-09-05 16:38:07 +00:00
Nils Dijk	511e715ee3	Remove early escape in walking pg_depend (#2930 ) This is a bug that got in when we inlined the body of a function into this loop. Earlier revisions had two loops, hence a function that would be reused. With a return instead of a continue the list of dependencies being walked is dependent on the order in which we find them in pg_depend. This became apparent during pg12 compatibility. The order of entries in pg12 was luckily different causing a random test to fail due to this return. By changing it to a continue we only skip the entries that we don’t want to follow instead of skipping all entries that happen to be found later. sidefix for more stable isolation tests around ensure dependency	2019-09-05 18:03:34 +02:00
Philip Dubé	bdd30bb181	Don't allow distributing by a generated column	2019-09-04 14:50:17 +00:00
Philip Dubé	41dca121e2	Support GENERATE ALWAYS AS STORED	2019-09-04 14:50:17 +00:00
Nils Dijk	936d546a3c	Refactor Ensure Schema Exists to Ensure Dependecies Exists (#2882 ) DESCRIPTION: Refactor ensure schema exists to dependency exists Historically we only supported schema's as table dependencies to be created on the workers before a table gets distributed. This PR puts infrastructure in place to walk pg_depend to figure out which dependencies to create on the workers. Currently only schema's are supported as objects to create before creating a table. We also keep track of dependencies that have been created in the cluster. When we add a new node to the cluster we use this catalog to know which objects need to be created on the worker. Side effect of knowing which objects are already distributed is that we don't have debug messages anymore when creating schema's that are already created on the workers.	2019-09-04 14:10:20 +02:00
Philip Dubé	28d964240f	Remove CheckForUpdates https://reports.citusdata.com/v1/releases/latest We haven't updated the version CheckForUpdates sees since 7.1.0	2019-09-03 21:11:25 +00:00
Philip Dubé	da00c62eea	create_distributed_table: include COLLATE on columns	2019-08-29 14:22:54 +00:00
Philip Dubé	32ef459025	backend_data.c: include max_wal_senders in calculating maxBackend, matches changes in pg12's InitializeMaxBackends	2019-08-28 21:24:33 +00:00
Jelte Fennema	cbecf97c84	Move tuplestore setup to a helper function (#2898 ) * Add tuplestore helpers * More detailed error messages in tuplestore * Add CreateTupleDescCopy to SetupTuplestore * Use new SetupTuplestore helper function * Remove unnecessary copy * Remove comment about undefined behaviour	2019-08-27 09:11:08 +02:00
Philip Dubé	eba3828ef7	ColocatedShardIntervalList: sort	2019-08-26 17:42:41 +00:00
Philip Dubé	6b0d8ed83d	SortList in FinalizedShardPlacementList, makes 3 failure tests consistent between 11/12	2019-08-22 19:30:56 +00:00
Philip Dubé	693d4695d7	Create a test 'pg12' for pg12 features & error on unsupported new features Unsupported new features: COPY FROM WHERE, GENERATED ALWAYS AS, non-heap table access methods	2019-08-22 19:30:56 +00:00
Philip Dubé	e5cd298a98	pg12 revised layout of FunctionCallInfoData See `a9c35cf85c` clang raises a warning due to FunctionCall2InfoData technically being variable sized This is fine, as the struct is the size we want it to be. So silence the warning	2019-08-22 19:02:35 +00:00
Philip Dubé	bee779e7d4	planner/distributed_planner.c: get_func_cost replaced with add_function_cost in pg12	2019-08-22 19:02:10 +00:00
Philip Dubé	be3285828f	Collations matter for hashing strings in pg12 See https://www.postgresql.org/docs/12/collation.html#COLLATION-NONDETERMINISTIC	2019-08-22 18:58:37 +00:00
Philip Dubé	fe10ca453d	Implement FileCompat to abstract pg12 requiring API consumer to track file offsets	2019-08-22 18:57:47 +00:00
Philip Dubé	018ad1c58e	pg12: version_compat.h, tuples, oids, misc	2019-08-22 18:57:23 +00:00
Philip Dubé	9643ff580e	Update commands/vacuum.c with pg12 changes Adds support for SKIP_LOCKED, INDEX_CLEANUP, TRUNCATE Removes broken assert	2019-08-22 18:56:54 +00:00
Philip Dubé	68c4b71f93	Fix up includes with pg12 changes	2019-08-22 18:56:21 +00:00
Philip Dubé	fbc3e346e8	ruleutils_12.c Produced this file by copying ruleutils_11.c, then comparing postgres ruleutils.c changes between REL_11_STABLE & REL_12_STABLE	2019-08-22 18:56:05 +00:00
Hadi Moshayedi	6be1bacddd	Fix distributed deadlock for TRUNCATE	2019-08-22 11:03:53 -07:00
Hadi Moshayedi	a5b087c89b	Support FKs between reference tables	2019-08-21 16:11:27 -07:00
Hadi Moshayedi	a3578a6e60	Sort load_shard_placement_array by worker name/port	2019-08-21 14:35:05 -07:00
Philip Dubé	7bf7e41594	commands/index.c: Fix assertion typo	2019-08-21 18:54:05 +00:00
Philip Dubé	f4b90419ae	Raise an error when REINDEX TABLE or INDEX is invoked on a distributed relation	2019-08-21 17:03:14 +00:00
Philip Dubé	db5a7f49a7	Task Tracker: fix error being copy pasted from above block	2019-08-21 15:44:01 +00:00
Philip Dubé	f62d4a6712	citus_rm_job_directory for multi_query_directory_cleanup	2019-08-19 17:04:42 +00:00
Philip Dubé	9777f22e1e	Avoid invalid array accesses to partitionFileArray	2019-08-19 17:04:42 +00:00
Philip Dubé	f4ca02664a	single_shard_commit_protocol: GUC_NO_SHOW_ALL	2019-08-18 12:54:32 +00:00
Hadi Moshayedi	c582eb89c8	Add some missing locks.	2019-08-15 12:34:31 -07:00
Philip Dubé	f4e513b3d4	Introduce citus.single_shard_commit_protocol for if users want 1PC on writes to replicas	2019-08-15 18:49:40 +00:00
Philip Dubé	cd951fa9ca	Avoid multiple pg_dist_colocation records being created for reference tables master_deactivate_node is updated to decrement the replication factor Otherwise deactivation could have create_reference_table produce a second record UpdateColocationGroupReplicationFactor is renamed UpdateColocationGroupReplicationFactorForReferenceTables & the implementation looks up the record based on distributioncolumntype == InvalidOid, rather than by id Otherwise the record's replication factor fails to be maintained when there are no reference tables	2019-08-13 17:21:02 +00:00
Nils Dijk	be6b7bec69	Add UDF citus_(prepare\|finish)_pg_upgrade to aid with upgrading citus (#2877 ) DESCRIPTION: Add functions to help with postgres upgrades Currently there is [a list of manual steps](https://docs.citusdata.com/en/v8.2/admin_guide/upgrading_citus.html?highlight=upgrade#upgrading-postgresql-version-from-10-to-11) to perform during a postgres upgrade. These steps guarantee our catalog tables are kept and counter values are maintained across upgrades. Having more than 1 command in our docs for users to manually execute during upgrades is error prone for both the user, and our docs. There are already 2 catalog tables that have been introduced to citus that have not been added to our docs for backing up during upgrades (`pg_authinfo` and `pg_dist_poolinfo`). As we add more functionality to citus we run into situations where there are more steps required either before or after the upgrade. At the same time, when we move catalog tables to a place where the contents will be maintained automatically during upgrades we could have less steps in our docs. This will come to a hard to maintain matrix of citus versions and steps to be performed. Instead we could take ownership of these steps within the extension itself. This PR introduces two new functions for the user to use instead of long lists of error prone instructions to follow. - `citus_prepare_pg_upgrade` This function should be called by the user right before shutting down the cluster. This will ensure all citus catalog tables are backed up in a location where the information will be retained during an upgrade. - `citus_finish_pg_upgrade` This function should be called right after a pg_upgrade of the cluster. This will restore the catalog tables to the state before the upgrade happend. Both functions need to be executed both on the coordinator and on all the workers, in the same fashion our current documentation instructs to do. There are two known problems with this function in its current form, which is also a problem with our docs. We should schedule time in the future to improve on this, but having it automated now is better as we are about to add extra steps to take after upgrades. - When you install citus in a clean cluster we do enable ssl for communication between the coordinator and the workers. If an upgrade to a clean cluster is performed we do not setup ssl on the new cluster causing the communication to fail. - There are no automated tests added in this PR to execute an upgrade test durning every build. Our current test infrastructure does not allow for 2 versions of postgres to exist in the same environment. We will need to invest time to create a new testing harness that could run the following scenario: 1. Create cluster 2. Run extensible scripts to execute arbitrary statements on this cluster 3. Perform an upgrade by preparing, upgrading and finishing 4. Run extensible scripts to verify all objects created by earlier scripts exists in correct form in the upgraded cluster Given the non trivial amount of work involved for such a suite I'd like to land this before we have automated testing. On a side note; As the reviewer noticed, the tables created in the public namespace are not visible in `psql` with `\d`. The backup catalog tables have the same name as the tables in `pg_catalog`. Due to postgres internals `pg_catalog` is first in the search path and therefore the non-qualified name would alwasy resolve to `pg_catalog.pg_dist_*`. Internally this is called a non-visible table as it would resolve to a different table without a qualified name. Only visible tables are shown with `\d`.	2019-08-13 15:53:10 +02:00
Hadi Moshayedi	009d8b7401	Some cleanup	2019-08-12 15:38:52 -07:00
Philip Dubé	705d1bf0e0	Use PG_JOB_CACHE_DIR	2019-08-09 15:25:59 +00:00
Onder Kalaci	060ac11476	Do not record relation accessess unnecessarily Before this commit, we've recorded the relation accesses in 3 different places - FindPlacementListConnection -- applies all executor in tx block - StartPlacementExecutionOnSession() -- adaptive executor only - StartPlacementListConnection() -- router/real-time only This is different than Citus 8.2, and could lead to query execution times increase considerably on multi-shard commands in transaction block that are on partitioned tables. Benchmarks: ``` 1+8 c5.4xlarge cluster Empty distributed partitioned table with 365 partitions: https://gist.github.com/onderkalaci/1edace4ed6bd6f061c8a15594865bb51#file-partitions_365-sql ./pgbench -f /tmp/multi_shard.sql -c10 -j10 -P 1 -T 120 postgres://citus:w3r6KLJpv3mxe9E-NIUeJw@c.fy5fkjcv45vcepaogqcaskmmkee.db.citusdata.com:5432/citus?sslmode=require cat /tmp/multi_shard.sql BEGIN; DELETE FROM collections_list; DELETE FROM collections_list; DELETE FROM collections_list; COMMIT; cat /tmp/single_shard.sql BEGIN; DELETE FROM collections_list WHERE key = :aid; DELETE FROM collections_list WHERE key = :aid; DELETE FROM collections_list WHERE key = :aid; COMMIT; cat /tmp/mix.sql BEGIN; DELETE FROM collections_list WHERE key = :aid; DELETE FROM collections_list WHERE key = :aid; DELETE FROM collections_list WHERE key = :aid; DELETE FROM collections_list; DELETE FROM collections_list; DELETE FROM collections_list; COMMIT; ``` The table shows `latency average` of pgbench runs explained above, so we have a pretty solid improvement even over 8.2.2. \| Test \| Citus 8.2.2 \| Citus 8.3.1 \| Citus 8.3.2 (this branch) \| Citus 8.3.1 (FKEYs disabled via GUC) \| \| ------------- \| ------------- \| ------------- \|------------- \| ------------- \| \|multi_shard \| 2370.083 ms \|3605.040 ms \|1324.094 ms \|1247.255 ms \| \| single_shard \| 85.338 ms \|120.934 ms \|73.216 ms \| 78.765 ms \| \| mix \| 2434.459 ms \| 3727.080 ms \|1306.456 ms \| 1280.326 ms \|	2019-08-08 18:42:08 +02:00
Onder Kalaci	35ee896f3d	Get rid of an unnecessary parameter targetPoolSize parameter for ExecuteUtilityTaskListWithoutResults becomes obsolete, just remove it.	2019-08-07 19:35:56 +02:00
Onder Kalaci	b2e01d0745	Refactor switching to sequential mode We don't need to wait until the execution. As soon as we realize that we need sequential execution, we should do it.	2019-08-07 19:35:56 +02:00
Philip Dubé	b77c52f95b	PlanRouterQuery: don't store list of list of shard intervals in relationShardList	2019-08-02 14:08:57 +00:00
Philip Dubé	fdc0ef6392	Adaptive executor: use 2PC when replication_factor > 1	2019-08-01 23:55:12 +00:00
Philip Dubé	064bd66a20	Avoid segfault in logging queries	2019-07-31 15:28:46 +00:00
Philip Dubé	3982b4635f	CompareShardIntervals: if intervals are equal, compare id. Works around sort being unstable	2019-07-26 16:13:36 +00:00
Marco Slot	e2bc09838e	Use ereport instead of elog in adaptive executor	2019-07-23 20:40:32 +02:00
Marco Slot	bd111366b0	Skip CheckConnectionTimeout when checkForPoolTimeout is false	2019-07-23 20:40:32 +02:00
Marco Slot	a3811b1e55	Avoid FindWorkerNode calls in adaptive executor	2019-07-23 20:40:32 +02:00
Marco Slot	4444d92dbc	Set initial pool size to cached connection count	2019-07-23 20:40:32 +02:00
Marco Slot	4c0c33365e	Avoid creating a redundant event set at the start	2019-07-23 20:40:32 +02:00
Marco Slot	32e7a80960	Avoid unnecessary calls to PQconsumeInput	2019-07-23 20:40:32 +02:00
Marco Slot	71ad5c095b	Use ModifyWaitEvent when only wait flags changed	2019-07-23 20:40:32 +02:00
Philip Dubé	50144b75d0	Add check-empty to testing Makefile Don't create functions multiple times Move ALTER TABLEs to their declaration Remove DROP FUNCTIONS IF EXISTS, OR REPLACE	2019-07-24 11:03:54 -07:00
Philip Dubé	acbaa38a62	Squash migrations for versions 5/6, don't use WITH OIDS	2019-07-24 11:03:29 -07:00
Hanefi Onaldi	8127297999	update workerNodeList after sorting	2019-07-23 20:57:07 +00:00
Marco Slot	efbe58eab2	Fix SQL schema version, we skipped 8.3	2019-07-17 16:05:25 +02:00
Philip Dubé	0915027389	DistributedPlan: replace operation with modLevel This causes no behaviorial changes, only organizes better to implement modifying CTEs Also rename ExtactInsertRangeTableEntry to ExtractResultRelationRTE, as the source of this function didn't match the documentation Remove Task's upsertQuery in favor of ROW_MODIFY_NONCOMMUTATIVE Split up AcquireExecutorShardLock into more internal functions Tests: Normalize multi_reference_table multi_create_table_constraints	2019-07-16 13:58:18 -07:00
Hanefi Onaldi	0bdec52761	Fix default_version in citus.control file (#2840 )	2019-07-11 14:24:51 +03:00
Hanefi Onaldi	5a6eba6ba9	Bump Citus to 8.4devel	2019-07-10 15:26:10 +03:00
Nils Dijk	791cc26a86	Fix an issue with subquery map merge jobs as non-root Also automated all manual tests around multi user isolation for internal citus udf's automate upgrade_to_reference_table tests add negative tests for lock_relation_if_exists add tests for permissions on worker_cleanup_job_schema_cache add tests for worker_fetch_partition_file add tests for worker_merge_files_into_table fix problem with worker_merge_files_and_run_query when run as non-super user and add tests for behaviour	2019-07-10 12:40:05 +02:00

... 6 7 8 9 10 ...

1953 Commits (5fcddfa2c63ae5d5c60ee7978b4aec63e9a63b80)