citus

Commit Graph

Author	SHA1	Message	Date
Onur Tirtir	f3801143fb	Add cascade option to undistribute_table	2021-01-07 15:41:49 +03:00
Marco Slot	47c1b19174	Revert "Do metadata sync in a separate background worker." This reverts commit `4df723cf9b`.	2021-01-07 10:30:04 +01:00
Marco Slot	5de3337b2f	Support local execution for INSERT..SELECT with re-partitioning	2021-01-06 16:15:53 +01:00
Naisila Puka	bcfc0aa4e9	Rethrow original concurrent index creation failure message (#4469 ) * Rethrow original concurrent index creation failure message * Alter test outputs for concurrent index creation * Detect duplicate table failure in concurrent index creation * Add test for conc. index creation w/out duplicates	2021-01-06 15:27:13 +03:00
Ahmet Gedemenli	1f36ff7c17	Prevent deadlock for long named partitioned index creation on single node (#4461 ) * Prevent deadlock for long named partitioned index creation on single node * Create IsSingleNodeCluster function * Use both local and sequential execution	2021-01-05 13:39:13 +03:00
Ahmet Gedemenli	f27649754b	Add alter index set statistics support (#4455 ) * Add alter index set statistics support * Use attNum instead of attName	2021-01-05 13:23:11 +03:00
Onur Tirtir	87e5276bdd	Fix fkey graph test for self reference (#4450 )	2020-12-28 12:47:39 +03:00
Naisila Puka	04aeb6938b	Merge branch 'master' into issue4237	2020-12-25 12:36:40 +03:00
Hadi Moshayedi	4df723cf9b	Do metadata sync in a separate background worker.	2020-12-24 08:25:55 -08:00
Naisila Puka	0bb2c991f9	Merge branch 'master' into issue4237	2020-12-24 18:05:27 +03:00
Ahmet Gedemenli	5af585269a	Add separate pg13 test for stats targets	2020-12-24 18:01:25 +03:00
naisila	59a81491e8	Add test for master_create_empty_shard on coordinator	2020-12-24 17:59:40 +03:00
Ahmet Gedemenli	d4bc17f6f0	Propagate statistics with altered targets	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	f7c70f9a63	Propagate alter stats target	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	5a1607b6c0	Propagate alter stats schema	2020-12-24 17:10:12 +03:00
Ahmet Gedemenli	bdce4a7e67	Propagate rename statistics	2020-12-24 17:10:12 +03:00
Onur Tirtir	5ed9197041	Implement infra to get foreign key connected relations (#4439 ) On top of our foreign key graph, implement the infrastructure to get list of relations that are connected to input relation via a foreign key graph. We need this to support cascading create_citus_local_table & undistribute_table operations. Also add regression tests to see what our foreign key graph is able to capture currently.	2020-12-24 16:42:40 +03:00
Onur Tirtir	57e7defa3c	Support CREATE INDEX commands without index name on citus tables (#4273 )	2020-12-23 23:15:39 +03:00
Marco Slot	e3dcc278e0	Remove upgrade_to_reference_table UDF	2020-12-23 00:40:14 +01:00
jeff-davis	90d63cb792	Add columnar pg_dump test. (#4433 )	2020-12-22 15:57:35 -08:00
Ahmet Gedemenli	874fa1fc09	Propagate Drop Statistics	2020-12-22 18:34:46 +03:00
Marco Slot	321cc784c7	Collapse Citus 7.* scripts into Citus 8.0-1	2020-12-21 22:55:51 +01:00
Hadi Moshayedi	dde0323b57	Columnar: enable zstd & lz4 compilation by default (#4402 ) * Columnar: enable zstd & lz4 compilation by default * Make zstd & lz4 tests more consistent * Don't require lz4 & zstd for postgres 11 Co-authored-by: Nils Dijk <nils@citusdata.com>	2020-12-21 12:11:58 -08:00
Onur Tirtir	cceaf31e4c	Add some more tests with views to test recursive planning on views (#4427 ) (cherry picked from commit `51f422f3c6`)	2020-12-21 11:53:37 +03:00
jeff-davis	49281202af	Add simple follower test for columnar. (#4432 )	2020-12-18 13:59:20 -08:00
jeff-davis	3e0f1aaaab	Prevent inserting into logically-replicated columnar table. (#4429 )	2020-12-18 12:29:30 -08:00
Marco Slot	f2056e553f	Expose partition column of subqueries in optimizer (#4355 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2020-12-18 20:32:52 +01:00
SaitTalhaNisanci	145112f3a0	Fix attribute numbers in subquery conversions (#4426 ) Attribute number in a subquery RTE and relation RTE means different things. In a relation attribute number will point to the column number in the table definition including the dropped columns as well however in subquery, it means the index in the target list. When we convert a relation RTE to subquery RTE we should either correct all the relevant attribute numbers or we can just add a dummy column for the dropped columns. We choose the latter in this commit because it is practically too vulnerable to update all the vars in a query. Another thing this commit fixes is that in case a join restriction clause list contains a false clause, we should just returns a false clause instead of the whole list, because the whole list will contain restrictions from other RTEs as well and this breaks the query, which can be seen from the output changes, now it is much simpler. Also instead of adding single tests for dropped columns, we choose to run the whole mixed queries with tables with dropped columns, this revealed some bugs already, which are fixed in this commit.	2020-12-18 20:25:41 +03:00
Nils Dijk	a748729998	rework ci	2020-12-18 18:04:45 +01:00
Ahmet Gedemenli	770d3da1ca	Add dependencies for stat schemas	2020-12-18 17:04:13 +03:00
Ahmet Gedemenli	6c0465566a	Propagate create statistics	2020-12-17 20:38:36 +03:00
Marco Slot	1e2518f83c	Add tests for router queries with catalog tables (#4422 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2020-12-17 15:07:50 +01:00
Marco Slot	100e5d3196	Address review feedback	2020-12-15 15:23:38 +01:00
Marco Slot	23dccd8941	Add some new tests for complex correlated subqueries in WHERE	2020-12-15 14:17:16 +01:00
Marco Slot	707a6554b1	Support co-located/recurring correlated subqueries	2020-12-15 14:17:16 +01:00
Sait Talha Nisanci	181a7e1d36	Skip dropped columns	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	7951273f74	Refactor WrapRteRelationIntoSubquery	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	0e53aa5d3b	Add more tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	f5dd5379b2	Add more tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	3aed6c3ad0	Rename containsOnlyLocalTable as isLocalTableModification Update error message in Modify View	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	13c43d5744	Improve table conversion logic in dist-local joins	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	5618f3a3fc	Use BaseRestrictInfo for finding equality columns Baseinfo also has pushed down filters etc, so it makes more sense to use BaseRestrictInfo to determine what columns have constant equality filters. Also RteIdentity is used for removing conversion candidates instead of rteIndex.	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	69992d58f9	Add broken local-dist table modifications tests It seems that most of the updates were broken, we weren't aware of it because there wasn't any data in the tables. They are broken mostly because local tables do not have a shard id and some code paths should be updated with that information, currently when there is an invalid shard id, it is assumed to be pruned. Consider local tables in router planner In case there is a local table, the shard id will not be valid and there are some checks that rely on shard id, we should skip these in case of local tables, which is handled with a dummy placement. Add citus local table dist table join tests add local-dist table mixed joins tests	2020-12-15 18:18:36 +03:00
Sait Talha Nisanci	2a44029aaf	Simplify ContainsTableToBeConvertedToSubquery AllDataLocallyAccessible and ContainsLocalTableSubqueryJoin are removed. We can possibly remove ModifiesLocalTableWithRemoteCitusLocalTable as well. Though this removal has a side effect that now when all the data is locally available, we could still wrap a relation into a subquery, I guess that should be resolved in the router planner itself. Add more tests	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	26d9f0b457	Use auto mode in tests and fix debug message	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	3bd53a24a3	Support update on postgres table from citus local table	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	4b6611460a	Support foreign table joins as well	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	7e9204eba9	Update vars in quals while wrapping RTE to subquery When we wrap an RTE to subquery we are updating the variables varno's as 1, however we should also update the varno's of vars in quals. Also some other small code quality improvements are done.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	0689f2ac1a	Recursively plan distributed tables only if all have unique filters The previous algorithm was not consistent and it could convert different RTEs based on the table orders in the query. Now we convert local tables if there is a distributed table which doesn't have a unique index. So if there are 4 tables, local1, local2, dist1, dist2_with_pkey then we will convert local1 and local2 in `auto` mode. Converting a distributed table is not that logical because as there is a distributed table without a unique index, we will need to convert the local tables anyway. So converting the distributed table with pkey is redundant.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	a008fc611c	Support materialized view joins as well	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	5f46abffd9	Update check multi tests	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	eebcd995b3	Add some more tests	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	5693cabc41	Not convert an already routable plannable query We should not recursively plan an already routable plannable query. An example of this is (SELECT * FROM local JOIN (SELECT * FROM dist) d1 USING(a)); So we let the recursive planner do all of its work and at the end we convert the final query to to handle unsupported joins. While doing each conversion, we check if it is router plannable, if so we stop. Only consider range table entries that are in jointree If a range table is not in jointree then there is no point in considering that because we are trying to convert range table entries to subqueries for join use case.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	2ff65f3630	Enable partitioned distributed tables in local-dist table joins	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	44953579cf	Enable citus-local distributed table joins Check equality in quals We want to recursively plan distributed tables only if they have an equality filter on a unique column. So '>' and '<' operators will not trigger recursive planning of distributed tables in local-distributed table joins. Recursively plan distributed table only if the filter is constant If the filter is not a constant then the join might return multiple rows and there is a chance that the distributed table will return huge data. Hence if the filter is not constant we choose to recursively plan the local table.	2020-12-15 18:17:10 +03:00
Sait Talha Nisanci	f3d55448b3	Choose distributed table if it has a unique index in filter When doing local-distributed table joins we convert one of them to subquery. The current policy is that we convert distributed tables to subquery if it has a unique index on a column that has unique index(primary key also has a unique index).	2020-12-15 18:17:10 +03:00
Onder Kalaci	945193555b	add basic regression tests	2020-12-15 18:17:10 +03:00
Onder Kalaci	594e001f3b	Add filter pushdown regression tests Also handle WHERE false	2020-12-15 18:17:10 +03:00
Onder Kalaci	82a4830c7d	Adjust the existing regression tests	2020-12-15 18:17:10 +03:00
Marco Slot	f2538a456f	Support co-located/recurring sublinks in the target list	2020-12-13 15:45:24 +01:00
Hadi Moshayedi	4dd22cc4e4	Columnar: Fix ANALYZE for large number of rows.	2020-12-10 09:52:33 -08:00
Hadi Moshayedi	b3dac5e9d1	Columnar: set default compression as zstd if available	2020-12-09 14:32:08 -08:00
Hadi Moshayedi	4668fe51a6	Columnar: Make compression level configurable	2020-12-09 08:48:50 -08:00
Hadi Moshayedi	f5a4a4bc74	Columnar: Support zstd compression	2020-12-09 08:30:55 -08:00
Hadi Moshayedi	3f81ee26fd	Columnar: Support LZ4 compression	2020-12-09 08:29:07 -08:00
jeff-davis	260a02180b	Add tests for unsupported columnar storage features (#4397 ) Add negative tests: * Deletes * Sample scan * Special columns * Tuple locks * Indexes	2020-12-09 00:08:45 -08:00
Jeff Davis	c91e5b052b	more test fixups	2020-12-07 13:43:27 -08:00
Jeff Davis	7169ba21c4	more test fixes	2020-12-07 13:36:46 -08:00
Jeff Davis	3758e83850	Rename cstore->columnar in SQL objects and errors.	2020-12-07 13:01:53 -08:00
Jeff Davis	ad919ff220	Tests for UPDATE and error message improvement. UPDATEs on partitioned tables that affect only row partitions should succeed, the rest should fail. Also rename CStoreScan to ColumnarScan to make the error message more relevant.	2020-12-07 11:25:30 -08:00
Ahmet Gedemenli	936775e8e3	Delete transactions when removing node With this commit, we delete entries in pg_dist_transaction for the primary nodes that are removed by `master_remove_node`.	2020-12-07 11:35:20 +03:00
Hadi Moshayedi	01da2a1c73	Columnar: track decompressed length in metadata	2020-12-04 09:09:39 -08:00
Onder Kalaci	bd9827aed9	Add regression tests with different data types We typically do not test Citus with these uncommon data types. Now, we already have the tests for ADF integration, add it to regression tests as well.	2020-12-04 10:25:00 +03:00
Hadi Moshayedi	4a9aebaa7b	Columnar: rename block to chunk	2020-12-03 08:50:19 -08:00
Hadi Moshayedi	24bfd368a9	Columnar: Fix VACUUM for empty tables	2020-12-03 08:46:09 -08:00
Marco Slot	c9b658daea	Add a public.citus_tables view	2020-12-03 17:31:40 +01:00
Marco Slot	4098d33acb	Allow citus size functions on replicated tables	2020-12-03 16:33:24 +01:00
Marco Slot	c69ea2512a	Fix flappy failure test	2020-12-03 13:54:02 +01:00
Onder Kalaci	c546ec5e78	Local node connection management When Citus needs to parallelize queries on the local node (e.g., the node executing the distributed query and the shards are the same), we need to be mindful about the connection management. The reason is that the client backends that are running distributed queries are competing with the client backends that Citus initiates to parallelize the queries in order to get a slot on the max_connections. In that regard, we implemented a "failover" mechanism where if the distributed queries cannot get a connection, the execution failovers the tasks to the local execution. The failover logic is follows: - As the connection manager if it is OK to get a connection - If yes, we are good. - If no, we fail the workerPool and the failure triggers the failover of the tasks to local execution queue The decision of getting a connection is follows: /* * For local nodes, solely relying on citus.max_shared_pool_size or * max_connections might not be sufficient. The former gives us * a preview of the future (e.g., we let the new connections to establish, * but they are not established yet). The latter gives us the close to * precise view of the past (e.g., the active number of client backends). * * Overall, we want to limit both of the metrics. The former limit typically * kics in under regular loads, where the load of the database increases in * a reasonable pace. The latter limit typically kicks in when the database * is issued lots of concurrent sessions at the same time, such as benchmarks. */	2020-12-03 14:16:13 +03:00
Hadi Moshayedi	c2f60b6422	Columnar: pg_upgrade support (#4354 )	2020-12-02 08:46:59 -08:00
Ahmet Gedemenli	5242dcfe99	Add tests for propagating alter schema rename	2020-12-02 15:18:26 +03:00
Nils Dijk	6f9c040f76	DESCRIPTION: Propagate columnar table settings for distributed tables When distributing a columnar table, as well as changing options on a distributed columnar table, this patch will forward the settings from the coordinator to the workers. For propagating options changes on an already distributed table this change is pretty straight forward. Before applying the change in options locally we will create a `DDLJob` that contains a call to `alter_columnar_table_set(...)` for every shard placement with all settings of the current table. This goes both for setting an option as well as resetting. This will reset the values to the defaults configured on the coordinator. Having the effect that the coordinator is authoritative on the settings and makes sure the shards have the same settings set as the table on the coordinator. When a columnar table is distributed it is using the `TableDDLCommand` infra structure to create a new kind of `TableDDLCommand`. This new type, called a `TableDDLCommandFunction` contains a context and 2 function pointers to execute. One function returns the command as applied on the table, the second function will return the sql command to apply to a shard with a given shard id. The schema name is ignored as it will use the fully qualified name of the shard in the same schema as the base table.	2020-12-02 13:02:42 +01:00
Halil Ozan Akgül	ef0914a7f8	Adds ORDER BY to flaky test (#4305 ) Co-authored-by: Önder Kalacı <onder@citusdata.com>	2020-12-02 14:24:05 +03:00
Onder Kalaci	f7e1aa3f22	Multi-row INSERTs use local execution when placements are local Multi-row execution already uses sequential execution. When shards are local, using local execution is profitable as it avoids an extra connection establishment to the local node.	2020-12-01 21:37:59 +03:00
Marco Slot	48caca4084	Improve regression test settings	2020-11-30 20:34:03 +01:00
Ahmet Gedemenli	8e5f0487eb	Add order by for flaky test	2020-12-01 10:54:52 +03:00
Ahmet Gedemenli	67761897ab	Add test for citus table size func in transaction with modification Add test for citus_relation_size	2020-12-01 10:38:15 +03:00
Hadi Moshayedi	feecb7b423	Columnar: few fixes (#4371 ) * Columnar: fix a memory issue * Columnar: no need for deferred triggers * Columnar: relax memory growth constraints	2020-11-30 18:09:43 -08:00
Hadi Moshayedi	a94e8c9cda	Associate column store metadata with storage id (#4347 )	2020-11-30 18:01:43 -08:00
SaitTalhaNisanci	c31a8df380	Call 6 times not 7 in subquery_prepared_statements (#4357 )	2020-11-30 21:20:51 +03:00
Nils Dijk	383e334023	refactor options to their own table linked to the regclass (#4346 ) Columnar options were by accident linked to the relfilenode instead of the regclass/relation oid. This PR moves everything related to columnar options to their own catalog table.	2020-11-27 11:22:08 -08:00
Onder Kalaci	629ecc3dee	Add the infrastructure to count the number of client backends Considering the adaptive connection management improvements that we plan to roll soon, it makes it very helpful to know the number of active client backends. We are doing this addition to simplify yhe adaptive connection management for single node Citus. In single node Citus, both the client backends and Citus parallel queries would compete to get slots on Postgres' `max_connections` on the same Citus database. With adaptive connection management, we have the counters for Citus parallel queries. That helps us to adaptively decide on the remote executions pool size (e.g., throttle connections if necessary). However, we do not have any counters for the total number of client backends on the database. For single node Citus, we should consider all the client backends, not only the remote connections that Citus does. Of course Postgres internally knows how many client backends are active. However, to get that number Postgres iterates over all the backends. For examaple, see [pg_stat_get_db_numbackends](`8e90ec5580/src/backend/utils/adt/pgstatfuncs.c (L1240)`) where Postgres iterates over all the backends. For our purpuses, we need this information on every connection establishment. That's why we cannot affort to do this kind of iterattion.	2020-11-25 19:19:24 +01:00
Ahmet Gedemenli	a64dc8a72b	Fixes a bug preventing INSERT SELECT .. ON CONFLICT with a constraint name on local shards Separate search relation shard function Add tests	2020-11-25 15:10:46 +03:00
Önder Kalacı	c760cd3470	Move local execution after remote execution (#4301 ) * Move local execution after the remote execution Before this commit, when both local and remote tasks exist, the executor was starting the execution with local execution. There is no strict requirements on this. Especially considering the adaptive connection management improvements that we plan to roll soon, moving the local execution after to the remote execution makes more sense. The adaptive connection management for single node Citus would look roughly as follows: - Try to connect back to the coordinator for running parallel queries. - If succeeds, go on and execute tasks in parallel - If fails, fallback to the local execution So, we'll use local execution as a fallback mechanism. And, moving it after to the remote execution allows us to implement such further scenarios.	2020-11-24 13:43:38 +01:00
Hadi Moshayedi	40b52ab757	Fix memory leaks in column store	2020-11-23 11:26:12 -08:00
Jeff Davis	8cee2b092b	remove columnar FDW code	2020-11-20 10:03:12 -08:00
Onder Kalaci	c433c66f2b	Do not execute subplans multiple times with cursors Before this commit, we let AdaptiveExecutorPreExecutorRun() to be effective multiple times on every FETCH on cursors. That does not affect the correctness of the query results, but adds significant overhead.	2020-11-20 10:43:56 +01:00
Hadi Moshayedi	b182a95389	Fix ALTER COLUMN ... SET TYPE for columnar	2020-11-19 15:36:45 -08:00
Jeff Davis	91015deb9d	rename UDFs also	2020-11-19 12:27:40 -08:00
Jeff Davis	a2b698a766	rename cstore_tableam -> columnar	2020-11-19 12:15:51 -08:00
Hadi Moshayedi	2747fd80ff	Add prepared materialized view tests for columnar	2020-11-17 20:13:20 -08:00
Hadi Moshayedi	6711340ea6	Add prepared xact & stmt tests for columnar	2020-11-17 20:00:57 -08:00
Hadi Moshayedi	97cba2d5b6	Implements write state management for tuple inserts. TableAM API doesn't allow us to pass around a state variable along all of the tuple inserts belonging to the same command. We require this in columnar store, since we batch them, and when we have enough rows we flush them as stripes. To do that, we keep a (relfilenode) -> stack of (subxact id, TableWriteState) global mapping. Inserts Whenever we want to insert a tuple, we look up for the relation's relfilenode in this mapping. If top of the stack matches current subtransaction, we us the existing TableWriteState. Otherwise, we allocate a new TableWriteState and push it on top of stack. (Sub)Transaction Commit/Aborts When the subtransaction or transaction is committed, we flush and pop all entries matching current SubTransactionId. When the subtransaction or transaction is committed, we pop all entries matching current SubTransactionId and discard them without flushing. Reads Since we might have unwritten rows which needs to be read by a table scan, we flush write states on SELECTs. Since flushing the write state of upper transactions in a subtransaction will cause metadata being written in wrong subtransaction, we ERROR out if any of the upper subtransactions have unflushed rows. Table Drops We record in which subtransaction the table was dropped. When committing a subtransaction in which table was dropped, we propagate the drop to upper transaction. When aborting a subtransaction in which table was dropped, we mark table as not deleted.	2020-11-17 12:07:16 -08:00
Nils Dijk	22df8027b0	add extra output for multi_extension targeting pg11	2020-11-17 19:01:54 +01:00
Nils Dijk	2987535172	add pg upgrade tests verifying table am is created	2020-11-17 18:55:36 +01:00
Nils Dijk	d065bb495d	Prepare downgrade script and bump development version to 10.0-1	2020-11-17 18:55:35 +01:00
Nils Dijk	b6d4a1bbe2	fix style	2020-11-17 18:55:35 +01:00
Nils Dijk	3bb6554976	make tests run	2020-11-17 18:55:35 +01:00
Nils Dijk	f89bd3eeb5	move columnar test files	2020-11-17 18:55:34 +01:00
Onur Tirtir	5e3dc9d707	Bump citus version to 10.0devel	2020-11-09 13:16:54 +03:00
Onur Tirtir	5d5966f700	Fix a flaky test in mixed_relkind_tests (#4300 )	2020-11-06 14:53:30 +03:00
Onder Kalaci	e0d2ac7620	Do not rely on set_rel_pathlist_hook for finding local relations When a relation is used on an OUTER JOIN with FALSE filters, set_rel_pathlist_hook may not be called for the table. There might be other cases as well, so do not rely on the hook for classification of the tables.	2020-11-06 11:14:30 +01:00
Onur Tirtir	cc8be422ce	Fix relkind checks in planner for relkinds other than RELKIND_RELATION (#4294 ) We were qualifying relations with relkind != RELKIND_RELATION as non-relations due to the strict checks around RangeTblEntry->relkind in planner.	2020-11-05 14:21:02 +03:00
Hanefi Önaldı	85a4b61a0e	Prevent undistribute_table calls for partitions	2020-11-03 18:10:20 +03:00
Hanefi Önaldı	5db380f33a	Prevent undistribute_table calls for foreign tables	2020-11-03 17:33:29 +03:00
Halil Ozan Akgul	77b3be8b6d	Turn RelOptInfos to only used field of them, relids, to be able to copy	2020-10-22 13:42:28 +03:00
Onur Tirtir	790beea59f	Add intermediate result tests with unsupported outer joins (#4262 )	2020-10-20 12:11:18 +03:00
SaitTalhaNisanci	0f209377c4	Fix incorrect join related fields (#4242 ) * Fix incorrect join related fields Ruleutils expect to give the original index of join columns hence we should consider the dropped columns while setting the fields in SetJoinRelatedFieldsCompat. * add some more tests for joins * Move tests to join.sql and create a utility function	2020-10-19 18:28:39 +03:00
Onur Tirtir	c49077d594	Disallow outer joins `ON TRUE` with ref & dist tables when ref table is outer relation (#4255 ) Disallow `ON TRUE` outer joins with reference & distributed tables when reference table is outer relation by fixing the logic bug made when calling `LeftListIsSubset` function. Also, be more defensive when removing duplicate join restrictions when join clause is empty for non-inner joins as they might still contain useful information for non-inner joins.	2020-10-19 16:58:11 +03:00
Onder Kalaci	bbedfca761	Improve the relation restriction counters It seems like Postgres could call set_rel_pathlist() for the same relation multiple times. This breaks the logic where we assume relationCount eqauls to the number of entries in relationRestrictionList. In summary, relationRestrictionList may contain duplicate entries.	2020-10-19 08:51:16 +02:00
Hadi Moshayedi	663549db33	Set explicit transfer_mode in tableam tests	2020-10-16 12:40:37 -07:00
Nils Dijk	caabbf4b84	Table access method support for distributed tables	2020-10-16 12:02:25 -07:00
Marco Slot	8976f245ab	Support reference table view in reference table modification	2020-10-16 11:31:24 +02:00
Onder Kalaci	596f7bf4a9	Add more regression test for single node Citus Tests on commands with SCHEMA.	2020-10-15 17:32:32 +02:00
Onder Kalaci	fe3caf3bc8	Local execution considers intermediate result size limit With this commit, we make sure that local execution adds the intermediate result size as the distributed execution adds. Plus, it enforces the citus.max_intermediate_result_size value.	2020-10-15 17:18:55 +02:00
Marco Slot	31858c8a29	Check table existence in EnsureRelationKindSupported	2020-10-15 17:05:06 +02:00
Onder Kalaci	15e724c073	Add regression tests for outer/cross JOINs	2020-10-14 15:17:30 +02:00
Onder Kalaci	de33079065	Improve outer join checks Before this commit, the logic was: - As long as the outer side of the JOIN is not a JOIN (e.g., relation or subquery etc.), we check for the existence of any recurring tuples. There were two implications of this decision. First, even if a subquery which is on the outer side contains distributed table JOIN reference table, Citus would unnecessarily throw an error. Note that, the JOIN inside the subquery would already be going to be tested recursively. But, as long as that check passes, there is no reason for the upper JOIN to fail. An example, which used to fail and now works: SELECT * FROM (SELECT * FROM dist JOIN ref) as foo LEFT JOIN dist; Second, certain JOINs, especially with ON (true) conditions were not represented as Citus expects the JOINs to be in the format DeferredErrorIfUnsupportedRecurringTuplesJoin().	2020-10-14 15:17:30 +02:00
Onur Tirtir	1a28858c47	Disallow field indirection in INSERT/UPDATE queries (#4241 )	2020-10-14 14:11:59 +03:00
Onur Tirtir	8efca3b60a	Fix a crash with inserting domain composite types in coord. evaluation (#4231 ) Use short lived per-tuple context in citus_evaluate_expr like (pg) evaluate_expr does. We should not use planState->ExprContext when evaluating expressions as it might lead to freeing the same executor twice (first one happens in citus_evaluate_expr itself and the other one happens when postgres doing clean-up for the top level executor state), which in turn might cause seg.faults. However, now as we don't have necessary planState info to evaluate prepared statements, we also add planState->es_param_list_info to per-tuple ExprContext.	2020-10-13 14:19:59 +03:00
Halil Ozan Akgul	e2736c25bd	Adds support for WITH TIES option	2020-10-12 19:34:18 +03:00
Sait Talha Nisanci	dc40758355	Return early if there is no citus table in VACUUM	2020-10-09 11:10:00 +03:00
Sait Talha Nisanci	99bb79745a	Commit transaction for VACUUM on shell table With postgres 13, there is a global lock that prevents multiple VACUUMs happening in the current database. This global lock is taken for a short time but this creates a problem because of the following: - We execute the VACUUM for the shell table through the standard process utility. In this step the global lock is taken for the current database. - If the current node has shard placements then it tries to execute VACUUM over a connection to localhost with ExecuteUtilityTaskList. - the VACUUM on shard placements cannot proceed because it is waiting for the global lock for the current database to be released. - The acquired lock from the VACUUM for shell table will not be released until the transaction is committed. - So there is a deadlock. As a solution, we commit the current transaction in case of VACUUM after the VACUUM is executed for the shell table. Executing the VACUUM on a shell table is not important because the data there will probably be truncated. PostprocessVacuumStmt takes the necessary locks on the shell table so we don't need to take any extra locks after we commit the current transaction.	2020-10-09 10:57:44 +03:00
Marco Slot	881e5df780	Fix a bug that could lead to multiple maintenance daemons	2020-10-08 16:18:14 +02:00
Marco Slot	18219843d0	Add maintenance daemon error tests	2020-10-08 16:17:33 +02:00
Marco Slot	dbc348b7e0	Create sequence dependency during metadata syncing	2020-10-06 10:57:39 +02:00
Marco Slot	9bba8bb4e8	Remove master_drop_sequences	2020-10-06 10:57:33 +02:00
Onur Tirtir	2cd0a69dfb	Fix multi-row & router INSERT crash with local exec. when def. cols not specified (#4197 ) Multi-row & router INSERT's were crashing with local execution if at least one of the DEFAULT columns were not specified in VALUES list. This was because, the changes we make on query->values_lists and query->targetList was sufficient for deparsing given INSERT for remote execution but not sufficient for local execution. With this commit, DEFAULT value normalization for multi-row & router INSERT's is fixed by adding dummy column references for unspecified DEFAULT columns.	2020-10-05 10:45:17 +03:00
Önder Kalacı	df5aa0f0cc	Switch to sequential execution if the index name is long (#4209 ) Citus has the logic to truncate the long shard names to prevent various issues, including self-deadlocks. However, for partitioned tables, when index is created on the parent table, the index names on the partitions are auto-generated by Postgres. We use the same Postgres function to generate the index names on the shards of the partitions. If the length exceeds the limit, we switch to sequential execution mode.	2020-10-02 13:39:34 +03:00
Ahmet Gedemenli	70e9edb4f2	Add subplan test with insert	2020-10-01 13:58:55 +03:00
Jelte Fennema	13ef8252e7	Add broken distributed subplan test	2020-10-01 13:52:42 +03:00
Ahmet Gedemenli	3357eea46b	Add regression tests for PG13 WAL	2020-10-01 13:52:42 +03:00
Hanefi Önaldı	b0a2c1ee5c	Disallow volatile functions on single shard update queries We currently do not support volatile functions in update/delete statements because the function evaluation logic does not know how to distinguish volatile functions (that need to be evaluated per row) from stable functions (that need to be evaluated per query), and it is also not safe to push the volatile functions down on replicated tables.	2020-09-29 15:40:21 +03:00
Marco Slot	b905c8043d	Fix create index concurrently crash with local execution	2020-09-25 11:49:09 +02:00
Ahmet Gedemenli	abfb79bda6	Sort explain analyze output by task time Add sort method parameter for regression tests Fix check-style Change sorting method parameters to enum Polish Add task fields to OutTask Add test into multi_explain Fix isolation test	2020-09-24 11:38:40 +03:00
Onur Tirtir	64d5ac6a10	Do not downgrade if a citus local table exists (#4174 ) As the previous versions of Citus don't know how to handle citus local tables, we should prevent downgrading from 9.5 to older versions if any citus local tables exists.	2020-09-22 14:19:50 +03:00
Onder Kalaci	5d017cd123	Improve node matedata when coordinator is added Coordinator should always be always active, hasmetadata and metadasynced. Prevent changing those fields.	2020-09-21 14:53:41 +02:00
Onder Kalaci	6fc1dea85c	Improve the robustness of function call delegation Pushing down the CALLs to the node that the CALL is executed is dangerous and could lead to infinite recursion. When the coordinator added as worker, Citus was by chance preventing this. The coordinator was marked as "not metadatasynced" node in pg_dist_node, which prevented CALL/function delegation to happen. With this commit, we do the following: - Fix metadatasynced column for the coordinator on pg_dist_node - Prevent pushdown of function/procedure to the same node that the function/procedure is being executed. Today, we do not sync pg_dist_object (e.g., distributed functions metadata) to the worker nodes. But, even if we do it now, the function call delegation would prevent the infinite recursion.	2020-09-21 14:53:30 +02:00
SaitTalhaNisanci	dae2c69fd7	Not allow removing a single node with ref tables (#4127 ) * Not allow removing a single node with ref tables We should not allow removing a node if it is the only node in the cluster and there is a data on it. We have this check for distributed tables but we didn't have it for reference tables. * Update src/test/regress/expected/single_node.out Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> * Update src/test/regress/sql/single_node.sql Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2020-09-18 15:35:59 +03:00
Ahmet Gedemenli	1cf11b4632	Shorten insert_select_connection_leak_test	2020-09-18 10:07:15 +03:00
Önder Kalacı	8d3f353746	Add more tests for single node citus - distributetd tables (#4166 )	2020-09-17 17:50:35 +02:00
Marco Slot	c9d46c618b	Fix EXPLAIN ANALYZE truncation	2020-09-17 14:42:21 +02:00
Onur Tirtir	d81559b7f8	Use "table" instead of "reference table" in sequential truncate log (#4164 ) We might get this debug message for citus local tables as well	2020-09-17 14:37:36 +03:00
Onur Tirtir	4118560b75	Prevent citus local table creation from a catalog table (#4158 )	2020-09-15 14:30:48 +03:00
Önder Kalacı	e7079d1384	Add orderbys to some tests (#4162 )	2020-09-14 16:59:22 +02:00
Marco Slot	b82f6ee163	Add tests for distributing catalog tables	2020-09-10 04:46:11 +02:00
Marco Slot	bd12555b16	Fix distributing tables owned by extensions	2020-09-10 04:46:11 +02:00
Onur Tirtir	9a56c22917	Add udf tests with citus local tables (#4154 )	2020-09-11 12:36:53 +03:00
Onur Tirtir	3a73fba810	Apply planner changes for citus local tables	2020-09-09 11:51:18 +03:00
Onur Tirtir	a58a4395ab	Extend citus local table utility command support This commit brings following features: Foreign key support from citus local tables to reference tables * Foreign key support from reference tables to citus local tables (only with RESTRICT & NO ACTION behavior) * ALTER TABLE ENABLE/DISABLE trigger command support * CREATE/DROP/ALTER trigger command support and disallows: * ALTER TABLE ATTACH/DETACH PARTITION commands * CREATE TABLE <postgres table> ATTACH PARTITION <citus local table> commands * Foreign keys from postgres tables to citus local tables (the other way was already disallowed) for citus local tables.	2020-09-09 11:50:55 +03:00
Onur Tirtir	17cc810372	Implement "citus local table" creation logic	2020-09-09 11:50:48 +03:00
Onur Tirtir	ba208eae4d	Record non-distributed table accesses in local executor (#4139 )	2020-09-07 18:19:08 +03:00
Hanefi Önaldı	024d398cd7	Allow distribution of functions that read from reference tables create_distributed_function(function_name, distribution_arg_name, colocate_with text) This UDF did not allow colocate_with parameters when there were no disttribution_arg_name supplied. This commit changes the behaviour to allow missing distribution_arg_name parameters when the function should be colocated with a reference table.	2020-09-01 07:28:34 +03:00
SaitTalhaNisanci	20c39fae9a	Loosen the requirement to pushdown a subquery with ref tables (#4110 ) AllTargetExpressionsAreColumnReferences would return false if a query had an entry that is referencing the outer query. It seems safe to not have this for non-distributed tables, such as reference tables. We already have separate checks for other cases such as having limits.	2020-08-14 12:11:15 +03:00
Hadi Moshayedi	7b74eca22d	Support EXPLAIN EXECUTE ANALYZE.	2020-08-10 13:44:30 -07:00
Philip Dubé	212ae7163f	Fix non deterministic collation test to work with ancient libicu versions CentOS 7's libicu is too old for und-u-ks-level2 @colStrength=secondary works with both older & newer versions of libicu	2020-08-07 12:34:32 +00:00
Halil Ozan Akgul	375310b7f1	Adds support for table undistribution	2020-08-05 14:36:03 +03:00
Sait Talha Nisanci	d68bfc5687	Improve error for index operator class parameters The error message when index has opclassopts is improved and the commit from postgres side is also included for future reference. Also some minor style related changes are applied.	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	288aa58603	add alternative out for pg13 test	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	d0b0c88920	Changelog: error out if index has opclassopts Error out if index has opclassopts. Changelog entry on PG13: Allow CREATE INDEX to specify the GiST signature length and maximum number of integer ranges (Nikita Glukhov)	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	f7a1971361	Changelog: Alter type options It seems that we don't support propagating commands related to base types. Therefore Alter TYPE options doesn't seem to apply to us. I have added a test to verify that we don't propagate them. Changelog entry on pg13: Add ALTER TYPE options useful for extensions, like TOAST and I/O functions control (Tomas Vondra, Tom Lane)	2020-08-04 15:38:11 +03:00
Sait Talha Nisanci	00633165fc	Changelog: Test unicode escapes Unicode escapes work as expected, related tests are added. Changelog entry on PG13: Allow Unicode escapes, e.g., E'\u####', U&'\####', to specify any character available in the database encoding, even when the database encoding is not UTF-8 (Tom Lane)	2020-08-04 15:36:30 +03:00
Sait Talha Nisanci	79dcb80140	Changelog: Test IS NORMALIZED for pg13 Tests for is_normalized and normalized ar eadded. One thing that seems to be because of existent bug is that when we don't give the second argument to normalize or is_normalized, which is optional, it crashes. Because in the executor part, in the expression we don't have the default argument. Changelog entry in PG-13: Add SQL functions NORMALIZE() to normalize Unicode strings, and IS NORMALIZED to check for normalization (Peter Eisentraut) Commit on Postgres: 2991ac5fc9b3904ca4582be6d323497d7c3d17c9	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	ebabca16b7	Changelog: Test row suffix notation It seems that row suffix notation is working fine with our code, a test is added. Changelog entry in PG13: Allow ROW values values to have their members extracted with suffix notation (Tom Lane)	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	275ccd0400	Changelog: Test that alter view rename column works Changelog entry in PG13: Add ALTER VIEW syntax to rename view columns (Fujii Masao)	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	920d7211e4	Changelog: Test that we error out for DROP EXPRESSION PG13 now supports dropping expression from a column such as generated columns. We error out with this currently. Changelog entry in postgres: Add ALTER TABLE clause DROP EXPRESSION to remove generated properties from columns (Peter Eisentraut)	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	87088d92bc	Changelog: handle VACUUM PARALLEL option Postgres 13 added a new VACUUM option, PARALLEL. It is now supported in our code as well. Relevant changelog message on postgres: Allow VACUUM to process indexes in parallel (Masahiko Sawada, Amit Kapila)	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	1070828465	update cte inline output for pg13 Make some macros in version_compat more robust Remove commented code in ruleutils Remove unnecessary variable assignments	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	ff7a563c57	decrease log level to debug1 to prevent flaky debug	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	6ff4e42706	Add alternative output for multi_function_in_join With pg13, constants functions from "FROM" clause are replaced. This means that in citus side, we will see the constraints in restriction info, instead of the function call. For example: SELECT * FROM table1 JOIN add(3,5) sum ON (id = sum) ORDER BY id ASC; Assuming that the function `add` returns constant, it will be evaluated on postgres side. This means that this query will be routable because there will be only one shard after pruning with the restrictions. However before pg13, this would be multi shard query. And it would go into recursive planning, the function would be evaluated on the coordinator because it can be. This means that with pg13, users will need to distribute the function because when it is routable executable, it will currently also send the function call to the worker in the query. So the function should exist in the worker. It could be better to replace the constant in the query tree as well so that the query string sent to the worker has the constant value and therefore it doesn't need the function. However I feel like users would already have the function in workers if they have any multi shard query. Commit on Postgres side: 7266d0997dd2a0632da38a594c78e25ff21df67e	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	c5c9ec288f	fix multi_mx_create_table test	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	80d2bc2317	normalize some output and sort test result	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	0f6c21d418	sort result in ch_bench_having_mx test	2020-08-04 15:10:22 +03:00
Onder Kalaci	eeb8c81de2	Implement shared connection count reservation & enable `citus.max_shared_pool_size` for COPY With this patch, we introduce `locally_reserved_shared_connections.c/h` files which are responsible for reserving some space in shared memory counters upfront. We sometimes need to reserve connections, but not necessarily establish them. For example: - COPY command should reserve connections as it cannot know which connections it needs in which order. COPY establishes connections as any input data hits the workers. For example, for router COPY command, it only establishes 1 connection. As discussed here (https://github.com/citusdata/citus/pull/3849#pullrequestreview-431792473), COPY needs to reserve connections up-front, otherwise we can end up with resource starvation/un-detected deadlocks.	2020-08-03 18:51:40 +02:00
nukoyluoglu	38987431e7	propagation of CHECK statements to workers with parentheses (#4039 ) * ensure propagation of CHECK statements to workers with parantheses & adjust regression test outputs * add tests for distributing tables with simple CHECK constraints * added test for CHECK on bool variable	2020-07-27 15:08:37 +03:00
Benjamin Satzger	a35a15a513	Distribute custom aggregates with multiple arguments (#4047 ) Enable custom aggregates with multiple parameters to be executed on workers. #2921 introduces distributed execution of custom aggregates. One of the limitations of this feature is that only aggregate functions with a single aggregation parameter can be pushed to worker nodes. Aim of this change is to remove that limitation and support handling of multi-parameter aggregates. Resolves: #3997 See also: #2921	2020-07-24 15:16:00 -07:00
Halil Ozan Akgul	38b72ddd66	Fixes create index concurrently bug	2020-07-24 12:14:14 +03:00
Halil Ozan Akgül	e9f89ed651	Fixes the non existing table bug (#4058 )	2020-07-23 18:01:21 +03:00
Sait Talha Nisanci	4308d867d9	remove task-tracker in comments, documentation	2020-07-21 16:21:01 +03:00
Hanefi Önaldı	e534dbae4a	Accept list of values in a supported ALTER ROLE .. SET statement Some GUCs support a list of values which is indicated by GUC_LIST_INPUT flag. When an ALTER ROLE .. SET statement is executed, the new configuration default for affected users and databases are stored in the setconfig(text[]) column in a pg_db_role_setting record. If a GUC that supports a list of values is used in an ALTER ROLE .. SET statement, we need to split the text into items delimited by commas.	2020-07-21 03:49:57 +03:00
Nils Dijk	00a4a15d95	fix sorting on string litteral (#4045 ) As noted by Talha https://github.com/citusdata/citus/pull/4029#issuecomment-660466972 there was still some sort order flappiness in the test. The root cause is that sorting on `1::text` sorts on the literal `'1'` which causes sorting to be indeterministic. This behaviour is consistent with Postgres' behaviour, so no bug on Citus' side.	2020-07-20 17:39:27 +02:00
Onder Kalaci	c25de2cf22	Remove flag from As it doesn't make any sense anymore	2020-07-20 12:45:05 +02:00
SaitTalhaNisanci	b3af63c8ce	Remove task tracker executor (#3850 ) * use adaptive executor even if task-tracker is set * Update check-multi-mx tests for adaptive executor Basically repartition joins are enabled where necessary. For parallel tests max adaptive executor pool size is decresed to 2, otherwise we would get too many clients error. * Update limit_intermediate_size test It seems that when we use adaptive executor instead of task tracker, we exceed the intermediate result size less in the test. Therefore updated the tests accordingly. * Update multi_router_planner It seems that there is one problem with multi_router_planner when we use adaptive executor, we should fix the following error: +ERROR: relation "authors_range_840010" does not exist +CONTEXT: while executing command on localhost:57637 * update repartition join tests for check-multi * update isolation tests for repartitioning * Error out if shard_replication_factor > 1 with repartitioning As we are removing the task tracker, we cannot switch to it if shard_replication_factor > 1. In that case, we simply error out. * Remove MULTI_EXECUTOR_TASK_TRACKER * Remove multi_task_tracker_executor Some utility methods are moved to task_execution_utils.c. * Remove task tracker protocol methods * Remove task_tracker.c methods * remove unused methods from multi_server_executor * fix style * remove task tracker specific tests from worker_schedule * comment out task tracker udf calls in tests We were using task tracker udfs to test permissions in multi_multiuser.sql. We should find some other way to test them, then we should remove the commented out task tracker calls. * remove task tracker test from follower schedule * remove task tracker tests from multi mx schedule * Remove task-tracker specific functions from worker functions * remove multi task tracker extra schedule * Remove unused methods from multi physical planner * remove task_executor_type related things in tests * remove LoadTuplesIntoTupleStore * Do initial cleanup for repartition leftovers During startup, task tracker would call TrackerCleanupJobDirectories and TrackerCleanupJobSchemas to clean up leftover directories and job schemas. With adaptive executor, while doing repartitions it is possible to leak these things as well. We don't retry cleanups, so it is possible to have leftover in case of errors. TrackerCleanupJobDirectories is renamed as RepartitionCleanupJobDirectories since it is repartition specific now, however TrackerCleanupJobSchemas cannot be used currently because it is task tracker specific. The thing is that this function is a no-op currently. We should add cleaning up intermediate schemas to DoInitialCleanup method when that problem is solved(We might want to solve it in this PR as well) * Revert "remove task tracker tests from multi mx schedule" This reverts commit `03ecc0a681`. * update multi mx repartition parallel tests * not error with task_tracker_conninfo_cache_invalidate * not run 4 repartition queries in parallel It seems that when we run 4 repartition queries in parallel we get too many clients error on CI even though we don't get it locally. Our guess is that, it is because we open/close many connections without doing some work and postgres has some delay to close the connections. Hence even though connections are removed from the pg_stat_activity, they might still not be closed. If the above assumption is correct, it is unlikely for it to happen in practice because: - There is some network latency in clusters, so this leaves some times for connections to be able to close - Repartition joins return some data and that also leaves some time for connections to be fully closed. As we don't get this error in our local, we currently assume that it is not a bug. Ideally this wouldn't happen when we get rid of the task-tracker repartition methods because they don't do any pruning and might be opening more connections than necessary. If this still gives us "too many clients" error, we can try to increase the max_connections in our test suite(which is 100 by default). Also there are different places where this error is given in postgres, but adding some backtrace it seems that we get this from ProcessStartupPacket. The backtraces can be found in this link: https://circleci.com/gh/citusdata/citus/138702 * Set distributePlan->relationIdList when it is needed It seems that we were setting the distributedPlan->relationIdList after JobExecutorType is called, which would choose task-tracker if replication factor > 1 and there is a repartition query. However, it uses relationIdList to decide if the query has a repartition query, and since it was not set yet, it would always think it is not a repartition query and would choose adaptive executor when it should choose task-tracker. * use adaptive executor even with shard_replication_factor > 1 It seems that we were already using adaptive executor when replication_factor > 1. So this commit removes the check. * remove multi_resowner.c and deprecate some settings * remove TaskExecution related leftovers * change deprecated API error message * not recursively plan single relatition repartition subquery * recursively plan single relation repartition subquery * test depreceated task tracker functions * fix overlapping shard intervals in range-distributed test * fix error message for citus_metadata_container * drop task-tracker deprecated functions * put the implemantation back to worker_cleanup_job_schema_cachesince citus cloud uses it * drop some functions, add downgrade script Some deprecated functions are dropped. Downgrade script is added. Some gucs are deprecated. A new guc for repartition joins bucket size is added. * order by a test to fix flappiness	2020-07-18 13:11:36 +03:00
Nils Dijk	23d44eba9f	fix flappy tests due to undeterministic order of test output (#4029 ) As reported on #4011 https://github.com/citusdata/citus/pull/4011/files#r453804702 some of the tests were flapping due to an indeterministic order for test outputs. This PR makes the test output ordered for all tests returning non-zero rows. Needs to be backported to 9.2, 9.3, 9.4	2020-07-14 15:47:29 +02:00
SaitTalhaNisanci	ab5be77709	test coordinator reference-distributed table join (#3698 )	2020-07-14 11:43:03 +03:00
Sait Talha Nisanci	1b5ed45a58	add multi follower repartition tests	2020-07-13 19:50:50 +03:00
Sait Talha Nisanci	510535f558	address feedback	2020-07-13 19:45:02 +03:00
Sait Talha Nisanci	41ec76a6ad	use ActiveReadableNodeList in JobExecutorType and task tracker The reason we should use ActiveReadableNodeList instead of ActiveReadableNonCoordinatorNodeList is that if coordinator is added to cluster as a worker, it should be counted as well. Otherwise if there is only coordinator in the cluster, the count will be 0, hence we get a warning. In MultiTaskTrackerExecute, we should connect to coordinator if it is added to the cluster because it will also be assigned tasks.	2020-07-13 19:45:02 +03:00
Sait Talha Nisanci	d97d03ec65	use ActivePrimaryNodeList to include coordinator ActiveReadableWorkerNodeList doesn't include coordinator, however if coordinator is added as a worker, we should also include that while planning. The current methods are very easily misusable and this requires a refactoring to make the distinction between methods that include coordinator and that don't very explicit as they can introduce subtle/major bugs pretty easily.	2020-07-13 19:20:15 +03:00
Sait Talha Nisanci	db1b78148c	send schema creation/cleanup to coordinator in repartitions We were using ALL_WORKERS TargetWorkerSet while sending temporary schema creation and cleanup. We(well mostly I) thought that ALL_WORKERS would also include coordinator when it is added as a worker. It turns out that it was FILTERING OUT the coordinator even if it is added as a worker to the cluster. So to have some context here, in repartitions, for each jobId we create (at least we were supposed to) a schema in each worker node in the cluster. Then we partition each shard table into some intermediate files, which is called the PARTITION step. So after this partition step each node has some intermediate files having tuples in those nodes. Then we fetch the partition files to necessary worker nodes, which is called the FETCH step. Then from the files we create intermediate tables in the temporarily created schemas, which is called a MERGE step. Then after evaluating the result, we remove the temporary schemas(one for each job ID in each node) and files. If node 1 has file1, and node 2 has file2 after PARTITION step, it is enough to either move file1 from node1 to node2 or vice versa. So we prune one of them. In the MERGE step, if the schema for a given jobID doesn't exist, the node tries to use the `public` schema if it is a superuser, which is actually added for testing in the past. So when we were not sending schema creation comands for each job ID to the coordinator(because we were using ALL_WORKERS flag, and it doesn't include the coordinator), we would basically not have any schemas for repartitions in the coordinator. The PARTITION step would be executed on the coordinator (because the tasks are generated in the planner part) and it wouldn't give us any error because it doesn't have anything to do with the temporary schemas(that we didn't create). But later two things would happen: - If by chance the fetch is pruned on the coordinator side, we the other nodes would fetch the partitioned files from the coordinator and execute the query as expected, because it has all the information. - If the fetch tasks are not pruned in the coordinator, in the MERGE step, the coordinator would either error out saying that the necessary schema doesn't exist, or it would try to create the temporary tables under public schema ( if it is a superuser). But then if we had the same task ID with different jobID it would fail saying that the table already exists, which is an error we were getting. In the first case, the query would work okay, but it would still not do the cleanup, hence we would leave the partitioned files from the PARTITION step there. Hence ensure_no_intermediate_data_leak would fail. To make things more explicit and prevent such bugs in the future, ALL_WORKERS is named as ALL_NON_COORD_WORKERS. And a new flag to return all the active nodes is added as ALL_DATA_NODES. For repartition case, we don't use the only-reference table nodes but this version makes the code simpler and there shouldn't be any significant performance issue with that.	2020-07-13 19:20:15 +03:00
Nils Dijk	449d1f0e91	force aliases in deparsing for queries with anonymous column references (#4011 ) DESCRIPTION: Force aliases in deparsing for queries with anonymous column references Fixes: #3985 The root cause has todo with discrepancies in the query tree we create. I think in the future we should spend some time on categorising all changes we made to ruleutils and see if we can change the data structure `query` we pass to the deparser to have an actual valid postgres query for the deparser to render. For now the fix is to keep track, besides changing the names of the entries in the target list, also if we have a reference to an anonymous columns. If there are anonymous columns we set the `printaliases` flag to true which forces the deparser to add the aliases.	2020-07-13 16:29:24 +02:00
Hadi Moshayedi	3651fc64ee	Fix Subtransaction memory leak	2020-07-09 12:33:39 -07:00
Jelte Fennema	16242d5264	Fix write queries with const expressions and COLLATE in various places (#3973 )	2020-07-08 18:19:53 +02:00
Jelte Fennema	ab01571c9e	Fix crash with single node dummy placement (#3993 ) Static analysis found an issue where we could dereference `NULL`, because `CreateDummyPlacement` could return `NULL` when there were no workers. This PR changes it so that it never returns `NULL`, which was intended by @marcocitus when doing this change: https://github.com/citusdata/citus/pull/3887/files#r438136433 While adding tests for citus on a single node I also added some more basic tests and it turns out we error out on repartition joins. This has been present since `shouldhaveshards` was introduced and is not trivial to fix. So I created a separate issue for this: https://github.com/citusdata/citus/issues/3996	2020-07-08 17:11:25 +02:00
Philip Dubé	444472ffc6	ruleutils: use get_rtable_name for deparsing resultRelation	2020-07-07 12:20:41 +00:00
Marco Slot	b4fec63bc0	Rename master evaluation to coordinator evaluation	2020-07-07 10:37:41 +02:00
Jelte Fennema	8ab47f4f37	Add a CI check to see if all tests are part of a schedule (#3959 ) I recently forgot to add tests to a schedule in two of my PRs. One of these was caught by review, but the other one was not. This adds a script to causes CI to ensure that each test in the repo is included in at least one schedule. Three tests were found that were currently not part of a schedule. This PR adds those three tests to a schedule as well and it also fixes some small issues with these tests.	2020-07-03 11:34:55 +02:00
Onur Tirtir	be17ebb334	Bump citus version to 9.5devel	2020-07-01 14:46:55 +03:00
Hanefi Önaldı	ca2ececb3b	Downgrade path from 9.4 to 9.3 to 9.2	2020-07-01 10:38:11 +03:00
Sait Talha Nisanci	e5a21f07cb	test aggregates with expressions	2020-06-30 11:41:16 -07:00
Jelte Fennema	392c5e2c34	Fix wrong cancellation message about distributed deadlocks (#3956 )	2020-06-30 14:57:46 +02:00
Jelte Fennema	02fa942be1	Fix assertion error when rolling back to savepoint (#3868 ) It was possible to get an assertion error, if a DML command was cancelled that opened a connection and then "ROLLBACK TO SAVEPOINT" was used to continue the transaction. The reason for this was that canceling the transaction might leave the `claimedExclusively` flag on for (some of) it's connections. This caused an assertion failure because `CanUseExistingConnection` would return false and a new connection would be opened, and then there would be two connections doing DML for the same placement. Which is disallowed. That this situation caused an assertion failure instead of an error, means that without asserts this could possibly result in some visibility bugs, similar to the ones described https://github.com/citusdata/citus/issues/3867	2020-06-30 11:31:46 +02:00
Hadi Moshayedi	cd25a27174	Fix crash caused by EXPLAIN EXECUTE INSERT ... SELECT	2020-06-25 08:55:48 -07:00
Hadi Moshayedi	4e8d79998e	Save INSERT/SELECT method in DistributedPlan. This is so we don't need to calculate it twice in insert_select_executor.c and multi_explain.c, which can cause discrepancy if an update in one of them is not reflected in the other site.	2020-06-25 08:55:48 -07:00
Jelte Fennema	64506143e4	Replace flaky repartition analyze test with a non flaky one (#3950 ) The flaky test was introduced in #3941. This removes that flaky test and adds a new one that fails in the same manner when removing the fix in #3941. An example of a random failure can be found here: https://app.circleci.com/pipelines/github/citusdata/citus/9558/workflows/de76e7a5-6558-46c9-97e7-8b1dae1f173b/jobs/135876/steps	2020-06-25 15:19:15 +02:00
SaitTalhaNisanci	50e115fe3a	test task tracker repartition with replication >1 (#3944 )	2020-06-24 14:54:20 +03:00
SaitTalhaNisanci	f458d1fd1c	Fix/task execution (#3941 ) * Not set TaskExecution with adaptive executor Adaptive executor is using a utility method from task tracker for repartition joins, however adaptive executor doesn't need taskExecution. It is only used by task tracker. This causes a problem when explain analyze is used because what taskExecution is pointing to might be random. We solve this by not setting taskExecution from adaptive executor. So it will stay NULL as set by CreateTask. * use same memory context as task for taskExecution Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2020-06-24 12:10:00 +03:00
Philip Dubé	cd0b2ad5b5	citus_evaluate_expression: call expand_function_arguments beforehand to avoid segfaulting on implicit parameters	2020-06-23 18:06:46 +00:00
Jelte Fennema	0259815d3a	Fix EXPLAIN ANALYZE received data counter issues (#3917 ) In #3901 the "Data received from worker(s)" sections were added to EXPLAIN ANALYZE. After merging @pykello posted some review comments. This addresses those comments as well as fixing a other issues that I found while addressing them. The things this does: 1. Fix `EXPLAIN ANALYZE EXECUTE p1` to not increase received data on every execution 2. Fix `EXPLAIN ANALYZE EXECUTE p1(1)` to not return 0 bytes as received data allways. 3. Move `EXPLAIN ANALYZE` specific logic to `multi_explain.c` from `adaptive_executor.c` 4. Change naming of new explain sections to `Tuple data received from node(s)`. Firstly because a task can reference the coordinator too, so "worker(s)" was incorrect. Secondly to indicate that this is tuple data and not all network traffic that was performed. 5. Rename `totalReceivedData` in our codebase to `totalReceivedTupleData` to make it clearer that it's a tuple data counter, not all network traffic. 6. Actually add `binary_protocol` test to `multi_schedule` (woops) 7. Fix a randomly failing test in `local_shard_execution.sql`.	2020-06-17 11:33:38 +02:00
Jelte Fennema	799bfdab56	Temporarily disable connection leak tests that fail a lot (#3911 ) MX connection leak failures: 1. https://app.circleci.com/pipelines/github/citusdata/citus/9296/workflows/e36d1088-662a-4f60-acec-293132632c2f/jobs/131908/steps 2. https://app.circleci.com/pipelines/github/citusdata/citus/9258/workflows/37659d82-2c5b-495e-b0e7-905811e30444/jobs/131299 Failure connection leak failures: 1. https://app.circleci.com/pipelines/github/citusdata/citus/9297/workflows/c0ebc326-8c93-468f-8b70-f470bd492fb9/jobs/131920 2. https://app.circleci.com/pipelines/github/citusdata/citus/9283/workflows/9af154d0-ff96-4c5d-ae19-81faae1e0c18/jobs/131668	2020-06-16 13:48:48 +02:00
Jelte Fennema	927de6d187	Show amount of data received in EXPLAIN ANALYZE (#3901 ) Sadly this does not actually work yet for binary protocol data, because when doing EXPLAIN ANALYZE we send two commands at the same time. This means we cannot use `SendRemoteCommandParams`, and thus cannot use the binary protocol. This can still be useful though when using the text protocol, to find out that a lot of data is being sent.	2020-06-15 16:01:05 +02:00
Hadi Moshayedi	ef778c1cd7	address feedback from Sait Talha & Hadi	2020-06-12 18:36:02 -07:00
Marco Slot	24feadc230	Handle joins between local/reference/cte via router planner	2020-06-12 18:36:01 -07:00
Halil Ozan Akgül	8c5eb6b7ea	Insert Select Into Local Table (#3870 ) * Insert select with master query * Use relid to set custom_scan_tlist varno * Reviews * Fixes null check Co-authored-by: Marco Slot <marco.slot@gmail.com>	2020-06-12 17:06:31 +03:00
Jelte Fennema	0e12d045b1	Support use of binary protocol in between nodes (#3877 ) This can save a lot of data to be sent in some cases, thus improving performance for which inter query bandwidth is the bottleneck. There's some issues with enabling this as default, so that's currently not done.	2020-06-12 15:02:51 +02:00
Nils Dijk	da8f2b0134	Feature: tdigest aggregate (#3897 ) DESCRIPTION: Adds support to partially push down tdigest aggregates tdigest extensions: https://github.com/tvondra/tdigest This PR implements the partial pushdown of tdigest calculations when possible. The extension adds a tdigest type which can be combined into the same structure. There are several aggregate functions that can be used to get; - a quantile - a list of quantiles - the quantile of a hypothetical value - a list of quantiles for a list of hypothetical values These function can work both on values or tdigest types. Since we can create tdigest values either by combining them, or based on a group of values we can rewrite the aggregates in such a way that most of the computation gets delegated to the compute on the shards. This both speeds up the percentile calculations because the values don't have to be sorted while at the same time making the transfer size from the shards to the coordinator significantly less.	2020-06-12 13:50:28 +02:00
Philip Dubé	1722d8ac8b	Allow routing modifying CTEs We still recursively plan some cases, eg: - INSERTs - SELECT FOR UPDATE when reference tables in query - Everything must be same single shard & replication model	2020-06-11 15:14:06 +00:00
Hadi Moshayedi	0e3140c14d	Include execution duration in worker_last_saved_explain_analyze	2020-06-11 02:54:54 -07:00
Hadi Moshayedi	7c52c6edb0	CTE statistics in EXPLAIN ANALYZE	2020-06-11 02:39:59 -07:00
Hadi Moshayedi	1f6d6ee4a5	Show query text in EXPLAIN output	2020-06-11 02:19:55 -07:00
Hadi Moshayedi	bb96ef5047	Does the EXPLAIN ANALYZE at the same time as execution, so avoids executing twice. We wrap worker tasks in worker_save_query_explain_analyze() so we can fetch their explain output later by a call worker_last_saved_explain_analyze(). Fixes #3519 Fixes #2347 Fixes #2613 Fixes #621	2020-06-11 01:55:57 -07:00
Hadi Moshayedi	6ca621bd16	Test we don't support multi-shard EXPLAIN EXECUTE	2020-06-10 17:11:27 -07:00
Onder Kalaci	640717bea2	Copy doesn't use more than MaxAdaptiveExecutor Co-authored-by: Hanefi Önaldı <Hanefi.Onaldi@Microsoft.com>	2020-06-10 16:46:21 +03:00
Jelte Fennema	b87bae71bb	Error out when using different users in the same transaction (#3869 ) Fixes #3867 As described in the issue above we return incorrect results when changing user within a transaction. This causes us to error out instead.	2020-06-10 14:07:40 +02:00
Onder Kalaci	06461ca55f	Coerce types properly for INSERT Also, unify similar code-paths to rely on more accurate function.	2020-06-10 10:40:28 +02:00
Hadi Moshayedi	5cdfa9f571	Implement EXPLAIN ANALYZE udfs. Implements worker_save_query_explain_analyze and worker_last_saved_explain_analyze. worker_save_query_explain_analyze executes and returns results of query while saving its EXPLAIN ANALYZE to be fetched later. worker_last_saved_explain_analyze returns the saved EXPLAIN ANALYZE result.	2020-06-09 10:02:05 -07:00
Hadi Moshayedi	45a41e249f	Test EXPLAIN ANALYZE doesn't show repartition join tasks	2020-06-06 23:24:45 -07:00
Hadi Moshayedi	02cff1a7c6	Test that EXPLAIN ANALYZE is not supported for some forms of INSERT/SELECT	2020-06-06 23:24:45 -07:00
Onur Tirtir	dfcc18468c	Error out for unsupported trigger objects Error out if creating a citus table from a table having triggers. Error out for CREATE TRIGGER commands that are run on citus tables.	2020-05-31 23:10:01 +03:00
MoYi	9e1f198155	Fix composite create type deparsing to preserve typmod	2020-05-15 13:12:54 +00:00
SaitTalhaNisanci	cf98b9d6d5	not wait forever for metadata sync in tests (#3760 ) We shouldn't wait forever for metada sync in tests, otherwise when a test gets stuck, we don't know which line causes the problem.	2020-05-14 10:51:24 +03:00
Nils Dijk	105de7beb8	Fix for pruned target list entries (#3818 ) DESCRIPTION: Ignore pruned target list entries in coordinator plan The postgres planner has the ability to prune target list entries that are proven not used in the output relation. When this happens at the `CitusCustomScan` boundary we need to _not_ return these pruned columns to not upset the rest of the planner. By using the target list the planner asks us to return we fix issues that lead to Assertion failures, and potentially could be runtime errors when they hit in a production build. Fixes #3809	2020-05-06 13:56:02 +02:00
Marco Slot	6ce2803777	Make sure we don't wrap GROUP BY expressions in any_value	2020-05-05 05:12:45 +02:00
Onder Kalaci	f9d4a9cf38	Remove assertion for subqueries in WHERE clause ANDed with FALSE In the code, we had the assumption that if restriction information is NULL, it means that we cannot have any disributetd tables in the subquery. However, for subqueries in WHERE clause, that is not the case when the subquery is ANDed with FALSE. In that case, Citus operates on the originalQuery (which doesn't go through the standard_planner()), and rely on the restriction information generated by standard_plannner(). As Postgres is smart enough to no generate restriction information for subqueries ANDed with FALSE, we hit the assertion.	2020-05-04 10:52:15 +02:00
Onder Kalaci	891d99efaf	add order by to some tests to make the output consistent	2020-05-01 12:41:51 +02:00
SaitTalhaNisanci	cbda951395	Fix task copy and appending empty task in ExtractLocalAndRemoteTasks (#3802 ) * Not append empty task in ExtractLocalAndRemoteTasks ExtractLocalAndRemoteTasks extracts the local and remote tasks. If we do not have a local task the localTaskPlacementList will be NIL, in this case we should not append anything to local tasks. Previously we would first check if a task contains a single placement or not, now we first check if there is any local task before doing anything. * fix copy of node task Task node has task query, which might contain a list of strings in its fields. We were using postgres copyObject for these lists. Postgres assumes that each element of list will be a node type. If it is not a node type it will error. As a solution to that, a new macro is introduced to copy a list of strings.	2020-04-29 11:05:34 +03:00
Philip Dubé	b6b3c1bc17	Fix COPY TO's COPY (SELECT) with distributed table having generated columns It's necessary to omit generated columns from output	2020-04-28 14:40:47 +00:00
Onder Kalaci	0cb7ab2d05	Explicitly mark queries in physical planner for [not] having parameters Physical planner doesn't support parameters. If the parameters have already been resolved when the physical planner handling the queries, mark it. The reason is that the executor is unaware of this, and sends the parameters along with the worker queries, which fails for composite types. (See `DissuadePlannerFromUsingPlan()` for the details of paramater resolving)	2020-04-24 12:49:43 +02:00
Onur Tirtir	2e927bd6b7	Bump Citus to 9.4devel (#3788 )	2020-04-22 12:50:00 +03:00
Hanefi Önaldı	e85b835065	Skip dependency setup on coordinator node	2020-04-21 12:06:31 +03:00
Onder Kalaci	e182215d96	Improve connection error message from the worker nodes We currently put the actual error message to the detail part. However, many drivers don't show detail part. As connection errors are somehow common, and hard to trace back, can't we added the detail to the message itself. In addition to that, we changed "connection error" message, as it was confusing to the users who think that the error was happening while connecting to the coordinator. In fact, this error is showing up when the coordinator fails to connect remote nodes.	2020-04-20 13:32:55 +02:00
Hadi Moshayedi	1250d691d3	Replicate reference tables before master_create_empty_shard	2020-04-17 16:47:03 -07:00
SaitTalhaNisanci	1d0f4bdcd2	invalidate plan cache in master_update_node (#3758 ) * invalidate plan cache in master_update_node If a plan is cached by postgres but a user uses master_update_node, then when the plan cache is used for the updated node, they will get the old nodename/nodepost in the plan. This is because the plan cache doesn't know about the master_update_node. This could be a problem in prepared statements or anything that goes into plancache. As a solution the plan cache is invalidated inside master_update_node. * add invalidate_inactive_shared_connections test function We introduce invalidate_inactive_shared_connections udf to be used in testing. It is possible that a connection count for an inactive node will be greater than 0 and in that case it will not be removed at the time of invalidation. However, later we don't have a mechanism to remove it, which means that it will stay in the hash. For this not to cause a problem, we use this udf in testing. * move invalidate_inactive_shared_connections to udfs from test as it will be used in mx * remove the test udf * remove the IsInactive check	2020-04-17 17:43:48 +03:00
Philip Dubé	e4a4707f4a	Avoid setting hasWindowFuncs true after window functions have been optimized out of query	2020-04-17 12:22:48 +00:00
SaitTalhaNisanci	a9a3be15cc	introduce TASK_QUERY_NULL task type (#3774 ) When we call SetTaskQueryString we would set the task type to TASK_QUERY_TEXT, and some parts of the codebase rely on the fact that if TASK_QUERY_TEXT is set, the data can be read safely. However if SetTaskQueryString is called with a NULL taskQueryString this can cause crashes. In that case taskQueryType will simply be set to TASK_QUERY_NULL.	2020-04-17 14:59:22 +03:00
Hanefi Önaldı	d535121f8d	Introduce truncate_local_data_after_distributing_table()	2020-04-17 13:21:34 +03:00
Nils Dijk	1d6ba1d09e	Refactor alter role to work on distributed roles (#3739 ) DESCRIPTION: Alter role only works for citus managed roles Alter role was implemented before we implemented good role management that hooks into the object propagation framework. This is a refactor of all alter role commands that have been implemented to - be on by default - only work for supported roles - make the citus extension owner a supported role Instead of distributing the alter role commands for roles at the beginning of the node activation role it now _only_ executes the alter role commands for all users in all databases and in the current database. In preparation of full role support small refactors have been done in the deparser. Earlier tests targeting other roles than the citus extension owner have been either slightly changed or removed to be put back where we have full role support. Fixes #2549	2020-04-16 12:23:27 +02:00
Hadi Moshayedi	59b9a4e5a1	Detect deadlocks in replicate_reference_tables()	2020-04-15 11:06:18 -07:00
SaitTalhaNisanci	df9048ebaa	update outdated comments related to local_execution (#3759 )	2020-04-15 16:15:43 +03:00
Marco Slot	8b83306a27	Issue worker messages with the same log level	2020-04-14 21:08:25 +02:00
Onder Kalaci	aa6b641828	Throttle connections to the worker nodes With this commit, we're introducing a new infrastructure to throttle connections to the worker nodes. This infrastructure is useful for multi-shard queries, router queries are have not been affected by this. The goal is to prevent establishing more than citus.max_shared_pool_size number of connections per worker node in total, across sessions. To do that, we've introduced a new connection flag OPTIONAL_CONNECTION. The idea is that some connections are optional such as the second (and further connections) for the adaptive executor. A single connection is enough to finish the distributed execution, the others are useful to execute the query faster. Thus, they can be consider as optional connections. When an optional connection is not allowed to the adaptive executor, it simply skips it and continues the execution with the already established connections. However, it'll keep retrying to establish optional connections, in case some slots are open again.	2020-04-14 10:27:48 +02:00
Hadi Moshayedi	2639a9a19d	Test master_copy_shard_placement errors on foreign constraints	2020-04-13 12:45:27 -07:00
Hadi Moshayedi	f9de734329	Ensure metadata is synced on ReplicateColocatedShardPlacement	2020-04-13 11:45:21 -07:00
SaitTalhaNisanci	2b2a146af4	update gitignores with new files in test folder (#3749 )	2020-04-13 17:09:18 +03:00
Philip Dubé	30f10984e1	Defer get_agg_clause_costs, it happens later & avoids errors	2020-04-10 13:26:05 +00:00
Halil Ozan Akgul	56e814a333	Adds public host to only hyperscale tests	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	d574ac33a8	Adds next shard ids to multi_create_table tests	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	a701fc774a	Adds multi_schedule_hyperscale schedule	2020-04-10 15:54:47 +03:00
Halil Ozan Akgul	c2edf989cf	Adds public host parameters	2020-04-10 13:04:24 +03:00
SaitTalhaNisanci	17373d51da	not wait forever in upgrade distributed function before (#3731 )	2020-04-10 09:43:42 +03:00
Marco Slot	a4b2197450	Correctly handle non-constant LIMIT/OFFSET clauses	2020-04-09 19:59:50 +00:00
SaitTalhaNisanci	24dcb02bca	enable local table join with reference table (#3697 ) * enable local table join with reference table * test different cases with local table and reference join	2020-04-09 15:25:54 +03:00
SaitTalhaNisanci	ebda3eff61	read database name inside the function (#3730 )	2020-04-09 13:11:13 +03:00
SaitTalhaNisanci	233e4a24d1	use local execution within transaction block (#3714 ) * use local executon when in a transaction block When we are inside a transaction block, there could be other methods that need local execution, therefore we will use local execution in a transaction block. * update test outputs with transaction block local execution * add a test to verify we dont leak intermediate schemas	2020-04-09 12:41:58 +03:00
SaitTalhaNisanci	fa88046ce1	test that we don't leak intermediate schemas (#3737 ) * test that we don't leak intermediate schemas We have tests to make sure that we don't intermediate any intermediate files, tables etc but we don't test if we are leaking schemas. It makes sense to test this as well. * remove all repartition schemas in case of error This solution is not an ideal one but it seems to be doing the job. We should have a more generic solution for the cleanup but it seems that putting the cleanup in the abort handler is dangerous and it was crashing.	2020-04-09 12:17:41 +03:00
Hadi Moshayedi	9b8802ba2d	Remove todo from reference_table_utils	2020-04-08 12:46:55 -07:00
Hadi Moshayedi	dda53a0bba	GUC for replicate reference tables on activate.	2020-04-08 12:42:45 -07:00
Hadi Moshayedi	c168a53ebc	Tests for replicate_reference_tables	2020-04-08 12:41:36 -07:00
Hadi Moshayedi	acfa850c38	Make multi_replicate_reference_table check-base friendly	2020-04-08 12:41:36 -07:00
Marco Slot	924cd7343a	Defer reference table replication to shard creation time	2020-04-08 12:41:36 -07:00
Önder Kalacı	70012dfd33	Do not error when an intermediate file does not exit (#3707 ) When the file does not exist, it could mean two different things. First -- and a lot more common -- case is that a failure happened in a concurrent backend on the same distributed transaction. And, one of the backends in that transaction has already been roll backed, which has already removed the file. If we throw an error here, the user might see this error instead of the actual error message. Instead, we prefer to WARN the user and pretend that the file has no data in it. In the end, the user would see the actual error message for the failure. Second, in case of any bugs in intermediate result broadcasts, we could try to read a non-existing file. That is most likely to happen during development. Thus, when asserts enabled, we throw an error instead of WARNING so that the developers cannot miss.	2020-04-07 17:06:55 +02:00
Onder Kalaci	a695b44ce9	Add new regression tests	2020-04-07 17:06:55 +02:00
Onder Kalaci	4b3d17f466	Make sure that tests are not failing randomly	2020-04-07 17:06:55 +02:00
Marco Slot	2632343f64	Fix intermediate result pruning for INSERT..SELECT	2020-04-07 11:07:49 +02:00
Marco Slot	84672c3dbd	Simplify intermediate result pruning logic	2020-04-07 10:53:29 +02:00
SaitTalhaNisanci	a710b3cdc5	fix null tupleStoreState case in ExecuteLocalTaskListExtended (#3711 ) In case we don't care about the tupleStoreState in ExecuteLocalTaskListExtended, it could be passed as null. In that case we will get a seg error. This changes it so that a dummy tuple store will be created when it is null. Do not use local execution in ExecuteTaskListOutsideTransaction. As we are going to run the tasks outside transaction, we shouldn't use local execution. However, there is some problem when using local execution related to repartition joins, when we solve that problem, we can execute the tasks coming to this path with local execution. Also logging the local command is simplified. normalize job id in worker_hash_partition_table in test outputs.	2020-04-07 11:47:09 +03:00
Philip Dubé	b01bae5937	Check connections from connection_placement before polling	2020-04-06 17:45:44 +00:00
SaitTalhaNisanci	cd3e499834	not log in debug level in null parameters (#3718 ) The purpose of null_parameters is to make sure that citus doesn't crash with null parameters. (The related issue is #3493.) The logs in this file are not that important and they are flaky. The flakiness is related to postgres part as well so it is hard to reproduce them. Therefore it makes sense to decrease the log level.	2020-04-06 17:59:46 +03:00
SaitTalhaNisanci	3d3605be80	simplify vacuum test and fix the flakiness (#3704 ) look at sent commands to simplify complex logic in vacuum test also normalize connection id as that can differ when we don't have to choose a specific connection.	2020-04-03 21:39:54 +03:00
SaitTalhaNisanci	32156dbf5c	fix flaky log statement in null_parameters (#3705 ) It seems that sometimes the pruning is deferred and sometimes not with this statement. What we care in this test is to see that it doesn't crash. I think we don't care about the log statement for this line. So it makes sense to not log this statement, and care about the result.	2020-04-03 17:01:59 +03:00
Hanefi Önaldı	d1223bd6cc	Remove migration paths to 9.3-1, introduce 9.3-2	2020-04-03 12:50:45 +03:00
SaitTalhaNisanci	710970407f	not wait forever in multi_extension test (#3702 )	2020-04-03 12:21:02 +03:00
SaitTalhaNisanci	659283c9a7	fix multi utilities vacuum test (#3699 )	2020-04-03 11:50:00 +03:00
Marco Slot	fd8cdb92f4	Evaluate nextval in the target list on the coordinator	2020-04-02 02:53:19 +02:00
SaitTalhaNisanci	df88ab71b6	normalize assign_distributed_transaction_id in tests	2020-04-01 18:23:16 +03:00
SaitTalhaNisanci	0aebd78ea7	use localExecution in ExecuteTaskListExtended ExecuteTaskListExtended is the common method for different codepaths, and instead of writing separate local execution logics in different codepaths, it makes more sense to have the logic here. We still need to do some refactoring, this is an initial step. After this commit, we can run create shard commands locally. There is a special case with shard creation commands. A create shard command might have a concatenated query string, however local execution did not know how to execute a task with multiple query strings. This is also implemented in this commit. We go over each query in the concatenated query string and plan/execute them one by one. A more clean solution to this would be to make sure that each task has a single query. We currently cannot do that because we need to ensure the task dependencies. However, it would make sense to do that at some point and it would simplify the code a lot.	2020-04-01 18:23:16 +03:00
Philip Dubé	3bb4f14efd	upgrade_type_after: ORDER BY	2020-04-01 01:07:21 +00:00
Philip Dubé	d155149c18	tests: remove stale comment, fix typo	2020-03-31 20:13:51 +00:00
Marco Slot	252abcce16	Allow table type to be used in target list	2020-03-31 11:11:01 -07:00
Marco Slot	331b45348c	Fix error when using LEFT JOIN with GROUP BY on primary key	2020-03-30 16:42:22 +02:00
Philip Dubé	67d2ad4e37	Fixes flaky test in multi_reference_table: ORDER BY (#3676 ) Fixes app.circleci.com/pipelines/github/citusdata/citus/7744/workflows/0848f36c-af9e-46b7-9dda-a421df54ba56/jobs/109503	2020-03-30 23:31:10 +02:00
Hanefi Onaldi	0e8103b101	Propagate ALTER ROLE .. SET statements In PostgreSQL, user defaults for config parameters can be changed by ALTER ROLE .. SET statements. We wish to propagate those defaults accross the Citus cluster so that the behaviour will be similar in different workers. The defaults can either be set in a specific database, or the whole cluster, similarly they can be set for a single role or all roles. We propagate the ALTER ROLE .. SET if all the conditions below are met: - The query affects the current database, or all databases - The user is already created in worker nodes	2020-03-27 13:02:48 +03:00
Marco Slot	a65ffee266	Fixes a bug that causes some DML queries containing aggregates to fail	2020-03-26 16:08:34 +00:00
Marco Slot	b89e9dc158	Fix a bug which caused queries with SRFs and function evalution to fail	2020-03-25 06:55:53 +01:00
Philip Dubé	917cb6ae93	Don't segfault on queries using GROUPING GROUPING will always return 0 outside of GROUPING SETS, CUBE, or ROLLUP Since we don't support those, it makes sense to reject GROUPING in queries	2020-03-25 15:46:43 +00:00
Philip Dubé	720525cfda	Add support for window functions on coordinator Some refactoring: Consolidate expression which decides whether GROUP BY/HAVING are pushed down Rename early pullUpIntermediateRows to hasNonDistributableAggregates Create WorkerColumnName to handle formatting WORKER_COLUMN_FORMAT Ignore NULL StringInfo pointers to SafeToPushdownWindowFunction Fix bug where SubqueryPushdownMultiNodeTree mutates supplied Query, SafeToPushdownWindowFunction requires the original query as it relies on rtable	2020-03-25 15:31:20 +00:00
Jelte Fennema	2aabe3e2ef	Mark all connections for shutdown when citus.node_conninfo chan… (#3642 ) We cache connections between nodes in our connection management code. This is good for speed. For security this can be a problem though. If the user changes settings related to TLS encryption they want those to be applied to future queries. This is especially important when they did not have TLS enabled before and now they want to enable it. This can normally be achieved by changing citus.node_conninfo. However, because connections are not reopened there will still be old connections that might not be encrypted at all. This commit changes that by marking all connections to be shutdown at the end of their current transaction. This way running transactions will succeed, even if placement requires connections to be reused for this transaction. But after this transaction completes any future statements will use a connection created with the new connection options. If a connection is requested and a connection is found that is marked for shutdown, then we don't return this connection. Instead a new one is created. This is needed to make sure that if there are no running transactions, then the next statement will not use an old cached connection, since connections are only actually shutdown at the end of a transaction.	2020-03-24 15:31:41 +01:00
Hadi Moshayedi	b46b9a68ae	Tests for master_copy_shard_placement	2020-03-23 08:33:55 -07:00
Marco Slot	ede176d849	Implement shard placement copying	2020-03-23 08:33:08 -07:00
Philip Dubé	dd2bd53e5b	PartiallyEvaluateExpression: Avoid unrecognized paramkind: 2	2020-03-23 14:14:01 +00:00
SaitTalhaNisanci	3df578010e	add a UDF to update colocation (#3623 ) If two tables have the same distribution column type, we implicitly colocate them. This is useful since colocation has a big performance impact in most applications. When a table is rebalanced, all of the colocated tables are also rebalanced. If table A and table B are colocated and we want to rebalance table A, table B will also be rebalanced. We need replica identity so that logical replication can replicate updates and deletes during rebalancing. If table B does not have a replica identity we error out. A solution to this is to introduce a UDF so that colocation can be updated. The remaining tables in the colocation group will stay colocated. For example if table A, B and C are colocated and after updating table B's colocations, table A and table C stay colocated. The "updating colocation" step does not move any data around, it only updated pg_dist_partition and pg_dist_colocation tables. Specifically it creates a new colocation group for the table and updates the entry in pg_dist_partition while invalidating any cache.	2020-03-23 13:22:24 +03:00
SaitTalhaNisanci	9d2f3c392a	enable local execution in INSERT..SELECT and add more tests We can use local copy in INSERT..SELECT, so the check that disables local execution is removed. Also a test for local copy where the data size > LOCAL_COPY_FLUSH_THRESHOLD is added. use local execution with insert..select	2020-03-18 09:34:39 +03:00
SaitTalhaNisanci	42cfc4c0e9	apply review items log shard id in local copy and add more comments	2020-03-18 09:33:55 +03:00
SaitTalhaNisanci	c22068e75a	use the right partition for partitioned tables	2020-03-18 09:28:59 +03:00
SaitTalhaNisanci	1df9601e13	not use local copy if current transaction is connected to local group If current transaction is connected to local group we should not use local copy, because we might not see some of the changes that are made over the connection to the local group.	2020-03-18 09:28:59 +03:00
SaitTalhaNisanci	39bbec0f30	add tests for local copy execution	2020-03-18 09:28:59 +03:00
Nils Dijk	e5237b9e20	Fix left join shard pruning (#3569 ) DESCRIPTION: Fix left join shard pruning in pushdown planner Due to #2481 which moves outer join planning through the pushdown planner we caused a regression on the shard pruning behaviour for outer joins. In the pushdown planner we make a union of the placement groups for all shards accessed by a query based on the filters we see during planning. Unfortunately implicit filters for left joins are not available during this part. This causes the inner part of an outer join to not prune any shards away. When we take the union of the placement groups it shows the behaviour of not having any shards pruned. Since the inner part of an outer query will not return any rows if the outer part does not contain any rows we have observed we do not have to add the shard intervals of the inner part of an outer query to the list of shard intervals to query. Fixes: #3512	2020-03-13 15:20:45 +01:00
Onur Tirtir	a14739f808	Local execution of ddl/drop/truncate commands (#3514 ) * reimplement ExecuteUtilityTaskListWithoutResults for local utility command execution * introduce new functions for local execution of utility commands * change ErrorIfTransactionAccessedPlacementsLocally logic for local utility command execution * enable local execution for TRUNCATE command on distributed & reference tables * update existing tests for local utility command execution * enable local execution for DDL commands on distributed & reference tables * enable local execution for DROP command on distributed & reference tables * add normalization rules for cascaded commands * add new tests for local utility command execution	2020-03-13 15:39:32 +03:00
Philip Dubé	11b968bc30	Add runtime type checking to AGGREGATE_CUSTOM_COMBINE helper functions	2020-03-11 17:20:30 +00:00
Philip Dubé	4b68ee12c6	Also check aggregates in havingQual when scanning for non pushdownable aggregates Came across this while coming up with test cases, 'result "68_1" does not exist' I'll seek to address in a future PR, for now avoid segfault	2020-03-11 15:47:04 +00:00
Önder Kalacı	63ced3d901	Improve master evaluation tests (#3609 ) * Add third column to master_evaluation_modify table It was already added in some tests, but now make it globally applicable to the test file. * Add third column to master_evaluation_select table As we'll use the column in some tests * Add modify regression tests For the combinations of: local/remote, router/fast-path: - Distribution key is a const. - Contains a function - A column which is not dist. key is parametrized * Add select regression tests For the combinations of: local/remote, router/fast-path: - Distribution key is a const. - Contains a function - A column which is not dist. key is parametrized * Make some tests consistent to check-base	2020-03-11 15:38:08 +01:00
Onder Kalaci	7d787e3d5e	Prevent create_distributed_function() from the workers As this could cause weird edge cases.	2020-03-10 18:24:20 +01:00
Philip Dubé	81cfa05d3d	First phase of addressing HAVING subquery issues Add failing tests, make changes to avoid crashes at least Fix HAVING subquery pushdown ignoring reference table only subqueries, also include HAVING in recursive planning Given that we have a function IsDistributedTable which includes reference tables, it seems best to have IsDistributedTableRTE & QueryContainsDistributedTableRTE reflect that they do not include reference tables in their check Similarly SublinkList's name should reflect that it only scans WHERE contain_agg_clause asserts that we don't have SubLinks, use contain_aggs_of_level as suggested by pg sourcecode	2020-03-09 17:58:30 +00:00
Onder Kalaci	2ed19181fe	Improve definition of RelationInfoContainsOnlyRecurringTuples Before this commit, we considered !ContainsRecurringRTE() enough for NotContainsOnlyRecurringTuples. However, instead, we can check for existince of any distributed table. DESCRIPTION: Fixes a bug that causes wrong results with complex outer joins	2020-03-09 17:28:33 +01:00
Marco Slot	5b1d1dd413	Remove unnecessary use of max_parallel_workers_per_gather	2020-03-06 13:18:58 +01:00
Hanefi Onaldi	c0ad44f975	Fix early exit bug on intermediate result pruning There are 2 problems with our early exit strategy that this commit fixes: 1- When we decide that a subplan results are sent to all worker nodes, we used to skip traversing the whole distributed plan, instead of skipping only the subplan. 2- We used to consider all available nodes in the cluster (secondaries and inactive nodes as well as active primaries) when deciding on early exit strategy. This resulted in failures to early exit when there are secondaries or inactive nodes.	2020-03-05 16:41:44 +03:00
Onder Kalaci	f72916875f	Expand test coverage for combinations of master evalution, deferred pruning, parameters, local execution - Router & Remote & Requires Master Evaluation & With Param & Without Param - Fast Path Router & Remote & Requires Master Evaluation & With Param & Without Param	2020-03-05 12:37:22 +01:00
Nils Dijk	268ad741a9	Refactor the deparsing of a CREATE EXTENSION to prevent NULL POINTER dereferences (#3518 ) DESCRIPTION: satisfy static analysis tool for a nullptr dereference During the static analysis project on the codebase this code has been flagged as having the potential for a null pointer dereference. Funnily enough the author had already made a comment of it in the code this was not possible due to us setting the schema name before we pass in the statement. If we want to reuse this code in a later setting this comment might not always apply and we could actually run into null pointer dereference. This patch changes a bit of the code around to first of all make sure there is no NULL pointer dereference in this code anymore. Secondly we allow for better deparsing by setting and adhering to the `if_not_exists` flag on the statement. And finally add support for all syntax described in the documentation of postgres (FROM was missing).	2020-03-04 16:47:07 +01:00
Marco Slot	27f23d2c89	Add some distribution column = composite type prepared statement tests	2020-03-04 05:01:43 +01:00
Onder Kalaci	087f6eb4c0	For composite types, add cast to the parameter to ease remote node detect the type.	2020-03-04 11:27:45 +01:00
Philip Dubé	34f241af16	Fix create_distributed_table on a table using GENERATED ALWAYS AS If the generated column does not come at the end of the column list, columnNameList doesn't line up with the column indexes. Seek past CREATE TABLE test_table ( test_id int PRIMARY KEY, gen_n int GENERATED ALWAYS AS (1) STORED, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); SELECT create_distributed_table('test_table', 'test_id'); Would raise ERROR: cannot cast 23 to 1184	2020-02-28 09:34:26 -08:00
Hadi Moshayedi	1b3e58f0c3	Merge branch 'improve-shard-pruning' of https://github.com/MarkusSintonen/citus into MarkusSintonen-improve-shard-pruning	2020-02-26 07:13:33 -08:00
Nils Dijk	a77ed9cd23	Refactor master query to be planned by postgres' planner (#3326 ) DESCRIPTION: Replace the query planner for the coordinator part with the postgres planner Closes #2761 Citus had a simple rule based planner for the query executed on the query coordinator. This planner grew over time with the addigion of SQL support till it was getting close to the functionality of the postgres planner. Except the code was brittle and its complexity rose which made it hard to add new SQL support. Given its resemblance with the postgres planner it was a long outstanding wish to replace our hand crafted planner with the well supported postgres planner. This patch replaces our planner with a call to postgres' planner. Due to the functionality of the postgres planner we needed to support both projections and filters/quals on the citus custom scan node. When a sort operation is planned above the custom scan it might require fields to be reordered in the custom scan before returning the tuple (projection). The postgres planner assumes every custom scan node implements projections. Because we controlled the plan that was created we prevented reordering in the custom scan and never had implemented it before. A same optimisation applies to having clauses that could have been where clauses. Instead of applying the filter as a having on the aggregate it will push it down into the plan which could reach a custom scan node. For both filters and projections we have implemented them when tuples are read from the tuple store. If no projections or filters are required it will directly return the tuple from the tuple store. Otherwise it will loop tuples from the tuple store through the filter and projection until a tuple is found and returned. Besides filters being pushed down a side effect of having quals that could have been a where clause is that a call to read intermediate result could be called before the first tuple is fetched from the custom scan. This failed because the intermediate result would only be pulled to the coordinator on the first tuple fetch. To overcome this problem we do run the distributed subplans now before we run the postgres executor. This ensures the intermediate result is present on the coordinator in time. We do account for total time instrumentation by removing the instrumentation before handing control to the psotgres executor and update the timings our self. For future SQL support it is enough to create a valid query structure for the part of the query to be executed on the query coordinating node. As a utility we do serialise and print the query at debug level4 for engineers to inspect what kind of query is being planned on the query coordinator.	2020-02-25 14:39:56 +01:00
Philip Dubé	bcf54c5014	Address a couple issues with maintenace daemon management: - Stop the daemon when citus extension is dropped - Bail on maintenance daemon startup if myDbData is started with a non-zero pid - Stop maintenance daemon from spawning itself - Don't use postgres die, just wrap proc_exit(0) - Assert(myDbData->workerPid == MyProcPid) The two issues were that multiple daemons could be running for a database, or that a daemon would be leftover after DROP EXTENSION citus	2020-02-21 16:49:01 +00:00
Nils Dijk	6ee82c381e	Add missing pieces for version bump of #3482 (#3523 )	2020-02-21 12:35:29 +01:00
Onur Tirtir	001089783c	Fix null relation name issue in CheckConflictingRelationAccesses	2020-02-19 19:10:35 +03:00
Philip Dubé	d7a4ffdc46	Add test for issue, does not reproduce issue	2020-02-18 23:45:17 +00:00
Marco Slot	038e5999cb	Implement direct COPY table TO stdout	2020-02-17 15:15:10 +01:00
Markus Sintonen	099e266a6c	Force task executor	2020-02-16 01:32:52 +02:00
Markus Sintonen	cf8319b992	Add comment, add subquery NOT tests	2020-02-16 01:21:10 +02:00
Markus Sintonen	3d3d615040	Add comment about NOT_EXPR. Treat it as invalid constraint for safety.	2020-02-15 16:54:38 +02:00
Markus Sintonen	cdedb98c54	Improve shard pruning logic to understand OR-conditions. Previously a limitation in the shard pruning logic caused multi distribution value queries to always go into all the shards/workers whenever query also used OR conditions in WHERE clause. Related to https://github.com/citusdata/citus/issues/2593 and https://github.com/citusdata/citus/issues/1537 There was no good workaround for this limitation. The limitation caused quite a bit of overhead with simple queries being sent to all workers/shards (especially with setups having lot of workers/shards). An example of a previous plan which was inadequately pruned: ``` EXPLAIN SELECT count() FROM orders_hash_partitioned WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22); QUERY PLAN --------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0) Task Count: 4 Tasks Shown: One of 4 -> Task Node: host=localhost port=xxxxx dbname=regression -> Aggregate (cost=13.68..13.69 rows=1 width=8) -> Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned (cost=0.00..13.68 rows=1 width=0) Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22))) (9 rows) ``` After this commit the task count is what one would expect from the query defining multiple distinct values for the distribution column: ``` EXPLAIN SELECT count() FROM orders_hash_partitioned WHERE (o_orderkey IN (1,2)) AND (o_custkey = 11 OR o_custkey = 22); QUERY PLAN --------------------------------------------------------------------- Aggregate (cost=0.00..0.00 rows=0 width=0) -> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0) Task Count: 2 Tasks Shown: One of 2 -> Task Node: host=localhost port=xxxxx dbname=regression -> Aggregate (cost=13.68..13.69 rows=1 width=8) -> Seq Scan on orders_hash_partitioned_630000 orders_hash_partitioned (cost=0.00..13.68 rows=1 width=0) Filter: ((o_orderkey = ANY ('{1,2}'::integer[])) AND ((o_custkey = 11) OR (o_custkey = 22))) (9 rows) ``` "Core" of the pruning logic works as previously where it uses `PrunableInstances` to queue ORable valid constraints for shard pruning. The difference is that now we build a compact internal representation of the query expression tree with PruningTreeNodes before actual shard pruning is run. Pruning tree nodes represent boolean operators and the associated constraints of it. This internal format allows us to have compact representation of the query WHERE clauses which allows "core" pruning logic to work with OR-clauses correctly. For example query having `WHERE (o_orderkey IN (1,2)) AND (o_custkey=11 OR (o_shippriority > 1 AND o_shippriority < 10))` gets transformed into: 1. AND(o_orderkey IN (1,2), OR(X, AND(X, X))) 2. AND(o_orderkey IN (1,2), OR(X, X)) 3. AND(o_orderkey IN (1,2), X) Here X is any set of unknown condition(s) for shard pruning. This allow the final shard pruning to correctly recognize that shard pruning is done with the valid condition of `o_orderkey IN (1,2)`. Another example with unprunable condition in query `WHERE (o_orderkey IN (1,2)) OR (o_custkey=11 AND o_custkey=22)` gets transformed into: 1. OR(o_orderkey IN (1,2), AND(X, X)) 2. OR(o_orderkey IN (1,2), X) Which is recognized as unprunable due to the OR condition between distribution column and unknown constraint -> goes to all shards. Issue https://github.com/citusdata/citus/issues/1537 originally suggested transforming the query conditions into a full disjunctive normal form (DNF), but this process of transforming into DNF is quite a heavy operation. It may "blow up" into a really large DNF form with complex queries having non trivial `WHERE` clauses. I think the logic for shard pruning could be simplified further but I decided to leave the "core" of the shard pruning untouched.	2020-02-14 17:58:13 +00:00
Jelte Fennema	5ef3e83ce4	Make multi_utilities test take 2 seconds instead of 20 (#3507 ) On worker 2 it was waiting for dustbunnies_990001 to be vacuumed/analyzed. This table doesn't actually exist, so that never happend. Now it waits for the correct table and throws an error if it waits more than 10 seconds.	2020-02-14 15:38:51 +01:00
Onder Kalaci	975c4c2264	Do not prune shards if the distribution key is NULL The root of the problem is that, standard_planner() converts the following qual ``` {OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 100 :args ( {VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 45 } {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull true :location 51 :constvalue <> } ) :location 49 } ``` To ``` ( {CONST :consttype 16 :consttypmod -1 :constcollid 0 :constlen 1 :constbyval true :constisnull true :location -1 :constvalue <> } ) ``` So, Citus doesn't deal with NULL values in real-time or non-fast path router queries. And, in the FastPathRouter planner, we check constisnull in DistKeyInSimpleOpExpression(). However, in deferred pruning case, we do not check for isnull for const. Thus, the fix consists of two parts: - Let PruneShards() not crash when NULL parameter is passed - For deferred shard pruning in fast-path queries, explicitly check that we have CONST which is not NULL	2020-02-13 15:00:31 +01:00
Onur Tirtir	39df51e903	Introduce objects to dist. infrastructure when updating Citus (#3477 ) Mark existing objects that are not included in distributed object infrastructure in older versions of Citus (but now should be) as distributed, after updating Citus successfully.	2020-02-07 18:07:59 +03:00
Nils Dijk	d5433400f9	Fix: Unnecessary repartition on joins with more than 4 tables (#3473 ) DESCRIPTION: Fix unnecessary repartition on joins with more than 4 tables In 9.1 we have introduced support for all CH-benCHmark queries by widening our definitions of joins to include joins with expressions in them. This had the undesired side effect of Q5 regressing on its plan by implementing a repartition join. It turned out this regression was not directly related to widening of the join clause, nor the schema employed by CH-benCHmark. Instead it had to do with 4 or more tables being joined in a chain. A chain meaning: ```sql SELECT * FROM a,b,c,d WHERE a.part = b.part AND b.part = c.part AND .... ``` Due to how our join order planner was implemented it would only keep track of 1 of the partition columns when comparing if the join could be executed locally. This manifested in a join chain of 4 tables to _always_ be executed as a repartition join. 3 tables joined in a chain would have the middle table shared by the two outer tables causing the local join possibility to be found. With this patch we keep a unique list (or set) of all partition columns participating in the join. When a candidate table is checked for a possibility to execute a local join it will check if there is any partition column in that set that matches an equality join clause on the partition column of the candidate table. By taking into account all partition columns in the left relation it will now find the local join path on >= 4 tables joined in a chain. fixes: #3276	2020-02-06 15:07:07 +01:00
Halil Ozan Akgul	8ce4f20061	Fixes the bug of grants on public schema propagation	2020-02-05 18:05:58 +03:00
Marco Slot	64ca5c9acb	Add additional INSERT..SELECT repartition tests	2020-02-05 11:06:44 +01:00
Hadi Moshayedi	9dd14fa90d	Rename discarded target list items in repartitioned INSERT/SELECT	2020-02-05 11:06:44 +01:00
Onder Kalaci	c7e2309f4c	Improve single hash-repartitioning with numeric (or non-int) types We used to treat the shard interval array that we passed as numeric[]. However, it should be int[], as the shard ranges are int[].	2020-02-04 20:30:04 +01:00
Hadi Moshayedi	bc1a800f70	Use current user for repartition join temp schemas. Otherwise when using a less privileged user we might get errors when trying to create the schema.	2020-02-04 09:48:20 -08:00
Hadi Moshayedi	890e23e734	Update multi_insert_select_non_pushable_queries	2020-02-03 13:13:30 -08:00
Hadi Moshayedi	5818bcd27e	Update with_dml	2020-02-03 13:13:30 -08:00
Hadi Moshayedi	46f60e1ac0	Update multi_insert_select_conflict	2020-02-03 13:13:30 -08:00
Hadi Moshayedi	05f58c9ec5	Update multi_insert_select	2020-02-03 13:13:30 -08:00
Hadi Moshayedi	264530311a	Don't use distributed insert/select for repartitioned joins	2020-02-03 13:13:30 -08:00
Onder Kalaci	8be1b0112d	Add failure test for parallel reference table join	2020-02-03 19:35:07 +01:00
Marco Slot	a6bd6c657e	Add tests that exercise parallel reference table join logic	2020-02-03 11:54:29 +01:00
Onder Kalaci	2f274a4fce	Make sure to go deeper into the functions to search for PARAMs For example, a PARAM might reside inside a function just because of a casting of a type such as the follows: ``` {FUNCEXPR :funcid 1740 :funcresulttype 1700 :funcretset false :funcvariadic false :funcformat 2 :funccollid 0 :inputcollid 0 :args ( {PARAM :paramkind 0 :paramid 15 :paramtype 23 :paramtypmod -1 :paramcollid 0 :location 356 } ) ``` We should recursively check the expression before bailing out.	2020-02-03 09:36:12 +01:00
Hadi Moshayedi	9d988b3437	Add insert/select connection leak tests	2020-01-30 14:09:07 -08:00
Philip Dubé	d43c80d4d8	pullUpIntermediateRows should not be true when groupedByDisjointPartitionColumn is true This was causing 'SELECT id, stdev(y_int) FROM tbl GROUP BY id' to push down stddev without group by	2020-01-30 21:18:08 +00:00
Philip Dubé	5fccc56d3e	Expand the set of aggregates which cannot have LIMIT approximated Previously we only prevented AVG from being pushed down, but this is incorrect: - array_agg, while somewhat non sensical to order by, will potentially be missing values - combinefunc aggregation will raise errors about cstrings not being comparable (while we also can't know if the aggregate is commutative) This commit limits approximating LIMIT pushdown when ordering by aggregates to: min, max, sum, count, bit_and, bit_or, every, any Which means of those we previously supported, we now exclude: avg, array_agg, jsonb_agg, jsonb_object_agg, json_agg, json_object_agg, hll_add, hll_union, topn_add, topn_union	2020-01-30 17:45:18 +00:00
Önder Kalacı	8584cb005b	Do not evaluate functions on the coordinator for SELECT queries (#3440 ) Previously, the logic for evaluting the functions and the parameters were the same. That ended-up evaluting the functions inaccurately on the coordinator. Instead, split the function evaluation logic from parameter evalution logic.	2020-01-30 08:47:28 +01:00
Önder Kalacı	e9c17b71a4	Add missing ORDER BY (#3441 ) As it causes some random failures	2020-01-29 17:36:32 +01:00
Jelte Fennema	b9eee70fa5	Fix random output ordering in CTE inlining test (#3434 )	2020-01-27 16:38:27 +01:00
Önder Kalacı	4519d3411d	Improve the representation of used sub plans (#3411 ) Previously, we've identified the usedSubPlans by only looking to the subPlanId. With this commit, we're expanding it to also include information on the location of the subPlan. This is useful to distinguish the cases where the subPlan is used either on only HAVING or both HAVING and any other part of the query.	2020-01-24 10:47:14 +01:00
Philip Dubé	69dde460de	See what flaky multi_extension test is doing with roles	2020-01-23 21:50:40 +00:00
Önder Kalacı	ef7d1ea91d	Locally execute queries that don't need any data access (#3410 ) * Update shardPlacement->nodeId to uint As the source of the shardPlacement->nodeId is always workerNode->nodeId, and that is uint32. We had this hack because of: `0ea4e52df5 (r266421409)` And, that is gone with: `90056f7d3c (diff-c532177d74c72d3f0e7cd10e448ab3c6L1123)` So, we're safe to do it now. * Relax the restrictions on using the local execution Previously, whenever any local execution happens, we disabled further commands to do any remote queries. The basic motivation for doing that is to prevent any accesses in the same transaction block to access the same placements over multiple sessions: one is local session the other is remote session to the same placement. However, the current implementation does not distinguish local accesses being to a placement or not. For example, we could have local accesses that only touches intermediate results. In that case, we should not implement the same restrictions as they become useless. So, this is a pre-requisite for executing the intermediate result only queries locally. * Update the error messages As the underlying implementation has changed, reflect it in the error messages. * Keep track of connections to local node With this commit, we're adding infrastructure to track if any connection to the same local host is done or not. The main motivation for doing this is that we've previously were more conservative about not choosing local execution. Simply, we disallowed local execution if any connection to any remote node is done. However, if we want to use local execution for intermediate result only queries, this'd be annoying because we expect all queries to touch remote node before the final query. Note that this approach is still limiting in Citus MX case, but for now we can ignore that. * Formalize the concept of Local Node Also some minor refactoring while creating the dummy placement * Write intermediate results locally when the results are only needed locally Before this commit, Citus used to always broadcast all the intermediate results to remote nodes. However, it is possible to skip pushing the results to remote nodes always. There are two notable cases for doing that: (a) When the query consists of only intermediate results (b) When the query is a zero shard query In both of the above cases, we don't need to access any data on the shards. So, it is a valuable optimization to skip pushing the results to remote nodes. The pattern mentioned in (a) is actually a common patterns that Citus users use in practice. For example, if you have the following query: WITH cte_1 AS (...), cte_2 AS (....), ... cte_n (...) SELECT ... FROM cte_1 JOIN cte_2 .... JOIN cte_n ...; The final query could be operating only on intermediate results. With this patch, the intermediate results of the ctes are not unnecessarily pushed to remote nodes. * Add specific regression tests As there are edge cases in Citus MX and with round-robin policy, use the same queries on those cases as well. * Fix failure tests By forcing not to use local execution for intermediate results since all the tests expects the results to be pushed remotely. * Fix flaky test * Apply code-review feedback Mostly style changes * Limit the max value of pg_dist_node_seq to reserve for internal use	2020-01-23 18:28:34 +01:00
Hadi Moshayedi	be647ad944	Output filenames in ensure_no_intermediate_data_leak This can helpful in guiding us where to look when this test fails. For example, if the result file has repartitioned_results_ prefix, then we need to look into repartitioned insert/select. Otherwise it is probably a CTE or a subquery.	2020-01-22 11:12:16 -08:00
Jelte Fennema	cd5259a25a	Do not place new shards with shards in TO_DELETE state (#3408 ) When creating a new distributed table. The shards would colocate with shards with SHARD_STATE_TO_DELETE (shardstate = 4). This means if that state was because of a shard move the new shard would be created on two nodes and it would not get deleted since it's shard state would be 1.	2020-01-22 14:52:12 +01:00
Halil Ozan Akgul	b40f067d05	Adds propagation for grant on schema commands	2020-01-20 14:51:28 +03:00
Onder Kalaci	fd17e4578e	Improve tests	2020-01-17 16:02:57 +01:00
Onder Kalaci	0bf1e81e33	Cache local plans on BeginScan	2020-01-17 16:02:57 +01:00
Onder Kalaci	016f561e45	Ingest data for cte_inline tests	2020-01-17 12:46:00 +01:00
Jelte Fennema	246435be7e	Lazy query deparsing executable queries (#3350 ) Deparsing and parsing a query can be heavy on CPU. When locally executing the query we don't need to do this in theory most of the time. This PR is the first step in allowing to skip deparsing and parsing the query in these cases, by lazily creating the query string and storing the query in the task. Future commits will make use of this and not deparse and parse the query anymore, but use the one from the task directly.	2020-01-17 11:49:43 +01:00
Hadi Moshayedi	6cf1c01660	Don't use repartitioned INSERT/SELECT for repartition joins	2020-01-16 23:40:31 -08:00
Hadi Moshayedi	5eeb07124f	Repartitioned INSERT/SELECT: include job id in result id prefix	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	a079278b0c	Repartitioned INSERT/SELECT: Add a GUC to enable/disable it	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	ce5eea4885	INSERT/SELECT: make SELECT column names unique	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	8b27a9a195	More range partitioned tests	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	8635396cea	Repartitioned INSERT/SELECT: Test rollback behaviour	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	43218eebf6	Failure tests for INSERT/SELECT repartition	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	665b33dca1	MX tests for INSERT/SELECT repartition	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	af2349f21f	Repartitioned INSERT/SELECT: Add a prepared statement test	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	97072c9eb1	INSERT/SELECT: show method in EXPLAIN output	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	b143d9588a	Repartitioned INSERT/SELECT: Test GROUP BY	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	fe548b762f	Repartitioned INSERT/SELECT: Test CTEs	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	494cc383cc	Repartitioned INSERT/SELECT: Enable RETURNING	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	4b14347fc3	Tests for DML followed by insert/select repartition	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	44a2aede16	Don't start a coordinated transaction on workers. Otherwise transaction hooks of Citus kick in and might cause unwanted errors.	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	42c3c03b85	Handle extra columns added in ExpandWorkerTargetEntry() in repartitioned INSERT/SELECT	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	89463f9760	Repartitioned INSERT/SELECT: cast columns in SELECT targets	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	d67a384350	Enable repartitioned INSERT/SELECT ON CONFLICT.	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	b4e5f4b10a	Implement INSERT ... SELECT with repartitioning	2020-01-16 23:24:52 -08:00
Hadi Moshayedi	e30580e2bd	Add ORDER BY to multi_row_insert.sql	2020-01-16 15:20:39 -08:00
Jelte Fennema	cb5154cf03	Add more failing tests, of which some have bad error messages	2020-01-16 18:30:30 +01:00
Onder Kalaci	dc17c2658e	Defer shard pruning for fast-path router queries to execution This is purely to enable better performance with prepared statements. Before this commit, the fast path queries with prepared statements where the distribution key includes a parameter always went through distributed planning. After this change, we only go through distributed planning on the first 5 executions.	2020-01-16 16:59:36 +01:00
Halil Ozan Akgul	c5539d20d9	Adds alter table schema propagation	2020-01-16 17:04:16 +03:00
Nils Dijk	b6e09eb691	Fix: distributed function with table reference in declare (#3384 ) DESCRIPTION: Fixes a problem when adding a new node due to tables referenced in a functions body Fixes #3378 It was reported that `master_add_node` would fail if a distributed function has a table name referenced in its declare section of the body. By default postgres validates the body of a function on creation. This is not a problem in the normal case as tables are replicated to the workers when we distribute functions. However when a new node is added we first create dependencies on the workers before we try to create any tables, and the original tables get created out of bound when the metadata gets synced to the new node. This causes the function body validator to raise an error the table is not on the worker. To mitigate this issue we set `check_function_bodies` to `off` right before we are creating the function. The added test shows this does resolve the issue. (issue can be reproduced on the commit without the fix)	2020-01-16 14:21:54 +01:00
Jelte Fennema	e76281500c	Replace shardId lock with lock on colocation+shardIntervalIndex (#3374 ) This new locking pattern makes sure that some deadlocks that could happend during rebalancing cannot occur anymore.	2020-01-16 13:14:01 +01:00
Jelte Fennema	86343bcc8f	Re-add test that broke with GUC workaround	2020-01-16 12:34:50 +01:00
Jelte Fennema	6b9b633695	Add more tests for prepared statements	2020-01-16 12:28:15 +01:00
Jelte Fennema	43a3fdd12f	Fix comment	2020-01-16 12:28:15 +01:00
Jelte Fennema	fe3827e499	Add tests for [NOT] MATERIALEZED	2020-01-16 12:28:15 +01:00
Onder Kalaci	326dfab44a	Fix a query which triggers an existing bug, see https://github.com/citusdata/citus/issues/3189#issuecomment-571497051	2020-01-16 12:28:15 +01:00
Onder Kalaci	3818be45a6	Update regression tests-5 Failure tests that rely on intermediate results	2020-01-16 12:28:15 +01:00
Onder Kalaci	1e85938b46	Update regression tests-4 Update the MX tests. Similar to the previous commits, prevent CTE inlining in some cases to prevent divergent test outputs.	2020-01-16 12:28:15 +01:00
Onder Kalaci	64560b07be	Update regression tests-2 In this commit, we're introducing a way to prevent CTE inlining via a GUC. The GUC is used in all the tests where PG 11 and PG 12 tests would diverge otherwise. Note that, in PG 12, the restriction information for CTEs are generated. It means that for some queries involving CTEs, Citus planner (router planner/ pushdown planner) may behave differently. So, via the GUC, we prevent tests to diverge on PG 11 vs PG 12. When we drop PG 11 support, we should get rid of the GUC, and mark relevant ctes as MATERIALIZED, which does the same thing.	2020-01-16 12:28:15 +01:00
Onder Kalaci	421bf68516	Add the specific regression tests With this commit, we're adding the specific tests for CTE inlining. The test has a different output file for pg 11, because as mentioned in the previous commits, PG 12 generates more restriction information for CTEs.	2020-01-16 12:28:15 +01:00
Philip Dubé	4d9a733c2f	Fix inserting multiple values with row expression partition column causing the insert to be ignored Raise an error instead of silently inserting nothing if we hit this condition in the future	2020-01-15 21:10:50 +00:00
Marco Slot	06709ee108	Always use NOTICE in log_remote_commands and avoid redaction when possible	2020-01-13 18:24:36 +01:00
Philip Dubé	ccabf19090	Propagate DROP ROUTINE, ALTER ROUTINE In two places I've made code more straight forward by using ROUTINE in our own codegen Two changes which may seem extraneous: AppendFunctionName was updated to not use pg_get_function_identity_arguments. This is because that function includes ORDER BY when printing an aggregate like my_rank. While ALTER AGGREGATE my_rank(x "any" ORDER BY y "any") is accepted by postgres, ALTER ROUTINE my_rank(x "any" ORDER BY y "any") is not. Tests were updated to use macaddr over integer. Using integer is flaky, our logic could sometimes end up on tables like users_table. I originally wanted to use money, but money isn't hashable.	2020-01-13 15:37:46 +00:00
Philip Dubé	4b5d6c3ebe	Rename RelayFileState to ShardState Replace FILE_ prefix with SHARD_STATE_	2020-01-12 05:57:53 +00:00
Hadi Moshayedi	40ba2cdd6e	Test RedistributeTaskListResult	2020-01-09 23:47:25 -08:00
Philip Dubé	281aacce9b	Fix row-gather for subqueries being handled by task-tracker task-tracker has specific logic for MultiPartition when GROUP BY is missing We were ending up in this code path because row-gather removes GROUP BY	2020-01-10 01:51:37 +00:00
Hadi Moshayedi	bb65669186	Failure tests for PartitionTasklistResults	2020-01-09 10:55:58 -08:00
Hadi Moshayedi	f38d0e5b3f	Partitioned task list results.	2020-01-09 10:32:58 -08:00
Philip Dubé	bf7d86a3e8	Fix typo: aggragate -> aggregate	2020-01-07 01:16:09 +00:00
Philip Dubé	863bf49507	Implement pulling up rows to coordinator when aggregates cannot be pushed down. Enabled by default	2020-01-07 01:16:04 +00:00
Onder Kalaci	c8f14c9f6c	Make sure to update shard states of partitions on failures Fixes #3331 In #2389, we've implemented support for partitioned tables with rep > 1. The implementation is limiting the use of modification queries on the partitions. In fact, we error out when any partition is modified via EnsurePartitionTableNotReplicated(). However, we seem to forgot an important case, where the parent table's partition is marked as INVALID. In that case, at least one of the partition becomes INVALID. However, we do not mark partitions as INVALID ever. If the user queries the partition table directly, Citus could happily send the query to INVALID placements -- which are not marked as INVALID. This PR fixes it by marking the placements of the partitions as INVALID as well. The shard placement repair logic already re-creates all the partitions, so should be fine in that front.	2020-01-06 12:26:08 +01:00
Philip Dubé	566246ecd4	End regression tests with ensure_no_intermediate_data_leak Also update tests to clean up jobs when they're directly testing job udfs	2020-01-03 18:59:02 +00:00
Önder Kalacı	0c70a5470e	Allow RETURNING in fast-path queries (#3352 ) * Allow RETURNING in fast-path queries Because there is no specific reason for that.	2020-01-03 13:42:50 +00:00
Önder Kalacı	a174eb4f7b	Do not go through standard_planner() for INSERTs (#3348 ) That seems unnecessary. We already have the notion of FastPath queries, simply add it there.	2020-01-03 12:15:22 +00:00
Jelte Fennema	5fee9d04c9	Uncomment local execution EXPLAIN ANALYZE tests	2020-01-02 18:56:32 +00:00
Marco Slot	ba39d72fe1	Fix incorrect union all pushdown issue	2020-01-01 09:03:50 +01:00
Jelte Fennema	cf88bdf833	Add tests for complex joins on reference tables	2019-12-27 15:05:51 +01:00
Jelte Fennema	3a042e4611	Allow cartesian products on reference tables	2019-12-27 15:05:51 +01:00
Jelte Fennema	4233cd0d9d	Allow non equi joins on reference tables	2019-12-27 15:05:51 +01:00
Marco Slot	b21b6905ae	Do not repeat GROUP BY distribution_column on coordinator Allow arbitrary aggregates to be pushed down in these scenarios	2019-12-25 01:33:41 +00:00
Philip Dubé	a6ffcab59d	CREATE EXTENSION is propagated now	2019-12-24 21:04:37 +00:00
Marco Slot	a2ddfecd86	Fix inconsistent shard metadata issue	2019-12-24 08:01:32 +01:00
Hadi Moshayedi	d7aea7fa10	Implement partitioned intermediate results.	2019-12-24 03:53:39 -08:00
Marco Slot	b37ef0e394	Fix error in distributed queries when shards are on the coordinator	2019-12-24 06:36:43 +01:00
Philip Dubé	e9bbdb8f31	Fix handling of empty intermediate results when distributing custom aggregates	2019-12-23 17:27:52 +00:00
Jelte Fennema	b655c02352	Add the necessary changes for rebalance strategies on enterprise (#3325 ) This commit adds the SQL and C changes necessary to support custom rebalance strategies in the Enterprise version of Citus.	2019-12-19 15:23:08 +01:00
Hadi Moshayedi	ef487e0792	Implement fetch_intermediate_results	2019-12-18 10:46:35 -08:00
Hadi Moshayedi	249508d267	Estimate cost of read_intermediate_results()	2019-12-17 13:51:51 -08:00
Hadi Moshayedi	113bd1e5f1	Implement read_intermediate_results	2019-12-17 13:51:16 -08:00
SaitTalhaNisanci	7ff4ce2169	Add adaptive executor support for repartition joins (#3169 ) * WIP * wip * add basic logic to run a single job with repartioning joins with adaptive executor * fix some warnings and return in ExecuteDependedTasks if there is none * Add the logic to run depended jobs in adaptive executor The execution of depended tasks logic is changed. With the current logic: - All tasks are created from the top level task list. - At one iteration: - CurTasks whose dependencies are executed are found. - CurTasks are executed in parallel with adapter executor main logic. - The iteration is repeated until all tasks are completed. * Separate adaptive executor repartioning logic * Remove duplicate parts * cleanup directories and schemas * add basic repartion tests for adaptive executor * Use the first placement to fetch data In task tracker, when there are replicas, we try to fetch from a replica for which a map task is succeeded. TaskExecution is used for this, however TaskExecution is not used in adaptive executor. So we cannot use the same thing as task tracker. Since adaptive executor fails when a map task fails (There is no retry logic yet). We know that if we try to execute a fetch task, all of its map tasks already succeeded, so we can just use the first one to fetch from. * fix clean directories logic * do not change the search path while creating a udf * Enable repartition joins with adaptive executor with only enable_reparitition_joins guc * Add comments to adaptive_executor_repartition * dont run adaptive executor repartition test in paralle with other tests * execute cleanup only in the top level execution * do cleanup only in the top level ezecution * not begin a transaction if repartition query is used * use new connections for repartititon specific queries New connections are opened to send repartition specific queries. The opened connections will be closed at the FinishDistributedExecution. While sending repartition queries no transaction is begun so that we can see all changes. * error if a modification was done prior to repartition execution * not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level * fix cleanup logic * update tests * add missing function comments * add test for transaction with DDL before repartition query * do not close repartition connections in adaptive executor * rollback instead of commit in repartition join test * use close connection instead of shutdown connection * remove unnecesary connection list, ensure schema owner before removing directory * rename ExecuteTaskListRepartition * put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case * split adaptive executor repartition to DAG execution logic and repartition logic * apply review items * apply review items * use an enum for remote transaction state and fix cleanup for repartition * add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction * fix style * wip * rename removejobdir to partition cleanup * do not close connections at the end of repartition queries * do repartition cleanup in pg catch * apply review items * decide whether to use transaction or not at execution creation * rename isOutsideTransaction and add missing comment * not error in pg catch while doing cleanup * use replication factor of the creation time, not current time to decide if task tracker should be chosen * apply review items * apply review items * apply review item	2019-12-17 19:09:45 +03:00
Marco Slot	2f568ad5a5	Forbid using connections that sent intermediate results for data access and vice versa	2019-12-17 11:49:13 +01:00
Onur TIRTIR	8092529a2c	Split propagate extension test and add alternative output (#3314 ) * Split extension name tests from propagate_extension_commands.sql * Add alternative output for escape_extension_name.sql	2019-12-17 13:49:16 +03:00
Marco Slot	5f656e22db	Fix issue in IsMultiStatementTransaction detection	2019-12-16 17:01:43 +01:00
Marco Slot	1633123d78	Fix crash in IN (NULL) queries	2019-12-13 08:35:54 +01:00
Hadi Moshayedi	e7a6cc0801	Fix some typos from #3280	2019-12-12 13:29:26 -08:00
Marco Slot	e7a8db5493	Fix issue with some zero-shard modifications	2019-12-12 07:19:10 +01:00
SaitTalhaNisanci	053fe18404	not continue in sequential execution if a cancellation is received (#3289 )	2019-12-12 17:22:30 +03:00
Hadi Moshayedi	383d34f51b	Tests for multi-statement transactions with subqueries or ctes	2019-12-11 19:54:15 -08:00
Hadi Moshayedi	939d3c955b	Don't plan function joins locally	2019-12-11 16:53:29 -08:00
Hadi Moshayedi	067d92a7f6	Don't plan joins between ref tables and views locally	2019-12-11 14:31:34 -08:00
Hadi Moshayedi	e3e174f30f	Fix the way we check for local/reference table joins in the executor	2019-12-11 12:50:20 -08:00
Önder Kalacı	fecf61ef1f	Add missing ORDER BY in a CTE (#3282 ) Otherwise, the query output might not be consistent.	2019-12-11 10:24:54 +01:00
Marco Slot	486c620a3c	Fix inserts into local tables with distributed subqueries	2019-12-10 10:17:18 +01:00
Önder Kalacı	f027e9dd77	Improve Recursive CTE tests (#3274 ) Postgres keeps track of recursive CTEs in the queryTree in two ways: - queryTree->hasRecursive is set to true, whenever a RECURSIVE CTE is used in the SQL. Citus checks for it - If the CTE is actually a recursive one (a.k.a., references itself) Postgres marks CommonTableExpr->cterecursive as true as well The tests that are changed in the PR doesn't cover (b), and this becomes an issue with CTE inlining (#3161). In that case, Citus/Postgres can inline such CTEs, and the queries works with Citus. However, this tests intend to check if there is any recursive CTE in the queryTree. So, we're actually making the CTEs recursive CTEs by referring itself. We'll add cases where a recursive CTE works by inlining in #3161.	2019-12-10 09:38:45 +01:00
Philip Dubé	fcf2fd819b	Add distributioncolumncollation to to pg_dist_colocation Use partition column's collation for range distributed tables Don't allow non deterministic collations for hash distributed tables CoPartitionedTables: don't compare unequal types	2019-12-09 19:51:40 +00:00
Philip Dubé	d138bb89bf	Support creating collations as part of dependency resolution. Propagate ALTER/DROP on distributed collations Propagate CREATE COLLATION when outside transaction	2019-12-09 04:42:51 +00:00
Marco Slot	6a9c0ea7fe	Fix errors in DML with sublinks hidden by null expressions	2019-12-06 14:25:04 +01:00
Hadi Moshayedi	d28beb3711	Detect SQL UDF Calls.	2019-12-05 14:31:05 -08:00
Philip Dubé	5a17fd6d9d	Test more reference/local cases, also ALTER ROLE Test ALTER ROLE doesn't deadlock when coordinator added, or propagate from mx workers Consolidate wait_until_metadata_sync & verify_metadata to multi_test_helpers	2019-12-03 22:23:14 +00:00
Philip Dubé	1597fbb369	aggregate_support test: test DISTINCT, ORDER BY, FILTER, & no intermediate results Previously, - we'd push down ORDER BY, but this doesn't order intermediate results between workers - we'd keep FILTER on master aggregate, which would raise an error about unexpected cstrings	2019-12-03 15:46:01 +00:00
Philip Dubé	5fcc169a3a	Stray depended to dependent tidy up	2019-12-03 15:28:32 +00:00
Marco Slot	bb3bc10f0c	Fix segfault in column_to_column_name	2019-12-01 23:57:25 +01:00
Marco Slot	b1b13e394e	Fix segfault when executing DDL via UDF	2019-12-01 22:54:41 +01:00
Marco Slot	4c8d43c5d0	Bump repo version to 9.2devel	2019-11-29 07:33:39 +01:00
Philip Dubé	0d04ff1692	RECORD: Add support for more expression types - OpExpr - NullIfExpr - MinMaxExpr - CoalesceExpr - CaseExpr Also fix case where ARRAY[(1,2), NULL] was rejected	2019-11-27 17:07:22 +00:00
Philip Dubé	168e11cc9b	Implement support for RECORD[] where we support RECORD Support for ARRAY[] expressions is limited to having a consistent shape, eg ARRAY[(int,text),(int,text)] as opposed to ARRAY[(int,text),(float,text)] or ARRAY[(int,text),(int,text,float)]	2019-11-27 15:02:43 +00:00
Hadi Moshayedi	2268a9cae6	Error for metadata commands if any metadata node is out-of-sync (#3226 ) * Error for metadata commands if any metadata node is out-of-sync * Make the functions have separate APIs for all workers/metadata workers	2019-11-27 09:52:57 +01:00
Marco Slot	2329157406	Swap aggregate_support tests to simplify enterprise merge	2019-11-26 13:39:18 +01:00
Philip Dubé	261a9de42d	Fix typos: VAR_SET_VALUE_KIND -> VAR_SET_VALUE kind beginnig -> beginning plannig -> planning the the -> the er then -> er than	2019-11-25 23:24:13 +00:00
Marco Slot	4b0ac4b0dd	Properly escape ALTER FUNCTION .. SET deparsing. Also test	2019-11-25 23:01:30 +00:00
Philip Dubé	3c10c27b13	GetFunctionAlterOwnerCommand: use format_procedure_qualified distributed_functions: test a function with a quote in name AppendDefElemSet: quote variable names	2019-11-25 23:01:30 +00:00
Philip Dubé	a81e6a81ab	Fix distributed aggregation for non superuser roles Moves support functions to pg_catalog for now. We'd prefer a different solution for when we're creating these support functions dynamically	2019-11-25 20:46:25 +00:00
Onur TIRTIR	bef32624c3	Escape extension name in extension command propagation (#3218 )	2019-11-24 12:16:10 +03:00
Philip Dubé	99164398bf	Fix potential segfault from standard_planner inlining functions	2019-11-21 18:47:36 +00:00
Philip Dubé	c563e0825c	Strip trailing whitespace and add final newline (#3186 ) This brings files in line with our editorconfig file	2019-11-21 14:25:37 +01:00
Hanefi Onaldi	d82f3e9406	Introduce intermediate result broadcasting In plain words, each distributed plan pulls the necessary intermediate results to the worker nodes that the plan hits. This is primarily useful in three ways. (i) If the distributed plan that uses intermediate result(s) is a router query, then the intermediate results are only broadcasted to a single node. (ii) If a distributed plan consists of only intermediate results, which is not uncommon, the intermediate results are broadcasted to a single node only. (iii) If a distributed query hits a sub-set of the shards in multiple workers, the intermediate results will be broadcasted to the relevant node(s). The final item (iii) becomes crucial for append/range distributed tables where typically the distributed queries hit a small subset of shards/workers. To do this, for each query that Citus creates a distributed plan, we keep track of the subPlans used in the queryTree, and save it in the distributed plan. Just before Citus executes each subPlan, Citus first keeps track of every worker node that the distributed plan hits, and marks every subPlan should be broadcasted to these nodes. Later, for each subPlan which is a distributed plan, Citus does this operation recursively since these distributed plans may access to different subPlans, and those have to be recorded as well.	2019-11-20 15:26:36 +03:00
Jelte Fennema	1ed05be82c	Flaky test: Fix recover_prepared_transactions (#3205 ) Failed test: https://app.circleci.com/jobs/github/citusdata/citus/35994 We now always take a new connection	2019-11-19 17:49:13 +01:00
Jelte Fennema	1ac96f228b	Flaky test: Force correct plan (#3203 ) Failing test: https://app.circleci.com/jobs/github/citusdata/citus/23148	2019-11-19 17:11:05 +01:00
Onur TIRTIR	26c306d188	Add extensions to distributed object propagation infrastructure (#3185 )	2019-11-19 17:56:28 +03:00
Jelte Fennema	87f57eb92b	Fix verify_metadata not returning consistent results (#3199 ) Failing test: https://app.circleci.com/jobs/github/citusdata/citus/58827	2019-11-19 11:02:35 +01:00
Hanefi Onaldi	e3ad4aba94	Bump 9.1devel * Add Changelog entry for 9.0.1 * Bump citus version to 9.1devel	2019-11-19 10:35:57 +03:00
Halil Ozan Akgul	5ae7b219ff	Create the ALTER ROLE propagation	2019-11-18 18:31:28 +03:00
Nils Dijk	217890af5f	Feature: Expression in reference join (#3180 ) DESCRIPTION: Expression in reference join Fixed: #2582 This patch allows arbitrary expressions in the join clause when joining to a reference table. An example of such joins could be found in CHbenCHmark queries 7, 8, 9 and 11; `mod((s_w_id * s_i_id),10000) = su_suppkey` and `ascii(substr(c_state,1,1)) = n2.n_nationkey`. Since the join is on a reference table these queries are able to be pushed down to the workers. To implement these queries we will widen the `IsJoinClause` predicate to not check if the expressions are a type `Var` after stripping the implicit coerciens. Instead we define a join clause when the `Var`'s in a clause come from more than 1 table. This allows more clauses to pass into the logical planner's `MultiNodeTree(...)` planning function. To compensate for this we tighten down the `LocalJoin`, `SinglePartitionJoin` and `DualPartitionJoin` to check for direct column references when planning. This allows the planner to work with arbitrary join expressions on reference tables.	2019-11-18 16:25:46 +01:00
Hadi Moshayedi	d9dcba25e3	Plan reference/local table joins locally	2019-11-15 07:36:50 -08:00
Onder Kalaci	90943a6ce6	Do not include coordinator shards when round-robin is selected When the user picks "round-robin" policy, the aim is that the load is distributed across nodes. However, for reference tables on the coordinator, since local execution kicks in immediately, round-robin is ignored. With this change, we're excluding the placement on the coordinator. Although the approach seems a little bit invasive because of modifications in the placement list, that sounds acceptable. We could have done this in some other ways such as: 1) Add a field to "Task->roundRobinPlacement" (or such), which is updated as the first element after RoundRobinPolicy is applied. During the execution, if that placement is local to the coordinator, skip it and try the other remote placements. 2) On TaskAccessesLocalNode()@local_execution.c, check task_assignment_policy, if round-robin selected and there is local placement on the coordinator, skip it. However, task assignment is done on planning, but this decision is happening on the execution, which could create weird edge cases.	2019-11-15 06:03:32 -08:00
Hadi Moshayedi	15af1637aa	Replicate reference tables to coordinator.	2019-11-15 05:50:19 -08:00
Hadi Moshayedi	cb011bb30f	Propagate isactive to metadata nodes.	2019-11-15 05:48:42 -08:00
Philip Dubé	495c0f5117	Phase 1 implementation of custom aggregates Phase 1 seeks to implement minimal infrastructure, so does not include: - dynamic generation of support aggregates to handle multiple arguments - configuration methods to direct aggregation strategy, or mark an aggregate's serialize/deserialize as safe to operate across nodes Aggregates can be distributed when: - they have a single argument - they have a combinefunc - their transition type is not a pseudotype	2019-11-14 19:01:24 +00:00
Philip Dubé	edc7a2ee38	Improve RECORD support	2019-11-14 18:32:22 +00:00
Philip Dubé	eb35743c3f	Remove citus.worker_list_file & master_initialize_node_metadata	2019-11-13 00:49:58 +00:00
Jelte Fennema	adc6ca6100	Make simple in queries on unique columns work with repartion join (#3171 ) This is necassery to support Q20 of the CHbenCHmark: #2582. To summarize the fix: The subquery is converted into an INNER JOIN on a table. This fixes the issue, since an INNER JOIN on a table is already supported by the repartion planner. The way this replacement is happening.: 1. Postgres replaces `col in (subquery)` with a SEMI JOIN (subquery) on col = subquery_result 2. If this subquery is simple enough Postgres will replace it with a regular read from a table 3. If the subquery returns unique results (e.g. a primary key) Postgres will convert the SEMI JOIN into an INNER JOIN during the planning. It will not change this in the rewritten query though. 4. We check if Postgres sends us any SEMI JOINs during its join order planning, if it doesn't we replace all SEMI JOINs in the rewritten query with INNER JOIN (which we already support).	2019-11-11 13:44:28 +01:00
Önder Kalacı	460f000218	Remove failure tests related to real-time executor (#3174 ) Since we've removed the executor, we don't need the specific tests. Since the tests are already using adaptive executor, they were passing. But, we've plenty of extra tests for adaptive executor, so seems safe to remove.	2019-11-11 10:18:37 +01:00
Philip Dubé	ad86c1b866	AcquireDistributedLockOnRelations: escape relation names	2019-11-08 21:23:01 +00:00
Jelte Fennema	9fb897a074	Fix queries with repartition joins and group by unique column (#3157 ) Postgres doesn't require you to add all columns that are in the target list to the GROUP BY when you group by a unique column (or columns). It even actively removes these group by clauses when you do. This is normally fine, but for repartition joins it is not. The reason for this is that the temporary tables don't have these primary key columns. So when the worker executes the query it will complain that it is missing columns in the group by. This PR fixes that by adding an ANY_VALUE aggregate around each variable in the target list that does is not contained in the group by or in an aggregate. This is done only for repartition joins. The ANY_VALUE aggregate chooses the value from an undefined row in the group.	2019-11-08 15:36:18 +01:00
Önder Kalacı	0b3d4e55d9	Local execution should not change hasReturning for distributed tables (#3160 ) It looks like the logic to prevent RETURNING in reference tables to have duplicate entries that comes from local and remote executions leads to missing some tuples for distributed tables. With this PR, we're ensuring to kick in the logic for reference tables only.	2019-11-08 12:49:56 +01:00
Philip Dubé	2fc45e5897	create_distributed_function: accept aggregates Adds support for OCLASS_PROC to worker_create_or_replace_object	2019-11-06 18:23:37 +00:00
Önder Kalacı	960cd02c67	Remove real time router executors (#3142 ) * Remove unused executor codes All of the codes of real-time executor. Some functions in router executor still remains there because there are common functions. We'll move them to accurate places in the follow-up commits. * Move GUCs to transaction mngnt and remove unused struct * Update test output * Get rid of references of real-time executor from code * Warn if real-time executor is picked * Remove lots of unused connection codes * Removed unused code for connection restrictions Real-time and router executors cannot handle re-using of the existing connections within a transaction block. Adaptive executor and COPY can re-use the connections. So, there is no reason to keep the code around for applying the restrictions in the placement connection logic.	2019-11-05 12:48:10 +01:00
Önder Kalacı	ffd89e4e01	Include all relevant relations in the ExtractRangeTableRelationWalker (#3135 ) We've changed the logic for pulling RTE_RELATIONs in #3109 and non-colocated subquery joins and partitioned tables. @onurctirtir found this steps where I traced back and found the issues. While looking into it in more detail, we decided to expand the list in a way that the callers get all the relevant RTE_RELATIONs RELKIND_RELATION, RELKIND_PARTITIONED_TABLE, RELKIND_FOREIGN_TABLE and RELKIND_MATVIEW. These are all relation kinds that Citus planner is aware of.	2019-11-01 16:06:58 +01:00
Onur TIRTIR	d3f68bf44f	Fix view is not distributed error when view is used in modify statements (#3104 )	2019-11-01 16:34:01 +03:00
Marco Slot	03cae27782	Add tests for distributing functions with replication_model statement	2019-10-26 23:57:59 +02:00
SaitTalhaNisanci	29d45bd1b9	Do not assign InvalidOid for local execution while extracting parameters (#3131 ) * do not assign InvalidOid for local execution while extracting parameters * rename functions * rename parameter and replace function	2019-10-28 14:28:22 +03:00
Önder Kalacı	dceaddbe4d	Remove real-time/router executors (step 1) (#3125 ) See #3125 for details on each item. * Remove real-time/router executor tests-1 These are the ones which doesn't have '_%d' in the test output files. * Remove real-time/router executor tests-2 These are the ones which has in the test output files. * Move the tests outputs to correct place * Make sure that single shard commits use 2PC on adaptive executor It looks like we've messed the tests in #2891. Fixing back. * Use adaptive executor for all router queries This becomes important because when task-tracker is picked, we used to pick router executor, which doesn't make sense. * Remove explicit references to real-time/router executors in the tests * JobExecutorType never picks real-time/router executors * Make sure to go incremental in test output numbers * Even users cannot pick real-time anymore * Do not use real-time/router custom scans * Get rid of unnecessary normalizations * Reflect unneeded normalizations * Get rid of unnecessary test output file	2019-10-25 10:54:54 +02:00
Marco Slot	b8c8fd4612	Fix run_command_on_colocated_placements tests	2019-10-23 00:08:17 +02:00
Onder Kalaci	c2460a1c31	Add upgrade test for distributed functions Simply make sure that Citus can pushdown functions after pg upgrade.	2019-10-23 12:07:51 +02:00
Philip Dubé	2a969fe4bb	ssl_by_default: remove stray PG10 check	2019-10-23 00:27:54 +00:00
Jelte Fennema	78e495e030	Add shouldhaveshards to pg_dist_node (#2960 ) This is an improvement over #2512. This adds the boolean shouldhaveshards column to pg_dist_node. When it's false, create_distributed_table for new collocation groups will not create shards on that node. Reference tables will still be created on nodes where it is false.	2019-10-22 16:47:16 +02:00
Halil Ozan Akgul	5f04ac774f	Adds the tests for refresh materialized views	2019-10-17 16:00:56 +03:00
Jelte Fennema	7abedc38b0	Support subqueries in HAVING (#3098 ) Areas for further optimization: - Don't save subquery results to a local file on the coordinator when the subquery is not in the having clause - Push the the HAVING with subquery to the workers if there's a group by on the distribution column - Don't push down the results to the workers when we don't push down the HAVING clause, only the coordinator needs it Fixes #520 Fixes #756 Closes #2047	2019-10-16 16:40:14 +02:00
Jelte Fennema	9b2f4d71ac	Make sure some MX tests use defined shard_ids (#3103 )	2019-10-12 22:46:14 +02:00
Philip Dubé	74cb168205	Remove Postgres 10 support	2019-10-11 21:56:56 +00:00
Philip Dubé	4063e7ca67	CALL delegation: apply strip_implicit_coercions to distribution argument	2019-10-10 17:42:43 +00:00
Nils Dijk	4a4a220945	Fix enum add value order and pg12 (#3082 ) DESCRIPTION: Fix order for enum values and correctly support pg12 PG 12 introduces `ALTER TYPE ... ADD VALUE ...` during transactions. Earlier versions would error out when called in a transaction, hence we connect to workers outside of the transaction which could cause inconsistencies on pg12 now that postgres doesn't error with this syntax anymore. During the implementation of this fix it became apparent there was an error with the ordering of enum labels when the type was recreated. A patch and test have been included.	2019-10-07 17:16:19 +02:00
Jelte Fennema	01da11f264	Change citus truncate trigger to AFTER and add more upgrade tests (#3070 ) * Add more upgrade tests * Fix citus trigger generation after upgrade citus_truncate_trigger runs before truncate when created by create_distributed_table: `492d1b2cba/src/backend/distributed/commands/create_distributed_table.c (L1163)` * Remove pg_dist_jobid_seq	2019-10-07 16:43:04 +02:00
Onder Kalaci	3be72ce42f	Make sure that distributed functions always have the correct user Objectives: (a) both super user and regular user should have the correct owner for the function on the worker (b) The transactional semantics would work fine for both super user and regular user (c) non-super-user and non-function owner would get a reasonable error message if tries to distribute the function Co-authored-by: @serprex	2019-10-04 21:38:49 +00:00
SaitTalhaNisanci	c547664fae	Add Citus upgrade tests with its job (#3003 ) * Add initial citus upgrade test * Add restart databases and run tests in all nodes * Add output for citus versions 8.0 8.1 8.2 and 8.3 * Add verify step for citus upgrade * Add target for citus upgrade test in makefile * Add check citus upgrade job * Fix installation file path and add missing tar * Run citus upgrade for v8.0 v8.1 v8.2 and v8.3 * Create upgrade_common file and rename upgrade check * Add pg version to citus upgrade test * Test with postgres 10 and 11 in citus upgrade tests * Add readme for citus upgrade test * Add some basic tests to citus upgrade tests * Add citus upgrade mixed mode test * Remove citus artifacts before installing another one * Refactor citus upgrade test according to reviews * quick and dirty rewrite of citus upgrade tests to support local execution. I think we need to change the makefile in such a way that the tar files can be injected from the circle ci config file. Also I removed some of the citus version checks you had to not have the requirement to pass that in separately from the pre tar file. I am not super happy with it, but two flags that need to be kept in sync is also not desirable. Instead I print out the citus version that is installed per node. This will not cause a failure if they are not what one would expect but it lets us verify we are running the expected version. * use latest citusupgradetester in circleci * update readme and use common alias for upgrade_common import	2019-10-04 17:44:49 +03:00
Marco Slot	1a3a174f67	Grant usage on schema citus to public	2019-10-04 12:26:08 +02:00
Hadi Moshayedi	217db2a03e	Don't block for locks in SyncMetadataToNodes()	2019-10-03 16:53:36 -07:00
SaitTalhaNisanci	19bdca14d8	Add jobs to run tests with pg 12 (#3033 ) * Add PG12 test outputs * Add jobs to run tests with pg 12 * use POSIX collate for compatibility between pg10/pg11/pg12 * do not override the new default value when running vanilla tests * fix 2 problems with pg12 tests * update pg12 images with pg12 rc1 * remove pg10 jobs * Revert "Add PG12 test outputs" This reverts commit `f3545b92ef`. * change images to use latest instead of dev * add missing coverage flags	2019-10-02 15:33:12 +03:00
Hanefi Onaldi	bd416ef68f	Fix empty FROM clauses in PG12	2019-10-01 19:54:11 +00:00
Philip Dubé	89d35e9692	Attempt to force custom plans for prepared statements when trying to delegate function calls We discern between PARAM_EXEC & PARAM_EXTERN: `d52eaa0948/src/include/nodes/primnodes.h (L211)` According to primnodes.h we should only run into PARAM_EXEC or PARAM_EXTERN	2019-09-30 23:49:14 +00:00
Hadi Moshayedi	5e97e5c98e	Don't push down queries when in subqueries/ctes	2019-09-30 14:22:05 -07:00
Nils Dijk	01b26cf91a	Disallow distributed functions for functions depending on an extension (#3049 ) DESCRIPTION: Disallow distributed functions for functions depending on an extension Functions depending on an extension cannot (yet) be distributed by citus. If we would allow this it would cause issues with our dependency following mechanism as we stop following objects depending on an extension. By not allowing functions to be distributed when they depend on an extension as well as not allowing to make distributed functions depend on an extension we won't break the ability to add new nodes. Allowing functions depending on extensions to be distributed at the moment could cause problems in that area.	2019-09-30 15:19:47 +02:00
Nils Dijk	473cbc0115	Propagate CREATE OR REPLACE FUNCTION to workers for distributed functions (#3043 ) DESCRIPTION: Propagate CREATE OR REPLACE FUNCTION Distributed functions could be replaced, which should be propagated to the workers to keep the function in sync between all nodes. Due to the complexity of deparsing the `CreateFunctionStmt` we actually produce the plan during the processing phase of our utilityhook. Since the changes have already been made in the catalog tables we can reuse `pg_get_functiondef` to get us the generated `CREATE OR REPLACE` sql.	2019-09-30 12:41:17 +02:00
Jelte Fennema	82ec918b29	Add explain summary support (#3046 ) Fixes #2922 and also adds explain analyze regression tests	2019-09-30 10:58:49 +02:00
Nils Dijk	9c2c50d875	Hookup function/procedure deparsing to our utility hook (#3041 ) DESCRIPTION: Propagate ALTER FUNCTION statements for distributed functions Using the implemented deparser for function statements to propagate changes to both functions and procedures that are previously distributed.	2019-09-27 22:06:49 +02:00
Philip Dubé	363409a0c2	Propagate REINDEX TABLE & REINDEX INDEX	2019-09-27 18:14:53 +00:00
Hanefi Onaldi	66b9f2e887	Deparsing and qualifiying for FUNCTION/PROCEDURE statements (#3014 ) This PR aims to add all the necessary logic to qualify and deparse all possible `{ALTER\|DROP} .. {FUNCTION\|PROCEDURE}` queries. As Procedures are introduced in PG11, the code contains many PG version checks. I tried my best to make it easy to clean up once we drop PG10 support. Here are some caveats: - I assumed that the parse tree is a valid one. There are some queries that are not allowed, but still are parsed successfully by postgres planner. Such queries will result in errors in execution time. (e.g. `ALTER PROCEDURE p STRICT` -> `STRICT` action is valid for functions but not procedures. Postgres decides to parse them nevertheless.)	2019-09-27 19:02:52 +02:00
Marco Slot	2868e02a3d	Implement SELECT function call delegation. When a function is marked as colocated with a distributed table, we try delegating queries of kind "SELECT func(...)" to workers. We currently only support this simple form, and don't delegate forms like "SELECT f1(...), f2(...)", "SELECT f1(...) FROM ...", or function calls inside transactions. As a side effect, we also fix the transactional semantics of DO blocks. Previously we didn't consider a DO block a multi-statement transaction. Now we do. Co-authored-by: Marco Slot <marco@citusdata.com> Co-authored-by: serprex <serprex@users.noreply.github.com> Co-authored-by: pykello <hadi.moshayedi@microsoft.com>	2019-09-27 09:13:25 -07:00
Onder Kalaci	219f3676a0	Improve some tests around local execution and CTE inlining on pg 12	2019-09-25 10:53:19 +02:00
Philip Dubé	4f60e3a149	Feedback	2019-09-24 17:31:09 +00:00
Marco Slot	c1e43b25da	Use the new create_distributed_function API in some call tests	2019-09-24 17:31:09 +00:00
Philip Dubé	90e1f1442a	Annotated tests for multi_mx_call. Co-authored-by: pykello <hadi.moshayedi@microsoft.com>	2019-09-24 17:31:09 +00:00
Philip Dubé	c95d46b4f3	Extend multi_mx_call with some of Hadi's suggestions for better test coverage	2019-09-24 17:31:09 +00:00
Philip Dubé	16b8d17aba	Test: multi_mx_call	2019-09-24 17:31:09 +00:00
Onder Kalaci	18de78f386	Relax the colocation checks for distributed functions As long as the types can be coerced, it is safe to pushdown functions.	2019-09-24 16:31:08 +02:00
Marco Slot	0dea485c68	Fix misspelling in multi_colocation_utils	2019-09-24 11:27:30 +02:00
Hadi Moshayedi	48078a30e6	Fix wait_until_metadata_sync() for postgres 12. Postgres 12 now has an assertion that the calls to WaitLatchOrSocket handle postmaster death.	2019-09-23 14:15:35 -07:00
Philip Dubé	06faba91c0	Include ifdefs for pg12 API changes, update local_shard_executiuon test to avoid CTE inlining	2019-09-23 20:22:35 +00:00
Onder Kalaci	d37745bfc7	Sync metadata to worker nodes after create_distributed_function Since the distributed functions are useful when the workers have metadata, we automatically sync it. Also, after master_add_node(). We do it lazily and let the deamon sync it. That's mainly because the metadata syncing cannot be done in transaction blocks, and we don't want to add lots of transactional limitations to master_add_node() and create_distributed_function().	2019-09-23 18:30:53 +02:00
Marco Slot	5f23b951c7	Support serial and smallserial when syncing metadata	2019-09-23 17:39:21 +02:00
Marco Slot	e58d76c5f6	Fix assert failure in bare SELECT FROM reference table FOR UPDATE in MX	2019-09-23 17:00:09 +02:00
SaitTalhaNisanci	71e7047e65	Enhance pg upgrade tests (#3002 ) * Enhance pg upgrade tests * Add a specific upgrade test for pg_dist_partition We store the index of distribution column, and when a column with an index that is smaller than distribution column index is dropped before an upgrade, the index should still match the distribution column after an upgrade	2019-09-23 17:37:14 +03:00
Marco Slot	d85d77634d	Handle anonymous composite types on the target list	2019-09-23 14:53:02 +02:00
Onder Kalaci	d7e2968120	Add parameters to create_distributed_function() With this commit, we're changing the API for create_distributed_function() such that users can provide the distribution argument and the colocation information.	2019-09-22 21:53:33 +02:00
Nils Dijk	72015faeb2	fix disable_object_propagation test for pg12	2019-09-19 17:40:24 +02:00
Hadi Moshayedi	d2f2acc4b2	Make master_update_node citus-ha friendly.	2019-09-18 09:32:54 -07:00
Hadi Moshayedi	76f3933b05	Add metadatasynced, and sync on master_update_node() Co-authored-by: pykello <hadi.moshayedi@microsoft.com> Co-authored-by: serprex <serprex@users.noreply.github.com>	2019-09-18 09:32:54 -07:00
Nils Dijk	db5d03931d	Feature disable object propagation (#2986 ) DESCRIPTION: Provide a GUC to turn of the new dependency propagation functionality In the case the dependency propagation functionality introduced in 9.0 causes issues to a cluster of a user they can turn it off almost completely. The only dependency that will still be propagated and kept track of is the schema to emulate the old behaviour. GUC to change is `citus.enable_object_propagation`. When set to `false` the functionality will be mostly turned off. Be aware that objects marked as distributed in `pg_dist_object` will still be kept in the catalog as a distributed object. Alter statements to these objects will not be propagated to workers and may cause desynchronisation.	2019-09-18 17:16:22 +02:00
Philip Dubé	ac14f1dd49	pg12 doesn't support client_min_messages as 'fatal'	2019-09-17 20:37:06 +00:00
Nils Dijk	2b7f5552c8	Fix: rename remote type on conflict (#2983 ) DESCRIPTION: Rename remote types during type propagation To prevent data to be destructed when a remote type differs from the type on the coordinator during type propagation we wanted to rename the type instead of `DROP CASCADE`. This patch removes the `DROP` logic and adds the creation of a rename statement to a free name.	2019-09-17 18:54:10 +02:00
Nils Dijk	0a3152d09c	Add feature flag to turn off create type propagation (#2982 ) DESCRIPTION: Add feature flag to turn off create type propagation When `citus.enable_create_type_propagation` is set to `false` citus will not propagate `CREATE TYPE` statements to the workers. Types are still distributed when tables that depend on these types are distributed.	2019-09-17 15:50:06 +02:00

... 9 10 11 12 13 ...

1766 Commits (1d7dda991f54a2da75febf9040a4356221e9a4ba)