citus

Commit Graph

Author	SHA1	Message	Date
Halil Ozan Akgul	7166901492	Fixes the bug where undistribute can drop Citus extension (cherry picked from commit `b255706189`)	2022-06-01 18:56:56 +03:00
Hanefi Onaldi	8ef705012a	Add normalization rules for flaky isolation tests We remove `<waiting ...>` and `<... completed>` outputs for some CREATE INDEX CONCURRENTLY commands since they can cause flakiness in some scenarios. Postgres calls WaitForOlderSnapshots() and this can cause CREATE INDEX CONCURRENTLY commands for shards to get blocked by each other for brief periods of time. The extra waits can pop-up, or they can get completed at different lines in the output files. To remedy that, we rename those indexes to be captured by the new normalization rule. (cherry picked from commit `52541c5802`)	2022-06-01 16:12:01 +03:00
Hanefi Onaldi	530aafd8ee	Grep logs for deterministic global_cancel test results (#5948 ) (cherry picked from commit `313104ab9b`)	2022-06-01 16:12:01 +03:00
Gledis Zeneli	c440cbb643	Fix memory error with citus_add_node reported by valgrind test (#5967 ) The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed. Also fix another place in the code that might have been susceptible to this issue. I tested on both multi-vg and multi-1-vg and the test were successful. (cherry picked from commit `beef392f5a`)	2022-06-01 13:06:54 +03:00
gledis69	a64e135a36	Revert "Copy data from heap tuples instead of using references" This reverts commit `50e8638ede`.	2022-06-01 13:06:38 +03:00
gledis69	50e8638ede	Copy data from heap tuples instead of using references The general rule is: If the data is used within the bounds of table_open ... table_close > no need to copy If the data is required for use even after the table is closed > copy (cherry picked from commit `dc9da7630f`)	2022-06-01 12:27:11 +03:00
jeff-davis	b34b1ce06b	Columnar: fix wraparound bug. (#5962 ) columnar_vacuum_rel() now advances relfrozenxid. Fixes #5958. (cherry picked from commit `74ce210f8b`)	2022-05-31 07:46:12 -07:00
Onder Kalaci	0d0dd0af1c	Show that no metadata is sent when disabled (cherry picked from commit `89c1ccb7a5`)	2022-05-30 17:01:49 +02:00
Onder Kalaci	3227d6551e	Do not send metadata changes during add node if citus.enable_metadata_sync is set to false (cherry picked from commit `7157152f6c`)	2022-05-30 17:01:44 +02:00
Onder Kalaci	d147d5d0c5	Avoid assertion failure on citus_add_node (cherry picked from commit `010a2a408e`)	2022-05-30 17:01:38 +02:00
Ahmet Gedemenli	4b5f749c23	Propagate dependent views upon distribution (#5950 ) (cherry picked from commit `26d927178c`)	2022-05-26 18:58:04 +03:00
Burak Velioglu	29c67c660d	Create view and materialized views with right schema and owner while altering the distributed table. To be able to alter view's owner without enforcing sequential mode. Alter view process functions have been udpated to use metadata connection.	2022-05-25 10:42:54 +03:00
Gledis Zeneli	6da2d41e00	Do not obtain AccessShareLock before actual lock (#5965 ) Do not obtain AccessShareLock before acquiring the distributed locks. Acquiring an AccessShareLock ensures that the relations which we are trying to get a distributed lock on will not be dropped in the time between when the LOCK command is issued and the LOCK commands are send to the worker. However, this also leads to distributed deadlocks in such scenarios: ```sql -- for dist lock acquiring order coor, w1, w2 -- on w2 LOCK t1 IN ACCESS EXLUSIVE MODE; -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock -- concurrently on w1 LOCK t1 IN ACCESS EXLUSIVE MODE; -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock -- acquire dist lock on coor, w1, gets blocked on local AccessShareLock on w2 -- on w2 continuation of the execution above -- starts to acquire dist locks and gets blocked on the coor by the lock acquired by w1 -- distributed deadlock ``` We opt for avoiding such deadlocks with the cost of the possibility of running into errors when the relations on which we are trying to acquire locks on get dropped. (cherry picked from commit `27ddb4fc8e`)	2022-05-23 17:28:37 +03:00
Onder Kalaci	2d5560537b	Due to new commits in master branch, outputs diverged	2022-05-23 09:36:38 +02:00
Onder Kalaci	8b0499c91a	Parallelize metadata syncing on node activate It is often useful to be able to sync the metadata in parallel across nodes. Also citus_finalize_upgrade_to_citus11() uses start_metadata_sync_to_primary_nodes() after this commit. Note that this commit does not parallelize all pieces of node activation or metadata syncing. Instead, it tries to parallelize potenially large parts of metadata, which is the objects and distributed tables (in general Citus tables). In the future, it would be nice to sync the reference tables in parallel across nodes. Create ~720 distributed tables / ~23450 shards ```SQL -- declaratively partitioned table CREATE TABLE github_events_looooooooooooooong_name ( event_id bigint, event_type text, event_public boolean, repo_id bigint, payload jsonb, repo jsonb, actor jsonb, org jsonb, created_at timestamp ) PARTITION BY RANGE (created_at); SELECT create_time_partitions( table_name := 'github_events_looooooooooooooong_name', partition_interval := '1 day', end_at := now() + '24 months' ); CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id); SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id'); SET client_min_messages TO ERROR; ``` across 1 node: almost same as expected ```SQL SELECT start_metadata_sync_to_primary_nodes(); Time: 15664.418 ms (00:15.664) select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node; Time: 14284.069 ms (00:14.284) ``` across 7 nodes: ~3.5x improvement ```SQL SELECT start_metadata_sync_to_primary_nodes(); ┌──────────────────────────────────────┐ │ start_metadata_sync_to_primary_nodes │ ├──────────────────────────────────────┤ │ t │ └──────────────────────────────────────┘ (1 row) Time: 25711.192 ms (00:25.711) -- across 7 nodes select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node; Time: 82126.075 ms (01:22.126) ``` (cherry picked from commit `dd02e1755f`)	2022-05-23 09:25:31 +02:00
Onder Kalaci	513e073206	Fixes a bug that prevents dropping/altering indexes There are two problems in this area. First, when there are expressions on the index name, we should call `transformIndexExpression()` before generating the index name. That is what Postgres does. Second, because of `40c24bfef9` PG 13 and PG 14 generates different names for indexes with function calls even for local PG tables. Assume we have: ```SQL create table t(id int); select create_distributed_table('t', 'id'); create index ON t (my_very_boring_function(id)); ``` On PG 13, the name of the index is `t_expr_idx` ```SQL \d t Table "public.t" ┌────────┬─────────┬───────────┬──────────┬─────────┐ │ Column │ Type │ Collation │ Nullable │ Default │ ├────────┼─────────┼───────────┼──────────┼─────────┤ │ id │ integer │ │ │ │ └────────┴─────────┴───────────┴──────────┴─────────┘ Indexes: "t_expr_idx" btree (my_very_boring_function(id::bigint)) ``` On PG 14, the name of the index is `t_my_very_boring_function_idx` ```SQL \d t Table "public.t" ┌────────┬─────────┬───────────┬──────────┬─────────┐ │ Column │ Type │ Collation │ Nullable │ Default │ ├────────┼─────────┼───────────┼──────────┼─────────┤ │ id │ integer │ │ │ │ └────────┴─────────┴───────────┴──────────┴─────────┘ Indexes: "t_my_very_boring_function_idx" btree (my_very_boring_function(id::bigint)) ``` The second issue is not very critical. The important part is that we adjust regression tests to drop all the indexes, which ensures the index names are sane on any version. (cherry picked from commit `2cc4053fc1`)	2022-05-23 09:22:25 +02:00
Onder Kalaci	4b5cb7e2b9	Mark existing views as distributed when upgrade to 11.0+ We have a mechanism which ensures that newly distributed objects are recorded during `alter extension citus update`. However, the logic was lacking "view"s. With this commit, we make sure that existing views are also marked as distributed during upgrade. (cherry picked from commit `ee45e7bfbf`)	2022-05-23 09:22:17 +02:00
Gledis Zeneli	97b453e679	Add TRUNCATE arbitrary config tests (#5848 ) Adds TRUNCATE arbitrary config tests. Also adds the ability to skip tests from particular configs.	2022-05-20 19:53:18 +02:00
Marco Slot	8c5035c0a5	Improve nested execution checks and add GUC to disable	2022-05-20 19:35:59 +02:00
Marco Slot	7c6784b1f4	Add caching for functions that check the backend type	2022-05-20 19:35:52 +02:00
Marco Slot	556f43f24a	Fix prepared statement bug when switching from local to remote execution	2022-05-20 19:35:45 +02:00
gledis69	909b72b027	Add distributing lock command support (cherry picked from commit `4731630741`)	2022-05-20 18:02:34 +03:00
Gledis Zeneli	3f282c660b	Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930 ) Breaking down #5899 into smaller PR-s This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views). This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command: TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3; We generate and send the following command to all the workers in metadata: ```sql SEL citus.enable_ddl_propagation TO FALSE; LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE; SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE'); SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE'); LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE; SEL citus.enable_ddl_propagation TO TRUE; ``` Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations. When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command. (cherry picked from commit `4c6f62efc6`)	2022-05-20 17:24:44 +03:00
Marco Slot	73fd4f7ded	Allow distributed execution from run_command_on_* functions	2022-05-20 15:42:50 +02:00
Burak Velioglu	8229d4b7ee	Add ALTER VIEW support Adds support for propagation ALTER VIEW commands to - Change owner of view - SET/RESET option - Rename view and view's column name - Change schema of the view Since PG also supports targeting views with ALTER TABLE commands, related code also added to direct such ALTER TABLE commands to ALTER VIEW commands while sending them to workers.	2022-05-20 12:18:14 +03:00
Burak Velioglu	0cf769c43a	Introduce CREATE/DROP VIEW Adds support for propagating create/drop view commands and views to worker node while scaling out the cluster. Since views are dropped while converting the table type, metadata connection will be used while propagating view commands to not switch to sequential mode.	2022-05-20 12:18:02 +03:00
Burak Velioglu	591f2565cc	Use object address instead of relation id on DDLJob to decide on syncing metadata	2022-05-20 12:17:56 +03:00
Ahmet Gedemenli	ddfcbfdca1	Add tests for materialized views	2022-05-20 12:17:48 +03:00
Ahmet Gedemenli	16071fac1d	Add view tests to arbitrary configs	2022-05-20 12:17:41 +03:00
Onder Kalaci	9c4e3329f6	Rename metadata sync to node metadata sync where applicable	2022-05-19 11:00:51 +02:00
Onder Kalaci	36f641c586	Serialize reference table modifications with node changes & restore point With Citus MX enabled, when a reference table is modified, it does some operations on the first worker node(e.g., acquire locks). If node metadata is locked (via add node or create restore point), the changes to the reference tables should be blocked.	2022-05-19 11:00:51 +02:00
Onder Kalaci	5fe384329e	Adds "sync" option to citus_disable_node() UDF	2022-05-19 11:00:51 +02:00
Marco Slot	c20732142e	Add a run_command_on_coordinator function	2022-05-19 10:41:10 +02:00
Marco Slot	082a14656d	Fix downgrade scripts and add new downgrade tests	2022-05-19 10:37:56 +02:00
Marco Slot	33dede5b75	Add a citus_is_coordinator function	2022-05-19 10:36:22 +02:00
Nils Dijk	5e4c0e4bea	Merge pull request #5931 from citusdata/refactor/dedupe-object-propagation Refactor: reduce complexity and code duplication for Object Propagation	2022-05-18 18:06:24 +02:00
Ahmet Gedemenli	c2d9e88bf5	Fix schema name bug for sequences (#5937 )	2022-05-18 17:29:30 +02:00
Ahmet Gedemenli	88369b6b23	Merge pull request #5934 from citusdata/fix-alter-statistics-nspname Fix alter statistics namespace name	2022-05-18 17:29:30 +02:00
Onder Kalaci	b7a39a232d	Refrain reading the metadata cache for all tables during upgrade First, it is not needed. Second, in the past we had issues regarding this: https://github.com/citusdata/citus/pull/4344 When I create 10k tables, ~120K shards, this saves 40Mb of memory during ALTER EXTENSION citus UPDATE. Before the change: MetadataCacheMemoryContext: 41943040 ~ 40MB After the change: MetadataCacheMemoryContext: 8192 (cherry picked from commit `f193e16a01`)	2022-05-06 13:53:43 +02:00
Marco Slot	e8b41d1e5b	Convert citus.hide_shards_from_app_name_prefixes to citus.show_shards_for_app_name_prefixes	2022-05-05 13:24:23 +02:00
Onder Kalaci	b4a65b9c45	Do not set coordinator's metadatasynced column to false After a disable_node (cherry picked from commit `5fc7661169`)	2022-04-25 09:35:00 +02:00
Onder Kalaci	6ca3478c8d	Do not assign distributed transaction ids for local execution In the past, for all modifications on the local execution, we enabled 2PC (with `6a7ed7b309`). This also required us to enable coordinated transactions via https://github.com/citusdata/citus/pull/4831 . However, it does have a very substantial impact on the distributed deadlock detection. The distributed deadlock detection is designed to avoid single-statement transactions because they cannot lead to any actual deadlocks. The implementation is to skip backends without distributed transactions are assigned. Now that we assign single statement local executions in the lock graphs, we are conflicting with the design of distributed deadlock detection. In general, we should fix it. However, one might think that it is not a big deal, even if the processes show up in the lock graphs, the deadlock detection should not be causing any false positives. That is false, unless https://github.com/citusdata/citus/issues/1803 is fixed. Now that local processes are considered as a single distributed backend, the lock graphs might find: local execution 1 [tx id: 1] -> any local process [tx id: 0] any local process [tx id: 0] -> local execution 2 [tx id: 2] And, decides that there is a distributed deadlock. This commit is: (a) right thing to do, as local execuion should not need any distributed tx id (b) Eliminates performance issues that might come up with deadlock detection does a lot of unncessary checks (c) After moving local execution after the remote execution via https://github.com/citusdata/citus/pull/4301, the vauge requirement for assigning distributed tx ids are already gone. (cherry picked from commit `a2debe0f02`)	2022-04-25 09:34:32 +02:00
Hanefi Onaldi	86df61cae8	Bump Citus to 11.0.1_beta	2022-04-11 16:09:11 +03:00
Burak Velioglu	6eed51b75c	Create function in transaction according to create object propagation guc (cherry picked from commit `5d9599f964`)	2022-04-11 13:01:14 +03:00
Nils Dijk	675ba65f22	Implement DOMAIN propagation for citus	2022-04-08 16:18:02 +02:00
Marco Slot	d611a50a80	Allow adding a unique constraint with an index	2022-04-07 16:41:10 +02:00
Marco Slot	c5797030de	Fix EXPLAIN ANALYZE JSON format for subplans	2022-04-07 16:00:12 +02:00
Marco Slot	a74d991445	Handle user-defined type parameters in EXPLAIN ANALYZE	2022-04-07 11:37:43 +02:00
Marco Slot	cb9e510e40	Add TABLESAMPLE support	2022-04-01 16:48:29 +02:00
Onder Kalaci	e336b92552	Only hide shards from client backends and pg bg workers The aim of hiding shards is to hide shards from client applications. Certain bg workers (such as pg_cron or Citus maintanince daemon) should be treated like client applications because users can run queries from such bg workers. And, these bg workers should follow the similar application_name checks as client backeends. Certain other bg workers, such as logical replication or postgres' parallel workers, should never hide shards. They are internal operations. Similarly the other backend types like the walsender or checkpointer or autovacuum should never hide shards. (cherry picked from commit `9043a1ed3f`)	2022-03-30 17:44:03 +02:00
Hanefi Onaldi	4784d5579b	Bump Citus to 11.0.0_beta	2022-03-24 16:17:47 +03:00
Halil Ozan Akgul	c843ebe48e	Turn metadata sync on in arbitrary config tests	2022-03-23 15:19:52 +03:00
Jelte Fennema	3a44fa827a	Add versions of forboth that don't need ListCell (#5856 ) We've had custom versions of Postgres its `foreach` macro which with a hidden ListCell for quite some time now. People like these custom macros, because they are easier to use and require less boilerplate. This adds similar custom versions of Postgres its `forboth` macro. Now you don't need ListCells anymore when looping over two lists at the same time.	2022-03-23 14:50:36 +03:00
Ahmet Gedemenli	b5448e43e3	Fix aggregate signature bug (#5854 )	2022-03-23 13:42:03 +03:00
Burak Velioglu	db9f0d926c	Add support for deparsing ALTER FUNCION ... SUPPORT ... commands	2022-03-22 21:55:55 +03:00
Onder Kalaci	af4ba3eb1f	Remove citus.enable_cte_inlining GUC In Postgres 12+, users can adjust whether to inline/not inline CTEs by [NOT] MATERIALIZED keywords. So, this GUC is already useless.	2022-03-22 17:14:44 +01:00
Halil Ozan Akgul	4690c42121	Fixes ALTER COLLATION encoding does not exist bug	2022-03-22 17:42:20 +03:00
Marco Slot	32c23c2775	Disallow re-partition joins when no hash function defined	2022-03-22 13:42:53 +01:00
Onur Tirtir	11433ed357	Create DDL job for create enum command in postprocess as we do for composite types Since now we don't throw an error for enums that user attempts creating in temp schema, the preprocess / DDL job that contains the prepared statement (to idempotently create the enum type) gets executed. As a result, we were emitting the following warning because of the error the underlying worker connection throws: ```sql WARNING: cannot PREPARE a transaction that has operated on temporary objects CONTEXT: while executing command on localhost:xxxxx WARNING: connection to the remote node localhost:xxxxx failed with the following error: another command is already in progress ERROR: cannot PREPARE a transaction that has operated on temporary objects CONTEXT: while executing command on localhost:xxxxx ```	2022-03-22 15:09:23 +03:00
Onur Tirtir	dc31102630	Locally create objects having a dependency that we cannot distribute We were already doing so for functions & types believing that this cannot be the case for other object types. However, as in #5830, we cannot distribute an object that user attempts creating in temp schema. Even more, this doesn't only apply to functions and types but also to many other object types. So with this commit, we teach preprocess/postprocess functions (that need to create dependencies on worker nodes) how to skip trying to distribute such objects. We also start identifying temp schemas as the objects that we don't know how to propagate to worker nodes so that we can simply create objects locally if user attempts creating them in a temp schema. There are 36 callers of `EnsureDependenciesExistOnAllNodes` in the codebase atm and for the most we still need to throw a hard error (i.e.: not use `DeferErrorIfHasUnsupportedDependency` beforehand), such as: i) user explicitly wants to create a distributed object * CreateCitusLocalTable * CreateDistributedTable * master_create_worker_shards * master_create_empty_shard * create_distributed_function * EnsureExtensionFunctionCanBeDistributed ii) we don't want to skip altering distributed table on worker nodes * PostprocessIndexStmt * PostprocessCreateTriggerStmt * PostprocessCreateStatisticsStmt iii) object is already distributed / handled by Citus before, so we aren't okay with not propagating the ALTER command * PostprocessAlterTableSchemaStmt * PostprocessAlterCollationOwnerStmt * PostprocessAlterCollationSchemaStmt * PostprocessAlterDatabaseOwnerStmt * PostprocessAlterExtensionSchemaStmt * PostprocessAlterFunctionOwnerStmt * PostprocessAlterFunctionSchemaStmt * PostprocessAlterSequenceOwnerStmt * PostprocessAlterSequenceSchemaStmt * PostprocessAlterStatisticsSchemaStmt * PostprocessAlterStatisticsOwnerStmt * PostprocessAlterTextSearchConfigurationSchemaStmt * PostprocessAlterTextSearchDictionarySchemaStmt * PostprocessAlterTextSearchConfigurationOwnerStmt * PostprocessAlterTextSearchDictionaryOwnerStmt * PostprocessAlterTypeSchemaStmt * PostprocessAlterForeignServerOwnerStmt iv) we already cannot create those objects in temp schemas, so skipping for now * PostprocessCreateExtensionStmt * PostprocessCreateForeignServerStmt Also note that there are 3 more callers of `EnsureDependenciesExistOnAllNodes` in enterprise in addition to those 36 but we don't need to do anything specific about them due to the same reasoning given in iii).	2022-03-22 15:09:23 +03:00
Halil Ozan Akgul	50bace9cfb	Fixes the type names that start with underscore bug	2022-03-22 14:24:30 +03:00
Halil Ozan Akgul	4dbc760603	Introduces citus_coordinator_node_id	2022-03-22 10:34:22 +03:00
Hanefi Onaldi	9f204600af	Allow all possible option types for text search objects (#5838 )	2022-03-21 20:01:53 +01:00
Halil Ozan Akgül	6c05e4b35c	Add check_mx to operations schedule (#5818 )	2022-03-21 19:09:26 +03:00
Burak Velioglu	d4625ec6a1	Add support for zero-argument polymorphic aggregates	2022-03-21 16:10:40 +03:00
Ahmet Gedemenli	46c6630328	Qualify CREATE AGGREGATE stmts in Preprocess (#5834 )	2022-03-21 13:55:09 +03:00
Burak Velioglu	2c2064bf36	Create type locally if it has undistributable dependency	2022-03-18 18:23:32 +03:00
Marco Slot	055bbd6212	Use coordinated transaction when there are multiple queries per task	2022-03-18 15:04:27 +01:00
Marco Slot	cab243218d	Avoid locks in relation_is_a_known_shard	2022-03-18 14:37:39 +01:00
Marco Slot	5bb5359da0	Fix worker node version check	2022-03-17 13:23:02 +01:00
Marco Slot	22a18fc1f2	Fix typo in upgrade function	2022-03-17 13:23:02 +01:00
Jelte Fennema	68bfc8d1c0	Use good initdb options in arbitrary configs tests (#5802 ) In `pg_regress_multi.pl` we're running `initdb` with some options that the `common.py` `initdb` is currently not using. All these flags seem reasonable, so this brings `common.py` in line with `pg_regress_multi.pl`. In passing change the `--nosync` flag to `--no-sync`, since that's what the PG documentation lists as the official option name (but both work).	2022-03-17 13:22:23 +01:00
Jelte Fennema	b0e406a478	Disable ddl propagation when creating users in arbitrary config tests (#5814 ) This should help with failing enterprise tests.	2022-03-16 15:12:20 +01:00
Ahmet Gedemenli	eddfea18c2	Fix role creation issue on schema tests (#5812 )	2022-03-16 13:49:28 +01:00
Burak Velioglu	333c73a53c	Drop distributed table on worker with ProcessUtilityParseTree	2022-03-15 17:42:01 +03:00
Gledis Zeneli	56ab64b747	Patches #5758 with some more error checks (#5804 ) Add error checks to detect failed connection and don't ping secondary nodes to detect self reference.	2022-03-15 15:02:47 +03:00
Hanefi Onaldi	c0cd8f3d56	Wait until metadata sync before testing distributed sequences	2022-03-15 10:28:51 +01:00
Marco Slot	e42a798707	Always use RowShareLock in pg_dist_node when syncing metadata	2022-03-15 10:28:51 +01:00
Ahmet Gedemenli	36b33e2491	Add sequence tests to arbitrary config (#5771 ) Add sequence tests to arbitrary config (#5771)	2022-03-14 19:16:24 +03:00
Jelte Fennema	41c6393e82	Parallelize cluster setup in arbitrary config tests (#5738 ) Cluster setup time is significant in arbitrary configs. We can parallelize this a bit more. Runtime of the following command decreases from ~25 seconds to ~22 seconds on my machine with this change: ``` make -C src/test/regress/ check-arbitrary-base CONFIGS=CitusDefaultClusterConfig EXTRA_TESTS=prepared_statements_1 ``` Currently we can only run different configs in parallel. However, when working on a feature or trying to fix a bug this is not important. In those cases you simply want to run a single test file on a single config. And you want to run that every time you made a change to the code that you think fixes the issue. This PR allows parallelising running of bash commands. So `initdb` and `pg_ctl start` is run in parallel for all nodes in the cluster. Instead of one waiting for the other. When you run the above command nothing is being run in parallel. After this PR, cluster setup is being run in parallel.	2022-03-14 16:42:20 +01:00
Jelte Fennema	5063257252	Disable fsync in arbitrary config tests (#5800 ) We have fsync enabled for regular tests already in `pg_regress_multi.pl`. This does the same for the arbitrary config tests. On my machine this changes the runtime from the following command from ~37 to ~25 seconds: ```bash make -C src/test/regress/ check-arbitrary-configs CONFIGS=CitusDefaultClusterConfig ```	2022-03-14 18:12:38 +03:00
Onder Kalaci	338752d96e	Guard against hard wait event set errors Similar to https://github.com/citusdata/citus/pull/5158, but this time instead of the executor, use this in all the remaining places.	2022-03-14 14:35:56 +01:00
Onder Kalaci	953951007c	Move wait event error checks to connection manager	2022-03-14 14:35:56 +01:00
Onur Tirtir	216b9b5b7a	Fix an incorrect error message related with fkeys between replicated dist tables (#5796 ) This is not supported in enterprise too.	2022-03-14 14:34:09 +01:00
Hanefi Onaldi	b24e1dfccc	Propagate text search commands to all worker nodes (#5797 ) Here is a list of some functions, and the `TargetWorkerSet` parameters they supply to `NodeDDLTaskList`: PostprocessCreateTextSearchConfigurationStmt - NON_COORDINATOR_NODES PreprocessDropTextSearchConfigurationStmt - NON_COORDINATOR_METADATA_NODES PreprocessAlterTextSearchConfigurationSchemaStmt - NON_COORDINATOR_METADATA_NODES I guess this means that, if metadata syncing is disabled on the node, we may have some issues. Consider the following: Let's assume the user has metadata syncing disabled. 2 workers. `CREATE TEXT SEARCH CONFIGURATION ...` will get propagated to all workers. `ALTER ... CONFIGURATION ...` will not get propagated to workers. After adding a new non-metadata node, the new node will get the altered configuration as it reads from catalog. At this point CONFIGURATION definitions got diverged in the cluster. I suggest that we always use `NON_COORDINATOR_METADATA_NODES` in all the TEXT SEARCH operations here.	2022-03-14 14:44:34 +03:00
Onder Kalaci	db529facab	Only change the sequence types if the target column type is a supported sequence type Before this commit, we erroneously converted the sequence type to the column's type it is used. However, it is possible that the sequence is used in an expression which then converted to a type that cannot be a sequence, such as text. With this commit, we only try this conversion if the column type is a supported sequence type (e.g., smallint, int and bigint). Note that we do this conversion because if the column type is a bigint and the sequence is NOT a bigint, users would be in trouble because sequences would generate values that are out of the range of the column. (The other ways are already not supported such as the column is int and the sequence is bigint would fail on the worker.) In other words, with this commit, we scope this optimization only when the target column type is a supported sequence type. Otherwise, we let users to more freely use the sequences.	2022-03-11 16:06:00 +01:00
Halil Ozan Akgül	37fafd007c	Turn metadata sync on in isolation_update_node and isolation_update_node_lock_writes tests (#5779 )	2022-03-11 16:39:20 +03:00
Ahmet Gedemenli	d06146360d	Support GRANT ON SCHEMA commands in CREATE SCHEMA statements (#5789 ) * Support GRANT ON SCHEMA commands in CREATE SCHEMA statements * Add test * add comment * Rename to GetGrantCommandsFromCreateSchemaStmt	2022-03-11 14:47:45 +03:00
Jelte Fennema	e5d5c7be93	Start erroring out for unsupported lateral subqueries (#5753 ) With the introduction of #4385 we inadvertently started allowing and pushing down certain lateral subqueries that were unsafe to push down. To be precise the type of LATERAL subqueries that is unsafe to push down has all of the following properties: 1. The lateral subquery contains some non recurring tuples 2. The lateral subquery references a recurring tuple from outside of the subquery (recurringRelids) 3. The lateral subquery requires a merge step (e.g. a LIMIT) 4. The reference to the recurring tuple should be something else than an equality check on the distribution column, e.g. equality on a non distribution column. Property number four is considered both hard to detect and probably not used very often. Thus this PR ignores property number four and causes query planning to error out if the first three properties hold. Fixes #5327	2022-03-11 11:59:18 +01:00
Halil Ozan Akgül	c9913b135c	Turn metadata sync on in isolation_ref2ref_foreign_keys test (#5791 )	2022-03-11 13:30:11 +03:00
Halil Ozan Akgül	2edaf0971c	Turn metadata sync on in isolation reference copy vs all (#5790 ) * Turn metadata sync on in isolation_reference_copy_vs_all test * Update the output of isolation_reference_copy_vs_all test	2022-03-11 11:27:46 +03:00
Hanefi Onaldi	b0eb685101	Add support for TEXT SEARCH DICTIONARY objects TEXT SEARCH DICTIONARY objects depend on TEXT SEARCH TEMPLATE objects. Since we do not yet support distributed TS TEMPLATE objects, we skip dependency checks for text search templates, similar to what we do for roles. The user is expected to manually create the TEXT SEARCH TEMPLATE objects before a) adding new nodes, b) creating TEXT SEARCH DICTIONARY objects.	2022-03-11 03:40:20 +03:00
Marco Slot	49467e27e6	Ensure worker_save_query_explain_analyze always fully qualifies types (#5776 ) Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-03-10 07:30:11 -08:00
Gledis Zeneli	2cb02bfb56	Fix node adding itself with citus_add_node leading to deadlock (Fix #5720 ) (#5758 ) If a worker node is being added, a command is sent to get the server_id of the worker from the pg_dist_node_metadata table. If the worker's id is the same as the node executing the code, we will know the node is trying to add itself. If the node tries to add itself without specifying `groupid:=0` the operation will result in an error.	2022-03-10 17:46:33 +03:00
Burak Velioglu	547f6b18ef	Ensure dependencies exists for all alter owner commands	2022-03-10 16:37:55 +03:00
Ahmet Gedemenli	4312486141	Remove unnecessary schema name from CREATE SCHEMA stmts (#5785 )	2022-03-10 15:19:14 +03:00
Hanefi Onaldi	d153c2de0d	Fix some typos in comments	2022-03-10 15:03:26 +03:00
Ahmet Gedemenli	551a7d1383	Support CREATE SCHEMA without name (#5782 )	2022-03-10 13:38:00 +03:00
Marco Slot	8e43c8094d	Fix CREATE EXTENSION propagation with custom version	2022-03-09 17:40:50 +01:00
Marco Slot	7559ad12ba	Change create_object_propagation default to immediate	2022-03-09 17:40:50 +01:00
Burak Velioglu	bbe1b16125	Check whether the object has unsupported or circular dependency	2022-03-09 16:37:53 +03:00
Jelte Fennema	c8839de68b	Don't use cascading deletes in Citus 11 migration script (#5767 ) Using CASCADE in a DELETE can inadvertently delete things we don't intend to. It's safer to fail hard and make the user delete depending things manually.	2022-03-09 14:35:23 +01:00
Halil Ozan Akgül	333bcc7948	Global PID Helper Functions (#5768 ) * Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions * Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions * Add tests	2022-03-09 13:15:59 +03:00
Ahmet Gedemenli	264cf78842	Disable use_citus_managed_tables for Postgres config (#5773 )	2022-03-08 17:13:49 +03:00
Onder Kalaci	c32b2de1a7	Improve citus_lock_waits 1) Remove useless columns 2) Show backends that are blocked on a DDL even before gpid is assigned 3) One minor bugfix, where we clear distributedCommandOriginator properly.	2022-03-07 11:10:44 +01:00
Ahmet Gedemenli	2a3c0c1914	Revert upgrade script changes (#5757 )	2022-03-07 13:04:58 +03:00
Onder Kalaci	24fcd2a88c	Handle dropping the partitioned tables properly Before this commit, we might be leaving some metadata on the workers. Now, we handle DROP SCHEMA .. CASCADE properly to avoid any metadata leakage.	2022-03-07 10:02:54 +01:00
Nils Dijk	3801576dfb	Move pg_dist_object to pg_catalog (#5765 ) DESCRIPTION: Move pg_dist_object to pg_catalog Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`. By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema. With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes. Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.	2022-03-04 17:40:38 +00:00
Halil Ozan Akgul	0500a62515	Updates citus_dist_stat_activity to use citus_stat_activity	2022-03-04 17:28:17 +03:00
Ahmet Gedemenli	b8eedcd261	Notice when create_distributed_function called without params (#5752 ) * Notice when create_distributed_function called without params * Move variable comments to top * Add valid check for cache entry * add objtype to notice msg * update test outputs * Add more tests * Address feedback	2022-03-04 17:26:39 +03:00
Önder Kalacı	bd6a6563ff	Merge branch 'master' into calculate_gpid	2022-03-04 11:34:12 +01:00
Burak Velioglu	cb6d67a9a9	Make sure that all dependencies of citus tables can be distributed	2022-03-03 20:08:09 +03:00
Onder Kalaci	c7b67ba0ea	Add citus_backend_gpid() And also citus_calculate_gpid(nodeId,pid). These UDFs are just wrappers for the existing functions. Useful for testing and simple manipulation of citus_stat_activity.	2022-03-03 15:29:40 +01:00
Halil Ozan Akgul	06a0509b1a	Introduces citus_stat_activity view	2022-03-03 16:19:20 +03:00
Marco Slot	ddf7cf29f3	Sync pg_dist_colocation as a batch	2022-03-03 12:48:48 +01:00
Marco Slot	3ba61244b8	Synchronize pg_dist_colocation metadata	2022-03-03 11:01:59 +01:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Onder Kalaci	e80a36c4b6	Improve visibility rules for non-priviledge roles It seems like our approach is way too restrictive and some places are wrong. Now, we follow very similar approach to pg_stat_activity. Some of the changes are pre-requsite for implementing citus_dist_stat_activity via citus_stat_activity.	2022-03-02 18:04:01 +01:00
Onder Kalaci	35ec9721b4	Add a new API for enabling Citus MX for clusters upgrading from earlier versions Clusters created pre-Citus 11 mostly didn't have metadata sync enabled. For those clusters, we add a utility UDF which fixes some minor issues and sync the necessary objects to the workers.	2022-03-02 17:02:55 +01:00
Onder Kalaci	98751058a9	Add Primary key to the table Otherwise enterprise tests fail	2022-03-02 12:03:59 +01:00
Marco Slot	dcfbb51b6b	Revert "Build Columnar.so and make Citus depends on it (#5661 )" This reverts commit `a4133c69e8`.	2022-03-02 11:33:15 +01:00
Ahmet Gedemenli	e1809af376	Propagate CREATE AGGREGATE commands	2022-03-02 10:52:43 +03:00
Onder Kalaci	b79a0052a4	Drop function in the tests on a never version As dropping the function now relies on pg_dist_object, which exists with 9.0+	2022-03-02 08:45:35 +01:00
ywj	a4133c69e8	Build Columnar.so and make Citus depends on it (#5661 ) * [Columnar] Build columnar.so and let citus depends on it Co-authored-by: Yanwen Jin <yanwjin@microsoft.com> Co-authored-by: Ying Xu <32597660+yxu2162@users.noreply.github.com> Co-authored-by: jeff-davis <Jeffrey.Davis@microsoft.com>	2022-03-01 23:31:14 +03:00
Nils Dijk	65bd540943	Feature: configure object propagation behaviour in transactions (#5724 ) DESCRIPTION: Add GUC to control ddl creation behaviour in transactions Historically we would _not_ propagate objects when we are in a transaction block. Creation of distributed tables would not always work in sequential mode, hence objects created in the same transaction as distributing a table that would use the just created object wouldn't work. The benefit was that the user could still benefit from parallelism. Now that the creation of distributed tables is supported in sequential mode it would make sense for users to force transactional consistency of ddl commands for distributed tables. A transaction could switch more aggressively to sequential mode when creating new objects in a transaction. We don't change the default behaviour just yet. Also, many objects would not even propagate their creation when the transaction was already set to sequential, leaving the probability of a self deadlock. The new policy checks solve this discrepancy between objects as well.	2022-03-01 17:29:31 +03:00
Burak Velioglu	f17872aed4	Expand functions while resolving dependencies	2022-03-01 17:08:46 +03:00
Gledis Zeneli	b825232ecb	Handle rebalance / replication when a node is disabled (Fix #5664 ) (#5729 ) The issue in question is caused when rebalance / replication call `FullShardPlacementList` which returns all shard placements (including those in disabled nodes with `citus_disable_node`). Eventually, `FindFillStateForPlacement` looks for the state across active workers and fails to find a state for the placements which are in the disabled workers causing a seg fault shortly after. Approach: * `ActivePlacementHash` was not using the status of the shard placement's node to determine if the node it is active. Initially, I just fixed that. * Additionally, I refactored the code which handles active shards in replication / rebalance to: * use a single function to determine if a shard placement is active. * do the shard active shard filtering before calling `RebalancePlacementUpdates` and `ReplicationPlacementUpdates`, so test methods like `shard_placement_rebalance_array` and `shard_placement_replication_array` which have different shard placement active requirements can do their own filtering while using the same rebalance / replicate logic that `rebalance_table_shards` and `replicate_table_shards` use. Fix #5664	2022-02-25 19:54:30 +03:00
Hanefi Onaldi	6c25eea62f	Fix some typos in comments	2022-02-24 19:48:52 +03:00
Onder Kalaci	df95d59e33	Drop support for CitusInitiatedBackend CitusInitiatedBackend was a pre-mature implemenation of the whole GlobalPID infrastructure. We used it to track whether any individual query is triggered by Citus or not. As of now, after GlobalPID is already in place, we don't need CitusInitiatedBackend, in fact it could even be wrong.	2022-02-24 12:12:43 +01:00
Marco Slot	0c4e3cb69c	Drop worker_partition_query_result on downgrade	2022-02-24 10:18:56 +01:00
Hanefi Onaldi	7bd6c2c9ac	Isolation tests for various ddl operations and metadata sync	2022-02-24 03:19:56 +03:00
Hanefi Onaldi	f4e8af2c22	Do not acquire locks on node metadata explicitly	2022-02-24 03:19:56 +03:00
Hanefi Onaldi	b70949ae8c	Lock nodes when building ddl task lists	2022-02-24 03:19:56 +03:00
Marco Slot	ef1ceb3953	Only use a single placement for map tasks	2022-02-23 19:40:21 +01:00
Marco Slot	8de802eec5	Enable local_shared_pool_size 5 in arbitrary configs test	2022-02-23 19:40:21 +01:00
Marco Slot	490765a754	Enable re-partition joins after local execution	2022-02-23 19:40:21 +01:00
Marco Slot	3cd9aa655a	Stop using citus.binary_worker_copy_format	2022-02-23 19:40:21 +01:00
Marco Slot	5ac0d31e8b	Fix re-partition hash range generation	2022-02-23 19:40:21 +01:00
Marco Slot	72d8fde28b	Use intermediate results for re-partition joins	2022-02-23 19:40:21 +01:00
Nils Dijk	1fb970224e	Fix: partitioned index dependencies (#5741 ) #5685 introduced the resolution of dependencies for indices. This missed support for indices on partitioned tables. This change adds support for partitioned indices to the dependency resolution code.	2022-02-23 17:53:26 +03:00
Jelte Fennema	e1afd30263	Speed up test runs on WSL2 a lot (#5736 ) It turns out `whereis` is incredibly slow on WSL2 (at least on my machine): ``` $ time whereis diff diff: /usr/bin/diff /usr/share/man/man1/diff.1.gz real 0m0.408s user 0m0.010s sys 0m0.101s ``` This command is run by our custom `diff` script, which is run for every test file that is run. So this adds lots of unnecessary runtime time to tests. This changes our custom `diff` script to only call `whereis` in the strange case that `/usr/bin/diff` does not exist. The impact of this small change on the total runtime of the tests on WSL is huge. As an example the following command takes 18 seconds without this change and 7 seconds with it: ``` make -C src/test/regress/ check-arbitrary-configs CONFIGS=PostgresConfig ```	2022-02-23 13:03:29 +01:00
Ahmet Gedemenli	8b9402540f	Add use_citus_managed_tables to arbitrary configs (cherry picked from commit 4e93afd1f78854e1aaab63690c441b0b0598a82c) (cherry picked from commit `0295fe2f5b`) (cherry picked from commit 878510725fab9cb6870b4504e0b1f055d7bbc68d)	2022-02-22 11:39:30 +03:00
Teja Mupparti	a62901396b	Allow unsafe triggers via a GUC	2022-02-21 22:45:17 -08:00
Onder Kalaci	95d5918967	Properly set worker_query and use	2022-02-21 18:22:33 +01:00
Onder Kalaci	dffcafc096	Use global pids in citus_lock_waits	2022-02-21 17:46:34 +01:00
Onder Kalaci	331af3dce8	Dumping wait edges becomes optionally scan all backends Before this commit, dumping wait edges can only be used for distributed deadlock detection purposes. With this commit, we open the possibility that we can use it for any backend.	2022-02-21 17:37:07 +01:00
Halil Ozan Akgul	f6cd4d0f07	Overrides pg_cancel_backend and pg_terminate_backend to accept global pid	2022-02-21 16:41:35 +03:00
Ahmet Gedemenli	c1d5ca9896	Do distributed check first, for DropSchema stmts	2022-02-21 14:43:04 +03:00
Ahmet Gedemenli	28aa715ce2	Add test for citus local tables with dropped columns	2022-02-21 12:07:17 +03:00
Ahmet Gedemenli	2bc6a00408	Refactor CreateDistributedTable to take column name	2022-02-21 12:07:17 +03:00

1 2 3 4 5 ...

3793 Commits (01c9ee30b57502d776f23cc1c621ac68f7e19745)