citus

Commit Graph

Author	SHA1	Message	Date
Halil Ozan Akgul	f74447b3b7	Creates new colocation for colocate_with:='none' too	2022-05-16 13:39:05 +03:00
Gledis Zeneli	4c6f62efc6	Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930 ) Breaking down #5899 into smaller PR-s This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views). This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command: TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3; We generate and send the following command to all the workers in metadata: ```sql SEL citus.enable_ddl_propagation TO FALSE; LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE; SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE'); SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE'); LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE; SEL citus.enable_ddl_propagation TO TRUE; ``` Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations. When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.	2022-05-11 18:38:48 +03:00
Burak Velioglu	1460452442	Introduce CREATE/DROP VIEW Adds support for propagating create/drop view commands and views to worker node while scaling out the cluster. Since views are dropped while converting the table type, metadata connection will be used while propagating view commands to not switch to sequential mode.	2022-05-10 13:07:14 +03:00
Önder Kalacı	dd78c81378	Fix flaky isolation - 1 (#5900 ) * Do not show any PG internal queries	2022-04-11 20:43:51 -07:00
Burak Velioglu	5d9599f964	Create function in transaction according to create object propagation guc	2022-04-08 17:15:31 +03:00
Halil Ozan Akgül	37fafd007c	Turn metadata sync on in isolation_update_node and isolation_update_node_lock_writes tests (#5779 )	2022-03-11 16:39:20 +03:00
Halil Ozan Akgül	c9913b135c	Turn metadata sync on in isolation_ref2ref_foreign_keys test (#5791 )	2022-03-11 13:30:11 +03:00
Halil Ozan Akgül	2edaf0971c	Turn metadata sync on in isolation reference copy vs all (#5790 ) * Turn metadata sync on in isolation_reference_copy_vs_all test * Update the output of isolation_reference_copy_vs_all test	2022-03-11 11:27:46 +03:00
Marco Slot	7559ad12ba	Change create_object_propagation default to immediate	2022-03-09 17:40:50 +01:00
Halil Ozan Akgül	333bcc7948	Global PID Helper Functions (#5768 ) * Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions * Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions * Add tests	2022-03-09 13:15:59 +03:00
Onder Kalaci	c32b2de1a7	Improve citus_lock_waits 1) Remove useless columns 2) Show backends that are blocked on a DDL even before gpid is assigned 3) One minor bugfix, where we clear distributedCommandOriginator properly.	2022-03-07 11:10:44 +01:00
Nils Dijk	3801576dfb	Move pg_dist_object to pg_catalog (#5765 ) DESCRIPTION: Move pg_dist_object to pg_catalog Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`. By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema. With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes. Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.	2022-03-04 17:40:38 +00:00
Halil Ozan Akgul	0500a62515	Updates citus_dist_stat_activity to use citus_stat_activity	2022-03-04 17:28:17 +03:00
Halil Ozan Akgul	06a0509b1a	Introduces citus_stat_activity view	2022-03-03 16:19:20 +03:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Onder Kalaci	e80a36c4b6	Improve visibility rules for non-priviledge roles It seems like our approach is way too restrictive and some places are wrong. Now, we follow very similar approach to pg_stat_activity. Some of the changes are pre-requsite for implementing citus_dist_stat_activity via citus_stat_activity.	2022-03-02 18:04:01 +01:00
Onder Kalaci	df95d59e33	Drop support for CitusInitiatedBackend CitusInitiatedBackend was a pre-mature implemenation of the whole GlobalPID infrastructure. We used it to track whether any individual query is triggered by Citus or not. As of now, after GlobalPID is already in place, we don't need CitusInitiatedBackend, in fact it could even be wrong.	2022-02-24 12:12:43 +01:00
Hanefi Onaldi	7bd6c2c9ac	Isolation tests for various ddl operations and metadata sync	2022-02-24 03:19:56 +03:00
Onder Kalaci	95d5918967	Properly set worker_query and use	2022-02-21 18:22:33 +01:00
Onder Kalaci	dffcafc096	Use global pids in citus_lock_waits	2022-02-21 17:46:34 +01:00
Hanefi Onaldi	2e5ca8ba2b	Add isolation tests for metadata sync vs all This commit introduces several test cases for concurrent operations that change metadata, and a concurrent metadata sync operation. The overall structure is as follows: - Session#1 starts metadata syncing in a transaction block - Session#2 does an operation that change metadata - Both sessions are committed - Another session checks whether the metadata are the same accross all nodes in the cluster.	2022-02-11 01:55:04 +03:00
Önder Kalacı	dc6c194916	Show IDLE backends in citus_dist_stat_activity (#5700 ) * Break the dependency to CitusInitiatedBackend infrastructure With this change, we start to show non-distributed backends as well in citus_dist_stat_activity. I think that (a) it is essential for making citus_lock_waits to work for blocked on DDL commands. (b) it is more expected from the user's perspective. The name of the view is a little inconsistent now (e.g., citus_dist_stat_activity) but we are already planning to improve the names with followup PRs. Also, we have global pids assigned, the CitusInitiatedBackend becomes obsolete.	2022-02-10 08:59:28 -08:00
Ahmet Gedemenli	76b63a307b	Propagate create/drop schema commands	2022-02-10 14:58:09 +03:00
Halil Ozan Akgul	8ee02b29d0	Introduce global PID	2022-02-08 16:49:38 +03:00
Onder Kalaci	923bb194a4	Move isolation_multiuser_locking to MX tests	2022-02-04 10:52:57 +01:00
Onder Kalaci	ff234fbfd2	Unify old GUCs into a single one Replaces citus.enable_object_propagation with citus.enable_metadata_sync Also, within Citus 11 release cycle, we added citus.enable_metadata_sync_by_default, that is also replaced with citus.enable_metadata_sync. In essence, when citus.enable_metadata_sync is set to true, all the objects and the metadata is send to the remote node. We strongly advice that the users never changes the value of this GUC.	2022-02-04 10:52:56 +01:00
Burak Velioglu	f88cc230bf	Handle tables and objects as metadata. Update UDFs accordingly With this commit we've started to propagate sequences and shell tables within the object dependency resolution. So, ensuring any dependencies for any object will consider shell tables and sequences as well. Separate logics for both shell tables and sequences have been removed. Since both shell tables and sequences logic were implemented as a part of the metadata handling before that logic, we were propagating them while syncing table metadata. With this commit we've divided metadata (which means anything except shards thereafter) syncing logic into multiple parts and implemented it either as a part of ActivateNode. You can check the functions called in ActivateNode to check definition of different metadata. Definitions of start_metadata_sync_to_node and citus_activate_node have also been updated. citus_activate_node will basically create an active node with all metadata and reference table shards. start_metadata_sync_to_node will be same with citus_activate_node except replicating reference tables. stop_metadata_sync_to_node will remove all the metadata. All of those UDFs need to be called by superuser.	2022-01-31 16:20:15 +03:00
Halil Ozan Akgul	9547228e8d	Add isolation_check_mx test	2021-12-30 14:58:30 +03:00
Onder Kalaci	549edcabb6	Allow disabling node(s) when multiple failures happen As of master branch, Citus does all the modifications to replicated tables (e.g., reference tables and distributed tables with replication factor > 1), via 2PC and avoids any shardstate=3. As a side-effect of those changes, handling node failures for replicated tables change. With this PR, when one (or multiple) node failures happen, the users would see query errors on modifications. If the problem is intermitant, that's OK, once the node failure(s) recover by themselves, the modification queries would succeed. If the node failure(s) are permenant, the users should call `SELECT citus_disable_node(...)` to disable the node. As soon as the node is disabled, modification would start to succeed. However, now the old node gets behind. It means that, when the node is up again, the placements should be re-created on the node. First, use `SELECT citus_activate_node()`. Then, use `SELECT replicate_table_shards(...)` to replicate the missing placements on the re-activated node.	2021-12-01 10:19:48 +01:00
Onder Kalaci	38b08ebde9	Generalize the error checks while removing node The checks for preventing to remove a node are very much reference table centric. We are soon going to add the same checks for replicated tables. So, make the checks generic such that: (a) replicated tables fit naturally (b) we can the same checks in `citus_disable_node`.	2021-11-26 14:25:29 +01:00
Hanefi Onaldi	4c135de9e4	Introduce CI checks for hash comments in specs We do not use comments starting with # in spec files because it creates errors from C preprocessor that expects directives after this character. Instead use C style comments, i.e: // single line comment You can also use multiline comments as well /* * multi line comment */	2021-11-26 14:52:51 +03:00
Marco Slot	f49d26fbeb	Remove citus_update_table_statistics isolation test	2021-11-19 10:51:15 +01:00
Marco Slot	9e6ca23286	Remove cstore_fdw-related logic	2021-11-16 13:59:03 +01:00
Önder Kalacı	8c0bc94b51	Enable replication factor > 1 in metadata syncing (#5392 ) - [x] Add some more regression test coverage - [x] Make sure returning works fine in case of local execution + remote execution (task->partiallyLocalOrRemote works as expected, already added tests) - [x] Implement locking properly (and add isolation tests) - [x] We do #shardcount round-trips on `SerializeNonCommutativeWrites`. We made it a single round-trip. - [x] Acquire locks for subselects on the workers & add isolation tests - [x] Add a GUC to prevent modification from the workers, hence increase the coordinator-only throughput - The performance slightly drops (~%15), unless `citus.allow_modifications_from_workers_to_replicated_tables` is set to false	2021-11-15 15:10:18 +03:00
Önder Kalacı	d5b371b2e0	Merge branch 'master' into naisila/fix-partitioned-index	2021-11-08 10:53:16 +01:00
naisila	385ba94d15	Run fix_partition_shard_index_names after each wrong naming command	2021-11-08 10:43:34 +01:00
Marco Slot	78866df13c	Remove master_append_table_to_shard UDF	2021-11-08 10:43:24 +01:00
Marco Slot	fba93df4b0	Remove copy into new append shard logic	2021-11-07 21:01:40 +01:00
Halil Ozan Akgul	a8f3f712cc	Turns mx on in isolations tests	2021-11-04 17:12:30 +03:00
Marco Slot	096660d61d	Remove master_apply_delete_command	2021-10-18 22:29:37 +02:00
Halil Ozan Akgul	9c9d4b5eeb	Turn MX on by default	2021-10-08 18:17:21 +03:00
Naisila Puka	d0390af72d	Add fix_partition_shard_index_names udf to fix currently broken names (#5291 ) * Add udf to include shardId in broken partition shard index names * Address reviews: rename index such that operations can be done on it * More comprehensive index tests * Final touches and formatting	2021-10-07 19:34:52 +03:00
Sait Talha Nisanci	f3fa133caa	Bind seg version to 1.3 in isolation_textension_commands	2021-09-03 15:41:28 +03:00
Onur Tirtir	889a2731cb	Split columnar stripe reservation into two phases (#5188 ) Previously, we were doing `first_row_number` reservation for the first row written to current `WriteState` but were doing `stripe_id` reservation when flushing the `WriteState` and were inserting the related record to `columnar.stripe` at that time as well. However, inserting `columnar.stripe` record at flush-time is problematic. This is because, as told in #5160, if relation has any index-based constraints and if there are two concurrent writes that are inserting conflicting key values for that constraint, then postgres relies on `tableAM->fetch_index_tuple` (=`columnar_fetch_index_tuple`) callback to return `true` when indexAM is checking against possible constraint violations. However, pending writes of other backends are not visible to concurrent sessions in columnar since we were not inserting the stripe metadata record until flushing the stripe. With this commit, we split stripe reservation into two phases: i) Reserve `stripe_id` and insert a "dummy" record to `columnar.stripe` at the very same time we reserve `first_row_number`, i.e. when writing the first row to the current `WriteState`. ii) At flush time, do the storage level allocation and complete the missing fields of the dummy record inserted into `columnar.stripe` during i). That way, any concurrent writes would be able to check against possible constraint violations by using `SnapshotDirty` when scanning `columnar.stripe`. Note that `columnar_fetch_index_tuple` still wouldn't be able to fill the output tupleslot for the requested tid but it would at least return `true` for such index look-up's and we believe this should be sufficient for the caller indexAM callback to make the concurrent writer block on prior one. That is how we fix #5160. Only downside of reserving `stripe_id` at the same time we reserve `first_row_number` is that now any aborted writes would also waste some amount of `stripe_id` as in the case of `first_row_number` but we are just wasting them one-by-one. Considering the fact that we waste `first_row_number` by the amount stripe row limit (=150k by default) in such cases, this shouldn't be important at all.	2021-09-02 11:49:14 +03:00
Onur Tirtir	bf4dfad6f7	Update curcid of given snapshot if it is MVCC Before starting to scan a columnar table, we always flush the pending writes to disk. However, we increment command counter after modifying metadata tables. On the other hand, now that we _don't always use_ xact snapshot to scan a columnar table, writes that we just flushed might not be visible to the query that just flushed pending writes to disk since curcid of provided snapshot would become smaller than the command id being used when modifying metadata tables. To give an example, before this change, below was a possible scenario due to the changes that we made to use the correct snapshot. ```sql CREATE TABLE t(a int, b int) USING columnar; BEGIN; INSERT INTO t VALUES (5, 10); SELECT * FROM t; ┌───┬───┐ │ a │ b │ ├───┼───┤ └───┴───┘ (0 rows) SELECT * FROM t; ┌───┬────┐ │ a │ b │ ├───┼────┤ │ 5 │ 10 │ └───┴────┘ (1 row) ```	2021-09-02 11:11:59 +03:00
SaitTalhaNisanci	b923d51fc6	Bump pg12 and pg13 images to pg12.8 and pg13.8 (#5208 ) In our testing infra structure, even though we use pinned versions of postgres, the auxiliary libraries might pull in newer versions. This is for example the case for libpq, which will now use the libpq libraries from 14beta3. The changes in this PR are a lot due to the libpq changes. We also have changed the citus version that is used as a base for the citus upgrades, from 10.0 to 10.1 . This caused columnar to enforce some extra limits on the settings, which conflicted with our upgrade tests. The changes in failure tests are due to the libpq changes. There are also a lot of changes on isolation tests outputs, hence we updated all of them. Co-authored-by: Nils Dijk <nils@citusdata.com>	2021-08-25 16:04:57 +03:00
Naisila Puka	e5b32b2c3c	Acquire AccessShareLock before updating table statistics (#5155 )	2021-08-12 13:58:15 +03:00
Onur Tirtir	f00c63c33d	Support columnar table index builds with CONCURRENTLY option (#5032 ) With this commit, we add (`CREATE INDEX` / `REINDEX`) `CONCURRENTLY` support for columnar tables. For that, we implement `columnar_index_validate_scan` callback. The reasoning behind the implementation is as follows: * Postgres function `validate_index` provides all the TIDs that are currently in the index to `columnar_index_validate_scan` callback via a `tupleSort` object.. * We start scanning the table by using `columnar_getnextslot` as usual. Before moving forward, note that `columnar_getnextslot` guarantees to return tuples in the order of their TIDs. * For us to use during table scan, postgres provides a snapshot guaranteeing that any tuples that are valid according to that snapshot but are not in the index must be added to the index. * Then for each tuple that we read from our table, we continue iterating given `tupleSort` to find the first TID that is greater than or equal to our tuple's TID. If both TID's are equal to each other, then we skip the tuple since it's already indexed. If the TID that we read from tupleSort is greater then our tuple's TID, then we decide to insert this tuple into index.	2021-07-09 13:44:58 +03:00
Onur Tirtir	3d11c0f9ef	Merge remote-tracking branch 'origin/master' into columnar-index Conflicts: src/test/regress/expected/columnar_empty.out src/test/regress/expected/multi_extension.out	2021-06-16 20:23:50 +03:00
Jelte Fennema	503c70b619	Cleanup orphaned shards before moving when necessary A shard move would fail if there was an orphaned version of the shard on the target node. With this change before actually fail, we try to clean up orphaned shards to see if that fixes the issue.	2021-06-04 11:23:07 +02:00

1 2

93 Commits (f74447b3b7f024ecc88294a0fff923d056692b00)