citus

Commit Graph

Author	SHA1	Message	Date
Jelte Fennema	fd07cc9baf	Fix flakyness in create index concurrently isolation tests (#6158 ) This creates consistent test output for isolation tests that involve `CREATE INDEX CONCURRENTLY`. `CREATE INDEX CONCURRENTLY` is sometimes temporarily detected as blocking, even though it will complete without any other queries needing to be run. This change makes sure that we wait until that happens without running any other queries in the meantime. This way we always get consistent output. The way we do that is addressed by using an empty step in the same session as the `CREATE INDEX CONCURRENLTY` command. Doing so forces the isolation tester to wait until the command is finished and not continue with steps from other sessions. This is [the recommended approach by Postgres][1]. There's two separate cases which are addressed in slightly different ways: 1. If `CREATE INDEX CONCURRENTLY` is actually blocked on another session: Add an empty step right after the commit of blocking session. e.g. `"s2-ddl-create-index-concurrently" "s1-commit" "s2-empty"` 2. If it's not actually blocked on another session: Add [an asterisk marker][2] to make it look like it's blocked (because sometimes this happens randomly) and right after that we add an empty step to trigger waiting. e.g. `"s2-ddl-create-index-concurrently"(*) "s2-empty" "s1-commit"` In passing this also enables isolation tests that were disabled due to a bug that has already been fixed for a while. Fixes #5993 Related to #5910 and #2966 [1]: `5f0adec253/src/test/isolation/README (L197-L204)` [2]: `5f0adec253/src/test/isolation/README (L174-L179)` Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2022-08-11 10:29:11 +02:00
Naisila Puka	3806f6f6a9	Add ORDER BY in pg_locks to avoid output order diffs (#6145 )	2022-08-09 06:02:07 +03:00
Sameer Awasekar	e236711eea	Introduce Non-Blocking Shard Split Workflow	2022-08-04 16:32:38 +02:00
Jelte Fennema	8bbc1a45e1	Fix flakyness in isolation_replicate_reference_tables_to_coordinator.spec (#6123 ) When the deadlock detector kills s2-update-dist-table both sessions finish at the same time. The order in which they are displayed can be swapped. To counteract this we start using the ["marker" feature][1] of the isolationtester framework to create consistent output. In passing this also sets the next_shard_id to the expected value by this test so it can be run using `make check-isolation-base`. Failed CI test: https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/dfe6f88a-c306-4d91-b771-d5d1deb1798d/jobs/713417 [1]: `ec62ce55a8/src/test/isolation/README (L152)`	2022-08-03 12:00:30 +02:00
Naisila Puka	85324f3acc	Clean up multi_shard_commit_protocol guc leftovers (#6110 )	2022-08-01 15:22:02 +03:00
Onder Kalaci	5bc8a81aa7	Add colocation checks for shard splits	2022-07-27 10:01:19 +02:00
Onder Kalaci	12fa3aaf6b	Concurrent shard move/copy and colocated table creation fix It turns out that create_distributed_table and citus_move/copy_shard_placement does not work well concurrently. To fix that, we need to acquire a lock, which sounds like a good use of colocation lock. However, the current usage of colocation lock is limited to higher level UDFs like rebalance_table_shards etc. Those usage of lock is still useful, but we cannot acquire the same lock on citus_move_shard_placement etc. because the coordinator connects to itself to acquire the lock. Hence, the high level UDF blocks itself. To fix that, we use one more colocation lock, with the placements are the main objects to consider.	2022-07-27 10:01:19 +02:00
Onder Kalaci	6c65d29924	Check the PGPROC's validity properly We used to only check whether the PID is valid or not. However, Postgres does not necessarily set the PID of the backend to 0 when it exists. Instead, we need to be able to check it from procArray. IsBackendPid() is what pg_stat_activity also relies on for a similar purpose.	2022-07-26 17:44:44 +02:00
Hanefi Onaldi	eb3e5ee227	Introduce citus_locks view citus_locks combines the pg_locks views from all nodes and adds global_pid, nodeid, and relation_name. The columns of citus_locks don't change based on the Postgres version, however the pg_locks's columns do. Postgres 14 added one more column to pg_locks (waitstart timestamptz). citus_locks has the most expansive column set, including the newly added column. If citus_locks is queried in a Postgres version where pg_locks doesn't have some columns, the values for those columns in citus_locks will be NULL	2022-07-21 03:06:57 +03:00
Nitish Upreti	5b3537cdff	Shard Split for Citus (#6029 ) * Blocking split setup * Add missing type * Missing API from Metadata Sync * Shard Split e2e code * Worker Split Copy DestReceiver skeleton * Basic destreceiver code * worker_split_copy UDF * UDF calling * Split points are text * Isolate Tenant and Split Shard Unification * Fixing executor and misc * Reindent code * Fixing UDF definitions * Hello World Local Copy works * Remote copy hello world works * Local and Remote binary test * Fixing text local copy and adding tests * Hello World shard split works * Negative tests * Blocking Split workflow works * Refactor * Bug fix * Reindent * Cleaning up and adding comments * Basic test for shard split workflow * ReIndent * Circle CI integration * Removing include causing circle-ci build failure * Remove SplitCopyDestReceiver and use PartitionedResultDestReceiver * Add support for citus.enable_binary_protocol * Reindent * Fix build break * Update Test * Cleanup on catch * Addressing open comments * Update downgrade script and quote schema/table in COPY statement * Fix metadata sync issue. Update regression test * Isolation test and bug fix * Add Isolation test, fix foreign constraint deadlock issue * Misc code review comments * Test name needing to be quoted * Refactor code from review comments * Explaining shardGroupSplitIntervalListList * Fix upgrade & downgrade * Fix broken test * Test fix Round 2 * Fixing bug and modifying test appropriately * Fully qualify copy udf name. Run Reindent * Address PR comments * Fix null handling when creating AuxiliaryStructures * Ensure local copy is triggered in tests * Limit max shards that can be created with split * Test failure fix * Remove split_mode and use shard_transfer_mode instead' * Fix test failure * Fix test failure * Fixing permission issue when splitting non-superuser owned tables * Fix test expected output * Remove extra space * Fix test * attempt to fix test * Addressing Marco's PR comment * Only clean shards created by workflow * Remove from merge * Update test	2022-07-18 02:54:15 -07:00
Hanefi Onaldi	ae58ca5783	Replace isolation tester func only once on enterprise tests (#6064 ) This is a continuation of a refactor (with commit sha `2b7cf0c097`) that aimed to use Citus helper UDFs by default in iso tests. PostgreSQL isolation test infrastructure uses some UDFs to detect whether concurrent sessions block each other. Citus implements alternatives to that UDF so that we are able to detect and report distributed transactions that get blocked on the worker nodes as well. We needed to explicitly replace PG helper functions with Citus implementations in each isolation file. Now we replace them by default.	2022-07-14 19:16:53 +03:00
Hanefi Onaldi	2b7cf0c097	Replace iso tester func only once (#5964 ) Use Citus helper UDFs by default in iso tests PostgreSQL isolation test infrastructure uses some UDFs to detect whether concurrent sessions block each other. Citus implements alternatives to that UDF so that we are able to detect and report distributed transactions that get blocked on the worker nodes as well. We needed to explicitly replace PG helper functions with Citus implementations in each isolation file. Now we replace them by default.	2022-07-06 11:04:31 +03:00
aykutbozkurt	8194dc4c62	* Added isolation tests for vacuum, * Added more regression tests for more vacuum options, * Fixed deadlock for unqualified vacuum when there is only 1 worker, * Supported lock_skipped for vacuum.	2022-06-23 15:33:14 +03:00
Jelte Fennema	184c7c0bce	Make enterprise features open source (#6008 ) This PR makes all of the features open source that were previously only available in Citus Enterprise. Features that this adds: 1. Non blocking shard moves/shard rebalancer (`citus.logical_replication_timeout`) 2. Propagation of CREATE/DROP/ALTER ROLE statements 3. Propagation of GRANT statements 4. Propagation of CLUSTER statements 5. Propagation of ALTER DATABASE ... OWNER TO ... 6. Optimization for COPY when loading JSON to avoid double parsing of the JSON object (`citus.skip_jsonb_validation_in_copy`) 7. Support for row level security 8. Support for `pg_dist_authinfo`, which allows storing different authentication options for different users, e.g. you can store passwords or certificates here. 9. Support for `pg_dist_poolinfo`, which allows using connection poolers in between coordinator and workers 10. Tracking distributed query execution times using citus_stat_statements (`citus.stat_statements_max`, `citus.stat_statements_purge_interval`, `citus.stat_statements_track`). This is disabled by default. 11. Blocking tenant_isolation 12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`	2022-06-16 00:23:46 -07:00
Gledis Zeneli	27ddb4fc8e	Do not obtain AccessShareLock before actual lock (#5965 ) Do not obtain AccessShareLock before acquiring the distributed locks. Acquiring an AccessShareLock ensures that the relations which we are trying to get a distributed lock on will not be dropped in the time between when the LOCK command is issued and the LOCK commands are send to the worker. However, this also leads to distributed deadlocks in such scenarios: ```sql -- for dist lock acquiring order coor, w1, w2 -- on w2 LOCK t1 IN ACCESS EXLUSIVE MODE; -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock -- concurrently on w1 LOCK t1 IN ACCESS EXLUSIVE MODE; -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock -- acquire dist lock on coor, w1, gets blocked on local AccessShareLock on w2 -- on w2 continuation of the execution above -- starts to acquire dist locks and gets blocked on the coor by the lock acquired by w1 -- distributed deadlock ``` We opt for avoiding such deadlocks with the cost of the possibility of running into errors when the relations on which we are trying to acquire locks on get dropped.	2022-05-23 13:06:38 +03:00
Hanefi Onaldi	52541c5802	Add normalization rules for flaky isolation tests We remove `<waiting ...>` and `<... completed>` outputs for some CREATE INDEX CONCURRENTLY commands since they can cause flakiness in some scenarios. Postgres calls WaitForOlderSnapshots() and this can cause CREATE INDEX CONCURRENTLY commands for shards to get blocked by each other for brief periods of time. The extra waits can pop-up, or they can get completed at different lines in the output files. To remedy that, we rename those indexes to be captured by the new normalization rule.	2022-05-21 00:55:47 +03:00
Marco Slot	ad5214b50c	Allow distributed execution from run_command_on_* functions	2022-05-20 15:26:47 +02:00
gledis69	4731630741	Add distributing lock command support	2022-05-20 12:28:07 +03:00
Halil Ozan Akgul	d171a736ab	Revert "Creates new colocation for colocate_with:='none' too" This reverts commit `f74447b3b7`.	2022-05-17 15:32:22 +03:00
Halil Ozan Akgul	f74447b3b7	Creates new colocation for colocate_with:='none' too	2022-05-16 13:39:05 +03:00
Gledis Zeneli	4c6f62efc6	Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930 ) Breaking down #5899 into smaller PR-s This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views). This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command: TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3; We generate and send the following command to all the workers in metadata: ```sql SEL citus.enable_ddl_propagation TO FALSE; LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE; SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE'); SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE'); LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE; SEL citus.enable_ddl_propagation TO TRUE; ``` Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations. When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.	2022-05-11 18:38:48 +03:00
Burak Velioglu	1460452442	Introduce CREATE/DROP VIEW Adds support for propagating create/drop view commands and views to worker node while scaling out the cluster. Since views are dropped while converting the table type, metadata connection will be used while propagating view commands to not switch to sequential mode.	2022-05-10 13:07:14 +03:00
Önder Kalacı	dd78c81378	Fix flaky isolation - 1 (#5900 ) * Do not show any PG internal queries	2022-04-11 20:43:51 -07:00
Burak Velioglu	5d9599f964	Create function in transaction according to create object propagation guc	2022-04-08 17:15:31 +03:00
Halil Ozan Akgül	37fafd007c	Turn metadata sync on in isolation_update_node and isolation_update_node_lock_writes tests (#5779 )	2022-03-11 16:39:20 +03:00
Halil Ozan Akgül	c9913b135c	Turn metadata sync on in isolation_ref2ref_foreign_keys test (#5791 )	2022-03-11 13:30:11 +03:00
Halil Ozan Akgül	2edaf0971c	Turn metadata sync on in isolation reference copy vs all (#5790 ) * Turn metadata sync on in isolation_reference_copy_vs_all test * Update the output of isolation_reference_copy_vs_all test	2022-03-11 11:27:46 +03:00
Marco Slot	7559ad12ba	Change create_object_propagation default to immediate	2022-03-09 17:40:50 +01:00
Halil Ozan Akgül	333bcc7948	Global PID Helper Functions (#5768 ) * Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions * Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions * Add tests	2022-03-09 13:15:59 +03:00
Onder Kalaci	c32b2de1a7	Improve citus_lock_waits 1) Remove useless columns 2) Show backends that are blocked on a DDL even before gpid is assigned 3) One minor bugfix, where we clear distributedCommandOriginator properly.	2022-03-07 11:10:44 +01:00
Nils Dijk	3801576dfb	Move pg_dist_object to pg_catalog (#5765 ) DESCRIPTION: Move pg_dist_object to pg_catalog Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`. By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema. With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes. Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.	2022-03-04 17:40:38 +00:00
Halil Ozan Akgul	0500a62515	Updates citus_dist_stat_activity to use citus_stat_activity	2022-03-04 17:28:17 +03:00
Halil Ozan Akgul	06a0509b1a	Introduces citus_stat_activity view	2022-03-03 16:19:20 +03:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Onder Kalaci	e80a36c4b6	Improve visibility rules for non-priviledge roles It seems like our approach is way too restrictive and some places are wrong. Now, we follow very similar approach to pg_stat_activity. Some of the changes are pre-requsite for implementing citus_dist_stat_activity via citus_stat_activity.	2022-03-02 18:04:01 +01:00
Onder Kalaci	df95d59e33	Drop support for CitusInitiatedBackend CitusInitiatedBackend was a pre-mature implemenation of the whole GlobalPID infrastructure. We used it to track whether any individual query is triggered by Citus or not. As of now, after GlobalPID is already in place, we don't need CitusInitiatedBackend, in fact it could even be wrong.	2022-02-24 12:12:43 +01:00
Hanefi Onaldi	7bd6c2c9ac	Isolation tests for various ddl operations and metadata sync	2022-02-24 03:19:56 +03:00
Onder Kalaci	95d5918967	Properly set worker_query and use	2022-02-21 18:22:33 +01:00
Onder Kalaci	dffcafc096	Use global pids in citus_lock_waits	2022-02-21 17:46:34 +01:00
Hanefi Onaldi	2e5ca8ba2b	Add isolation tests for metadata sync vs all This commit introduces several test cases for concurrent operations that change metadata, and a concurrent metadata sync operation. The overall structure is as follows: - Session#1 starts metadata syncing in a transaction block - Session#2 does an operation that change metadata - Both sessions are committed - Another session checks whether the metadata are the same accross all nodes in the cluster.	2022-02-11 01:55:04 +03:00
Önder Kalacı	dc6c194916	Show IDLE backends in citus_dist_stat_activity (#5700 ) * Break the dependency to CitusInitiatedBackend infrastructure With this change, we start to show non-distributed backends as well in citus_dist_stat_activity. I think that (a) it is essential for making citus_lock_waits to work for blocked on DDL commands. (b) it is more expected from the user's perspective. The name of the view is a little inconsistent now (e.g., citus_dist_stat_activity) but we are already planning to improve the names with followup PRs. Also, we have global pids assigned, the CitusInitiatedBackend becomes obsolete.	2022-02-10 08:59:28 -08:00
Ahmet Gedemenli	76b63a307b	Propagate create/drop schema commands	2022-02-10 14:58:09 +03:00
Halil Ozan Akgul	8ee02b29d0	Introduce global PID	2022-02-08 16:49:38 +03:00
Onder Kalaci	923bb194a4	Move isolation_multiuser_locking to MX tests	2022-02-04 10:52:57 +01:00
Onder Kalaci	ff234fbfd2	Unify old GUCs into a single one Replaces citus.enable_object_propagation with citus.enable_metadata_sync Also, within Citus 11 release cycle, we added citus.enable_metadata_sync_by_default, that is also replaced with citus.enable_metadata_sync. In essence, when citus.enable_metadata_sync is set to true, all the objects and the metadata is send to the remote node. We strongly advice that the users never changes the value of this GUC.	2022-02-04 10:52:56 +01:00
Burak Velioglu	f88cc230bf	Handle tables and objects as metadata. Update UDFs accordingly With this commit we've started to propagate sequences and shell tables within the object dependency resolution. So, ensuring any dependencies for any object will consider shell tables and sequences as well. Separate logics for both shell tables and sequences have been removed. Since both shell tables and sequences logic were implemented as a part of the metadata handling before that logic, we were propagating them while syncing table metadata. With this commit we've divided metadata (which means anything except shards thereafter) syncing logic into multiple parts and implemented it either as a part of ActivateNode. You can check the functions called in ActivateNode to check definition of different metadata. Definitions of start_metadata_sync_to_node and citus_activate_node have also been updated. citus_activate_node will basically create an active node with all metadata and reference table shards. start_metadata_sync_to_node will be same with citus_activate_node except replicating reference tables. stop_metadata_sync_to_node will remove all the metadata. All of those UDFs need to be called by superuser.	2022-01-31 16:20:15 +03:00
Halil Ozan Akgul	9547228e8d	Add isolation_check_mx test	2021-12-30 14:58:30 +03:00
Onder Kalaci	549edcabb6	Allow disabling node(s) when multiple failures happen As of master branch, Citus does all the modifications to replicated tables (e.g., reference tables and distributed tables with replication factor > 1), via 2PC and avoids any shardstate=3. As a side-effect of those changes, handling node failures for replicated tables change. With this PR, when one (or multiple) node failures happen, the users would see query errors on modifications. If the problem is intermitant, that's OK, once the node failure(s) recover by themselves, the modification queries would succeed. If the node failure(s) are permenant, the users should call `SELECT citus_disable_node(...)` to disable the node. As soon as the node is disabled, modification would start to succeed. However, now the old node gets behind. It means that, when the node is up again, the placements should be re-created on the node. First, use `SELECT citus_activate_node()`. Then, use `SELECT replicate_table_shards(...)` to replicate the missing placements on the re-activated node.	2021-12-01 10:19:48 +01:00
Onder Kalaci	38b08ebde9	Generalize the error checks while removing node The checks for preventing to remove a node are very much reference table centric. We are soon going to add the same checks for replicated tables. So, make the checks generic such that: (a) replicated tables fit naturally (b) we can the same checks in `citus_disable_node`.	2021-11-26 14:25:29 +01:00
Hanefi Onaldi	4c135de9e4	Introduce CI checks for hash comments in specs We do not use comments starting with # in spec files because it creates errors from C preprocessor that expects directives after this character. Instead use C style comments, i.e: // single line comment You can also use multiline comments as well /* * multi line comment */	2021-11-26 14:52:51 +03:00
Marco Slot	f49d26fbeb	Remove citus_update_table_statistics isolation test	2021-11-19 10:51:15 +01:00
Marco Slot	9e6ca23286	Remove cstore_fdw-related logic	2021-11-16 13:59:03 +01:00
Önder Kalacı	8c0bc94b51	Enable replication factor > 1 in metadata syncing (#5392 ) - [x] Add some more regression test coverage - [x] Make sure returning works fine in case of local execution + remote execution (task->partiallyLocalOrRemote works as expected, already added tests) - [x] Implement locking properly (and add isolation tests) - [x] We do #shardcount round-trips on `SerializeNonCommutativeWrites`. We made it a single round-trip. - [x] Acquire locks for subselects on the workers & add isolation tests - [x] Add a GUC to prevent modification from the workers, hence increase the coordinator-only throughput - The performance slightly drops (~%15), unless `citus.allow_modifications_from_workers_to_replicated_tables` is set to false	2021-11-15 15:10:18 +03:00
Önder Kalacı	d5b371b2e0	Merge branch 'master' into naisila/fix-partitioned-index	2021-11-08 10:53:16 +01:00
naisila	385ba94d15	Run fix_partition_shard_index_names after each wrong naming command	2021-11-08 10:43:34 +01:00
Marco Slot	78866df13c	Remove master_append_table_to_shard UDF	2021-11-08 10:43:24 +01:00
Marco Slot	fba93df4b0	Remove copy into new append shard logic	2021-11-07 21:01:40 +01:00
Halil Ozan Akgul	a8f3f712cc	Turns mx on in isolations tests	2021-11-04 17:12:30 +03:00
Marco Slot	096660d61d	Remove master_apply_delete_command	2021-10-18 22:29:37 +02:00
Halil Ozan Akgul	9c9d4b5eeb	Turn MX on by default	2021-10-08 18:17:21 +03:00
Naisila Puka	d0390af72d	Add fix_partition_shard_index_names udf to fix currently broken names (#5291 ) * Add udf to include shardId in broken partition shard index names * Address reviews: rename index such that operations can be done on it * More comprehensive index tests * Final touches and formatting	2021-10-07 19:34:52 +03:00
Sait Talha Nisanci	f3fa133caa	Bind seg version to 1.3 in isolation_textension_commands	2021-09-03 15:41:28 +03:00
Onur Tirtir	889a2731cb	Split columnar stripe reservation into two phases (#5188 ) Previously, we were doing `first_row_number` reservation for the first row written to current `WriteState` but were doing `stripe_id` reservation when flushing the `WriteState` and were inserting the related record to `columnar.stripe` at that time as well. However, inserting `columnar.stripe` record at flush-time is problematic. This is because, as told in #5160, if relation has any index-based constraints and if there are two concurrent writes that are inserting conflicting key values for that constraint, then postgres relies on `tableAM->fetch_index_tuple` (=`columnar_fetch_index_tuple`) callback to return `true` when indexAM is checking against possible constraint violations. However, pending writes of other backends are not visible to concurrent sessions in columnar since we were not inserting the stripe metadata record until flushing the stripe. With this commit, we split stripe reservation into two phases: i) Reserve `stripe_id` and insert a "dummy" record to `columnar.stripe` at the very same time we reserve `first_row_number`, i.e. when writing the first row to the current `WriteState`. ii) At flush time, do the storage level allocation and complete the missing fields of the dummy record inserted into `columnar.stripe` during i). That way, any concurrent writes would be able to check against possible constraint violations by using `SnapshotDirty` when scanning `columnar.stripe`. Note that `columnar_fetch_index_tuple` still wouldn't be able to fill the output tupleslot for the requested tid but it would at least return `true` for such index look-up's and we believe this should be sufficient for the caller indexAM callback to make the concurrent writer block on prior one. That is how we fix #5160. Only downside of reserving `stripe_id` at the same time we reserve `first_row_number` is that now any aborted writes would also waste some amount of `stripe_id` as in the case of `first_row_number` but we are just wasting them one-by-one. Considering the fact that we waste `first_row_number` by the amount stripe row limit (=150k by default) in such cases, this shouldn't be important at all.	2021-09-02 11:49:14 +03:00
Onur Tirtir	bf4dfad6f7	Update curcid of given snapshot if it is MVCC Before starting to scan a columnar table, we always flush the pending writes to disk. However, we increment command counter after modifying metadata tables. On the other hand, now that we _don't always use_ xact snapshot to scan a columnar table, writes that we just flushed might not be visible to the query that just flushed pending writes to disk since curcid of provided snapshot would become smaller than the command id being used when modifying metadata tables. To give an example, before this change, below was a possible scenario due to the changes that we made to use the correct snapshot. ```sql CREATE TABLE t(a int, b int) USING columnar; BEGIN; INSERT INTO t VALUES (5, 10); SELECT * FROM t; ┌───┬───┐ │ a │ b │ ├───┼───┤ └───┴───┘ (0 rows) SELECT * FROM t; ┌───┬────┐ │ a │ b │ ├───┼────┤ │ 5 │ 10 │ └───┴────┘ (1 row) ```	2021-09-02 11:11:59 +03:00
SaitTalhaNisanci	b923d51fc6	Bump pg12 and pg13 images to pg12.8 and pg13.8 (#5208 ) In our testing infra structure, even though we use pinned versions of postgres, the auxiliary libraries might pull in newer versions. This is for example the case for libpq, which will now use the libpq libraries from 14beta3. The changes in this PR are a lot due to the libpq changes. We also have changed the citus version that is used as a base for the citus upgrades, from 10.0 to 10.1 . This caused columnar to enforce some extra limits on the settings, which conflicted with our upgrade tests. The changes in failure tests are due to the libpq changes. There are also a lot of changes on isolation tests outputs, hence we updated all of them. Co-authored-by: Nils Dijk <nils@citusdata.com>	2021-08-25 16:04:57 +03:00
Naisila Puka	e5b32b2c3c	Acquire AccessShareLock before updating table statistics (#5155 )	2021-08-12 13:58:15 +03:00
Onur Tirtir	f00c63c33d	Support columnar table index builds with CONCURRENTLY option (#5032 ) With this commit, we add (`CREATE INDEX` / `REINDEX`) `CONCURRENTLY` support for columnar tables. For that, we implement `columnar_index_validate_scan` callback. The reasoning behind the implementation is as follows: * Postgres function `validate_index` provides all the TIDs that are currently in the index to `columnar_index_validate_scan` callback via a `tupleSort` object.. * We start scanning the table by using `columnar_getnextslot` as usual. Before moving forward, note that `columnar_getnextslot` guarantees to return tuples in the order of their TIDs. * For us to use during table scan, postgres provides a snapshot guaranteeing that any tuples that are valid according to that snapshot but are not in the index must be added to the index. * Then for each tuple that we read from our table, we continue iterating given `tupleSort` to find the first TID that is greater than or equal to our tuple's TID. If both TID's are equal to each other, then we skip the tuple since it's already indexed. If the TID that we read from tupleSort is greater then our tuple's TID, then we decide to insert this tuple into index.	2021-07-09 13:44:58 +03:00
Onur Tirtir	3d11c0f9ef	Merge remote-tracking branch 'origin/master' into columnar-index Conflicts: src/test/regress/expected/columnar_empty.out src/test/regress/expected/multi_extension.out	2021-06-16 20:23:50 +03:00
Jelte Fennema	503c70b619	Cleanup orphaned shards before moving when necessary A shard move would fail if there was an orphaned version of the shard on the target node. With this change before actually fail, we try to clean up orphaned shards to see if that fixes the issue.	2021-06-04 11:23:07 +02:00
Jelte Fennema	7015049ea5	Add citus_cleanup_orphaned_shards UDF Sometimes the background daemon doesn't cleanup orphaned shards quickly enough. It's useful to have a UDF to trigger this removal when needed. We already had a UDF like this but it was only used during testing. This exposes that UDF to users. As a safety measure it cannot be run in a transaction, because that would cause the background daemon to stop cleaning up shards while this transaction is running.	2021-06-04 11:23:07 +02:00
Hanefi Onaldi	878513f325	Remove all occurences of replication_model GUC	2021-05-21 16:14:59 +03:00
SaitTalhaNisanci	82f34a8d88	Enable citus.defer_drop_after_shard_move by default (#4961 ) Enable citus.defer_drop_after_shard_move by default	2021-05-21 10:48:32 +03:00
Jelte Fennema	10f06ad753	Fetch shard size on the fly for the rebalance monitor Without this change the rebalancer progress monitor gets the shard sizes from the `shardlength` column in `pg_dist_placement`. This column needs to be updated manually by calling `citus_update_table_statistics`. However, `citus_update_table_statistics` could lead to distributed deadlocks while database traffic is on-going (see #4752). To work around this we don't use `shardlength` column anymore. Instead for every rebalance we now fetch all shard sizes on the fly. Two additional things this does are: 1. It adds tests for the rebalance progress function. 2. If a shard move cannot be done because a source or target node is unreachable, then we error in stop the rebalance, instead of showing a warning and continuing. When using the by_disk_size rebalance strategy it's not safe to continue with other moves if a specific move failed. It's possible that the failed move made space for the next move, and because the failed move never happened this space now does not exist. 3. Adds two new columns to the result of `get_rebalancer_progress` which shows the size of the shard on the source and target node. Fixes #4930	2021-05-20 16:38:17 +02:00
Nils Dijk	a6c2d2a4c4	Feature: alter database owner (#4986 ) DESCRIPTION: Add support for ALTER DATABASE OWNER This adds support for changing the database owner. It achieves this by marking the database as a distributed object. By marking the database as a distributed object it will look for its dependencies and order the user creation commands (enterprise only) before the alter of the database owner. This is mostly important when adding new nodes. By having the database marked as a distributed object it can easily understand for which `ALTER DATABASE ... OWNER TO ...` commands to propagate by resolving the object address of the database and verifying it is a distributed object, and hence should propagate changes of owner ship to all workers. Given the ownership of the database might have implications on subsequent commands in transactions we force sequential mode for transactions that have a `ALTER DATABASE ... OWNER TO ...` command in them. This will fail the transaction with meaningful help when the transaction already executed parallel statements. By default the feature is turned off since roles are not automatically propagated, having it turned on would cause hard to understand errors for the user. It can be turned on by the user via setting the `citus.enable_alter_database_owner`.	2021-05-20 13:27:44 +02:00
SaitTalhaNisanci	ff2a125a5b	Lookup hostname before execution (#4976 ) We lookup the hostname just before the execution so that even if there are cached entries in the prepared statement cache we use the updated entries.	2021-05-18 16:46:31 +03:00
SaitTalhaNisanci	eaa7d2bada	Not block maintenance daemon (#4972 ) It was possible to block maintenance daemon by taking an SHARE ROW EXCLUSIVE lock on pg_dist_placement. Until the lock is released maintenance daemon would be blocked. We should not block the maintenance daemon under any case hence now we try to get the pg_dist_placement lock without waiting, if we cannot get it then we don't try to drop the old placements.	2021-05-17 03:22:35 -07:00
Onur Tirtir	2e419ea177	Add first_row_number column to columnar.stripe for tid mapping	2021-05-10 20:16:50 +03:00
Jelte Fennema	2f29d4e53e	Continue to remove shards after first failure in DropMarkedShards The comment of DropMarkedShards described the behaviour that after a failure we would continue trying to drop other shards. However the code did not do this and would stop after the first failure. Instead of simply fixing the comment I fixed the code, because the described behaviour is more useful. Now a single shard that cannot be removed yet does not block others from being removed.	2021-04-30 15:42:09 +03:00
SaitTalhaNisanci	93c2dcf3d2	Fix data-race with concurrent calls of DropMarkedShards (#4909 ) * Fix problews with concurrent calls of DropMarkedShards When trying to enable `citus.defer_drop_after_shard_move` by default it turned out that DropMarkedShards was not safe to call concurrently. This could especially cause big problems when also moving shards at the same time. During tests it was possible to trigger a state where a shard that was moved would not be available on any of the nodes anymore after the move. Currently DropMarkedShards is only called in production by the maintenaince deamon. Since this is only a single process triggering such a race is currently impossible in production settings. In future changes we will want to call DropMarkedShards from other places too though. * Add some isolation tests Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2021-04-21 10:59:48 +03:00
Onur Tirtir	1d3e075e62	Support temporary columnar tables (#4766 )	2021-03-12 12:01:36 +03:00
jeff-davis	9da9bd3dfd	Columnar: rename files and tests. (#4751 ) * Columnar: rename files and tests. * Columnar: Rename TableState to ColumnarState.	2021-03-01 08:34:24 -08:00
Hadi Moshayedi	4e53314e3f	Make isolation_metadata_sync_deadlock more resilient	2021-02-06 01:05:24 -08:00
Ahmet Gedemenli	2443b20b2c	Rename master to distributed for worker stat activity	2021-02-04 12:20:06 +03:00
Ahmet Gedemenli	34840ddc5c	Rename master to citus for dist stat activity cols	2021-02-04 11:12:23 +03:00
Onur Tirtir	dfcdccd0e7	Rename udf in regression tests (as per prev commit)	2021-01-27 15:52:37 +03:00
Hadi Moshayedi	bc01c795a2	Reland #4419	2021-01-19 07:48:47 -08:00
Ahmet Gedemenli	436c9d9d79	Remove the word 'master' from Citus UDFs (#4472 ) * Replace master_add_node with citus_add_node * Replace master_activate_node with citus_activate_node * Replace master_add_inactive_node with citus_add_inactive_node * Use master udfs in old scripts * Replace master_add_secondary_node with citus_add_secondary_node * Replace master_disable_node with citus_disable_node * Replace master_drain_node with citus_drain_node * Replace master_remove_node with citus_remove_node * Replace master_set_node_property with citus_set_node_property * Replace master_unmark_object_distributed with citus_unmark_object_distributed * Replace master_update_node with citus_update_node * Replace master_update_shard_statistics with citus_update_shard_statistics * Replace master_update_table_statistics with citus_update_table_statistics * Rename master_conninfo_cache_invalidate to citus_conninfo_cache_invalidate Rename master_dist_local_group_cache_invalidate to citus_dist_local_group_cache_invalidate * Replace master_copy_shard_placement with citus_copy_shard_placement * Replace master_move_shard_placement with citus_move_shard_placement * Rename master_dist_node_cache_invalidate to citus_dist_node_cache_invalidate * Rename master_dist_object_cache_invalidate to citus_dist_object_cache_invalidate * Rename master_dist_partition_cache_invalidate to citus_dist_partition_cache_invalidate * Rename master_dist_placement_cache_invalidate to citus_dist_placement_cache_invalidate * Rename master_dist_shard_cache_invalidate to citus_dist_shard_cache_invalidate * Drop master_modify_multiple_shards * Rename master_drop_all_shards to citus_drop_all_shards * Drop master_create_distributed_table * Drop master_create_worker_shards * Revert old function definitions * Add missing revoke statement for citus_disable_node	2021-01-13 12:10:43 +03:00
Marco Slot	011283122b	Add the shard rebalancer implementation	2021-01-07 16:51:55 +01:00
Marco Slot	47c1b19174	Revert "Do metadata sync in a separate background worker." This reverts commit `4df723cf9b`.	2021-01-07 10:30:04 +01:00
Marco Slot	d9f175532b	Revert "Trigger metadata sync at transaction commit" This reverts commit `a2c73bef27`.	2021-01-07 10:30:00 +01:00
Hadi Moshayedi	a2c73bef27	Trigger metadata sync at transaction commit	2020-12-24 08:28:38 -08:00
Hadi Moshayedi	4df723cf9b	Do metadata sync in a separate background worker.	2020-12-24 08:25:55 -08:00
Hadi Moshayedi	b3dac5e9d1	Columnar: set default compression as zstd if available	2020-12-09 14:32:08 -08:00
Jeff Davis	a2b698a766	rename cstore_tableam -> columnar	2020-11-19 12:15:51 -08:00
Nils Dijk	f89bd3eeb5	move columnar test files	2020-11-17 18:55:34 +01:00
Onur Tirtir	17cc810372	Implement "citus local table" creation logic	2020-09-09 11:50:48 +03:00
Halil Ozan Akgul	375310b7f1	Adds support for table undistribution	2020-08-05 14:36:03 +03:00
Sait Talha Nisanci	fe4ac51d8c	Normalize Output:.. since it changes with pg13 Fix indentation for better readability	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	76c7b3d1c6	Remove unused steps in isolation tests PG13 gives a warning for unused steps therefore we should remove the unused steps in isolation tests.	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	1dbd545cf4	replace task-tracker with adaptive in tests	2020-07-21 16:21:01 +03:00

1 2 3 4

162 Commits (7a428d1d24d2cbc34620447ba9163cfb47a593cd)