citus

Commit Graph

Author	SHA1	Message	Date
Jelte Fennema	1c5b8588fe	Address race condition in InitializeBackendData (#6285 ) Sometimes in CI our isolation_citus_dist_activity test fails randomly like this: ```diff step s2-view-dist: SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC; query \|citus_nodename_for_nodeid\|citus_nodeport_for_nodeid\|state \|wait_event_type\|wait_event\|usename \|datname ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+---------- INSERT INTO test_table VALUES (100, 100); \|localhost \| 57636\|idle in transaction\|Client \|ClientRead\|postgres\|regression -(1 row) + + SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.)), '[{}]'::JSONB) + FROM ( + SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity. FROM + pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid + ) AS csa_from_one_node; + \|localhost \| 57636\|active \| \| \|postgres\|regression +(2 rows) step s3-view-worker: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26692/workflows/3406e4b4-b686-4667-bec6-8253ee0809b1/jobs/765119 I intended to fix this with #6263, but the fix turned out to be insufficient. This PR tries to address the issue by setting distributedCommandOriginator correctly in more situations. However, even with this change it's still possible to reproduce the flaky test in CI. In any case this should fix at least some instances of this issue. In passing this changes the isolation_citus_dist_activity test to allow running it multiple times in a row.	2022-09-02 14:23:47 +02:00
Marco Slot	6bb31c5d75	Add non-blocking variant of create_distributed_table (#6087 ) Added create_distributed_table_concurrently which is nonblocking variant of create_distributed_table. It bases on the split API which takes advantage of logical replication to support nonblocking split operations. Co-authored-by: Marco Slot <marco.slot@gmail.com> Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>	2022-08-30 15:35:40 +03:00
Jelte Fennema	9749622399	Fix flakyness in isolation_distributed_deadlock_detection (#6240 ) Our isolation_distributed_deadlock_detection test would fail randomly in CI in three different ways. The first type of failure looked like this: ```diff check_distributed_deadlocks --------------------------- t (1 row) -step s1-update-5: <... completed> step s5-update-1: <... completed> ERROR: canceling the transaction since it was involved in a distributed deadlock +step s1-update-5: <... completed> step s1-commit: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26399/workflows/d213ee85-397a-467a-9ffb-39e4f44e6688/jobs/749533 This random change in output was harmless and happened because when the deadlock detector cancelled a query, two queries would continue: The one that was cancelled would throw an error (and thus complete), and the one that was unblocked would now complete. It was random which of the two the isolation tester would first detect as completed. To resolve this PR starts using the ["marker" feature][1], this allows us to make sure one of the steps won't be marked as completed until the other one completed first. The second random failure was very similar: ```diff check_distributed_deadlocks --------------------------- t (1 row) -step s2-update-2: <... completed> -step s3-update-3: <... completed> -ERROR: canceling the transaction since it was involved in a distributed deadlock step s6-commit: COMMIT; step s5-update-6: <... completed> +step s2-update-2: <... completed> +step s3-update-3: <... completed> +ERROR: canceling the transaction since it was involved in a distributed deadlock step s5-commit: ``` Again a harmless difference in test output. In this case it's possible that the deadlock detector would not detect the unblocked processes right away, and would thus continue with to the next step. This step was a commit on a session that was not blocked, and which thus could complete without issues. To solve this I changed the order of the commits at the end of the permutation, to always have the first session that would commit be the session that would be unblocked the last. This ensures that no commit will ever be executed before completing all the queries. The third issue was different and looked like this: ```diff step s4-update-5: <... completed> step s4-commit: COMMIT; +step s1-update-4: <... completed> +isolationtester: canceling step s3-update-4 after 5 seconds step s3-update-4: <... completed> +ERROR: canceling statement due to user request +step s2-update-2: <... completed> step s3-commit: COMMIT; -step s2-update-2: <... completed> -step s1-update-4: <... completed> step s1-commit: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26411/workflows/9089beec-4f0f-4027-b4ce-0e84889afc06/jobs/750143 The reason for this failure is not entirely clear to me, but I was able to remove the flakyness without impacting the goal of the test. What was happening was that both `s1` and `s3` were waiting for `s4` to commit and release it's lock on the row 4. For some reason it wasn't deterministic which of the two sessions would be granted the lock after it was released by row 4. The test expected `s3` to be granted the lock, but sometimes it would be granted to `s1` instead. Which would in turn cause `s3` to still be blocked. To solve this I simply removed `s1` completely from this test. It wasn't actually part of the cycle that the deadlock detector should detect and was an unrelated appendage: ```mermaid graph TD; s2-->s3; s3-->s4; s1-->s4; s4-->s5; s5-->s6; s6-->s5; ``` By removing `s1` completely there was no contention for the lock and `s3` could always acquire it. [1]: `a73d6c87f2/src/test/isolation/README (L163-L188)`	2022-08-26 12:03:40 +03:00
aykut-bozkurt	041f88d7bf	Revert "Revert "Creates new colocation for colocate_with:='none' too"" (#6227 ) This reverts commit `d171a736ab`.	2022-08-24 10:54:04 +03:00
Jelte Fennema	1dd775fae8	Speed up logical replication tests to fix flakyness (#6229 ) The isolation_tenant_isolation_nonblocking test would sometimes randomly fail in CI, because we have a limit of runtime limit of 2 minutes per test. ``` test isolation_tenant_isolation_nonblocking ... make: *** [Makefile:171: check-enterprise-isolation] Terminated Too long with no output (exceeded 2m0s): context deadline exceeded ``` One solution would obviously be to increase the timeout, but instead I spent some time to increase the speed of our tests by tweaking some timings. On my local machine the time it took to run the isolation_tenant_isolation_nonblocking test went from 75s to 15s. So now we should easily stay within the 2 minute per test limit. I also checked if the new settings improved other logical replication tests, but the impect differs wildly per test. One other example of a test that runs much quicker due to the change is isolation_non_blocking_shard_split_fkey. But the shard move tests I tried are impacted much less. Example of failed tests: https://app.circleci.com/pipelines/github/citusdata/citus/26373/workflows/4fa660e4-63c8-4844-bef8-70a7bea902b7/jobs/748199	2022-08-23 17:37:31 +02:00
Jelte Fennema	e0ada050aa	Enable binary logical replication for shard moves (#6017 ) Using binary encoding can save a lot of CPU cycles, both on the sender and on the receiver. Since the walsender and walreceiver processes are single threaded, this can matter a lot for the throughput if they are bottlenecked on CPU. This feature is only available in PG14, not PG13. It should be safe to always enable because it's only used for types that support binary encoding according to the PG docs: > Even when this option is enabled, only data types that have binary > send and receive functions will be transferred in binary. But in case it causes problems, it can still be disabled by setting `citus.enable_binary_protocol` to `false`.	2022-08-23 16:38:00 +02:00
Hanefi Onaldi	9ec9209fd9	Bump PG versions in CI configs	2022-08-22 17:16:52 +03:00
Önder Kalacı	616ff2a3fe	Adjust some isolation test for the recent PG commits (#6210 ) * Adjust some isolation test for the recent PG commits In `3f32395612`, Postgres starts any isolation session with `set application_name`. However, one of the tests we had expected that it is exactly the first command in the session. The test tries to show that even if a gpid has not been assigned, we can show it in the citus_lock_waits graph. Now that, it is literally not possible to have such test as gpid would be assigned after `set application_name` command. Still, it is good to have a test where a command is blocked on the parser	2022-08-19 17:06:34 +03:00
Jelte Fennema	3f4440ff69	Improve debugability of failures in isolation_ref2ref_foreign_keys (#6197 ) As shown in #6196 the output of s1-view-locks is sometimes not as expected. However, because it's output is very minimal it's hard to understand the reason for that. This adds some more columns and aggregates less, so we can more easily see what locks are unexpectedly held or released. In passing this also fixes the following flaky part of this test by excluding locks taken by the maintenance daemon. After running it with this more detailed output for s1-view-locks it became obvious that that was the problem here. ```diff diff -dU10 -w /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out --- /home/jelte/work/citus/src/test/regress/expected/isolation_ref2ref_foreign_keys.out.modified 2022-08-18 15:42:08.689525233 +0200 +++ /home/jelte/work/citus/src/test/regress/results/isolation_ref2ref_foreign_keys.out.modified 2022-08-18 15:42:08.729525233 +0200 @@ -288,21 +288,22 @@ step s1-view-locks: SELECT mode, count(*) FROM pg_locks WHERE locktype='advisory' GROUP BY mode ORDER BY 1, 2; mode \|count ------------------------+----- -(0 rows) +ShareUpdateExclusiveLock\| 1 +(1 row) starting permutation: s2-begin s2-insert-table-3 s1-view-locks s2-rollback s1-view-locks step s2-begin: BEGIN; step s2-insert-table-3: INSERT INTO ref_table_3 VALUES (7, 5); step s1-view-locks: ```	2022-08-19 15:12:09 +02:00
Jelte Fennema	31faa88a4e	Track rebalance progress at the shard move level (#6187 ) We're in the processes of totally changing the shard rebalancer experience and infrastructure. Soon the shard rebalancer will include retries, crash recovery and support for running in the background. These improvements come at a cost though, the way the get_rebalance_progress UDF currently works is very hard to replicate with this new structure. This is mostly because the old behaviour doesn't really make sense anymore with this new infrastructure. A new and better way to track the progress will be included as part of the new infrastructure. This PR is in preparation of the new code rebalancer experience. It changes the get_rebalance_progress UDF to only display the moves that are in progress at the moment, not the ones that happened in the past or that are planned in the future. Another option would have been to completely remove the current get_rebalance_progress functionality and point people to the new way of tracking progress. But old blogposts still reference the old UDF and users might have some automation on top of it. Showing the progress of the current moves is fairly simple to achieve, even with the new infrastructure. So this PR is a kind of compromise: It doesn't have complete feature parity with the old get_rebalance_progress, but the most common use cases will still work. There's also an advantage of the change: You can now see progress of shard moves that were triggered by calling citus_move_shard_placement manually. Instead of only being able to see progress of moves that were initiated using get_rebalance_table_shards.	2022-08-18 18:57:04 +02:00
Önder Kalacı	961fcff5db	Properly add / remove coordinator for isolation tests (#6181 ) We used to rely on a seperate session to add the coordinator. However, that might prevent the existing sessions to get assigned proper gpids, which causes flaky tests.	2022-08-18 17:32:12 +03:00
Jelte Fennema	7dca028391	Fix flakyness in isolation_reference_table (#6193 ) The newly introduced isolation_reference_table test had some flakyness, because the assumption on how the arbitrary reference table gets chosen was incorrect. This introduces a VACUUM FULL at the start of the test to ensure the assumption actually holds. Example of failed test: https://app.circleci.com/pipelines/github/citusdata/citus/26108/workflows/0a5cd526-006b-423e-8b67-7411b9c6be36/jobs/736802	2022-08-18 15:47:28 +03:00
Nils Dijk	a9d47a96f6	Fix reference table lock contention (#6173 ) DESCRIPTION: Fix reference table lock contention Dropping and creating reference tables unintentionally blocked on each other due to the use of an ExclusiveLock for both the Drop and conditionally copying existing reference tables to (new) nodes. The patch does the following: - Lower lock lever for dropping (reference) tables to `ShareLock` so they don't self conflict - Treat reference tables and distributed tables equally and acquire the colocation lock when dropping any table that is in a colocation group - Perform the precondition check for copying reference tables twice, first time with a lower lock that doesn't conflict with anything. Could have been a NoLock, however, in preparation for dropping a colocation group, it is an `AccessShareLock` During normal operation the first check will always pass and we don't have to escalate that lock. Making it that we won't be blocked on adding and remove reference tables. Only after a node addition the first `create_reference_table` will still need to acquire an `ExclusiveLock` on the colocation group to perform the copy.	2022-08-17 18:19:28 +02:00
Naisila Puka	20a0e0ed39	Grant create on public to some users where necessary (for PG15) (#6180 )	2022-08-17 17:35:10 +03:00
aykut-bozkurt	be06d65721	Nonblocking tenant isolation is supported by using split api. (#6167 )	2022-08-17 11:13:07 +03:00
Jelte Fennema	fd07cc9baf	Fix flakyness in create index concurrently isolation tests (#6158 ) This creates consistent test output for isolation tests that involve `CREATE INDEX CONCURRENTLY`. `CREATE INDEX CONCURRENTLY` is sometimes temporarily detected as blocking, even though it will complete without any other queries needing to be run. This change makes sure that we wait until that happens without running any other queries in the meantime. This way we always get consistent output. The way we do that is addressed by using an empty step in the same session as the `CREATE INDEX CONCURRENLTY` command. Doing so forces the isolation tester to wait until the command is finished and not continue with steps from other sessions. This is [the recommended approach by Postgres][1]. There's two separate cases which are addressed in slightly different ways: 1. If `CREATE INDEX CONCURRENTLY` is actually blocked on another session: Add an empty step right after the commit of blocking session. e.g. `"s2-ddl-create-index-concurrently" "s1-commit" "s2-empty"` 2. If it's not actually blocked on another session: Add [an asterisk marker][2] to make it look like it's blocked (because sometimes this happens randomly) and right after that we add an empty step to trigger waiting. e.g. `"s2-ddl-create-index-concurrently"(*) "s2-empty" "s1-commit"` In passing this also enables isolation tests that were disabled due to a bug that has already been fixed for a while. Fixes #5993 Related to #5910 and #2966 [1]: `5f0adec253/src/test/isolation/README (L197-L204)` [2]: `5f0adec253/src/test/isolation/README (L174-L179)` Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2022-08-11 10:29:11 +02:00
Naisila Puka	3806f6f6a9	Add ORDER BY in pg_locks to avoid output order diffs (#6145 )	2022-08-09 06:02:07 +03:00
Sameer Awasekar	e236711eea	Introduce Non-Blocking Shard Split Workflow	2022-08-04 16:32:38 +02:00
Jelte Fennema	8bbc1a45e1	Fix flakyness in isolation_replicate_reference_tables_to_coordinator.spec (#6123 ) When the deadlock detector kills s2-update-dist-table both sessions finish at the same time. The order in which they are displayed can be swapped. To counteract this we start using the ["marker" feature][1] of the isolationtester framework to create consistent output. In passing this also sets the next_shard_id to the expected value by this test so it can be run using `make check-isolation-base`. Failed CI test: https://app.circleci.com/pipelines/github/citusdata/citus/25562/workflows/dfe6f88a-c306-4d91-b771-d5d1deb1798d/jobs/713417 [1]: `ec62ce55a8/src/test/isolation/README (L152)`	2022-08-03 12:00:30 +02:00
Naisila Puka	85324f3acc	Clean up multi_shard_commit_protocol guc leftovers (#6110 )	2022-08-01 15:22:02 +03:00
Onder Kalaci	5bc8a81aa7	Add colocation checks for shard splits	2022-07-27 10:01:19 +02:00
Onder Kalaci	12fa3aaf6b	Concurrent shard move/copy and colocated table creation fix It turns out that create_distributed_table and citus_move/copy_shard_placement does not work well concurrently. To fix that, we need to acquire a lock, which sounds like a good use of colocation lock. However, the current usage of colocation lock is limited to higher level UDFs like rebalance_table_shards etc. Those usage of lock is still useful, but we cannot acquire the same lock on citus_move_shard_placement etc. because the coordinator connects to itself to acquire the lock. Hence, the high level UDF blocks itself. To fix that, we use one more colocation lock, with the placements are the main objects to consider.	2022-07-27 10:01:19 +02:00
Onder Kalaci	6c65d29924	Check the PGPROC's validity properly We used to only check whether the PID is valid or not. However, Postgres does not necessarily set the PID of the backend to 0 when it exists. Instead, we need to be able to check it from procArray. IsBackendPid() is what pg_stat_activity also relies on for a similar purpose.	2022-07-26 17:44:44 +02:00
Hanefi Onaldi	eb3e5ee227	Introduce citus_locks view citus_locks combines the pg_locks views from all nodes and adds global_pid, nodeid, and relation_name. The columns of citus_locks don't change based on the Postgres version, however the pg_locks's columns do. Postgres 14 added one more column to pg_locks (waitstart timestamptz). citus_locks has the most expansive column set, including the newly added column. If citus_locks is queried in a Postgres version where pg_locks doesn't have some columns, the values for those columns in citus_locks will be NULL	2022-07-21 03:06:57 +03:00
Nitish Upreti	5b3537cdff	Shard Split for Citus (#6029 ) * Blocking split setup * Add missing type * Missing API from Metadata Sync * Shard Split e2e code * Worker Split Copy DestReceiver skeleton * Basic destreceiver code * worker_split_copy UDF * UDF calling * Split points are text * Isolate Tenant and Split Shard Unification * Fixing executor and misc * Reindent code * Fixing UDF definitions * Hello World Local Copy works * Remote copy hello world works * Local and Remote binary test * Fixing text local copy and adding tests * Hello World shard split works * Negative tests * Blocking Split workflow works * Refactor * Bug fix * Reindent * Cleaning up and adding comments * Basic test for shard split workflow * ReIndent * Circle CI integration * Removing include causing circle-ci build failure * Remove SplitCopyDestReceiver and use PartitionedResultDestReceiver * Add support for citus.enable_binary_protocol * Reindent * Fix build break * Update Test * Cleanup on catch * Addressing open comments * Update downgrade script and quote schema/table in COPY statement * Fix metadata sync issue. Update regression test * Isolation test and bug fix * Add Isolation test, fix foreign constraint deadlock issue * Misc code review comments * Test name needing to be quoted * Refactor code from review comments * Explaining shardGroupSplitIntervalListList * Fix upgrade & downgrade * Fix broken test * Test fix Round 2 * Fixing bug and modifying test appropriately * Fully qualify copy udf name. Run Reindent * Address PR comments * Fix null handling when creating AuxiliaryStructures * Ensure local copy is triggered in tests * Limit max shards that can be created with split * Test failure fix * Remove split_mode and use shard_transfer_mode instead' * Fix test failure * Fix test failure * Fixing permission issue when splitting non-superuser owned tables * Fix test expected output * Remove extra space * Fix test * attempt to fix test * Addressing Marco's PR comment * Only clean shards created by workflow * Remove from merge * Update test	2022-07-18 02:54:15 -07:00
Hanefi Onaldi	ae58ca5783	Replace isolation tester func only once on enterprise tests (#6064 ) This is a continuation of a refactor (with commit sha `2b7cf0c097`) that aimed to use Citus helper UDFs by default in iso tests. PostgreSQL isolation test infrastructure uses some UDFs to detect whether concurrent sessions block each other. Citus implements alternatives to that UDF so that we are able to detect and report distributed transactions that get blocked on the worker nodes as well. We needed to explicitly replace PG helper functions with Citus implementations in each isolation file. Now we replace them by default.	2022-07-14 19:16:53 +03:00
Hanefi Onaldi	2b7cf0c097	Replace iso tester func only once (#5964 ) Use Citus helper UDFs by default in iso tests PostgreSQL isolation test infrastructure uses some UDFs to detect whether concurrent sessions block each other. Citus implements alternatives to that UDF so that we are able to detect and report distributed transactions that get blocked on the worker nodes as well. We needed to explicitly replace PG helper functions with Citus implementations in each isolation file. Now we replace them by default.	2022-07-06 11:04:31 +03:00
aykutbozkurt	8194dc4c62	* Added isolation tests for vacuum, * Added more regression tests for more vacuum options, * Fixed deadlock for unqualified vacuum when there is only 1 worker, * Supported lock_skipped for vacuum.	2022-06-23 15:33:14 +03:00
Jelte Fennema	184c7c0bce	Make enterprise features open source (#6008 ) This PR makes all of the features open source that were previously only available in Citus Enterprise. Features that this adds: 1. Non blocking shard moves/shard rebalancer (`citus.logical_replication_timeout`) 2. Propagation of CREATE/DROP/ALTER ROLE statements 3. Propagation of GRANT statements 4. Propagation of CLUSTER statements 5. Propagation of ALTER DATABASE ... OWNER TO ... 6. Optimization for COPY when loading JSON to avoid double parsing of the JSON object (`citus.skip_jsonb_validation_in_copy`) 7. Support for row level security 8. Support for `pg_dist_authinfo`, which allows storing different authentication options for different users, e.g. you can store passwords or certificates here. 9. Support for `pg_dist_poolinfo`, which allows using connection poolers in between coordinator and workers 10. Tracking distributed query execution times using citus_stat_statements (`citus.stat_statements_max`, `citus.stat_statements_purge_interval`, `citus.stat_statements_track`). This is disabled by default. 11. Blocking tenant_isolation 12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`	2022-06-16 00:23:46 -07:00
Gledis Zeneli	27ddb4fc8e	Do not obtain AccessShareLock before actual lock (#5965 ) Do not obtain AccessShareLock before acquiring the distributed locks. Acquiring an AccessShareLock ensures that the relations which we are trying to get a distributed lock on will not be dropped in the time between when the LOCK command is issued and the LOCK commands are send to the worker. However, this also leads to distributed deadlocks in such scenarios: ```sql -- for dist lock acquiring order coor, w1, w2 -- on w2 LOCK t1 IN ACCESS EXLUSIVE MODE; -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock -- concurrently on w1 LOCK t1 IN ACCESS EXLUSIVE MODE; -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock -- acquire dist lock on coor, w1, gets blocked on local AccessShareLock on w2 -- on w2 continuation of the execution above -- starts to acquire dist locks and gets blocked on the coor by the lock acquired by w1 -- distributed deadlock ``` We opt for avoiding such deadlocks with the cost of the possibility of running into errors when the relations on which we are trying to acquire locks on get dropped.	2022-05-23 13:06:38 +03:00
Hanefi Onaldi	52541c5802	Add normalization rules for flaky isolation tests We remove `<waiting ...>` and `<... completed>` outputs for some CREATE INDEX CONCURRENTLY commands since they can cause flakiness in some scenarios. Postgres calls WaitForOlderSnapshots() and this can cause CREATE INDEX CONCURRENTLY commands for shards to get blocked by each other for brief periods of time. The extra waits can pop-up, or they can get completed at different lines in the output files. To remedy that, we rename those indexes to be captured by the new normalization rule.	2022-05-21 00:55:47 +03:00
Marco Slot	ad5214b50c	Allow distributed execution from run_command_on_* functions	2022-05-20 15:26:47 +02:00
gledis69	4731630741	Add distributing lock command support	2022-05-20 12:28:07 +03:00
Halil Ozan Akgul	d171a736ab	Revert "Creates new colocation for colocate_with:='none' too" This reverts commit `f74447b3b7`.	2022-05-17 15:32:22 +03:00
Halil Ozan Akgul	f74447b3b7	Creates new colocation for colocate_with:='none' too	2022-05-16 13:39:05 +03:00
Gledis Zeneli	4c6f62efc6	Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930 ) Breaking down #5899 into smaller PR-s This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views). This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command: TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3; We generate and send the following command to all the workers in metadata: ```sql SEL citus.enable_ddl_propagation TO FALSE; LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE; SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE'); SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE'); LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE; SEL citus.enable_ddl_propagation TO TRUE; ``` Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations. When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.	2022-05-11 18:38:48 +03:00
Burak Velioglu	1460452442	Introduce CREATE/DROP VIEW Adds support for propagating create/drop view commands and views to worker node while scaling out the cluster. Since views are dropped while converting the table type, metadata connection will be used while propagating view commands to not switch to sequential mode.	2022-05-10 13:07:14 +03:00
Önder Kalacı	dd78c81378	Fix flaky isolation - 1 (#5900 ) * Do not show any PG internal queries	2022-04-11 20:43:51 -07:00
Burak Velioglu	5d9599f964	Create function in transaction according to create object propagation guc	2022-04-08 17:15:31 +03:00
Halil Ozan Akgül	37fafd007c	Turn metadata sync on in isolation_update_node and isolation_update_node_lock_writes tests (#5779 )	2022-03-11 16:39:20 +03:00
Halil Ozan Akgül	c9913b135c	Turn metadata sync on in isolation_ref2ref_foreign_keys test (#5791 )	2022-03-11 13:30:11 +03:00
Halil Ozan Akgül	2edaf0971c	Turn metadata sync on in isolation reference copy vs all (#5790 ) * Turn metadata sync on in isolation_reference_copy_vs_all test * Update the output of isolation_reference_copy_vs_all test	2022-03-11 11:27:46 +03:00
Marco Slot	7559ad12ba	Change create_object_propagation default to immediate	2022-03-09 17:40:50 +01:00
Halil Ozan Akgül	333bcc7948	Global PID Helper Functions (#5768 ) * Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions * Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions * Add tests	2022-03-09 13:15:59 +03:00
Onder Kalaci	c32b2de1a7	Improve citus_lock_waits 1) Remove useless columns 2) Show backends that are blocked on a DDL even before gpid is assigned 3) One minor bugfix, where we clear distributedCommandOriginator properly.	2022-03-07 11:10:44 +01:00
Nils Dijk	3801576dfb	Move pg_dist_object to pg_catalog (#5765 ) DESCRIPTION: Move pg_dist_object to pg_catalog Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`. By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema. With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes. Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.	2022-03-04 17:40:38 +00:00
Halil Ozan Akgul	0500a62515	Updates citus_dist_stat_activity to use citus_stat_activity	2022-03-04 17:28:17 +03:00
Halil Ozan Akgul	06a0509b1a	Introduces citus_stat_activity view	2022-03-03 16:19:20 +03:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Onder Kalaci	e80a36c4b6	Improve visibility rules for non-priviledge roles It seems like our approach is way too restrictive and some places are wrong. Now, we follow very similar approach to pg_stat_activity. Some of the changes are pre-requsite for implementing citus_dist_stat_activity via citus_stat_activity.	2022-03-02 18:04:01 +01:00

1 2 3

127 Commits (1c5b8588fe6702a7f2169e2b90a810b7c0246f36)