citus

Commit Graph

Author	SHA1	Message	Date
aykutbozkurt	98abd68178	PR #6728 / commit - 1 Add a method to send multiple commands to worker list reusing the same bare connections. Change will be useful for metadata sync api.	2023-03-30 10:52:46 +03:00
Marco Slot	8ad444f8ef	Hide shards from CDC subscriptions	2023-03-29 00:59:12 +02:00
rajeshkt78	85b8a2c7a1	CDC implementation for Citus using Logical Replication (#6623 ) Description: Implementing CDC changes using Logical Replication to avoid re-publishing events multiple times by setting up replication origin session, which will add "DoNotReplicateId" to every WAL entry. - shard splits - shard moves - create distributed table - undistribute table - alter distributed tables (for some cases) - reference table operations The citus decoder which will be decoding WAL events for CDC clients, ignores any WAL entry with replication origin that is not zero. It also maps the shard names to distributed table names.	2023-03-28 16:00:21 +05:30
Onur Tirtir	20a5f3af2b	Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with HasDistributionKey() (#6743 ) Now that we will soon add another table type having DISTRIBUTE_BY_NONE as distribution method and that we want the code to interpret such tables mostly as distributed tables, let's make the definition of those other two table types more strict by removing CITUS_TABLE_WITH_NO_DIST_KEY macro. And instead, use HasDistributionKey() check in the places where the logic applies to all table types that have / don't have a distribution key. In future PRs, we might want to convert some of those HasDistributionKey() checks if logic only applies to Citus local / reference tables, not the others. And adding HasDistributionKey() also allows us to consider having DISTRIBUTE_BY_NONE as the distribution method as a "table attribute" that can apply to distributed tables too, rather something that determines the table type.	2023-03-10 13:55:52 +03:00
Jelte Fennema	f061dbb253	Also reset transactions at connection shutdown (#6685 ) In #6314 I refactored the connection cleanup to be simpler to understand and use. However, by doing so I introduced a use-after-free possibility (that valgrind luckily picked up): In the `ShouldShutdownConnection` path of `AfterXactHostConnectionHandling` we free connections without removing the `transactionNode` from the dlist that it might be part of. Before the refactoring this wasn't a problem, because the dlist would be completely reset quickly after in `ResetGlobalVariables` (without reading or writing the dlist entries). The refactoring changed this by moving the `dlist_delete` call to `ResetRemoteTransaction`, which in turn was called in the `!ShouldShutdownConnection` path of `AfterXactHostConnectionHandling`. Thus this `!ShouldShutdownConnection` path would now delete from the `dlist`, but the `ShouldShutdownConnection` path would not. Thus to remove itself the deleting path would sometimes update nodes in the list that were freed right before. There's two ways of fixing this: 1. Call `dlist_delete` from both of paths. 2. Call `dlist_delete` from neither of the paths. This commit implements the second approach, and #6684 implements the first. We need to choose which approach we prefer. To make calling `dlist_delete` from both paths actually work, we also need to use a slightly different check to determine if we need to call dlist_delete. Various regression tests showed that there can be cases where the `transactionState` is something else than `REMOTE_TRANS_NOT_STARTED` but the connection was not added to the `InProgressTransactions` list One example of such a case is when running `TransactionStateMachine` without calling `StartRemoteTransactionBegin` beforehand. In those cases the connection won't be added to `InProgressTransactions`, but the `transactionState` is changed to `REMOTE_TRANS_SENT_COMMAND`. Sidenote: This bug already existed in 11.1, but valgrind didn't catch it back then. My guess is that this happened because #6314 was merged after the initial release branch was cut. Fixes #6638	2023-02-02 16:05:34 +01:00
Jelte Fennema	92689a8362	Make GPIDs work with pg_dist_poolinfo (#6588 ) The original implementation of GPIDs didn't work correctly when using `pg_dist_poolinfo` together with PgBouncer. The reason is that it assumed that once a connection was made to a worker, the originating GPID should stay the same for ever. But when pg_dist_poolinfo is used this isn't the case, because the same connection on the worker might be used by different backends of the coordinator. This fixes that issue by updating the GPID whenever a new application name is set on a connection. This is the only thing that's needed, because PgBouncer already sets the application name correctly on the server connection whenever a client is updated.	2023-01-13 14:39:19 +00:00
Jelte Fennema	34df853bda	Fix bug introduced by #6412 (#6590 ) In #6412 I made a change to not re-assign the global PID if it was already set. This inadvertently introduced a regression where `userId` and `databaseId` would not be set on the backend data when the global PID was assigned in the authentication hook. This fixes it by doing two things: 1. Removing `userId` from `BackendData`, since it's not used anywhere anyway. 2. Move assignment of `databaseId` to dedicated `SetBackendDataDatabaseId` function, that isn't a no-op when global pid is already set. Since #6412 is not released yet this does not need a description.	2023-01-10 16:21:57 +01:00
Philip Dubé	cf69fc3652	Grammar: it's to its Includes an error message & one case of its to it's Also fix "to the to" typos	2022-11-28 20:43:44 +00:00
Jelte Fennema	68de2ce601	Include gpid in all internal application names (#6431 ) When debugging issues it's quite useful to see the originating gpid in the application_name of a query on a worker. This already happens for most queries, but not for queries created by the rebalancer or by run_command_on_worker. This adds a gpid to those two application_names too. Note, that if the GPID of the new application_names is different than the current GPID of the backend the backend will continue to keep the old gpid as its actual GPID. This PR is just meant to make sure that the application_name is as useful as it can be for users to look at. Updating of gpids will be done in a follow-up PR, and adding gpids to all internal connections will make this easier.	2022-11-25 11:16:33 +01:00
Marco Slot	77fbcfaf14	Propagate BEGIN properties to worker nodes (#6483 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-11-10 18:08:43 +01:00
Onur Tirtir	1af28b3f27	Use CommitContext for subxact mgmt and reduce memory usage in CommitContext (#6099 ) (Hopefully) Fixes #5000. If memory allocation done for `SubXactContext state` in `PushSubXact()` fails, then `PopSubXact()` might segfault, for example, when grabbing the topmost `SubXactContext` from `activeSubXactContexts` if this is the first ever subxact within the current xact, with the following stack trace: ```c citus.so!list_nth_cell(const List list, int n) (\opt\pgenv\pgsql-14.3\include\server\nodes\pg_list.h:260) citus.so!PopSubXact(SubTransactionId subId) (\home\onurctirtir\citus\src\backend\distributed\transaction\transaction_management.c:761) citus.so!CoordinatedSubTransactionCallback(SubXactEvent event, SubTransactionId subId, SubTransactionId parentSubid, void * arg) (\home\onurctirtir\citus\src\backend\distributed\transaction\transaction_management.c:673) CallSubXactCallbacks(SubXactEvent event, SubTransactionId mySubid, SubTransactionId parentSubid) (\opt\pgenv\src\postgresql-14.3\src\backend\access\transam\xact.c:3644) AbortSubTransaction() (\opt\pgenv\src\postgresql-14.3\src\backend\access\transam\xact.c:5058) AbortCurrentTransaction() (\opt\pgenv\src\postgresql-14.3\src\backend\access\transam\xact.c:3366) PostgresMain(int argc, char ** argv, const char * dbname, const char * username) (\opt\pgenv\src\postgresql-14.3\src\backend\tcop\postgres.c:4250) BackendRun(Port * port) (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:4530) BackendStartup(Port * port) (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:4252) ServerLoop() (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:1745) PostmasterMain(int argc, char argv) (\opt\pgenv\src\postgresql-14.3\src\backend\postmaster\postmaster.c:1417) main(int argc, char argv) (\opt\pgenv\src\postgresql-14.3\src\backend\main\main.c:209) ``` For this reason, to be more defensive against memory-allocation errors that could happen at `PushSubXact()`, now we use our pre-allocated memory context for the objects created in `PushSubXact()`. This commit also attempts reducing the memory allocations done under CommitContext to reduce the chances of consuming all the memory available to CommitContext. Note that it's problematic to encounter with such a memory-allocation error for other objects created in `PushSubXact()` as well, so above is an example scenario that might result in a segfault. DESCRIPTION: Fixes a bug that might cause segfaults when handling deeply nested subtransactions	2022-11-03 00:57:32 +03:00
Jelte Fennema	cb34adf7ac	Don't reassign global PID when already assigned (#6412 ) DESCRIPTION: Fix bug in global PID assignment for rebalancer sub-connections In CI our isolation_shard_rebalancer_progress test would sometimes fail like this: ```diff +isolationtester: canceling step s1-rebalance-c1-block-writes after 60 seconds step s1-rebalance-c1-block-writes: SELECT rebalance_table_shards('colocated1', shard_transfer_mode:='block_writes'); - <waiting ...> + +ERROR: canceling statement due to user request step s7-get-progress: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/27855/workflows/2a7e335a-f3e8-46ed-b6bd-6920d42f7214/jobs/831710 It turned out this was an actual bug in the way our assigning of global PIDs interacts with the way we connect to ourselves as the shard rebalancer. The first command the shard rebalancer sends is a SET ommand to change the application_name to `citus_rebalancer`. If `StartupCitusBackend` is called after this command is processed, then it overwrites the global PID that was extracted from the previous application_name. This makes sure that we don't do that, and continue to use the original global PID. While it might seem that we only call `StartupCitusBackend` once for each query backend, this isn't actually the case. Whenever pg_dist_partition gets ANALYZEd by autovacuum we indirectly call `StartupCitusBackend` again, because we invalidate the cache then. In passing this fixes two other things as well: 1. It sets `distributedCommandOriginator` correctly in `AssignGlobalPID`, by using IsExternalClientBackend(). This doesn't matter much anymore, since AssignGlobalPID effectively becomes a no-op in this PR for any non-external client backends. 2. It passes the application_name to InitializeBackendData in StartupCitusBackend, instead of INVALID_CITUS_INTERNAL_BACKEND_GPID (which effectively got casted to NULL). In practice this doesn't change the behaviour of the call, since the call is a no-op for every backend except the maintenance daemon. And the behaviour of the call is the same for NULL as for the application_name of the maintenance daemon.	2022-10-11 16:41:01 +02:00
Jelte Fennema	24e06af6d2	Reuse connections for Splits and Logical Replication (#6314 ) In Split, Logical replication logic and ShardCleaner we call `SendCommandListToWorkerOutsideTransaction` and `SendOptionalCommandListToWorkerOutsideTransaction` frequently. This opens new connection for each of those calls, even though we already have a perfectly good connection lying around. This PR adds two new APIs `SendCommandListToWorkerOutsideTransactionWithConnection` and `SendOptionalCommandListToWorkerOutsideTransactionWithConnection` that allow sending a list of queries in a transaction over an existing connection. We also update the callers (Split, ShardCleaner, Logical Replication) to use these new APIs instead. Co-authored-by: Nitish Upreti <niupre@microsoft.com> Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>	2022-09-26 13:37:40 +02:00
Nitish Upreti	d7404a9446	'Deferred Drop' and robust 'Shard Cleanup' for Splits. (#6258 ) DESCRIPTION: This PR adds support for 'Deferred Drop' and robust 'Shard Cleanup' for Splits. Common Infrastructure This PR introduces new common infrastructure so as any operation that wants robust cleanup of resources can register with the cleaner and have the resources cleaned appropriately based on a specified policy. 'Shard Split' is the first consumer using this new infrastructure. Note : We only support adding 'shards' as resources to be cleaned-up right now but the framework will be extended to support other resources in future. Deferred Drop for Split Deferred Drop Support ensures that shards undergoing split are not dropped inline as part of operation but dropped later when no active read queries are running on shard. This helps with : Avoids any potential deadlock scenarios that can cause long running Split operation to rollback. Avoids Split operation blocking writes and then getting blocked (due to running queries on the shard) when trying to drop shards. Deferred drop is the new default behavior going forward. Shard Cleaner Extension Shard Cleaner is a background task responsible for deferred drops in case of 'Move' operations. The cleaner has been extended to ensure robust cleanup of shards (dummy shards and split children) in case of a failure based on the new infrastructure mentioned above. The cleaner also handles deferred drop for 'Splits'. TESTING: New test ''citus_split_shard_by_split_points_deferred_drop' to test deferred drop support. New test 'failure_split_cleanup' to test shard cleanup with failures in different stages. Update 'isolation_blocking_shard_split and isolation_non_blocking_shard_split' for deferred drop. Added non-deferred drop version of existing tests : 'citus_split_shard_no_deferred_drop' and 'citus_non_blocking_splits_no_deferred_drop'	2022-09-06 12:11:20 -07:00
Jelte Fennema	1c5b8588fe	Address race condition in InitializeBackendData (#6285 ) Sometimes in CI our isolation_citus_dist_activity test fails randomly like this: ```diff step s2-view-dist: SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC; query \|citus_nodename_for_nodeid\|citus_nodeport_for_nodeid\|state \|wait_event_type\|wait_event\|usename \|datname ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+---------- INSERT INTO test_table VALUES (100, 100); \|localhost \| 57636\|idle in transaction\|Client \|ClientRead\|postgres\|regression -(1 row) + + SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.)), '[{}]'::JSONB) + FROM ( + SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity. FROM + pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid + ) AS csa_from_one_node; + \|localhost \| 57636\|active \| \| \|postgres\|regression +(2 rows) step s3-view-worker: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26692/workflows/3406e4b4-b686-4667-bec6-8253ee0809b1/jobs/765119 I intended to fix this with #6263, but the fix turned out to be insufficient. This PR tries to address the issue by setting distributedCommandOriginator correctly in more situations. However, even with this change it's still possible to reproduce the flaky test in CI. In any case this should fix at least some instances of this issue. In passing this changes the isolation_citus_dist_activity test to allow running it multiple times in a row.	2022-09-02 14:23:47 +02:00
Marco Slot	432f399a5d	Allow citus_internal application_name with additional suffix (#6282 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-09-01 14:26:43 +02:00
Jelte Fennema	d68654680b	Fix flakyness in isolation_citus_dist_activity (#6263 ) Sometimes in CI our isolation_citus_dist_activity test fails randomly like this: ```diff step s2-view-dist: SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC; query \|citus_nodename_for_nodeid\|citus_nodeport_for_nodeid\|state \|wait_event_type\|wait_event\|usename \|datname ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+---------- INSERT INTO test_table VALUES (100, 100); \|localhost \| 57636\|idle in transaction\|Client \|ClientRead\|postgres\|regression -(1 row) + + SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.)), '[{}]'::JSONB) + FROM ( + SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity. FROM + pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid + ) AS csa_from_one_node; + \|localhost \| 57636\|active \| \| \|postgres\|regression +(2 rows) step s3-view-worker: ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26605/workflows/56d284d2-5bb3-4e64-a0ea-7b9b1626e7cd/jobs/760633 The reason for this is that citus_dist_stat_activity sometimes shows the query that it uses itself to get the data from pg_stat_activity. This is actually a bug, because it's a worker query and thus shouldn't show up there. To try and solve this bug, we remove two small opportunities for a race condition. These race conditions could happen when the backenddata was marked as active, but the distributedCommandOriginator was not set correctly yet/anymore. There was an opportunity for this to happen both during connection start and shutdown.	2022-08-30 12:57:37 +03:00
Marco Slot	639588bee0	Remove unused functions (#6220 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-08-22 11:53:25 +03:00
Jelte Fennema	dd548ee3c7	Use faster custom copy logic for non-blocking shard moves (#6119 ) DESCRIPTION: Use faster custom copy logic for non-blocking shard moves Non-blocking shard moves consist of two main phases: 1. Initial data copy 2. Catchup phase This changes the first of these phases significantly. Previously we used the copy logic provided by postgres subscriptions. This meant we didn't have to implement it ourselves, but it came with the downside of little control. When implementing shard splits we needed more control to even make it work, so we implemented our own logic for copying data between nodes. This PR starts using that logic for non-blocking shard moves. Doing so has four main advantages: 1. It uses COPY in binary format when possible, which is cheaper to encode and decode. Furthermore it very often results in less data that needs to be sent over the network. 2. It allows us to create the primary key (or other replica identity) after doing the initial data copy. This should give some speed up over the total run, because creating an index is bulk is much faster than incrementally building it. 3. It doesn't require a replication slot per parallel copy. Increasing the maximum number of replication slots uses resources in postgres, even if they are not used. So reducing the number of replication slots that shard moves need is nice. 4. Logical replication table_sync workers are slow to start up, so if lots of shards need to be copied that can make it quite slow. This can happen easily when combining Postgres partitioning with Citus.	2022-08-08 17:09:43 +02:00
Teja Mupparti	430c201d03	get_current_transaction_id() UDF is not printing the timestamp of the current transaction on the coordinator even when non-null	2022-08-05 10:12:07 -07:00
Onder Kalaci	149771792b	Remove useless version compats most likely leftover from earlier versions	2022-07-29 10:31:55 +02:00
Onder Kalaci	0a5112964d	Call relation access hash clean-up irrespective of remote transaction state Mainly because local-only transactions should be cleaned up	2022-07-28 11:27:59 +02:00
Onder Kalaci	d67cf907a2	Detach relation access tracking from connection management	2022-07-28 11:27:59 +02:00
Onder Kalaci	6c65d29924	Check the PGPROC's validity properly We used to only check whether the PID is valid or not. However, Postgres does not necessarily set the PID of the backend to 0 when it exists. Instead, we need to be able to check it from procArray. IsBackendPid() is what pg_stat_activity also relies on for a similar purpose.	2022-07-26 17:44:44 +02:00
Naisila Puka	7d6410c838	Drop postgres 12 support (#6040 ) * Remove if conditions with PG_VERSION_NUM < 13 * Remove server_above_twelve(&eleven) checks from tests * Fix tests * Remove pg12 and pg11 alternative test output files * Remove pg12 specific normalization rules * Some more if conditions in the code * Change RemoteCollationIdExpression and some pg12/pg13 comments * Remove some more normalization rules	2022-07-20 17:49:36 +03:00
Onder Kalaci	483a3a5875	PG 15 Compat: Resolve compile issues + shmem requests Similar to #5897, one more step for running Citus with PG 15. This PR at least make Citus run with PG 15. I have not tried running the tests with PG 15. Shmem changes are based on `4f2400cb3f` Compile breaks are mostly due to #6008	2022-07-15 10:11:39 +02:00
Jelte Fennema	184c7c0bce	Make enterprise features open source (#6008 ) This PR makes all of the features open source that were previously only available in Citus Enterprise. Features that this adds: 1. Non blocking shard moves/shard rebalancer (`citus.logical_replication_timeout`) 2. Propagation of CREATE/DROP/ALTER ROLE statements 3. Propagation of GRANT statements 4. Propagation of CLUSTER statements 5. Propagation of ALTER DATABASE ... OWNER TO ... 6. Optimization for COPY when loading JSON to avoid double parsing of the JSON object (`citus.skip_jsonb_validation_in_copy`) 7. Support for row level security 8. Support for `pg_dist_authinfo`, which allows storing different authentication options for different users, e.g. you can store passwords or certificates here. 9. Support for `pg_dist_poolinfo`, which allows using connection poolers in between coordinator and workers 10. Tracking distributed query execution times using citus_stat_statements (`citus.stat_statements_max`, `citus.stat_statements_purge_interval`, `citus.stat_statements_track`). This is disabled by default. 11. Blocking tenant_isolation 12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`	2022-06-16 00:23:46 -07:00
Ahmet Gedemenli	268d3fa3a6	Fix materialized view intermediate result filename (#5982 )	2022-06-14 15:07:08 +03:00
Onder Kalaci	dd02e1755f	Parallelize metadata syncing on node activate It is often useful to be able to sync the metadata in parallel across nodes. Also citus_finalize_upgrade_to_citus11() uses start_metadata_sync_to_primary_nodes() after this commit. Note that this commit does not parallelize all pieces of node activation or metadata syncing. Instead, it tries to parallelize potenially large parts of metadata, which is the objects and distributed tables (in general Citus tables). In the future, it would be nice to sync the reference tables in parallel across nodes. Create ~720 distributed tables / ~23450 shards ```SQL -- declaratively partitioned table CREATE TABLE github_events_looooooooooooooong_name ( event_id bigint, event_type text, event_public boolean, repo_id bigint, payload jsonb, repo jsonb, actor jsonb, org jsonb, created_at timestamp ) PARTITION BY RANGE (created_at); SELECT create_time_partitions( table_name := 'github_events_looooooooooooooong_name', partition_interval := '1 day', end_at := now() + '24 months' ); CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id); SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id'); SET client_min_messages TO ERROR; ``` across 1 node: almost same as expected ```SQL SELECT start_metadata_sync_to_primary_nodes(); Time: 15664.418 ms (00:15.664) select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node; Time: 14284.069 ms (00:14.284) ``` across 7 nodes: ~3.5x improvement ```SQL SELECT start_metadata_sync_to_primary_nodes(); ┌──────────────────────────────────────┐ │ start_metadata_sync_to_primary_nodes │ ├──────────────────────────────────────┤ │ t │ └──────────────────────────────────────┘ (1 row) Time: 25711.192 ms (00:25.711) -- across 7 nodes select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node; Time: 82126.075 ms (01:22.126) ```	2022-05-23 09:15:48 +02:00
Ying Xu	a1151c2395	Clear metadatacache during abort for create extension (#5907 ) * Bug fix for bug #5876. Memset MetadataCacheSystem every time there is an abort * Created an ObjectAccessHook that saves the transactionlevel of when citus was created and will clear metadatacache if that transaction level is rolled back. Added additional tests to make sure metadatacache is cleared	2022-05-20 13:47:58 -07:00
Marco Slot	7abcfac61f	Add caching for functions that check the backend type	2022-05-20 19:02:37 +02:00
Gledis Zeneli	4c6f62efc6	Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930 ) Breaking down #5899 into smaller PR-s This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views). This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command: TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3; We generate and send the following command to all the workers in metadata: ```sql SEL citus.enable_ddl_propagation TO FALSE; LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE; SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE'); SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE'); LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE; SEL citus.enable_ddl_propagation TO TRUE; ``` Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations. When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.	2022-05-11 18:38:48 +03:00
Burak Velioglu	1460452442	Introduce CREATE/DROP VIEW Adds support for propagating create/drop view commands and views to worker node while scaling out the cluster. Since views are dropped while converting the table type, metadata connection will be used while propagating view commands to not switch to sequential mode.	2022-05-10 13:07:14 +03:00
Jeff Davis	26f5e20580	PG15: update integer parsing APIs. Account for PG commits 3c6f8c011f and cfc7191dfe.	2022-05-02 10:12:03 -07:00
Onder Kalaci	a2debe0f02	Do not assign distributed transaction ids for local execution In the past, for all modifications on the local execution, we enabled 2PC (with `6a7ed7b309`). This also required us to enable coordinated transactions via https://github.com/citusdata/citus/pull/4831 . However, it does have a very substantial impact on the distributed deadlock detection. The distributed deadlock detection is designed to avoid single-statement transactions because they cannot lead to any actual deadlocks. The implementation is to skip backends without distributed transactions are assigned. Now that we assign single statement local executions in the lock graphs, we are conflicting with the design of distributed deadlock detection. In general, we should fix it. However, one might think that it is not a big deal, even if the processes show up in the lock graphs, the deadlock detection should not be causing any false positives. That is false, unless https://github.com/citusdata/citus/issues/1803 is fixed. Now that local processes are considered as a single distributed backend, the lock graphs might find: local execution 1 [tx id: 1] -> any local process [tx id: 0] any local process [tx id: 0] -> local execution 2 [tx id: 2] And, decides that there is a distributed deadlock. This commit is: (a) right thing to do, as local execuion should not need any distributed tx id (b) Eliminates performance issues that might come up with deadlock detection does a lot of unncessary checks (c) After moving local execution after the remote execution via https://github.com/citusdata/citus/pull/4301, the vauge requirement for assigning distributed tx ids are already gone.	2022-04-13 13:25:12 +02:00
Onder Kalaci	b0b91bab04	Rename metadata sync to node metadata sync where applicable	2022-04-07 17:51:31 +02:00
Gledis Zeneli	56ab64b747	Patches #5758 with some more error checks (#5804 ) Add error checks to detect failed connection and don't ping secondary nodes to detect self reference.	2022-03-15 15:02:47 +03:00
Marco Slot	e42a798707	Always use RowShareLock in pg_dist_node when syncing metadata	2022-03-15 10:28:51 +01:00
Gledis Zeneli	2cb02bfb56	Fix node adding itself with citus_add_node leading to deadlock (Fix #5720 ) (#5758 ) If a worker node is being added, a command is sent to get the server_id of the worker from the pg_dist_node_metadata table. If the worker's id is the same as the node executing the code, we will know the node is trying to add itself. If the node tries to add itself without specifying `groupid:=0` the operation will result in an error.	2022-03-10 17:46:33 +03:00
Hanefi Onaldi	d153c2de0d	Fix some typos in comments	2022-03-10 15:03:26 +03:00
Halil Ozan Akgül	333bcc7948	Global PID Helper Functions (#5768 ) * Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions * Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions * Add tests	2022-03-09 13:15:59 +03:00
Onder Kalaci	c32b2de1a7	Improve citus_lock_waits 1) Remove useless columns 2) Show backends that are blocked on a DDL even before gpid is assigned 3) One minor bugfix, where we clear distributedCommandOriginator properly.	2022-03-07 11:10:44 +01:00
Halil Ozan Akgul	0500a62515	Updates citus_dist_stat_activity to use citus_stat_activity	2022-03-04 17:28:17 +03:00
Onder Kalaci	c7b67ba0ea	Add citus_backend_gpid() And also citus_calculate_gpid(nodeId,pid). These UDFs are just wrappers for the existing functions. Useful for testing and simple manipulation of citus_stat_activity.	2022-03-03 15:29:40 +01:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Onder Kalaci	e80a36c4b6	Improve visibility rules for non-priviledge roles It seems like our approach is way too restrictive and some places are wrong. Now, we follow very similar approach to pg_stat_activity. Some of the changes are pre-requsite for implementing citus_dist_stat_activity via citus_stat_activity.	2022-03-02 18:04:01 +01:00
Onder Kalaci	df95d59e33	Drop support for CitusInitiatedBackend CitusInitiatedBackend was a pre-mature implemenation of the whole GlobalPID infrastructure. We used it to track whether any individual query is triggered by Citus or not. As of now, after GlobalPID is already in place, we don't need CitusInitiatedBackend, in fact it could even be wrong.	2022-02-24 12:12:43 +01:00
Onder Kalaci	95d5918967	Properly set worker_query and use	2022-02-21 18:22:33 +01:00
Onder Kalaci	dffcafc096	Use global pids in citus_lock_waits	2022-02-21 17:46:34 +01:00
Onder Kalaci	331af3dce8	Dumping wait edges becomes optionally scan all backends Before this commit, dumping wait edges can only be used for distributed deadlock detection purposes. With this commit, we open the possibility that we can use it for any backend.	2022-02-21 17:37:07 +01:00

1 2 3 4 5 ...

330 Commits (1fb3de14dfa6b5ad1c997f79067007cfdce8fc4f)