citus

Commit Graph

Author	SHA1	Message	Date
gurkanindibay	a5385d9f9c	Fixes test errors	2024-03-18 11:56:13 +03:00
gurkanindibay	3c3477efec	Fixes test errors	2024-03-15 18:17:42 +03:00
gurkanindibay	8bc2623687	Removes unnecessary test	2024-03-15 18:01:03 +03:00
gurkanindibay	5c5b19676b	Removes alter tablespace	2024-03-15 15:17:00 +03:00
gurkanindibay	2ca4d3eba7	Adds alter database stmt from non-main db	2024-03-15 12:02:05 +03:00
eaydingol	8afa2d0386	Change the order in which the locks are acquired (#7542 ) This PR changes the order in which the locks are acquired (for the target and reference tables), when a modify request is initiated from a worker node that is not the "FirstWorkerNode". To prevent concurrent writes, locks are acquired on the first worker node for the replicated tables. When the update statement originates from the first worker node, it acquires the lock on the reference table(s) first, followed by the target table(s). However, if the update statement is initiated in another worker node, the lock requests are sent to the first worker in a different order. This PR unifies the modification order on the first worker node. With the third commit, independent of the node that received the request, the locks are acquired for the modified table and then the reference tables on the first node. The first commit shows a sample output for the test prior to the fix. Fixes #7477 --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2024-03-10 10:20:08 +03:00
copetol	12f56438fc	Fix segfault when using certain DO block in function (#7554 ) When using a CASE WHEN expression in the body of the function that is used in the DO block, a segmentation fault occured. This fixes that. Fixes #7381 --------- Co-authored-by: Konstantin Morozov <vzbdryn@yahoo.com>	2024-03-08 14:21:42 +01:00
Gürkan İndibay	51009d0191	Add support for alter/drop role propagation from non-main databases (#7461 ) DESCRIPTION: Adds support for distributed `ALTER/DROP ROLE` commands from the databases where Citus is not installed --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2024-02-28 08:58:28 +00:00
Onur Tirtir	f4242685e3	Add failure handling for CREATE DATABASE commands (#7483 ) In preprocess phase, we save the original database name, replace dbname field of CreatedbStmt with a temporary name (to let Postgres to create the database with the temporary name locally) and then we insert a cleanup record for the temporary database name on all nodes *(\\). And in postprocess phase, we first rename the temporary database back to its original name for local node and then return a list of distributed DDL jobs i) to create the database with the temporary name and then ii) to rename it back to its original name on other nodes. That way, if CREATE DATABASE fails on any of the nodes, the temporary database will be cleaned up by the cleanup records that we inserted in preprocess phase and in case of a failure, we won't leak any databases called as the name that user intended to use for the database. Solves the problem documented in https://github.com/citusdata/citus/issues/7369 for CREATE DATABASE commands. (\\):* To ensure that we insert cleanup records on all nodes, with this PR we also start requiring having the coordinator in the metadata because otherwise we would skip inserting a cleanup record for the coordinator.	2024-02-23 17:02:32 +00:00
Onur Tirtir	9ddee5d02a	Test that we check unsupported options for CREATE DATABASE from non-main dbs (#7532 ) When adding CREATE/DROP DATABASE propagation in #7240, luckily we've added EnsureSupportedCreateDatabaseCommand() check into deparser too just to be on the safe side. That way, today CREATE DATABASE commands from non-main dbs don't silently allow unsupported options. I wasn't aware of this when merging #7439 and hence wanted to add a test so that we don't mistakenly remove that check from deparser in future.	2024-02-23 10:37:11 +00:00
eaydingol	3509b7df5a	Add support for SECURITY LABEL on ROLE propagation from non-main databases (#7525 ) DESCRIPTION: Adds support for distributed "SECURITY LABEL on ROLE" commands from the databases where Citus is not installed.	2024-02-23 09:54:19 +03:00
Gürkan İndibay	211415dd4b	Removes granted by statement to fix flaky test errors (#7526 ) Fix for the #7519 In metadata sync phase, grant statements for roles are being fetched and propagated from catalog tables. However, in some cases grant .. with admin option clauses executes after the granted by statements which causes #7519 error. We will fix this issue with the grantor propagation task in the project	2024-02-21 18:37:25 +03:00
Halil Ozan Akgül	852bcc5483	Add support for create / drop database propagation from non-main databases (#7439 ) DESCRIPTION: Adds support for distributed `CREATE/DROP DATABASE ` commands from the databases where Citus is not installed --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2024-02-21 10:44:01 +00:00
Gürkan İndibay	b3ef1b7e39	Add support for grant on database propagation from non-main databases (#7443 ) DESCRIPTION: Adds support for distributed `GRANT .. ON DATABASE TO USER` commands from the databases where Citus is not installed --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2024-02-21 13:14:58 +03:00
Gürkan İndibay	2cbfdbfa46	Adds Grant Role support from non-main db (#7404 ) DESCRIPTION: Adds support for distributed role-membership management commands from the databases where Citus is not installed (`GRANT <role> TO <role>`) This PR also refactors the code-path that allows executing some of the node-wide commands so that we use send deparsed query string to other nodes instead of the `queryString` passed into utility hook. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2024-02-19 17:53:27 +03:00
Gürkan İndibay	9a0cdbf5af	Fixes granted by cascade/restrict statements for revoke (#7517 ) DESCRIPTION: Fixes incorrect propagating of `GRANTED BY` and `CASCADE/RESTRICT` clauses for `REVOKE` statements There are two issues fixed in this PR 1. granted by statement will appear for revoke statements as well 2. revoke/cascade statement will appear after granted by Since granted by statements does not appear in statements, this bug hasn't been visible until now. However, after activating the granted by statement for revoke, order problem arised and this issue was fixed order problem for cascade/revoke as well In summary, this PR provides usage of granted by statements properly now with the correct order of statements. We can verify the both errors, fixed with just single statement REVOKE dist_role_3 from non_dist_role_3 granted by test_admin_role cascade;	2024-02-19 15:44:21 +03:00
eaydingol	15a3adebe8	Support SECURITY LABEL ON ROLE from any node (#7508 ) DESCRIPTION: Propagates SECURITY LABEL ON ROLE statement from any node	2024-02-15 20:34:15 +03:00
Gürkan İndibay	59da0633bb	Fixes invalid grantor field parsing in grant role propagation (#7451 ) DESCRIPTION: Resolves an issue that disrupts distributed GRANT statements with the grantor option In this issue 3 issues are being solved: 1.Correcting the erroneous appending of multiple granted by in the deparser. 2Adding support for grantor (granted by) in grant role propagation. 3. Implementing grantor (granted by) support during the metadata sync grant role propagation phase. Limitations: Currently, the grantor must be created prior to the metadata sync phase. During metadata sync, both the creation of the grantor and the grants given by that role cannot be performed, as the grantor role is not detected during the dependency resolution phase. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2024-02-15 08:27:29 +00:00
eaydingol	f01c5f2593	Move remaining citus_internal functions (#7478 ) Moves the following functions to the Citus internal schema: citus_internal_local_blocked_processes citus_internal_global_blocked_processes citus_internal_mark_node_not_synced citus_internal_unregister_tenant_schema_globally citus_internal_update_none_dist_table_metadata citus_internal_update_placement_metadata citus_internal_update_relation_colocation citus_internal_start_replication_origin_tracking citus_internal_stop_replication_origin_tracking citus_internal_is_replication_origin_tracking_active #7405 --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2024-02-07 16:58:17 +03:00
Filip Sedlák	6869b3ad10	Fail early when shard can't be safely moved to a new node (#7467 ) DESCRIPTION: citus_move_shard_placement now fails early when shard cannot be safely moved The implementation is quite simplistic - `citus_move_shard_placement(...)` will fail with an error if there's any new node in the cluster that doesn't have reference tables yet. It could have been finer-grained, i.e. erroring only when trying to move a shard to an unitialized node. Looking at the related functions - `replicate_reference_tables()` or `citus_rebalance_start()`, I think it's acceptable behaviour. These other functions also treat "any" unitialized node as a temporary anomaly. Fixes #7426 --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2024-02-07 12:04:52 +00:00
eaydingol	594cb6f274	Move more citus internal functions (#7473 ) Moves the following functions: citus_internal_delete_colocation_metadata citus_internal_delete_partition_metadata citus_internal_delete_placement_metadata citus_internal_delete_shard_metadata citus_internal_delete_tenant_schema	2024-01-31 23:00:04 +03:00
eaydingol	d05174093b	Move citus internal functions (#7470 ) Move more functions to citus_internal schema, the list: citus_internal_add_placement_metadata citus_internal_add_shard_metadata citus_internal_add_tenant_schema citus_internal_adjust_local_clock_to_remote citus_internal_database_command #7405	2024-01-31 11:45:19 +00:00
eaydingol	f6ea619e27	Move citus internal functions (#7466 ) Move the following functions from pg_catalog to citus_internal: citus_internal_add_object_metadata citus_internal_add_partition_metadata #7405	2024-01-30 12:27:10 +03:00
eaydingol	5d673874f7	Move citus internal functions (#7456 ) Move citus_internal_acquire_citus_advisory_object_class_lock and citus_internal_add_colocation_metadata functions from pg_catalog to citus_internal. #7405	2024-01-26 11:46:05 +03:00
eaydingol	542212c3d8	Make citus_internal schema public (#7450 ) DESCRIPTION: Makes citus_internal schema public #7405	2024-01-24 17:11:10 +03:00
Onur Tirtir	1d096df7f4	Not use hardcoded LOCAL_HOST_NAME but citus.local_hostname to distinguish loopback connections (#7436 ) Fixes a bug that breaks queries from non-maindbs when citus.local_hostname is set to a value different than "localhost". This is a very old bug doesn't cause a problem as long as Citus catalog is available to FindWorkerNode(). And the catalog is always available unless we're in non-main database, which might be the case on main but not on older releases, hence not adding a `DESCRIPTION`. For this reason, I don't see a reason to backport this. Maybe we should totally refrain using LOCAL_HOST_NAME in all code-paths, but not doing that in this PR as the other paths don't seem to be breaking something that is user-facing. ```c char * GetAuthinfo(char hostname, int32 port, char user) { char authinfo = NULL; bool isLoopback = (strncmp(LOCAL_HOST_NAME, hostname, MAX_NODE_LENGTH) == 0 && PostPortNumber == port); if (IsTransactionState()) { int64 nodeId = WILDCARD_NODE_ID; / -1 is a special value for loopback connections (task tracker) / if (isLoopback) { nodeId = LOCALHOST_NODE_ID; } else { WorkerNode worker = FindWorkerNode(hostname, port); if (worker != NULL) { nodeId = worker->nodeId; } } authinfo = GetAuthinfoViaCatalog(user, nodeId); } return (authinfo != NULL) ? authinfo : ""; } ```	2024-01-24 12:58:55 +00:00
Filip Sedlák	8b48d6ab02	Log username in the failed connection message (#7432 ) This patch includes the username in the reported error message. This makes debugging easier when certain commands open connections as other users than the user that is executing the command. ``` monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017); ERROR: connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied Time: 40,198 ms ```	2024-01-24 11:24:23 +00:00
Halil Ozan Akgül	1cb2e1e4e8	Fixes create user queries from Citus non-main databases with other users (#7442 ) This PR makes the connections to other nodes for `mark_object_distributed` use the same user as `execute_command_on_remote_nodes_as_user` so they'll use the same connection.	2024-01-24 12:57:54 +03:00
Gürkan İndibay	188614512f	Adds comment on database and role propagation (#7388 ) DESCRIPTION: Adds comment on database and role propagation. Example commands are as below comment on database <db_name> is '<comment_text>' comment on database <db_name> is NULL comment on role <role_name> is '<comment_text>' comment on role <role_name> is NULL --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2024-01-18 20:58:44 +03:00
Onur Tirtir	04b374fc01	Fix upgrade tests (#7413 ) Adding upgrade_basic_before_non_mixed.sql file because while upgrade_basic_after_non_mixed exist, its before variation didn't exist as we don't have any "before" steps. However, run_test.py assumes that all "after" files do have a "before" variation as well. So this PR adds an empty upgrade_basic_before_non_mixed.sql file. Also, given that we don't have such a version called as 12.1devel anymore, change it to 12.1.1. And finally, let CI skip testing flakyness for upgrade tests both because it's quite hard to get flaky-test-detection job working for upgrade tests and also because in the end it is not much useful to test upgrade tests against flakyness.	2024-01-16 12:37:18 +00:00
Teja Mupparti	00068e07c5	Fix the incorrect column count after ALTER TABLE, this fixes the bug #7378 (please read the analysis in the bug for more information)	2024-01-10 12:49:44 -08:00
Onur Tirtir	1d55debb98	Support CREATE / DROP database commands from any node (#7359 ) DESCRIPTION: Adds support for issuing `CREATE`/`DROP` DATABASE commands from worker nodes With this commit, we allow issuing CREATE / DROP DATABASE commands from worker nodes too. As in #7278, this is not allowed when the coordinator is not added to metadata because we don't ever sync metadata changes to coordinator when adding coordinator to the metadata via `SELECT citus_set_coordinator_host('<hostname>')`, or equivalently, via `SELECT citus_add_node(<coordinator_node_name>, <coordinator_node_port>, 0)`. We serialize database management commands by acquiring a Citus specific advisory lock on the first primary worker node if there are any workers in the cluster. As opposed to what we've done in https://github.com/citusdata/citus/pull/7278 for role management commands, we try to avoid from running into distributed deadlocks as much as possible. This is because, while distributed deadlocks that can happen around role management commands can be detected by Citus, this is not the case for database management commands because most of them cannot be run inside in a transaction block. In that case, Citus cannot even detect the distributed deadlock because the command is not part of a distributed transaction at all, then the command execution might not return the control back to the user for an indefinite amount of time.	2024-01-08 16:47:49 +00:00
Karina	20dc58cf5d	Fix getting heap tuple size (#7387 ) This fixes #7230. First of all, using HeapTupleHeaderGetDatumLength(heapTuple) is definetly wrong, it gives a number that's 4 times less than the correct tuple size (heapTuple.t_len). See https://github.com/postgres/postgres/blob/REL_16_0/src/include/access/htup_details.h#L455-L456 https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L279 https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L225-L226 When I fixed it, the limit_intermediate_size test failed, so I tried to understand what's going on there. In original commit `fd546cf` these queries were supposed to fail. Then in `b3af63c` three of the queries that were supposed to fail suddenly worked and tests were changed to pass without understanding why the output had changed or how to keep test testing what it had to test. Even comments saying that these queries should fail were left untouched. Commit message gives no clue about why exactly test has changed: > It seems that when we use adaptive executor instead of task tracker, we > exceed the intermediate result size less in the test. Therefore updated > the tests accordingly. Then `3fda2c3` also blindly raised the limit for one of the queries to keep it working: `3fda2c3254 (diff-a9b7b617f9dfd345318cb8987d5897143ca1b723c87b81049bbadd94dcc86570R19)` When in `fe3caf3` that HeapTupleHeaderGetDatumLength(heapTuple) call was finally added, one of those test queries became failing again. The other two of them now also failing after the fix. I don't understand how exactly the calculation of "intermediate result size" that is limited by citus.max_intermediate_result_size had changed through `b3af63c` and `fe3caf3`, but these numbers are now closer to what they originally were when this limitation was added in `fd546cf`. So these queries should fail, like in the original version of the limit_intermediate_size test. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2024-01-08 17:09:30 +01:00
Onur Tirtir	968ac74cde	Fix foreign_key_to_reference_shard_rebalance test (#7400 ) foreign_key_to_reference_shard_rebalance failed because partition of 2024 year does not exist, fixed by add default partition. Replaces https://github.com/citusdata/citus/pull/7396 by adding a rule that allows properly testing foreign_key_to_reference_shard_rebalance via run_test.py. Closes #7396 Co-authored-by: chuhx <148182736+cstarc1@users.noreply.github.com>	2024-01-04 13:16:45 +01:00
Onur Tirtir	d940cfa992	Do nothing if the database is not distributed (#7392 ) Fixes the remaining cases reported in https://github.com/citusdata/citus/issues/7370.	2024-01-03 17:03:06 +03:00
Gürkan İndibay	c3579eef06	Adds REASSIGN OWNED BY propagation (#7319 ) DESCRIPTION: Adds REASSIGN OWNED BY propagation This pull request introduces the propagation of the "Reassign owned by" statement. It accommodates both local and distributed roles for both the old and new assignments. However, when the old role is a local role, it undergoes filtering and is not propagated. On the other hand, if the new role is a local role, the process involves first creating the role on worker nodes before propagating the "Reassign owned" statement.	2023-12-28 15:15:58 +03:00
Gürkan İndibay	181b8ab6d5	Adds additional alter database propagation support (#7253 ) DESCRIPTION: Adds database connection limit, rename and set tablespace propagation In this PR, below statement propagations are added alter database <database_name> with allow_connections = <boolean_value>; alter database <database_name> rename to <database_name2>; alter database <database_name> set TABLESPACE <table_space_name> --------- Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl> Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-12-26 14:55:04 +03:00
Halil Ozan Akgül	b877d606c7	Adds 2PC distributed commands from other databases (#7203 ) DESCRIPTION: Adds support for 2PC from non-Citus main databases This PR only adds support for `CREATE USER` queries, other queries need to be added. But it should be simple because this PR creates the underlying structure. Citus main database is the database where the Citus extension is created. A non-main database is all the other databases that are in the same node with a Citus main database. When a `CREATE USER` query is run on a non-main database we: 1. Run `start_management_transaction` on the main database. This function saves the outer transaction's xid (the non-main database query's transaction id) and marks the current query as main db command. 2. Run `execute_command_on_remote_nodes_as_user("CREATE USER <username>", <username to run the command>)` on the main database. This function creates the users in the rest of the cluster by running the query on the other nodes. The user on the current node is created by the query on the outer, non-main db, query to make sure consequent commands in the same transaction can see this user. 3. Run `mark_object_distributed` on the main database. This function adds the user to `pg_dist_object` in all of the nodes, including the current one. This PR also implements transaction recovery for the queries from non-main databases.	2023-12-22 19:19:41 +03:00
Jodi-Ann Francis	6801a1ed1e	PG16 update GRANT... ADMIN \| INHERIT \| SET, and REVOKE Allowing GRANT ADMIN to now also be INHERIT or SET in support of psql16 GRANT role_name [, ...] TO role_specification [, ...] [ WITH { ADMIN \| INHERIT \| SET } { OPTION \| TRUE \| FALSE } ] [ GRANTED BY role_specification ] Fixes: #7148 Related: #7138 See review changes from https://github.com/citusdata/citus/pull/7164	2023-12-13 15:57:02 -05:00
Naisila Puka	dbdde111c1	Add missing order by clause in failure_split_cleanup test (#7363 ) https://github.com/citusdata/citus/actions/runs/6903353045/attempts/1#summary-18781959638 ```diff ARRAY['-100000'], ARRAY[:worker_1_node, :worker_2_node], 'force_logical'); ERROR: server closed the connection unexpectedly CONTEXT: while executing command on localhost:9060 SELECT operation_id, object_type, object_name, node_group_id, policy_type FROM pg_dist_cleanup where operation_id = 777 ORDER BY object_name; operation_id \| object_type \| object_name \| node_group_id \| policy_type --------------+-------------+-----------------------------------------------------------+---------------+------------- 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981000 \| 1 \| 0 - 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 1 \| 1 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 2 \| 0 + 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 1 \| 1 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981003 \| 2 \| 1 777 \| 4 \| citus_shard_split_publication_1_10_777 \| 2 \| 0 (5 rows) ``` Similar attempt to fix in `c9f2fc892d` There were some more missing ORDER BY stuff, so I added them	2023-11-24 18:26:06 +03:00
Gürkan İndibay	3b556cb5ed	Adds create / drop database propagation support (#7240 ) DESCRIPTION: Adds support for propagating `CREATE`/`DROP` database In this PR, create and drop database support is added. For CREATE DATABASE: * "oid" option is not supported * specifying "strategy" to be different than "wal_log" is not supported * specifying "template" to be different than "template1" is not supported The last two are because those are not saved in `pg_database` and when activating a node, we cannot assume what parameters were provided when creating the database. And "oid" is not supported because whether user specified an arbitrary oid when creating the database is not saved in pg_database and we want to avoid from oid collisions that might arise from attempting to use an auto-assigned oid on workers. Finally, in case of node activation, GRANTs for the database are also propagated. --------- Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl> Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-11-21 16:43:51 +03:00
Naisila Puka	cedcc220bf	Fixes flaky VACUUM (freeze, process toast true) result (#7348 ) https://app.circleci.com/pipelines/github/citusdata/citus/34550/workflows/5b802f66-2666-4623-a209-6d7799f7ee5f/jobs/1229153 ```diff VACUUM (FREEZE, PROCESS_TOAST true) local_vacuum_table; SELECT relfrozenxid::text::integer > :frozenxid AS frozen_performed FROM pg_class WHERE oid=:reltoastrelid::regclass; frozen_performed ------------------ - t + f (1 row) ``` Process toast option in vacuum was introduced in PG14. The failing test was supposed to be a part of `multi_utilities.sql`, but it was included in `pg14.sql` to avoid alternative output for PG13. See `ba62c0a148 (diff-ed03478f693155e2fe092e9ad356bf884dc097f554e8d75eff562d52bbcf7a75L255-L272)` for reference. However, now that we don't support PG13 anymore, we can move this test to `multi_utilities.sql`. Moving the test, plus inserting data before running vacuum freeze such that the freeze is more meaningful and not flaky, fixes the flakiness problem of the test.	2023-11-17 18:58:06 +03:00
Naisila Puka	c88bf5ff1c	Cleanup leftover replication slots in publication test (#7354 )	2023-11-17 15:11:38 +03:00
Naisila Puka	0d1f18862b	Propagates SECURITY LABEL ON ROLE stmt (#7304 ) We propagate `SECURITY LABEL [for provider] ON ROLE rolename IS labelname` to the worker nodes. We also make sure to run the relevant `SecLabelStmt` commands on a newly added node by looking at roles found in `pg_shseclabel`. See official docs for explanation on how this command works: https://www.postgresql.org/docs/current/sql-security-label.html This command stores the role label in the `pg_shseclabel` catalog table. This commit also fixes the regex string in `check_gucs_are_alphabetically_sorted.sh` script such that it escapes the dot. Previously it was looking for all strings starting with "citus" instead of "citus." as it should. To test this feature, I currently make use of a special GUC to control label provider registration in PG_init when creating the Citus extension.	2023-11-16 13:12:30 +03:00
Naisila Puka	c6fbb72c02	Fix flaky multi_prepare_plsql (#7346 ) Simple need of an `ORDER BY` clause Ran into this twice this week already! https://github.com/citusdata/citus/actions/runs/6849701315/attempts/1#summary-18622563506 https://github.com/citusdata/citus/actions/runs/6875051160/attempts/1#summary-18698009952 ```diff SELECT nspname, typname FROM pg_type JOIN pg_namespace ON pg_namespace.oid = pg_type.typnamespace WHERE typname = 'prepare_ddl_type_backup'; nspname \| typname -------------+------------------------- - public \| prepare_ddl_type_backup otherschema \| prepare_ddl_type_backup + public \| prepare_ddl_type_backup (2 rows) ```	2023-11-15 13:28:43 +03:00
Naisila Puka	a960799dfb	Clean up leftover replication slots in tests (#7338 ) This commit fixes the flakiness in `logical_replication` and `citus_non_blocking_split_shard_cleanup` tests. The flakiness was related to leftover replication slots. Below is a flaky example for each test: logical_replication https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267030604 citus_non_blocking_split_shard_cleanup https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267006967 ```diff -- Replication slots should be cleaned up SELECT slot_name FROM pg_replication_slots; slot_name --------------------------------- -(0 rows) + citus_shard_split_slot_19_10_17 +(1 row) ``` The tests by themselves are not flaky: 32 flaky test schedules each with 20 runs run successfully. https://github.com/citusdata/citus/actions/runs/6822020127?pr=7338 The conclusion is that: 1. `multi_tenant_isolation_nonblocking` is the problematic test running before `logical_replication` in the `enterprise_schedule`, so I added a cleanup at the end of `multi_tenant_isolation_nonblocking`. https://github.com/citusdata/citus/actions/runs/6824334614/attempts/1#summary-18560127461 2. `citus_split_shard_by_split_points_negative` is the problematic test running before `citus_non_blocking_split_shards_cleanup` in the split schedule. Also added cleanup line. For details on the investigation of leftover replication slots, please check the PR https://github.com/citusdata/citus/pull/7338	2023-11-14 18:50:54 +03:00
Naisila Puka	cdef2d5224	Random tests refactoring (#7342 ) While investigating replication slots leftovers in PR https://github.com/citusdata/citus/pull/7338, I ran into the following refactoring/cleanup that can be done in our test suite: - Add separate test to remove non default nodes - Remove coordinator removal from `add_coordinator` test Use `remove_coordinator_from_metadata` test where needed - Don't print nodeids in `multi_multiuser_auth` and `multi_poolinfo_usage` tests - Use `startswith` when checking for isolation or failure tests - Add some dependencies accordingly in `run_test.py` for running flaky test schedules	2023-11-14 12:49:15 +03:00
Onur Tirtir	240313e286	Support role commands from any node (#7278 ) DESCRIPTION: Adds support from issuing role management commands from worker nodes It's unlikely to get into a distributed deadlock with role commands, we don't care much about them at the moment. There were several attempts to reduce the chances of a deadlock but we didn't any of them merged into main branch yet, see: #7325 #7016 #7009	2023-11-10 09:58:51 +00:00
Naisila Puka	57ff762c82	Fix VACUUM flakiness in multi_utilities (#7334 ) When I run this test in my local, the size of the table after the DELETE command is around 58785792. Hence, I assume that the diffs suggest that the Vacuum had no effect. The current solution is to run the VACUUM command three times instead of once. Example diff: https://github.com/citusdata/citus/actions/runs/6722231142/attempts/1#summary-18269870674 ```diff insert into local_vacuum_table select i from generate_series(1,1000000) i; delete from local_vacuum_table; VACUUM local_vacuum_table; SELECT CASE WHEN s BETWEEN 20000000 AND 25000000 THEN 22500000 ELSE s END FROM pg_total_relation_size('local_vacuum_table') s ; s ---------- - 22500000 + 58785792 (1 row) ``` See more diff examples in the PR description https://github.com/citusdata/citus/pull/7334	2023-11-09 21:00:24 +03:00
Naisila Puka	0dc41ee5a0	Fix flaky multi_mx_insert_select_repartition test (#7331 ) https://github.com/citusdata/citus/actions/runs/6745019678/attempts/1#summary-18336188930 ```diff insert into target_table SELECT a*2 FROM source_table RETURNING a; -NOTICE: executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_xxxxx_from_4213582_to_0','repartitioned_results_xxxxx_from_4213584_to_0']::text[],'localhost',57638) bytes +NOTICE: executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_3940758121873413_from_4213584_to_0','repartitioned_results_3940758121873413_from_4213582_to_0']::text[],'localhost',57638) bytes ``` The elements in the array passed to `fetch_intermediate_results` are the same, but in the opposite order than expected. To fix this flakiness, we can omit the `"SELECT bytes FROM fetch_intermediate_results..."` line. From the following logs, it is understandable that the intermediate results have been fetched.	2023-11-08 15:15:33 +03:00

1 2 3 4 5 ...

2240 Commits (a5385d9f9cae73dc1eebf6d02f843d379194f1d9)