citus

Commit Graph

Author	SHA1	Message	Date
Onur Tirtir	5aedec4242	Improve error message for recursive CTEs (#7407 ) Fixes #2870	2024-01-30 15:12:48 +00:00
eaydingol	f6ea619e27	Move citus internal functions (#7466 ) Move the following functions from pg_catalog to citus_internal: citus_internal_add_object_metadata citus_internal_add_partition_metadata #7405	2024-01-30 12:27:10 +03:00
eaydingol	5d673874f7	Move citus internal functions (#7456 ) Move citus_internal_acquire_citus_advisory_object_class_lock and citus_internal_add_colocation_metadata functions from pg_catalog to citus_internal. #7405	2024-01-26 11:46:05 +03:00
eaydingol	542212c3d8	Make citus_internal schema public (#7450 ) DESCRIPTION: Makes citus_internal schema public #7405	2024-01-24 17:11:10 +03:00
Onur Tirtir	1d096df7f4	Not use hardcoded LOCAL_HOST_NAME but citus.local_hostname to distinguish loopback connections (#7436 ) Fixes a bug that breaks queries from non-maindbs when citus.local_hostname is set to a value different than "localhost". This is a very old bug doesn't cause a problem as long as Citus catalog is available to FindWorkerNode(). And the catalog is always available unless we're in non-main database, which might be the case on main but not on older releases, hence not adding a `DESCRIPTION`. For this reason, I don't see a reason to backport this. Maybe we should totally refrain using LOCAL_HOST_NAME in all code-paths, but not doing that in this PR as the other paths don't seem to be breaking something that is user-facing. ```c char * GetAuthinfo(char hostname, int32 port, char user) { char authinfo = NULL; bool isLoopback = (strncmp(LOCAL_HOST_NAME, hostname, MAX_NODE_LENGTH) == 0 && PostPortNumber == port); if (IsTransactionState()) { int64 nodeId = WILDCARD_NODE_ID; / -1 is a special value for loopback connections (task tracker) / if (isLoopback) { nodeId = LOCALHOST_NODE_ID; } else { WorkerNode worker = FindWorkerNode(hostname, port); if (worker != NULL) { nodeId = worker->nodeId; } } authinfo = GetAuthinfoViaCatalog(user, nodeId); } return (authinfo != NULL) ? authinfo : ""; } ```	2024-01-24 12:58:55 +00:00
Filip Sedlák	8b48d6ab02	Log username in the failed connection message (#7432 ) This patch includes the username in the reported error message. This makes debugging easier when certain commands open connections as other users than the user that is executing the command. ``` monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017); ERROR: connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied Time: 40,198 ms ```	2024-01-24 11:24:23 +00:00
Halil Ozan Akgül	1cb2e1e4e8	Fixes create user queries from Citus non-main databases with other users (#7442 ) This PR makes the connections to other nodes for `mark_object_distributed` use the same user as `execute_command_on_remote_nodes_as_user` so they'll use the same connection.	2024-01-24 12:57:54 +03:00
Gürkan İndibay	188614512f	Adds comment on database and role propagation (#7388 ) DESCRIPTION: Adds comment on database and role propagation. Example commands are as below comment on database <db_name> is '<comment_text>' comment on database <db_name> is NULL comment on role <role_name> is '<comment_text>' comment on role <role_name> is NULL --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2024-01-18 20:58:44 +03:00
Jelte Fennema-Nio	5ec056a172	Add pytest test example about connecting to a worker (#7386 ) I noticed while reviewing #7203 that there as no example of executing sql on a worker for the pytest README. Since this is a pretty common thing that people want to do, this PR adds that.	2024-01-18 15:05:24 +03:00
Jelte Fennema-Nio	fcfedff8d1	Support running isolation_update_node in flaky test detection (#7425 ) I noticed in #7423 that `isolation_update_node` could not be run using flaky test detection. This fixes that.	2024-01-17 15:36:26 +00:00
Valery	6cf6cf37fd	Adds information to explain output when using citus.explain_distributed_queries=false (#7412 ) Fixes https://github.com/citusdata/citus/issues/6490	2024-01-17 15:04:42 +00:00
Karina	21464adfec	Make isolation_update_node test system independent (#7423 ) Test isolation_update_node fails on some systems with the following error: ``` -s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Name or service not known +s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Temporary failure in name resolution ``` This slightly modifies an already existing [normalization rule](`739c6d26df/src/test/regress/bin/normalize.sed (L217-L218)`) to fix it. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2024-01-17 13:39:07 +00:00
Onur Tirtir	04b374fc01	Fix upgrade tests (#7413 ) Adding upgrade_basic_before_non_mixed.sql file because while upgrade_basic_after_non_mixed exist, its before variation didn't exist as we don't have any "before" steps. However, run_test.py assumes that all "after" files do have a "before" variation as well. So this PR adds an empty upgrade_basic_before_non_mixed.sql file. Also, given that we don't have such a version called as 12.1devel anymore, change it to 12.1.1. And finally, let CI skip testing flakyness for upgrade tests both because it's quite hard to get flaky-test-detection job working for upgrade tests and also because in the end it is not much useful to test upgrade tests against flakyness.	2024-01-16 12:37:18 +00:00
Halil Ozan Akgül	739c6d26df	Fix inserting to pg_dist_object for queries from other nodes (#7402 ) Running a query from a Citus non-main database that inserts to pg_dist_object requires a new connection to the main database itself. This PR adds that connection to the main database. --------- Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>	2024-01-11 16:05:14 +03:00
Teja Mupparti	00068e07c5	Fix the incorrect column count after ALTER TABLE, this fixes the bug #7378 (please read the analysis in the bug for more information)	2024-01-10 12:49:44 -08:00
Onur Tirtir	1d55debb98	Support CREATE / DROP database commands from any node (#7359 ) DESCRIPTION: Adds support for issuing `CREATE`/`DROP` DATABASE commands from worker nodes With this commit, we allow issuing CREATE / DROP DATABASE commands from worker nodes too. As in #7278, this is not allowed when the coordinator is not added to metadata because we don't ever sync metadata changes to coordinator when adding coordinator to the metadata via `SELECT citus_set_coordinator_host('<hostname>')`, or equivalently, via `SELECT citus_add_node(<coordinator_node_name>, <coordinator_node_port>, 0)`. We serialize database management commands by acquiring a Citus specific advisory lock on the first primary worker node if there are any workers in the cluster. As opposed to what we've done in https://github.com/citusdata/citus/pull/7278 for role management commands, we try to avoid from running into distributed deadlocks as much as possible. This is because, while distributed deadlocks that can happen around role management commands can be detected by Citus, this is not the case for database management commands because most of them cannot be run inside in a transaction block. In that case, Citus cannot even detect the distributed deadlock because the command is not part of a distributed transaction at all, then the command execution might not return the control back to the user for an indefinite amount of time.	2024-01-08 16:47:49 +00:00
Karina	20dc58cf5d	Fix getting heap tuple size (#7387 ) This fixes #7230. First of all, using HeapTupleHeaderGetDatumLength(heapTuple) is definetly wrong, it gives a number that's 4 times less than the correct tuple size (heapTuple.t_len). See https://github.com/postgres/postgres/blob/REL_16_0/src/include/access/htup_details.h#L455-L456 https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L279 https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L225-L226 When I fixed it, the limit_intermediate_size test failed, so I tried to understand what's going on there. In original commit `fd546cf` these queries were supposed to fail. Then in `b3af63c` three of the queries that were supposed to fail suddenly worked and tests were changed to pass without understanding why the output had changed or how to keep test testing what it had to test. Even comments saying that these queries should fail were left untouched. Commit message gives no clue about why exactly test has changed: > It seems that when we use adaptive executor instead of task tracker, we > exceed the intermediate result size less in the test. Therefore updated > the tests accordingly. Then `3fda2c3` also blindly raised the limit for one of the queries to keep it working: `3fda2c3254 (diff-a9b7b617f9dfd345318cb8987d5897143ca1b723c87b81049bbadd94dcc86570R19)` When in `fe3caf3` that HeapTupleHeaderGetDatumLength(heapTuple) call was finally added, one of those test queries became failing again. The other two of them now also failing after the fix. I don't understand how exactly the calculation of "intermediate result size" that is limited by citus.max_intermediate_result_size had changed through `b3af63c` and `fe3caf3`, but these numbers are now closer to what they originally were when this limitation was added in `fd546cf`. So these queries should fail, like in the original version of the limit_intermediate_size test. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2024-01-08 17:09:30 +01:00
Onur Tirtir	968ac74cde	Fix foreign_key_to_reference_shard_rebalance test (#7400 ) foreign_key_to_reference_shard_rebalance failed because partition of 2024 year does not exist, fixed by add default partition. Replaces https://github.com/citusdata/citus/pull/7396 by adding a rule that allows properly testing foreign_key_to_reference_shard_rebalance via run_test.py. Closes #7396 Co-authored-by: chuhx <148182736+cstarc1@users.noreply.github.com>	2024-01-04 13:16:45 +01:00
Onur Tirtir	d940cfa992	Do nothing if the database is not distributed (#7392 ) Fixes the remaining cases reported in https://github.com/citusdata/citus/issues/7370.	2024-01-03 17:03:06 +03:00
Gürkan İndibay	c3579eef06	Adds REASSIGN OWNED BY propagation (#7319 ) DESCRIPTION: Adds REASSIGN OWNED BY propagation This pull request introduces the propagation of the "Reassign owned by" statement. It accommodates both local and distributed roles for both the old and new assignments. However, when the old role is a local role, it undergoes filtering and is not propagated. On the other hand, if the new role is a local role, the process involves first creating the role on worker nodes before propagating the "Reassign owned" statement.	2023-12-28 15:15:58 +03:00
Gürkan İndibay	181b8ab6d5	Adds additional alter database propagation support (#7253 ) DESCRIPTION: Adds database connection limit, rename and set tablespace propagation In this PR, below statement propagations are added alter database <database_name> with allow_connections = <boolean_value>; alter database <database_name> rename to <database_name2>; alter database <database_name> set TABLESPACE <table_space_name> --------- Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl> Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-12-26 14:55:04 +03:00
Halil Ozan Akgül	b877d606c7	Adds 2PC distributed commands from other databases (#7203 ) DESCRIPTION: Adds support for 2PC from non-Citus main databases This PR only adds support for `CREATE USER` queries, other queries need to be added. But it should be simple because this PR creates the underlying structure. Citus main database is the database where the Citus extension is created. A non-main database is all the other databases that are in the same node with a Citus main database. When a `CREATE USER` query is run on a non-main database we: 1. Run `start_management_transaction` on the main database. This function saves the outer transaction's xid (the non-main database query's transaction id) and marks the current query as main db command. 2. Run `execute_command_on_remote_nodes_as_user("CREATE USER <username>", <username to run the command>)` on the main database. This function creates the users in the rest of the cluster by running the query on the other nodes. The user on the current node is created by the query on the outer, non-main db, query to make sure consequent commands in the same transaction can see this user. 3. Run `mark_object_distributed` on the main database. This function adds the user to `pg_dist_object` in all of the nodes, including the current one. This PR also implements transaction recovery for the queries from non-main databases.	2023-12-22 19:19:41 +03:00
Jodi-Ann Francis	6801a1ed1e	PG16 update GRANT... ADMIN \| INHERIT \| SET, and REVOKE Allowing GRANT ADMIN to now also be INHERIT or SET in support of psql16 GRANT role_name [, ...] TO role_specification [, ...] [ WITH { ADMIN \| INHERIT \| SET } { OPTION \| TRUE \| FALSE } ] [ GRANTED BY role_specification ] Fixes: #7148 Related: #7138 See review changes from https://github.com/citusdata/citus/pull/7164	2023-12-13 15:57:02 -05:00
Naisila Puka	dbdde111c1	Add missing order by clause in failure_split_cleanup test (#7363 ) https://github.com/citusdata/citus/actions/runs/6903353045/attempts/1#summary-18781959638 ```diff ARRAY['-100000'], ARRAY[:worker_1_node, :worker_2_node], 'force_logical'); ERROR: server closed the connection unexpectedly CONTEXT: while executing command on localhost:9060 SELECT operation_id, object_type, object_name, node_group_id, policy_type FROM pg_dist_cleanup where operation_id = 777 ORDER BY object_name; operation_id \| object_type \| object_name \| node_group_id \| policy_type --------------+-------------+-----------------------------------------------------------+---------------+------------- 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981000 \| 1 \| 0 - 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 1 \| 1 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 2 \| 0 + 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 1 \| 1 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981003 \| 2 \| 1 777 \| 4 \| citus_shard_split_publication_1_10_777 \| 2 \| 0 (5 rows) ``` Similar attempt to fix in `c9f2fc892d` There were some more missing ORDER BY stuff, so I added them	2023-11-24 18:26:06 +03:00
Naisila Puka	c019acc01b	Run wal2json cdc test for pg16 as well (#7361 ) pg16 wal2json package is now available, adding the tests back. Basically reverting `f253bb3210` Sister PR https://github.com/citusdata/the-process/pull/153	2023-11-24 14:40:23 +03:00
Gürkan İndibay	3b556cb5ed	Adds create / drop database propagation support (#7240 ) DESCRIPTION: Adds support for propagating `CREATE`/`DROP` database In this PR, create and drop database support is added. For CREATE DATABASE: * "oid" option is not supported * specifying "strategy" to be different than "wal_log" is not supported * specifying "template" to be different than "template1" is not supported The last two are because those are not saved in `pg_database` and when activating a node, we cannot assume what parameters were provided when creating the database. And "oid" is not supported because whether user specified an arbitrary oid when creating the database is not saved in pg_database and we want to avoid from oid collisions that might arise from attempting to use an auto-assigned oid on workers. Finally, in case of node activation, GRANTs for the database are also propagated. --------- Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl> Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com> Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-11-21 16:43:51 +03:00
Naisila Puka	cedcc220bf	Fixes flaky VACUUM (freeze, process toast true) result (#7348 ) https://app.circleci.com/pipelines/github/citusdata/citus/34550/workflows/5b802f66-2666-4623-a209-6d7799f7ee5f/jobs/1229153 ```diff VACUUM (FREEZE, PROCESS_TOAST true) local_vacuum_table; SELECT relfrozenxid::text::integer > :frozenxid AS frozen_performed FROM pg_class WHERE oid=:reltoastrelid::regclass; frozen_performed ------------------ - t + f (1 row) ``` Process toast option in vacuum was introduced in PG14. The failing test was supposed to be a part of `multi_utilities.sql`, but it was included in `pg14.sql` to avoid alternative output for PG13. See `ba62c0a148 (diff-ed03478f693155e2fe092e9ad356bf884dc097f554e8d75eff562d52bbcf7a75L255-L272)` for reference. However, now that we don't support PG13 anymore, we can move this test to `multi_utilities.sql`. Moving the test, plus inserting data before running vacuum freeze such that the freeze is more meaningful and not flaky, fixes the flakiness problem of the test.	2023-11-17 18:58:06 +03:00
Naisila Puka	c88bf5ff1c	Cleanup leftover replication slots in publication test (#7354 )	2023-11-17 15:11:38 +03:00
Japin Li	e14e8667cc	Fix redundant variable declaration (#7353 ) The `$workerCount` declare twice in src/test/regress/pg_regress_multi.pl.	2023-11-17 13:01:23 +03:00
Naisila Puka	0d1f18862b	Propagates SECURITY LABEL ON ROLE stmt (#7304 ) We propagate `SECURITY LABEL [for provider] ON ROLE rolename IS labelname` to the worker nodes. We also make sure to run the relevant `SecLabelStmt` commands on a newly added node by looking at roles found in `pg_shseclabel`. See official docs for explanation on how this command works: https://www.postgresql.org/docs/current/sql-security-label.html This command stores the role label in the `pg_shseclabel` catalog table. This commit also fixes the regex string in `check_gucs_are_alphabetically_sorted.sh` script such that it escapes the dot. Previously it was looking for all strings starting with "citus" instead of "citus." as it should. To test this feature, I currently make use of a special GUC to control label provider registration in PG_init when creating the Citus extension.	2023-11-16 13:12:30 +03:00
Naisila Puka	c6fbb72c02	Fix flaky multi_prepare_plsql (#7346 ) Simple need of an `ORDER BY` clause Ran into this twice this week already! https://github.com/citusdata/citus/actions/runs/6849701315/attempts/1#summary-18622563506 https://github.com/citusdata/citus/actions/runs/6875051160/attempts/1#summary-18698009952 ```diff SELECT nspname, typname FROM pg_type JOIN pg_namespace ON pg_namespace.oid = pg_type.typnamespace WHERE typname = 'prepare_ddl_type_backup'; nspname \| typname -------------+------------------------- - public \| prepare_ddl_type_backup otherschema \| prepare_ddl_type_backup + public \| prepare_ddl_type_backup (2 rows) ```	2023-11-15 13:28:43 +03:00
Naisila Puka	a960799dfb	Clean up leftover replication slots in tests (#7338 ) This commit fixes the flakiness in `logical_replication` and `citus_non_blocking_split_shard_cleanup` tests. The flakiness was related to leftover replication slots. Below is a flaky example for each test: logical_replication https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267030604 citus_non_blocking_split_shard_cleanup https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267006967 ```diff -- Replication slots should be cleaned up SELECT slot_name FROM pg_replication_slots; slot_name --------------------------------- -(0 rows) + citus_shard_split_slot_19_10_17 +(1 row) ``` The tests by themselves are not flaky: 32 flaky test schedules each with 20 runs run successfully. https://github.com/citusdata/citus/actions/runs/6822020127?pr=7338 The conclusion is that: 1. `multi_tenant_isolation_nonblocking` is the problematic test running before `logical_replication` in the `enterprise_schedule`, so I added a cleanup at the end of `multi_tenant_isolation_nonblocking`. https://github.com/citusdata/citus/actions/runs/6824334614/attempts/1#summary-18560127461 2. `citus_split_shard_by_split_points_negative` is the problematic test running before `citus_non_blocking_split_shards_cleanup` in the split schedule. Also added cleanup line. For details on the investigation of leftover replication slots, please check the PR https://github.com/citusdata/citus/pull/7338	2023-11-14 18:50:54 +03:00
Naisila Puka	cdef2d5224	Random tests refactoring (#7342 ) While investigating replication slots leftovers in PR https://github.com/citusdata/citus/pull/7338, I ran into the following refactoring/cleanup that can be done in our test suite: - Add separate test to remove non default nodes - Remove coordinator removal from `add_coordinator` test Use `remove_coordinator_from_metadata` test where needed - Don't print nodeids in `multi_multiuser_auth` and `multi_poolinfo_usage` tests - Use `startswith` when checking for isolation or failure tests - Add some dependencies accordingly in `run_test.py` for running flaky test schedules	2023-11-14 12:49:15 +03:00
Onur Tirtir	240313e286	Support role commands from any node (#7278 ) DESCRIPTION: Adds support from issuing role management commands from worker nodes It's unlikely to get into a distributed deadlock with role commands, we don't care much about them at the moment. There were several attempts to reduce the chances of a deadlock but we didn't any of them merged into main branch yet, see: #7325 #7016 #7009	2023-11-10 09:58:51 +00:00
Naisila Puka	57ff762c82	Fix VACUUM flakiness in multi_utilities (#7334 ) When I run this test in my local, the size of the table after the DELETE command is around 58785792. Hence, I assume that the diffs suggest that the Vacuum had no effect. The current solution is to run the VACUUM command three times instead of once. Example diff: https://github.com/citusdata/citus/actions/runs/6722231142/attempts/1#summary-18269870674 ```diff insert into local_vacuum_table select i from generate_series(1,1000000) i; delete from local_vacuum_table; VACUUM local_vacuum_table; SELECT CASE WHEN s BETWEEN 20000000 AND 25000000 THEN 22500000 ELSE s END FROM pg_total_relation_size('local_vacuum_table') s ; s ---------- - 22500000 + 58785792 (1 row) ``` See more diff examples in the PR description https://github.com/citusdata/citus/pull/7334	2023-11-09 21:00:24 +03:00
dependabot[bot]	d4663212f4	Bump werkzeug from 2.3.7 to 3.0.1 in /src/test/regress Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.3.7 to 3.0.1. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/werkzeug/compare/2.3.7...3.0.1) --- updated-dependencies: - dependency-name: werkzeug dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-11-09 17:14:14 +01:00
Naisila Puka	0dc41ee5a0	Fix flaky multi_mx_insert_select_repartition test (#7331 ) https://github.com/citusdata/citus/actions/runs/6745019678/attempts/1#summary-18336188930 ```diff insert into target_table SELECT a*2 FROM source_table RETURNING a; -NOTICE: executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_xxxxx_from_4213582_to_0','repartitioned_results_xxxxx_from_4213584_to_0']::text[],'localhost',57638) bytes +NOTICE: executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_3940758121873413_from_4213584_to_0','repartitioned_results_3940758121873413_from_4213582_to_0']::text[],'localhost',57638) bytes ``` The elements in the array passed to `fetch_intermediate_results` are the same, but in the opposite order than expected. To fix this flakiness, we can omit the `"SELECT bytes FROM fetch_intermediate_results..."` line. From the following logs, it is understandable that the intermediate results have been fetched.	2023-11-08 15:15:33 +03:00
Onur Tirtir	21646ca1e9	Fix flaky isolation_get_all_active_transactions.spec test (#7323 ) Fix the flaky test that results in following diff by waiting until the backend that we want to terminate really terminates, until 5secs. ```diff --- /__w/citus/citus/src/test/regress/expected/isolation_get_all_active_transactions.out.modified 2023-11-01 16:30:57.648749795 +0000 +++ /__w/citus/citus/src/test/regress/results/isolation_get_all_active_transactions.out.modified 2023-11-01 16:30:57.656749877 +0000 @@ -114,13 +114,13 @@ -------------------- t (1 row) step s3-show-activity: SET ROLE postgres; select count() from get_all_active_transactions() where process_id IN (SELECT FROM selected_pid); count ----- - 0 + 1 (1 row) ```	2023-11-03 09:00:32 +01:00
Onur Tirtir	5e2439a117	Make some more tests re-runable (#7322 ) * multi_mx_create_table * multi_mx_function_table_reference * multi_mx_add_coordinator * create_role_propagation * metadata_sync_helpers * text_search https://github.com/citusdata/citus/pull/7278 requires this.	2023-11-02 18:32:56 +03:00
Jelte Fennema-Nio	85b997a0fb	Fix flaky multi_alter_table_statements (#7321 ) Sometimes multi_alter_table_statements would fail in CI like this: ```diff -- Verify that DROP NOT NULL works ALTER TABLE lineitem_alter ALTER COLUMN int_column2 DROP NOT NULL; SELECT "Column", "Type", "Modifiers" FROM table_desc WHERE relid='lineitem_alter'::regclass; - Column \| Type \| Modifiers ---------------------------------------------------------------------- - l_orderkey \| bigint \| not null - l_partkey \| integer \| not null - l_suppkey \| integer \| not null - l_linenumber \| integer \| not null - l_quantity \| numeric(15,2) \| not null - l_extendedprice \| numeric(15,2) \| not null - l_discount \| numeric(15,2) \| not null - l_tax \| numeric(15,2) \| not null - l_returnflag \| character(1) \| not null - l_linestatus \| character(1) \| not null - l_shipdate \| date \| not null - l_commitdate \| date \| not null - l_receiptdate \| date \| not null - l_shipinstruct \| character(25) \| not null - l_shipmode \| character(10) \| not null - l_comment \| character varying(44) \| not null - float_column \| double precision \| default 1 - date_column \| date \| - int_column1 \| integer \| - int_column2 \| integer \| - null_column \| integer \| -(21 rows) - +ERROR: schema "alter_table_add_column" does not exist -- COPY should succeed now SELECT master_create_empty_shard('lineitem_alter') as shardid \gset ``` Reading from table_desc apparantly has an issue that if the schema gets deleted from one of the items, while it is being read that we get such an error. This change fixes that by not running multi_alter_table_statements in parallel with alter_table_add_column anymore. This is another instance of the same issue as in #7294	2023-11-02 16:42:45 +03:00
Jelte Fennema-Nio	f171ec98fc	Fix flaky failure_distributed_results (#7307 ) Sometimes in CI we run into this failure: ```diff SELECT resultId, nodeport, rowcount, targetShardId, targetShardIndex FROM partition_task_list_results('test', $$ SELECT * FROM source_table $$, 'target_table') NATURAL JOIN pg_dist_node; -WARNING: connection to the remote node localhost:xxxxx failed with the following error: connection not open +ERROR: connection to the remote node localhost:9060 failed with the following error: connection not open SELECT * FROM distributed_result_info ORDER BY resultId; - resultid \| nodeport \| rowcount \| targetshardid \| targetshardindex ---------------------------------------------------------------------- - test_from_100800_to_0 \| 9060 \| 22 \| 100805 \| 0 - test_from_100801_to_0 \| 57637 \| 2 \| 100805 \| 0 - test_from_100801_to_1 \| 57637 \| 15 \| 100806 \| 1 - test_from_100802_to_1 \| 57637 \| 10 \| 100806 \| 1 - test_from_100802_to_2 \| 57637 \| 5 \| 100807 \| 2 - test_from_100803_to_2 \| 57637 \| 18 \| 100807 \| 2 - test_from_100803_to_3 \| 57637 \| 4 \| 100808 \| 3 - test_from_100804_to_3 \| 9060 \| 24 \| 100808 \| 3 -(8 rows) - +ERROR: current transaction is aborted, commands ignored until end of transaction block -- fetch from worker 2 should fail SAVEPOINT s1; +ERROR: current transaction is aborted, commands ignored until end of transaction block SELECT fetch_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[], 'localhost', :worker_2_port) > 0 AS fetched; -ERROR: could not open file "base/pgsql_job_cache/xx_x_xxx/test_from_100802_to_1.data": No such file or directory -CONTEXT: while executing command on localhost:xxxxx +ERROR: current transaction is aborted, commands ignored until end of transaction block ROLLBACK TO SAVEPOINT s1; +ERROR: savepoint "s1" does not exist -- fetch from worker 1 should succeed SELECT fetch_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[], 'localhost', :worker_1_port) > 0 AS fetched; - fetched ---------------------------------------------------------------------- - t -(1 row) - +ERROR: current transaction is aborted, commands ignored until end of transaction block -- make sure the results read are same as the previous transaction block SELECT count(*), sum(x) FROM read_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[],'binary') AS res (x int); - count \| sum ---------------------------------------------------------------------- - 15 \| 863 -(1 row) - +ERROR: current transaction is aborted, commands ignored until end of transaction block ROLLBACk; ``` As outlined in the #7306 I created, the reason for this is related to only having a single connection open to the node. Finding and fixing the full cause is not trivial, so instead this PR starts working around this bug by forcing maximum parallelism. Preferably we'd want this workaround not to be necessary, but that requires spending time to fix this. For now having a less flaky CI is good enough.	2023-11-02 12:31:56 +00:00
Jelte Fennema-Nio	b47c8b3fb0	Fix flaky insert_select_connection_leak (#7302 ) Sometimes in CI insert_select_connection_leak would fail like this: ```diff END; SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections, worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections; leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- - 0 \| 0 + -1 \| 0 (1 row) -- ROLLBACK BEGIN; INSERT INTO target_table SELECT * FROM source_table; INSERT INTO target_table SELECT * FROM source_table; ROLLBACK; SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections, worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections; leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- - 0 \| 0 + -1 \| 0 (1 row) \set VERBOSITY TERSE -- Error on constraint failure BEGIN; INSERT INTO target_table SELECT * FROM source_table; SELECT worker_connection_count(:worker_1_port) AS worker_1_connections, worker_connection_count(:worker_2_port) AS worker_2_connections \gset SAVEPOINT s1; INSERT INTO target_table SELECT a, CASE WHEN a < 50 THEN b ELSE null END FROM source_table; @@ -89,15 +89,15 @@ leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- 0 \| 0 (1 row) END; SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections, worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections; leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- - 0 \| 0 + -1 \| 0 (1 row) ``` Source: https://github.com/citusdata/citus/actions/runs/6718401194/attempts/1#summary-18258258387 A negative amount of leaked connectios is obviously not possible. For some reason there was a connection open when we checked the initial amount of connections that was closed afterwards. This could be the from the maintenance daemon or maybe from the previous test that had not fully closed its connections just yet. The change in this PR doesnt't actually fix the cause of the negative connection, but it simply considers it good as well, by changing the result to zero for negative values. With this fix we might sometimes miss a leak, because the negative number can cancel out the leak and still result in a 0. But since the negative number only occurs sometimes, we'll still find the leak often enough.	2023-11-02 13:15:43 +01:00
Cédric Villemain	0678a2fd89	Fix #7242 , CALL(@0) crash backend (#7288 ) When executing a prepared CALL, which is not pure SQL but available with some drivers like npgsql and jpgdbc, Citus entered a code path where a plan is not defined, while trying to increase its cost. Thus SIG11 when plan is a NULL pointer. Fix by only increasing plan cost when plan is not null. However, it is a bit suspicious to get here with a NULL plan and maybe a better change will be to not call ShardPlacementForFunctionColocatedWithDistTable() with a NULL plan at all (in call.c:134) bug hit with for example: ``` CallableStatement proc = con.prepareCall("{CALL p(?)}"); proc.registerOutParameter(1, java.sql.Types.BIGINT); proc.setInt(1, -100); proc.execute(); ``` where `p(bigint)` is a distributed "function" and the param the distribution key (also in a distributed table), see #7242 for details Fixes #7242	2023-11-02 13:15:24 +01:00
Jelte Fennema-Nio	5a48a1602e	Debug flaky logical_replication test (#7309 ) Sometimes in CI our logical_replication test fails like this: ```diff +++ /__w/citus/citus/src/test/regress/results/logical_replication.out.modified 2023-11-01 14:15:08.562758546 +0000 @@ -40,21 +40,21 @@ SELECT count() from pg_publication; count ------- 0 (1 row) SELECT count() from pg_replication_slots; count ------- - 0 + 1 (1 row) SELECT count(*) FROM dist; count ------- ``` It's hard to understand what is going on here, just based on the wrong number. So this PR changes the test to show the name of the subscription, publication and replication slot to make finding the cause easier. In passing this also fixes another flaky test in the same file that our flaky test detection picked up. This is done by waiting for resource cleanup after the shard move.	2023-11-02 13:15:02 +01:00
Onur Tirtir	9867c5b949	Fix flaky multi_mx_node_metadata.sql test (#7317 ) Fixes the flaky test that results in following diff: ```diff --- /__w/citus/citus/src/test/regress/expected/multi_mx_node_metadata.out.modified 2023-11-01 14:22:12.890476575 +0000 +++ /__w/citus/citus/src/test/regress/results/multi_mx_node_metadata.out.modified 2023-11-01 14:22:12.914476657 +0000 @@ -840,24 +840,26 @@ (1 row) \c :datname - - :master_port SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%'; datname ------------ db_to_drop (1 row) DROP DATABASE db_to_drop; +ERROR: database "db_to_drop" is being accessed by other users SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%'; datname ------------ -(0 rows) + db_to_drop +(1 row) -- cleanup DROP SEQUENCE sequence CASCADE; NOTICE: drop cascades to default value for column a of table reference_table ```	2023-11-02 11:02:34 +00:00
Jelte Fennema-Nio	a6e86884f6	Fix flaky isolation_metadata_sync_deadlock (#7312 ) Sometimes isolation_metadata_sync_deadlock fails in CI like this: ```diff diff -dU10 -w /__w/citus/citus/src/test/regress/expected/isolation_metadata_sync_deadlock.out /__w/citus/citus/src/test/regress/results/isolation_metadata_sync_deadlock.out --- /__w/citus/citus/src/test/regress/expected/isolation_metadata_sync_deadlock.out.modified 2023-11-01 16:03:15.090199229 +0000 +++ /__w/citus/citus/src/test/regress/results/isolation_metadata_sync_deadlock.out.modified 2023-11-01 16:03:15.098199312 +0000 @@ -110,10 +110,14 @@ t (1 row) step s2-stop-connection: SELECT stop_session_level_connection_to_node(); stop_session_level_connection_to_node ------------------------------------- (1 row) + +teardown failed: ERROR: localhost:57638 is a metadata node, but is out of sync +HINT: If the node is up, wait until metadata gets synced to it and try again. +CONTEXT: SQL statement "SELECT master_remove_distributed_table_metadata_from_workers(v_obj.objid, v_obj.schema_name, v_obj.object_name)" ``` Source: https://github.com/citusdata/citus/actions/runs/6721938040/attempts/1#summary-18268946448 To fix this we now wait for the metadata to be fully synced to all nodes at the start of the teardown steps.	2023-11-02 10:39:05 +01:00
Onur Tirtir	2cf4c04023	Fix flaky global_cancel.sql test (#7316 )	2023-11-01 23:59:41 +01:00
Jelte Fennema-Nio	e3c93c303d	Fix flaky citus_non_blocking_split_shard_cleanup (#7311 ) Sometimes in CI citus_non_blocking_split_shard_cleanup failed like this: ```diff --- /__w/citus/citus/src/test/regress/expected/citus_non_blocking_split_shard_cleanup.out.modified 2023-11-01 15:07:14.280551207 +0000 +++ /__w/citus/citus/src/test/regress/results/citus_non_blocking_split_shard_cleanup.out.modified 2023-11-01 15:07:14.292551358 +0000 @@ -106,21 +106,22 @@ ----------------------------------- (1 row) \c - - - :worker_2_port SET search_path TO "citus_split_test_schema"; -- Replication slots should be cleaned up SELECT slot_name FROM pg_replication_slots; slot_name --------------------------------- -(0 rows) + citus_shard_split_slot_19_10_17 +(1 row) -- Publications should be cleanedup SELECT count(*) FROM pg_publication; count ``` It's expected that the replication slot is sometimes not cleaned up if we don't wait until resource cleanup completes. This PR starts doing that here.	2023-11-01 16:21:12 +00:00
Jelte Fennema-Nio	c9f2fc892d	Fix flaky failure_split_cleanup (#7299 ) Sometimes failure_split_cleanup failed in CI like this: ```diff ERROR: server closed the connection unexpectedly CONTEXT: while executing command on localhost:9060 SELECT operation_id, object_type, object_name, node_group_id, policy_type FROM pg_dist_cleanup where operation_id = 777 ORDER BY object_name; operation_id \| object_type \| object_name \| node_group_id \| policy_type --------------+-------------+-----------------------------------------------------------+---------------+------------- 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981000 \| 1 \| 0 - 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 1 \| 1 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 2 \| 0 + 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981002 \| 1 \| 1 777 \| 1 \| citus_failure_split_cleanup_schema.table_to_split_8981003 \| 2 \| 1 777 \| 4 \| citus_shard_split_publication_1_10_777 \| 2 \| 0 (5 rows) -- we need to allow connection so that we can connect to proxy ``` Source: https://github.com/citusdata/citus/actions/runs/6717642291/attempts/1#summary-18256014949 It's the common problem where we're missing a column in the ORDER BY clause. This fixes that by adding an node_group_id to the query in question.	2023-11-01 14:08:51 +00:00
Jelte Fennema-Nio	c83c556702	Fix flaky isolation_master_update_node (#7303 ) Sometimes in CI isolation_master_update_node fails like this: ```diff ------------------ (1 row) step s2-abort: ABORT; step s1-abort: ABORT; FATAL: terminating connection due to administrator command FATAL: terminating connection due to administrator command SSL connection has been closed unexpectedly +server closed the connection unexpectedly master_remove_node ------------------ ``` This just seesm like a random error line. The only way to reasonably fix this is by adding an extra output file. So that's what this PR does.	2023-11-01 16:44:45 +03:00
Jelte Fennema-Nio	0d83ab57de	Fix flaky multi_cluster_management (#7295 ) One of our most flaky and most anoying tests is multi_cluster_management. It usually fails like this: ```diff SELECT citus_disable_node('localhost', :worker_2_port); citus_disable_node -------------------- (1 row) SELECT public.wait_until_metadata_sync(60000); +WARNING: waiting for metadata sync timed out wait_until_metadata_sync -------------------------- (1 row) ``` This tries to address that by hardening wait_until_metadata_sync. I believe the reason for this warning is that there is a race condition in wait_until_metadata_sync. It's possible for the pre-check to fail, then have the maintenance daemon send a notification. And only then have the backend start to listen. I tried to fix it in two ways: 1. First run LISTEN, and only then read do the pre-check. 2. If we time out, check again just to make sure that we did not miss the notification somehow. And don't show a warning if all metadata is synced after the timeout. It's hard to know for sure that this fixes it because the test is not repeatable and I could not reproduce it locally. Let's just hope for the best. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-11-01 10:46:01 +00:00
Jelte Fennema-Nio	20ae42e7fa	Fix flaky multi_reference_table test (#7294 ) Sometimes multi_reference_table failed in CI like this: ```diff \c - - - :master_port DROP INDEX reference_schema.reference_index_2; \c - - - :worker_1_port SELECT "Column", "Type", "Modifiers" FROM table_desc WHERE relid='reference_schema.reference_table_ddl_1250019'::regclass; - Column \| Type \| Modifiers ---------------------------------------------------------------------- - value_2 \| double precision \| default 25.0 - value_3 \| text \| not null - value_4 \| timestamp without time zone \| - value_5 \| double precision \| -(4 rows) - +ERROR: schema "citus_local_table_queries" does not exist \di reference_schema.reference_index_2* List of relations Schema \| Name \| Type \| Owner \| Table ``` Source: https://github.com/citusdata/citus/actions/runs/6707535961/attempts/2#summary-18226879513 Reading from table_desc apparantly has an issue that if the schema gets deleted from one of the items, while it is being read that we get such an error. This change fixes that by not running multi_reference_table in parallel with citus_local_tables_queries anymore.	2023-11-01 10:12:06 +00:00
Cédric Villemain	37415ef8f5	Allow citus__size on index related to a distributed table (#7271 ) I just enhanced the existing code to check if the relation is an index belonging to a distributed table. If so the shardId is appended to relation (index) name and the _size function are executed as before. There is a change in an extern function: `extern StringInfo GenerateSizeQueryOnMultiplePlacements(...)` It's possible to create a new function and deprecate this one later if compatibility is an issue. Fixes https://github.com/citusdata/citus/issues/6496. DESCRIPTION: Allows using Citus size functions on distributed tables indexes. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-11-01 09:05:51 +00:00
Jelte Fennema-Nio	a76a832553	Fix flaky validate_constraint test (#7293 ) Sometimes validate constraint would fail like this: ```diff validatable_constraint_8000016 \| t (10 rows) DROP TABLE constrained_table; +ERROR: deadlock detected +DETAIL: Process 16602 waits for ShareRowExclusiveLock on relation 56258 of database 16384; blocked by process 16601. +Process 16601 waits for AccessShareLock on relation 56120 of database 16384; blocked by process 16602. +HINT: See server log for query details. DROP TABLE referenced_table CASCADE; DROP TABLE referencing_table; DROP SCHEMA validate_constraint CASCADE; -NOTICE: drop cascades to 3 other objects +NOTICE: drop cascades to 4 other objects DETAIL: drop cascades to type constraint_validity drop cascades to view constraint_validations_in_workers drop cascades to view constraint_validations +drop cascades to table constrained_table SET search_path TO DEFAULT; ``` Source: https://github.com/citusdata/citus/actions/runs/6708383699?pr=7291 This change fixes that by not running together with the foreign_key_to_reference_table test anymore. In passing it also simplifies dropping of the test its resources.	2023-11-01 09:41:28 +01:00
Emel Şimşek	ee8f4bb7e8	Start Maintenance Daemon for Main DB at the server start. (#7254 ) DESCRIPTION: This change starts a maintenance deamon at the time of server start if there is a designated main database. This is the code flow: 1. User designates a main database: `ALTER SYSTEM SET citus.main_db = "myadmindb";` 2. When postmaster starts, in _PG_Init, citus calls `InitializeMaintenanceDaemonForMainDb` This function registers a background worker to run `CitusMaintenanceDaemonMain `with `databaseOid = 0 ` 3. `CitusMaintenanceDaemonMain ` takes some special actions when databaseOid is 0: - Gets the citus.main_db value. - Connects to the citus.main_db - Now the `MyDatabaseId `is available, creates a hash entry for it. - Then follows the same control flow as for a regular db,	2023-10-30 09:44:13 +03:00
Benjamin O	f9218d9780	Support replacing IPv6 Loopback in `normalize.sed` (#7269 ) I had a test failure issue due to my machine using the IPv6 loopback address. This change to the `normalize.sed` solves that issue.	2023-10-27 16:42:55 +02:00
Onur Tirtir	db13afaa7b	Fix flaky columnar_create.sql test (#7266 )	2023-10-17 16:58:17 +03:00
Jelte Fennema-Nio	788e09a39a	Add a test for citus_shards where table names have spaces (#7224 ) There was a bug reported for previous versions of Citus where shard\_size was returning NULL for tables with spaces in them. It works fine on the main branch though, but I'm still adding a test for this to the main branch because it seems a good test to have.	2023-10-16 11:38:24 +02:00
Onur Tirtir	858d99be33	Take improvement_threshold into the account in citus_add_rebalance_strategy() (#7247 ) DESCRIPTION: Makes sure to take improvement_threshold into the account in `citus_add_rebalance_strategy()`. Fixes https://github.com/citusdata/citus/issues/7188.	2023-10-09 13:13:08 +03:00
dependabot[bot]	c323f49e83	Bump cryptography from 41.0.3 to 41.0.4 in /src/test/regress (#7231 ) Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.3 to 41.0.4. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nils Dijk <nils@citusdata.com>	2023-09-27 15:36:58 +02:00
Onur Tirtir	27ac44eb2a	Fix mixed Citus upgrade tests (#7218 ) When testing rolling Citus upgrades, coordinator should not be upgraded until we upgrade all the workers. --------- Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>	2023-09-26 17:52:52 +03:00
Gürkan İndibay	7fa109c977	Adds alter user missing features (#7204 ) DESCRIPTION: Adds alter user rename propagation and enriches alter user tests --------- Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-09-26 12:28:07 +03:00
Onur Tirtir	111b4c19bc	Make sure to disallow creating a replicated distributed table concurrently (#7219 ) See explanation in https://github.com/citusdata/citus/issues/7216. Fixes https://github.com/citusdata/citus/issues/7216. DESCRIPTION: Makes sure to disallow creating a replicated distributed table concurrently	2023-09-25 11:14:35 +03:00
Jelte Fennema-Nio	71e556e090	Remove useless test output (#7209 ) This was sometimes failing when running locally due to some local shard still existing due to. This fixes that. We normally silence all `drop schema cascade` output like this anyway to avoid unnecessary diffs when modifying a test later on.	2023-09-19 14:12:46 +02:00
Gürkan İndibay	7c0b289761	Adds alter database set option (#7181 ) DESCRIPTION: Adds support for ALTER DATABASE <db_name> SET .. statement propagation SET statements in Postgres has a common structure which is already being used in Alter Function statement. In this PR, I added a util file; citus_setutils and made it usable for both for alter database<db_name>set .. and alter function ... set ... statements. With this PR, below statements will be propagated ```sql ALTER DATABASE name SET configuration_parameter { TO \| = } { value \| DEFAULT } ALTER DATABASE name SET configuration_parameter FROM CURRENT ALTER DATABASE name RESET configuration_parameter ALTER DATABASE name RESET ALL ``` Additionally, there was a bug in processing float values in the common code block. I fixed this one as well Previous ```C case T_Float: { appendStringInfo(buf, " %s", strVal(value)); break; } ``` Now ```C case T_Float: { appendStringInfo(buf, " %s", nodeToString(value)); break; } ```	2023-09-14 16:29:16 +03:00
aykut-bozkurt	26dc407f4a	bump citus and columnar into 12.2devel (#7200 )	2023-09-14 12:03:09 +03:00
Gürkan İndibay	e5e64b7454	Adds alter database propagation - with and refresh collation (#7172 ) DESCRIPTION: Adds ALTER DATABASE WITH ... and REFRESH COLLATION VERSION support This PR adds supports for basic ALTER DATABASE statements propagation support. Below statements are supported: ALTER DATABASE <database_name> with IS_TEMPLATE <true/false>; ALTER DATABASE <database_name> with CONNECTION LIMIT <integer_value>; ALTER DATABASE <database_name> REFRESH COLLATION VERSION; --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2023-09-12 14:09:15 +03:00
Naisila Puka	1da99f8423	PG16 - Don't propagate GRANT ROLE with INHERIT/SET option (#7190 ) We currently don't support propagating these options in Citus Relevant PG commits: https://github.com/postgres/postgres/commit/e3ce2de https://github.com/postgres/postgres/commit/3d14e17 Limitation: We also need to take care of generated GRANT statements by dependencies in attempt to distribute something else. Specifically, this part of the code in `GenerateGrantRoleStmtsOfRole`: ``` grantRoleStmt->admin_opt = membership->admin_option; ``` In PG16, membership also has `inherit_option` and `set_option` which need to properly be part of the `grantRoleStmt`. We can skip for now since #7164 will take care of this soon, and also this is not an expected use-case.	2023-09-12 12:47:37 +03:00
Naisila Puka	c1dc378504	Fix WITH ADMIN FALSE propagation (#7191 )	2023-09-11 15:58:24 +03:00
Onur Tirtir	d628a4c21a	Add citus_schema_move() function (#7180 ) Add citus_schema_move() that can be used to move tenant tables within a distributed schema to another node. The function has two variations as simple wrappers around citus_move_shard_placement() and citus_move_shard_placement_with_nodeid() respectively. They pick a shard that belongs to the given tenant schema and resolve the source node that contain the shards under given tenant schema. Hence their signatures are quite similar to underlying functions: ```sql -- citus_schema_move(), using target node name and node port CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move( schema_id regnamespace, target_node_name text, target_node_port integer, shard_transfer_mode citus.shard_transfer_mode default 'auto') RETURNS void LANGUAGE C STRICT AS 'MODULE_PATHNAME', $$citus_schema_move$$; -- citus_schema_move(), using target node id CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move( schema_id regnamespace, target_node_id integer, shard_transfer_mode citus.shard_transfer_mode default 'auto') RETURNS void LANGUAGE C STRICT AS 'MODULE_PATHNAME', $$citus_schema_move_with_nodeid$$; ```	2023-09-08 12:03:53 +03:00
Naisila Puka	8894c76ec0	PG16 - Add rules option to CREATE COLLATION (#7185 ) Relevant PG commit: https://github.com/postgres/postgres/commit/30a53b7 30a53b7	2023-09-07 13:50:47 +03:00
Naisila Puka	2df88042b3	Add tests with JSON_ARRAYAGG and JSON_OBJECTAGG aggregates (#7186 ) Relevant PG commit: `7081ac46ac` 7081ac46ace8c459966174400b53418683c9fe5c	2023-09-07 13:29:39 +03:00
Naisila Puka	7e5136f2de	Add tests with publications with schema and table of the same schema (#7184 ) Relevant PG commit: https://github.com/postgres/postgres/commit/13a185f 13a185f It was backpatched through PG15 so I added this test in publication.sql instead of pg16.sql	2023-09-06 16:40:36 +03:00
Naisila Puka	b2fc763bc3	PG16 - Add tests with random_normal (#7183 ) Relevant PG commit: https://github.com/postgres/postgres/commit/38d8176	2023-09-06 14:57:24 +03:00
Naisila Puka	5c658b4eb7	PG16 - Add citus_truncate_trigger for Citus foreign tables (#7170 ) Since in PG16, truncate triggers are supported on foreign tables, we add the citus_truncate_trigger to Citus foreign tables as well, such that the TRUNCATE command is propagated to the table's single local shard as well. Note that TRUNCATE command was working for foreign tables even before this commit: see https://github.com/citusdata/citus/pull/7170#issuecomment-1706240593 for details This commit also adds tests with user-enabled truncate triggers on Citus foreign tables: both trigger on the shell table and on its single foreign local shard. Relevant PG commit: https://github.com/postgres/postgres/commit/3b00a94	2023-09-05 19:42:39 +03:00
aykut-bozkurt	8eb3360017	Fixes visibility problems with dependency propagation (#7028 ) Problem: Previously we always used an outside superuser connection to overcome permission issues for the current user while propagating dependencies. That has mainly 2 problems: 1. Visibility issues during dependency propagation, (metadata connection propagates some objects like a schema, and outside transaction does not see it and tries to create it again) 2. Security issues (it is preferrable to use current user's connection instead of extension superuser) Solution (high level): Now, we try to make a smarter decision on whether should we use an outside superuser connection or current user's metadata connection. We prefer using current user's connection if any of the objects, which is already propagated in the current transaction, is a dependency for a target object. We do that since we assume if current user has permissions to create the dependency, then it can most probably propagate the target as well. Our assumption is expected to hold most of the times but it can still be wrong. In those cases, transaction would fail and user should set the GUC `citus.create_object_propagation` to `deferred` to work around it. Solution: 1. We track all objects propagated in the current transaction (we can handle subtransactions), 2. We propagate dependencies via the current user's metadata connection if any dependency is created in the current transaction to address issues listed above. Otherwise, we still use an outside superuser connection. DESCRIPTION: Fixes some object propagation errors seen with transaction blocks. Fixes https://github.com/citusdata/citus/issues/6614 --------- Co-authored-by: Nils Dijk <nils@citusdata.com>	2023-09-05 18:04:16 +03:00
Naisila Puka	9f067731c0	Adds PostgreSQL 16 RC1 support (#7173 )	2023-09-05 14:32:41 +03:00
Emel Şimşek	a849570f3f	Improve the performance of CitusHasBeenLoaded function for a database that does not do CREATE EXTENSION citus but load citus.so. (#7123 ) For a database that does not create the citus extension by running ` CREATE EXTENSION citus;` `CitusHasBeenLoaded ` function ends up querying the `pg_extension` table every time it is invoked. This is not an ideal situation for a such a database. The idea in this PR is as follows: ### A new field in MetadataCache. Add a new variable `extensionCreatedState `of the following type: ``` typedef enum ExtensionCreatedState { UNKNOWN = 0, CREATED = 1, NOTCREATED = 2, } ExtensionCreatedState; ``` When the MetadataCache is invalidated, `ExtensionCreatedState` will be set to UNKNOWN. ### Invalidate MetadataCache when CREATE/DROP/ALTER EXTENSION citus commands are run. - Register a callback function, named `InvalidateDistRelationCacheCallback`, for relcache invalidation during the shared library initialization for `citus.so`. This callback function is invoked in all the backends whenever the relcache is invalidated in one of the backends. (This could be caused many DDLs operations). - In the cache invalidation callback,` InvalidateDistRelationCacheCallback`, invalidate `MetadataCache` zeroing it out. - In `CitusHasBeenLoaded`, perform the costly citus is loaded check only if the `MetadataCache` is not valid. ### Downsides Any relcache invalidation (caused by various DDL operations) will case Citus MetadataCache to get invalidated. Most of the time it will be unnecessary. But we rely on that DDL operations on relations will not be too frequent.	2023-09-05 13:29:35 +03:00
Jelte Fennema	bdf085eabb	Add some small improvements to python testing framework (#7159 ) 1. Adds an `sql_row` function, for when a query returns a single row with multiple columns. 2. Include a `notice_handler` for easier debugging 3. Retry dropping replication slots when they are "in use", this is often an ephemeral state and can cause flaky tests	2023-09-05 09:34:56 +02:00
Ivan Vyazmitinov	e94bf93152	#6548 2PC recovery is extremely ineffective on a cluster with multiple DATABASEs fix (#7174 )	2023-09-04 15:28:22 +02:00
Naisila Puka	de9af078b0	PG16 - Add reindex database/system tests (#7167 ) In PG16, REINDEX DATABASE/SYSTEM name is optional. We already don't propagate these commands automatically. Testing here with run_command_on_workers. Relevant PG commit: https://github.com/postgres/postgres/commit/2cbc3c1	2023-09-04 11:31:57 +03:00
Naisila Puka	cf71e80bfd	PG16 - Add tests for createdb with ICU_RULES option (#7161 ) When we create a database, it already needs to be manually created in the workers as well. This new icu_rules option should work as the other options as well. Added a test for that. Relevant PG commit: https://github.com/postgres/postgres/commit/30a53b7	2023-09-04 11:13:46 +03:00
Naisila Puka	05443a77ad	Adds test for COPY FROM failure in Citus foreign tables (#7160 )	2023-09-01 12:20:07 +03:00
Gürkan İndibay	b8bded6454	Adds citus_pause_node udf (#7089 ) DESCRIPTION: Presenting citus_pause_node UDF enabling pausing by node_id. citus_pause_node takes a node_id parameter and fetches all the shards in that node and puts AccessExclusiveLock on all the shards inside that node. With this lock, insert is disabled, until citus_pause_node transaction is closed. --------- Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2023-09-01 11:39:30 +03:00
Onur Tirtir	a830862717	Not undistribute Citus local table when converting it to a reference table / single-shard table	2023-08-29 12:57:28 +03:00
Naisila Puka	afab879de3	PG16 - Add COPY FROM default tests (#7143 ) Already supported in Citus, adding the same tests as in PG Relevant PG commit: https://github.com/postgres/postgres/commit/9f8377f	2023-08-24 15:52:09 +03:00
Naisila Puka	70c8aba967	PG16 - Add tests for CREATE/ALTER TABLE .. STORAGE (#7140 ) Relevant PG commits: https://github.com/postgres/postgres/commit/784cedd https://github.com/postgres/postgres/commit/b9424d0	2023-08-24 15:26:40 +03:00
Gürkan İndibay	8d3a06c1c7	Adds grant/revoke privileges on database propagation (#7109 ) DESCRIPTION: Adds grant/revoke propagation support for database privileges Following the implementation of support for granting and revoking database privileges, certain tests that issued grants for worker nodes experienced failures. These ones are fixed in this PR as well.	2023-08-24 14:43:19 +03:00
Naisila Puka	b8c493f2c4	PG16 - Add GENERIC_PLAN option to EXPLAIN (#7141 )	2023-08-23 20:15:54 +03:00
Naisila Puka	c73ef405f5	PG16 - IS JSON predicate and SYSTEM_USER tests (#7137 ) Support the IS JSON predicate Relevant PG commit: https://github.com/postgres/postgres/commit/6ee30209 SYSTEM_USER Relevant PG commit: https://github.com/postgres/postgres/commit/0823d061	2023-08-23 14:13:56 +03:00
Naisila Puka	36b51d617c	PG16 - Throw meaningful error for stats without a name on Citus tables (#7136 ) Relevant PG commit: `624aa2a13b` 624aa2a13bd02dd584bb0995c883b5b93b2152df	2023-08-23 10:25:01 +03:00
Gürkan İndibay	371f094b68	Removes pg_send_cancellation (#7135 ) DESCRIPTION: Removes pg_send_cancellation and all references	2023-08-21 17:29:44 +03:00
Naisila Puka	682dca1f12	Adds PG16Beta3 support (#6952 ) DESCRIPTION: Adds PG16Beta3 support This is the final commit that adds PG16 compatibility with Citus's current features. You can use Citus community with PG16Beta3. This commit: - Enables PG16 in the configure script. - Adds PG16 tests to CI using test images that have 16beta3 - Skips wal2json cdc test since wal2json package is not available for PG16 yet - Fixes an isolation test Several PG16 Compatibility commits have been merged before this final one. All these subtasks are done https://github.com/citusdata/citus/issues/7017 See the list below: 1 - `42d956888d` Resolve compilation issues 2 - `0d503dd5ac` Ruleutils and successful CREATE EXTENSION 3 - `907d72e60d` Some test outputs 4 - `7c6b4ce103` Outer join checks, subscription password, crash fixes 5 - `6056cb2c29` get_relation_info hook to avoid crash from adjusted partitioning 6 - `b36c431abb` Rework PlannedStmt and Query's Permission Info 7 - `ee3153fe50` More test output fixes 8 - `2c50b5f7ff` varnullingrels additions 9 - `b2291374b4` More test output fixes 10- `a2315fdc67` New options to vacuum and analyze 11- `9fa72545e2` Fix AM dependency and grant's admin option 12- `2d6cf8e79a` One more outer join check Stay tuned for PG16 new features in Citus :)	2023-08-17 21:02:59 +03:00
Naisila Puka	a5ce601c07	Bump PG14 and PG15 versions for CI tests (#7111 ) Postgres got minor updates on Aug10, this commit starts using the images with the latest version for our tests, namely 14.9 and 15.4. Depends on https://github.com/citusdata/the-process/pull/147 For CI images, we needed to regenerate Pipfile.lock, mainly because of an issue with pyyaml version: https://github.com/yaml/pyyaml/issues/601 We also needed to remove a failing test in subquery_local_tables.sql. Relevant PG commit: `b0e390e6d1` b0e390e6d1d68b92e9983840941f8f6d9e083fe0 Issue: https://github.com/citusdata/citus/issues/7119 For joins where consider_join_pushdown is false, we cannot get the information that we used to get, which prevents doing the distributed planning. Team already contacted PG committers for this. Until then, we remove the test from the schedule.	2023-08-17 11:53:19 +03:00
Naisila Puka	a2315fdc67	PG16 compatibility - new options to vacuum and analyze (#7114 ) PG16 compatibility - part 10 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` part 4 `7c6b4ce103` part 5 `6056cb2c29` part 6 `b36c431abb` part 7 `ee3153fe50` part 8 `2c50b5f7ff` part 9 `b2291374b4` This commit is in the series of PG16 compatibility commits. It: - Adds buffer_usage_limit to vacuum and analyze - Adds process_main, skip_database_stats, only_database_stats to vacuum Important Note: adding these options is actually required for check-vanilla tests to succeed. However, in concept, this PR belongs to "PG16 new features", rather than "PG16 regression tests sanity" Relevant PG commits: `1cbbee0338` 1cbbee03385763b066ae3961fc61f2cd01a0d0d7 `4211fbd841` 4211fbd8413b26e0abedbe4338aa7cda2cd469b4 `a46a7011b2` a46a7011b27188af526047a111969f257aaf4db8 More PG16 compatibility commits are coming soon ...	2023-08-16 16:18:28 +03:00
Naisila Puka	b982f2dee6	Changes PROCESS_TOAST default value to true (#7122 ) Process toast should be true by default, like in PG.	2023-08-16 14:40:24 +03:00
Naisila Puka	b2291374b4	PG16 compatibility - more test output fixes (#7112 ) PG16 compatibility - part 9 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` part 4 `7c6b4ce103` part 5 `6056cb2c29` part 6 `b36c431abb` part 7 `ee3153fe50` part 8 `2c50b5f7ff` This commit is in the series of PG16 compatibility commits. It makes some changes to our tests in order to be compatible with the following in PG16: - Fix multi_subquery_in_where_reference_clause test somehow PG got rid of the outer join (e.g., explain doesn't show outer joins), hence we can pushdown the subquery. Changing to users_reference_table - Fix unqualified column names for views in PG16 Relevant PG commit: `47bb9db759` 47bb9db75996232ea71fc1e1888ffb0e70579b54 - Fix global_cancel test Error wording and detail changed Relevant PG commit: `2631ebab7b` 2631ebab7b18bdc079fd86107c47d6104a6b3c6e - Fix local_table_join_test with lateral subquery Possible relevant PG commit: `ae89129aa3` ae89129aa3555c263b8c3ccc4c0f1ef7e46201aa I removed the where clause and the limit count error was hit again. With the where clause the query unexpectedly works. - Fix test outputs Relevant PG commits: -- `1349d2790b` -- `f4c7c410ee` For multi_explain and multi_complex_count_distinct there were too many places touched so I just added an alternative test output. For the other tests I modified the problematic parts. More PG16 compatibility commits are coming soon ...	2023-08-15 13:49:25 +03:00
Naisila Puka	2c50b5f7ff	PG16 compatibility - varnullingrels additions (#7107 ) PG16 compatibility - part 7 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` part 4 `7c6b4ce103` part 5 `6056cb2c29` part 6 `b36c431abb` part 7 `ee3153fe50` This commit is in the series of PG16 compatibility commits. PG16 introduced a new entry varnnullingrels to Var, which represents our partkey in pg_dist_partition. This commit does the necessary changes in Citus to support this. Relevant PG commit: `2489d76c49` 2489d76c4906f4461a364ca8ad7e0751ead8aa0d More PG16 compatibility commits are coming soon ...	2023-08-15 13:07:55 +03:00
Naisila Puka	ee3153fe50	PG16 compatibility - more test output fixes (#7108 ) PG16 compatibility - part 7 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` part 4 `7c6b4ce103` part 5 `6056cb2c29` part 6 `b36c431abb` This commit is in the series of PG16 compatibility commits. It makes some changes to our tests in order to be compatible with the following in PG16: - PG16 removed logic for converting a table to a view Relevant PG commit: `b23cd185fd` b23cd185fd5410e5204683933f848d4583e34b35 - Fix changed error message in certificate verification Relevant PG commit: `8eda731465` 8eda7314652703a2ae30d6c4a69c378f6813a7f2 - Fix backend type order in tests Relevant PG commit: `0c679464a8` 0c679464a837079acc75ff1d45eaa83f79e05690 - Reduce log level to omit extra NOTICE in create collation in PG16 Relevant PG commit: `a14e75eb0b` a14e75eb0b6a73821e0d66c0d407372ec8376105 That commit made LOCALE parameter apply regardless of the provider used, and it printed the following notice: NOTICE: using standard form "und-u-ks-level2" for ICU locale "@colStrength=secondary" We omit this notice to omit output change between pg versions. - Fix columnar_memory test TopMemoryContext now has more children contexts Possible relevant PG commit: `9d3ebba729` 9d3ebba729ebaf5882a92f0f5f662a3312037605 memusage is now around 8.5 MB, whereas it was less than 8MB before. To avoid differences between PG versions, I changed the test to compare to less than 9 MB. It still reflects very well the improvement from 28MB. - Alternative test output for GRANTOR values in pg_auth_members grantor changed in PG16 Relevant PG commit: `ce6b672e44` ce6b672e4455820a0348214be0da1a024c3f619f - Remove redundant grouping columns from our tests Relevant PG commit: `8d83a5d0a2` 8d83a5d0a2673174dc478e707de1f502935391a5 - Fix tests with different order in Filters Relevant PG commit: `2489d76c49` 2489d76c4906f4461a364ca8ad7e0751ead8aa0d More PG16 compatibility commits are coming soon ...	2023-08-09 18:04:32 +03:00
Naisila Puka	7c6b4ce103	PG16 compatibility - outer join checks, subscription password, crash fixes (#7097 ) PG16 compatibility - Part 4 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` This commit is in the series of PG16 compatibility commits. It adds some outer join checks to the planner, the new password_required option to the subscription, and a crash fix related to PGIOAlignedBlock, see below for more details: - Fix PGIOAlignedBlock Assert crash in PG16 Relevant PG commit: `faeedbcefd` faeedbcefd40bfdf314e048c425b6d9208896d90 - Pass planner info as argument to make_simple_restrictinfo Pre PG16 passing plannerInfo to make_simple_restrictinfo was only needed for placeholder Vars, which is not the case in this part of the codebase because we are building the expression from shard intervals which don't have placeholder vars. However, PG16 is counting baserels appearing in clause_relids and is deleting the rels mentioned in plannerinfo->outer_join_rels Hence directly accessing plannerinfo. We will crash if we leave it as NULL. For reference `2489d76c49 (diff-e045c41eda9686451a7993e91518e40056b3739365e39eb1b70ae438dc1f7c76R207)` Relevant PG commit: `2489d76c49` 2489d76c4906f4461a364ca8ad7e0751ead8aa0d - Add outer join checks, root->simple_rel_array - fix rebalancer to include passwork_required option Relevant PG commit: `c3afe8cf5a` c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6 More PG16 compatibility commits are coming soon ...	2023-08-04 14:51:28 +03:00

1 2 3 4 5 ...

3019 Commits (b9c5f6be76529bc6c89d0ae296fbcb85f341f1f2)