citus

Commit Graph

Author	SHA1	Message	Date
Gürkan İndibay	7794aab38c	Merge branch 'create_alter_database' into alter_database_additional_options	2023-11-20 11:16:21 +03:00
Gürkan İndibay	022b6268bf	Merge branch 'main' into create_alter_database	2023-11-20 11:08:21 +03:00
Naisila Puka	cedcc220bf	Fixes flaky VACUUM (freeze, process toast true) result (#7348 ) https://app.circleci.com/pipelines/github/citusdata/citus/34550/workflows/5b802f66-2666-4623-a209-6d7799f7ee5f/jobs/1229153 ```diff VACUUM (FREEZE, PROCESS_TOAST true) local_vacuum_table; SELECT relfrozenxid::text::integer > :frozenxid AS frozen_performed FROM pg_class WHERE oid=:reltoastrelid::regclass; frozen_performed ------------------ - t + f (1 row) ``` Process toast option in vacuum was introduced in PG14. The failing test was supposed to be a part of `multi_utilities.sql`, but it was included in `pg14.sql` to avoid alternative output for PG13. See `ba62c0a148 (diff-ed03478f693155e2fe092e9ad356bf884dc097f554e8d75eff562d52bbcf7a75L255-L272)` for reference. However, now that we don't support PG13 anymore, we can move this test to `multi_utilities.sql`. Moving the test, plus inserting data before running vacuum freeze such that the freeze is more meaningful and not flaky, fixes the flakiness problem of the test.	2023-11-17 18:58:06 +03:00
Naisila Puka	c88bf5ff1c	Cleanup leftover replication slots in publication test (#7354 )	2023-11-17 15:11:38 +03:00
Japin Li	e14e8667cc	Fix redundant variable declaration (#7353 ) The `$workerCount` declare twice in src/test/regress/pg_regress_multi.pl.	2023-11-17 13:01:23 +03:00
Naisila Puka	0d1f18862b	Propagates SECURITY LABEL ON ROLE stmt (#7304 ) We propagate `SECURITY LABEL [for provider] ON ROLE rolename IS labelname` to the worker nodes. We also make sure to run the relevant `SecLabelStmt` commands on a newly added node by looking at roles found in `pg_shseclabel`. See official docs for explanation on how this command works: https://www.postgresql.org/docs/current/sql-security-label.html This command stores the role label in the `pg_shseclabel` catalog table. This commit also fixes the regex string in `check_gucs_are_alphabetically_sorted.sh` script such that it escapes the dot. Previously it was looking for all strings starting with "citus" instead of "citus." as it should. To test this feature, I currently make use of a special GUC to control label provider registration in PG_init when creating the Citus extension.	2023-11-16 13:12:30 +03:00
Gürkan İndibay	3a6bf7e9ea	Merge branch 'main' into create_alter_database	2023-11-17 13:40:57 +03:00
gindibay	e56a113572	Adds missing line	2023-11-16 08:09:09 +03:00
gindibay	a948358d63	Fixes multi_test_helpers	2023-11-16 08:01:44 +03:00
gindibay	ef2a47e882	Fixes function after merge	2023-11-16 07:20:59 +03:00
gindibay	a6e16252d1	Removes extra line	2023-11-16 05:34:41 +03:00
gindibay	56bc813bd0	Adds check_database_on_all_nodes	2023-11-16 05:13:10 +03:00
gindibay	ed9021ca90	Merge branch 'main' of https://github.com/citusdata/citus into create_alter_database	2023-11-16 04:37:01 +03:00
gindibay	bb2b7ae9da	Adds logs for test	2023-11-15 23:02:21 +03:00
gindibay	feb609868e	Fixes flag name	2023-11-15 21:12:45 +03:00
gindibay	cd40380d80	resets enable_create_database_propagation	2023-11-15 20:51:40 +03:00
gindibay	11e7c94e2f	Removes extra line in test	2023-11-15 19:32:00 +03:00
gindibay	0f1273f6ad	Updates test output	2023-11-15 19:03:16 +03:00
gindibay	dc48833679	Adds tests for non-distributed database	2023-11-15 18:48:46 +03:00
gindibay	deee8e53dd	Adds grant to public	2023-11-15 17:33:14 +03:00
gindibay	7939267326	Fixes flakiness of test	2023-11-15 17:28:50 +03:00
Gürkan İndibay	bb76a9b4b9	Merge branch 'main' into create_alter_database	2023-11-15 17:18:11 +03:00
gindibay	9a558bdece	Adds datacl propagation	2023-11-15 16:04:26 +03:00
Naisila Puka	c6fbb72c02	Fix flaky multi_prepare_plsql (#7346 ) Simple need of an `ORDER BY` clause Ran into this twice this week already! https://github.com/citusdata/citus/actions/runs/6849701315/attempts/1#summary-18622563506 https://github.com/citusdata/citus/actions/runs/6875051160/attempts/1#summary-18698009952 ```diff SELECT nspname, typname FROM pg_type JOIN pg_namespace ON pg_namespace.oid = pg_type.typnamespace WHERE typname = 'prepare_ddl_type_backup'; nspname \| typname -------------+------------------------- - public \| prepare_ddl_type_backup otherschema \| prepare_ddl_type_backup + public \| prepare_ddl_type_backup (2 rows) ```	2023-11-15 13:28:43 +03:00
Naisila Puka	a960799dfb	Clean up leftover replication slots in tests (#7338 ) This commit fixes the flakiness in `logical_replication` and `citus_non_blocking_split_shard_cleanup` tests. The flakiness was related to leftover replication slots. Below is a flaky example for each test: logical_replication https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267030604 citus_non_blocking_split_shard_cleanup https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267006967 ```diff -- Replication slots should be cleaned up SELECT slot_name FROM pg_replication_slots; slot_name --------------------------------- -(0 rows) + citus_shard_split_slot_19_10_17 +(1 row) ``` The tests by themselves are not flaky: 32 flaky test schedules each with 20 runs run successfully. https://github.com/citusdata/citus/actions/runs/6822020127?pr=7338 The conclusion is that: 1. `multi_tenant_isolation_nonblocking` is the problematic test running before `logical_replication` in the `enterprise_schedule`, so I added a cleanup at the end of `multi_tenant_isolation_nonblocking`. https://github.com/citusdata/citus/actions/runs/6824334614/attempts/1#summary-18560127461 2. `citus_split_shard_by_split_points_negative` is the problematic test running before `citus_non_blocking_split_shards_cleanup` in the split schedule. Also added cleanup line. For details on the investigation of leftover replication slots, please check the PR https://github.com/citusdata/citus/pull/7338	2023-11-14 18:50:54 +03:00
Naisila Puka	cdef2d5224	Random tests refactoring (#7342 ) While investigating replication slots leftovers in PR https://github.com/citusdata/citus/pull/7338, I ran into the following refactoring/cleanup that can be done in our test suite: - Add separate test to remove non default nodes - Remove coordinator removal from `add_coordinator` test Use `remove_coordinator_from_metadata` test where needed - Don't print nodeids in `multi_multiuser_auth` and `multi_poolinfo_usage` tests - Use `startswith` when checking for isolation or failure tests - Add some dependencies accordingly in `run_test.py` for running flaky test schedules	2023-11-14 12:49:15 +03:00
gindibay	c1e9335fb7	Adds distributed check in metadata syncing	2023-11-14 09:01:00 +03:00
gindibay	fcdea98edd	Removes drop in citus_internal_db_command udf	2023-11-13 15:58:09 +03:00
gindibay	3731c45c29	Fixes drop force option	2023-11-13 14:19:19 +03:00
Onur Tirtir	ffa1fa0963	Improve tests for >= pg15 & pg >= 16	2023-11-13 13:40:55 +03:00
Onur Tirtir	5b446b1137	make tests passing	2023-11-13 11:27:29 +03:00
Onur Tirtir	fe24227638	Improve tests for PG <= 14	2023-11-13 11:09:54 +03:00
Onur Tirtir	240313e286	Support role commands from any node (#7278 ) DESCRIPTION: Adds support from issuing role management commands from worker nodes It's unlikely to get into a distributed deadlock with role commands, we don't care much about them at the moment. There were several attempts to reduce the chances of a deadlock but we didn't any of them merged into main branch yet, see: #7325 #7016 #7009	2023-11-10 09:58:51 +00:00
gindibay	7a6afb0beb	Fixes review comments	2023-11-10 07:24:54 +03:00
Gürkan İndibay	0a73cb31b0	Merge branch 'main' into create_alter_database	2023-11-10 14:12:48 +03:00
gindibay	b45543f51b	Merge remote-tracking branch 'origin/create_alter_database' into alter_database_additional_options	2023-11-10 05:38:19 +03:00
Naisila Puka	57ff762c82	Fix VACUUM flakiness in multi_utilities (#7334 ) When I run this test in my local, the size of the table after the DELETE command is around 58785792. Hence, I assume that the diffs suggest that the Vacuum had no effect. The current solution is to run the VACUUM command three times instead of once. Example diff: https://github.com/citusdata/citus/actions/runs/6722231142/attempts/1#summary-18269870674 ```diff insert into local_vacuum_table select i from generate_series(1,1000000) i; delete from local_vacuum_table; VACUUM local_vacuum_table; SELECT CASE WHEN s BETWEEN 20000000 AND 25000000 THEN 22500000 ELSE s END FROM pg_total_relation_size('local_vacuum_table') s ; s ---------- - 22500000 + 58785792 (1 row) ``` See more diff examples in the PR description https://github.com/citusdata/citus/pull/7334	2023-11-09 21:00:24 +03:00
dependabot[bot]	d4663212f4	Bump werkzeug from 2.3.7 to 3.0.1 in /src/test/regress Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.3.7 to 3.0.1. - [Release notes](https://github.com/pallets/werkzeug/releases) - [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst) - [Commits](https://github.com/pallets/werkzeug/compare/2.3.7...3.0.1) --- updated-dependencies: - dependency-name: werkzeug dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-11-09 17:14:14 +01:00
gindibay	5f4092db5b	Adds validation for template	2023-11-09 12:11:22 +03:00
Gürkan İndibay	32c67963bd	Merge branch 'main' into create_alter_database	2023-11-09 11:37:58 +03:00
Naisila Puka	0dc41ee5a0	Fix flaky multi_mx_insert_select_repartition test (#7331 ) https://github.com/citusdata/citus/actions/runs/6745019678/attempts/1#summary-18336188930 ```diff insert into target_table SELECT a*2 FROM source_table RETURNING a; -NOTICE: executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_xxxxx_from_4213582_to_0','repartitioned_results_xxxxx_from_4213584_to_0']::text[],'localhost',57638) bytes +NOTICE: executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_3940758121873413_from_4213584_to_0','repartitioned_results_3940758121873413_from_4213582_to_0']::text[],'localhost',57638) bytes ``` The elements in the array passed to `fetch_intermediate_results` are the same, but in the opposite order than expected. To fix this flakiness, we can omit the `"SELECT bytes FROM fetch_intermediate_results..."` line. From the following logs, it is understandable that the intermediate results have been fetched.	2023-11-08 15:15:33 +03:00
gindibay	5f8f1d312d	Fixes message errors	2023-11-08 06:56:21 +03:00
gindibay	ba377ecaab	Fixes tests	2023-11-08 03:23:34 +03:00
gindibay	65660db10d	Fixes review items	2023-11-08 02:02:00 +03:00
Onur Tirtir	21646ca1e9	Fix flaky isolation_get_all_active_transactions.spec test (#7323 ) Fix the flaky test that results in following diff by waiting until the backend that we want to terminate really terminates, until 5secs. ```diff --- /__w/citus/citus/src/test/regress/expected/isolation_get_all_active_transactions.out.modified 2023-11-01 16:30:57.648749795 +0000 +++ /__w/citus/citus/src/test/regress/results/isolation_get_all_active_transactions.out.modified 2023-11-01 16:30:57.656749877 +0000 @@ -114,13 +114,13 @@ -------------------- t (1 row) step s3-show-activity: SET ROLE postgres; select count() from get_all_active_transactions() where process_id IN (SELECT FROM selected_pid); count ----- - 0 + 1 (1 row) ```	2023-11-03 09:00:32 +01:00
Onur Tirtir	5e2439a117	Make some more tests re-runable (#7322 ) * multi_mx_create_table * multi_mx_function_table_reference * multi_mx_add_coordinator * create_role_propagation * metadata_sync_helpers * text_search https://github.com/citusdata/citus/pull/7278 requires this.	2023-11-02 18:32:56 +03:00
Jelte Fennema-Nio	85b997a0fb	Fix flaky multi_alter_table_statements (#7321 ) Sometimes multi_alter_table_statements would fail in CI like this: ```diff -- Verify that DROP NOT NULL works ALTER TABLE lineitem_alter ALTER COLUMN int_column2 DROP NOT NULL; SELECT "Column", "Type", "Modifiers" FROM table_desc WHERE relid='lineitem_alter'::regclass; - Column \| Type \| Modifiers ---------------------------------------------------------------------- - l_orderkey \| bigint \| not null - l_partkey \| integer \| not null - l_suppkey \| integer \| not null - l_linenumber \| integer \| not null - l_quantity \| numeric(15,2) \| not null - l_extendedprice \| numeric(15,2) \| not null - l_discount \| numeric(15,2) \| not null - l_tax \| numeric(15,2) \| not null - l_returnflag \| character(1) \| not null - l_linestatus \| character(1) \| not null - l_shipdate \| date \| not null - l_commitdate \| date \| not null - l_receiptdate \| date \| not null - l_shipinstruct \| character(25) \| not null - l_shipmode \| character(10) \| not null - l_comment \| character varying(44) \| not null - float_column \| double precision \| default 1 - date_column \| date \| - int_column1 \| integer \| - int_column2 \| integer \| - null_column \| integer \| -(21 rows) - +ERROR: schema "alter_table_add_column" does not exist -- COPY should succeed now SELECT master_create_empty_shard('lineitem_alter') as shardid \gset ``` Reading from table_desc apparantly has an issue that if the schema gets deleted from one of the items, while it is being read that we get such an error. This change fixes that by not running multi_alter_table_statements in parallel with alter_table_add_column anymore. This is another instance of the same issue as in #7294	2023-11-02 16:42:45 +03:00
Jelte Fennema-Nio	f171ec98fc	Fix flaky failure_distributed_results (#7307 ) Sometimes in CI we run into this failure: ```diff SELECT resultId, nodeport, rowcount, targetShardId, targetShardIndex FROM partition_task_list_results('test', $$ SELECT * FROM source_table $$, 'target_table') NATURAL JOIN pg_dist_node; -WARNING: connection to the remote node localhost:xxxxx failed with the following error: connection not open +ERROR: connection to the remote node localhost:9060 failed with the following error: connection not open SELECT * FROM distributed_result_info ORDER BY resultId; - resultid \| nodeport \| rowcount \| targetshardid \| targetshardindex ---------------------------------------------------------------------- - test_from_100800_to_0 \| 9060 \| 22 \| 100805 \| 0 - test_from_100801_to_0 \| 57637 \| 2 \| 100805 \| 0 - test_from_100801_to_1 \| 57637 \| 15 \| 100806 \| 1 - test_from_100802_to_1 \| 57637 \| 10 \| 100806 \| 1 - test_from_100802_to_2 \| 57637 \| 5 \| 100807 \| 2 - test_from_100803_to_2 \| 57637 \| 18 \| 100807 \| 2 - test_from_100803_to_3 \| 57637 \| 4 \| 100808 \| 3 - test_from_100804_to_3 \| 9060 \| 24 \| 100808 \| 3 -(8 rows) - +ERROR: current transaction is aborted, commands ignored until end of transaction block -- fetch from worker 2 should fail SAVEPOINT s1; +ERROR: current transaction is aborted, commands ignored until end of transaction block SELECT fetch_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[], 'localhost', :worker_2_port) > 0 AS fetched; -ERROR: could not open file "base/pgsql_job_cache/xx_x_xxx/test_from_100802_to_1.data": No such file or directory -CONTEXT: while executing command on localhost:xxxxx +ERROR: current transaction is aborted, commands ignored until end of transaction block ROLLBACK TO SAVEPOINT s1; +ERROR: savepoint "s1" does not exist -- fetch from worker 1 should succeed SELECT fetch_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[], 'localhost', :worker_1_port) > 0 AS fetched; - fetched ---------------------------------------------------------------------- - t -(1 row) - +ERROR: current transaction is aborted, commands ignored until end of transaction block -- make sure the results read are same as the previous transaction block SELECT count(*), sum(x) FROM read_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[],'binary') AS res (x int); - count \| sum ---------------------------------------------------------------------- - 15 \| 863 -(1 row) - +ERROR: current transaction is aborted, commands ignored until end of transaction block ROLLBACk; ``` As outlined in the #7306 I created, the reason for this is related to only having a single connection open to the node. Finding and fixing the full cause is not trivial, so instead this PR starts working around this bug by forcing maximum parallelism. Preferably we'd want this workaround not to be necessary, but that requires spending time to fix this. For now having a less flaky CI is good enough.	2023-11-02 12:31:56 +00:00
Jelte Fennema-Nio	b47c8b3fb0	Fix flaky insert_select_connection_leak (#7302 ) Sometimes in CI insert_select_connection_leak would fail like this: ```diff END; SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections, worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections; leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- - 0 \| 0 + -1 \| 0 (1 row) -- ROLLBACK BEGIN; INSERT INTO target_table SELECT * FROM source_table; INSERT INTO target_table SELECT * FROM source_table; ROLLBACK; SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections, worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections; leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- - 0 \| 0 + -1 \| 0 (1 row) \set VERBOSITY TERSE -- Error on constraint failure BEGIN; INSERT INTO target_table SELECT * FROM source_table; SELECT worker_connection_count(:worker_1_port) AS worker_1_connections, worker_connection_count(:worker_2_port) AS worker_2_connections \gset SAVEPOINT s1; INSERT INTO target_table SELECT a, CASE WHEN a < 50 THEN b ELSE null END FROM source_table; @@ -89,15 +89,15 @@ leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- 0 \| 0 (1 row) END; SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections, worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections; leaked_worker_1_connections \| leaked_worker_2_connections -----------------------------+----------------------------- - 0 \| 0 + -1 \| 0 (1 row) ``` Source: https://github.com/citusdata/citus/actions/runs/6718401194/attempts/1#summary-18258258387 A negative amount of leaked connectios is obviously not possible. For some reason there was a connection open when we checked the initial amount of connections that was closed afterwards. This could be the from the maintenance daemon or maybe from the previous test that had not fully closed its connections just yet. The change in this PR doesnt't actually fix the cause of the negative connection, but it simply considers it good as well, by changing the result to zero for negative values. With this fix we might sometimes miss a leak, because the negative number can cancel out the leak and still result in a 0. But since the negative number only occurs sometimes, we'll still find the leak often enough.	2023-11-02 13:15:43 +01:00
Cédric Villemain	0678a2fd89	Fix #7242 , CALL(@0) crash backend (#7288 ) When executing a prepared CALL, which is not pure SQL but available with some drivers like npgsql and jpgdbc, Citus entered a code path where a plan is not defined, while trying to increase its cost. Thus SIG11 when plan is a NULL pointer. Fix by only increasing plan cost when plan is not null. However, it is a bit suspicious to get here with a NULL plan and maybe a better change will be to not call ShardPlacementForFunctionColocatedWithDistTable() with a NULL plan at all (in call.c:134) bug hit with for example: ``` CallableStatement proc = con.prepareCall("{CALL p(?)}"); proc.registerOutParameter(1, java.sql.Types.BIGINT); proc.setInt(1, -100); proc.execute(); ``` where `p(bigint)` is a distributed "function" and the param the distribution key (also in a distributed table), see #7242 for details Fixes #7242	2023-11-02 13:15:24 +01:00

1 2 3 4 5 ...

3075 Commits (7794aab38ca372dc94802db61c8be960cd740868)