citus

Commit Graph

Author	SHA1	Message	Date
Hanefi Onaldi	fcd3b6c12f	Changelog entries for 11.3.0 (#6856 ) In this release, I tried something different. I experimented with adding the PR number and title to the changelog right before each changelog entry. This way, it is easier to track where a particular changelog entry comes from. After reviews are over, I plan to remove those lines with PR numbers and titles. I went through all the PRs that are merged after 11.2.0 release and came up with a list of PRs that may need help with changelog entries. You can see details on PRs grouped in several sections below. The following PRs below do not have a changelog entry. If you think that this is a mistake, please share it in this PR along with a suggestion on what the changelog item should be. PR #6846 : fix 3 flaky tests in failure schedule PR #6844 : Add CPU usage to citus_stat_tenants PR #6833 : Fix citus_stat_tenants period updating bug PR #6787 : Add more tests for ddl coverage PR #6842 : Add build-cdc-* temporary directories to .gitignore PR #6841 : Add build-cdc-* temporary directories to .gitignore PR #6840 : Bump Citus to 12.0devel PR #6824 : Fixes flakiness in multi_metadata_sync test PR #6811 : Backport identity column improvements to v11.2 PR #6830 : In run_test.py actually return worker_count PR #6825 : Fixes flakiness in multi_cluster_management test PR #6816 : Refactor run_test.py PR #6817 : Explicitly disallow local rels when inserting into dist table PR #6821 : Rename citus stats tenants PR #6822 : Add some more tests for initial sql support PR #6819 : Fix flakyness in citus_split_shard_by_split_points_deferred_drop PR #6814 : Make python-regress based tests runnable with run_test.py PR #6813 : Fix flaky multi_mx_schema_support test PR #6720 : Convert columnar tap tests to pytest PR #6812 : Revoke statistics permissions from public and grant them to pg_monitor PR #6769 : Citus stats tenants guc PR #6807 : Fix the incorrect (constant) value passed to pointer-to-bool parameter, pass a NULL as the value is not used PR #6797 : Attribute local queries and cached plans on local execution PR #6796 : Parse the annotation string correctly PR #6762 : Add logs to citus_stats_tenants PR #6773 : Add initial sql support for distributed tables that don't have a shard key PR #6792 : Disentangle MERGE planning code from the modify-planning code path PR #6761 : Citus stats tenants collector view PR #6791 : Make 8 more tests runnable multiple times via run_test.py PR #6786 : Refactor some of the planning code to accommodate a new planning path for MERGE SQL PR #6789 : Rename AllRelations.. functions to AllDistributedRelations.. PR #6788 : Actually skip arbitrary_configs_router & nested_execution for AllNullDistKeyDefaultConfig PR #6783 : Add a config for arbitrary config tests where all the tables are null-shard-key tables PR #6784 : Fix attach partition: citus local to null distributed PR #6782 : Add an arbitrary config test heavily based on multi_router_planner_fast_path.sql PR #6781 : Decide what to do with router planner error at one place PR #6778 : Support partitioning for dist tables with null dist keys PR #6766 : fix pip lock file PR #6764 : Make workerCount configurable for regression tests PR #6745 : Add support for creating distributed tables with a null shard key PR #6696 : This implements MERGE phase-III PR #6767 : Add pytest depedencies to Pipfile PR #6760 : Decide core distribution params in CreateCitusTable PR #6759 : Add multi_create_fdw into minimal_schedule PR #6743 : Replace CITUS_TABLE_WITH_NO_DIST_KEY checks with HasDistributionKey() PR #6751 : Stabilize single_node.sql and others that report illegal node removal PR #6742 : Refactor CreateDistributedTable() PR #6747 : Remove unused lock functions PR #6744 : Fix multiple output version arbitrary config tests PR #6741 : Stabilize single node tests PR #6740 : Fix string eval bug in migration files check PR #6736 : Make run_test.py and create_test.py importable without errors PR #6734 : Don't blanket ignore flake8 E402 error PR #6737 : Fixes bookworm packaging pipeline problem PR #6735 : Fix run_test.py on python 3.9 PR #6733 : MERGE: In deparser, add missing check for RETURNING clause. PR #6714 : Remove auto_explain workaround in citus explain hook for ALTER TABLE PR #6719 : Fix flaky test PR #6718 : Add more powerfull dependency tracking to run_test.py PR #6710 : Install non-vulnerable cryptography package PR #6711 : Support compilation and run tests on latest PG versions PR #6700 : Add auto-formatting and linting to our python code PR #6707 : Allow multi_insert_select to run repeatably PR #6708 : Fix flakyness in failure_create_distributed_table_non_empty PR #6698 : Miscellaneous cleanup PR #6704 : Update README for 11.2 PR #6703 : Fix dubious ownership error from git PR #6690 : Bump Citus to 11.3devel The following PRs have changelog entries that are too long to fit in a single line. I'd expect authors to supply at changelog entries in `DESCRIPTION:` lines that are at most 78 characters. If you want to supply multi-line changelog items, you can have multiple lines that start with `DESCRIPTION:` instead. PR #6837 : fixes update propagation bug when `citus_set_coordinator_host` is called more than once PR #6738 : Identity column implementation refactorings PR #6756 : Schedule parallel shard moves in background rebalancer by removing task dependencies between shard moves across colocation groups. PR #6793 : Add a GUC to disallow planning the queries that reference non-colocated tables via router planner PR #6726 : fix memory leak during altering distributed table with a lot of partition and shards PR #6722 : fix memory leak during distribution of a table with a lot of partitions PR #6693 : prevent memory leak during ConvertTable with a lot of partitions The following PR had an empty `DESCRIPTION:` line. This generates an empty changelog line that needs to be removed manually. Please either provide a short entry, or remove `DESCRIPTION:` line completely. PR #6810 : Make CDC decoder an independent extension PR #6827 : Makefile changes to build CDC in builddir for pgoutput and wal2json. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com> (cherry picked from commit `934430003e`)	2023-05-02 12:39:15 +03:00
Hanefi Onaldi	6d833a90e5	Bump columnar to 11.3 (cherry picked from commit eca70b29a67ed0b004c820bcb9f716ee2b6013eb)	2023-05-02 12:39:15 +03:00
Ahmet Gedemenli	12c27ace2f	Ignore nodes not allowed for shards, when planning rebalance steps (#6887 ) We are handling colocation groups with shard group count less than the worker node count, using a method different than the usual rebalancer. See #6739 While making the decision of using this method or not, we should've ignored the nodes that are marked `shouldhaveshards = false`. This PR excludes those nodes when making the decision. Adds a test such that: coordinator: [] worker 1: [1_1, 1_2] worker 2: [2_1, 2_2] (rebalance) coordinator: [] worker 1: [1_1, 2_1] worker 2: [1_2, 2_2] If we take the coordinator into account, the rebalancer considers the first state as balanced and does nothing (because shard_count < worker_count) But with this pr, we ignore the coordinator because it's shouldhaveshards = false So the rebalancer distributes each colocation group to both workers Also, fixes an unrelated flaky test in the same file (cherry picked from commit `59ccf364df`)	2023-05-01 12:55:08 +02:00
aykut-bozkurt	262c335860	break sequence dependency during table creation (#6889 ) We need to break sequence dependency for a table while creating the table during non-transactional metadata sync to ensure idempotency of the creation of the table. Problem: When we send `SELECT pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text) FROM pg_dist_partition` to workers during the non-transactional sync, table might not be in `pg_dist_partition` at worker, and sequence dependency is not broken at the worker. Solution: We break sequence dependency via `SELECT pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)` for each table while creating it at the workers. It is safe to send since the udf is a no-op when there is no sequence dependency. DESCRIPTION: Fixes a bug related to sequence idempotency at non-transactional sync. Fixes https://github.com/citusdata/citus/issues/6888. (cherry picked from commit `8cb69cfd13`)	2023-04-28 15:12:02 +03:00
Emel Şimşek	7b98fbb05e	When creating a HTAB we need to use HASH_COMPARE flag in order to set a user defined comparison function. (#6845 ) DESCRIPTION: Fixes memory errors, caught by valgrind, of type "conditional jump or move depends on uninitialized value" When running Citus tests under Postgres with valgrind, the test cases calling into `NonBlockingShardSplit` function produce valgrind errors of type "conditional jump or move depends on uninitialized value". The issue is caused by creating a HTAB in a wrong way. HASH_COMPARE flag should have been used when creating a HTAB with user defined comparison function. In the absence of HASH_COMPARE flag, HTAB falls back into built-in string comparison function. However, valgrind somehow discovers that the match function is not assigned to the user defined function as intended. Fixes #6835 (cherry picked from commit `e7a25d82c9`)	2023-04-18 14:50:09 +03:00
Gokhan Gulbiz	58155c5779	Backport to 11.3 - Ensure partitionKeyValue and colocationId are set for proper tenant stats gathering (#6834 ) (#6862 ) This PR updates the tenant stats implementation to set partitionKeyValue and colocationId in ExecuteLocalTaskListExtended, in addition to LocallyExecuteTaskPlan. This ensures that tenant stats can be properly gathered regardless of the code path taken. The changes were initially made while testing stored procedure calls for tenant stats. (cherry picked from commit `8782ea1582`)	2023-04-17 15:24:29 +03:00
aykut-bozkurt	17149b92b2	fix 3 flaky tests in failure schedule (#6846 ) Fixed 3 flaky tests in failure tests which caused flakiness in other tests due to changed node and group sequence ids during node addition-removal. (cherry picked from commit `3286ec59e9`)	2023-04-13 13:19:35 +03:00
aykut-bozkurt	1a9066c34a	fixes update propagation bug when `citus_set_coordinator_host` is called more than once (#6837 ) DESCRIPTION: Fixes update propagation bug when `citus_set_coordinator_host` is called more than once. Fixes https://github.com/citusdata/citus/issues/6731. (cherry picked from commit `a20f7e1a55`)	2023-04-13 13:18:28 +03:00
Halil Ozan Akgül	e14f4c3dee	Add CPU usage to citus_stat_tenants (#6844 ) This PR adds CPU usage to `citus_stat_tenants` monitor. CPU usage is tracked in periods, similar to query counts. (cherry picked from commit `9ba70696f7`)	2023-04-12 17:46:00 +03:00
Halil Ozan Akgül	5525676aad	Fix citus_stat_tenants period updating bug (#6833 ) Fixes the bug that causes updating the citus_stat_tenants periods incorrectly. `TimestampDifferenceExceeds` expects the difference in milliseconds but it was microseconds, this is fixed. `tenantStats->lastQueryTime` was updated during monitoring too, now it's updated only when there are tenant queries. (cherry picked from commit `8b50e95dc8`)	2023-04-12 17:45:44 +03:00
rajeshkt78	234df62106	Add build-cdc-* temporary directories to .gitignore (#6842 ) The CDC decoder buillds different versions of CDC base decoders during the build. Since the source files are copied to the temporay directories, they come in git status for files to be added. So these directories and a temporary CDC TAP test directory(tmpcheck) are added to .gitignore file.	2023-04-11 08:54:43 +05:30
Onur Tirtir	5dd08835df	Bump citus version to 11.3.0	2023-04-10 11:16:58 +03:00
rajeshkt78	29c8d9633a	Makefile changes to build CDC in builddir for pgoutput and wal2json. (#6827 ) DESCRIPTION: Makefile changes to build different versions of CDC decoder for different base decoders like pgoutput and wal2json with the same name and copy it to $packagelib/cdc_decoders dir. This helps the user to use logical replication slots normally with pgoutput without being aware of CDC decoder. 1) Changed src/backend/distributed/cdc/Makefile to setup a build directory for CDC in build-cdc-$(DECODER) dir and copy the source files (.c.h and Makefile.decoder) to the build dir and build it for each base decoder. 2) copy the pgoutput.so and wal2json.so into the above build dir and install them in PG packagelibdir/citus_decoders directory. 3)Added a testcase 016_cdc_wal2json.pl for testing the wal2json decoder using pg_recv_logical_changes function.	2023-04-06 17:03:12 +05:30
Naisila Puka	84f2d8685a	Adds control for background task executors involving a node (#6771 ) DESCRIPTION: Adds control for background task executors involving a node ### Background and motivation Nonblocking concurrent task execution via background workers was introduced in [#6459](https://github.com/citusdata/citus/pull/6459), and concurrent shard moves in the background rebalancer were introduced in [#6756](https://github.com/citusdata/citus/pull/6756) - with a hard dependency that limits to 1 shard move per node. As we know, a shard move consists of a shard moving from a source node to a target node. The hard dependency was used because the background task runner didn't have an option to limit the parallel shard moves per node. With the motivation of controlling the number of concurrent shard moves that involve a particular node, either as source or target, this PR introduces a general new GUC citus.max_background_task_executors_per_node to be used in the background task runner infrastructure. So, why do we even want to control and limit the concurrency? Well, it's all about resource availability: because the moves involve the same nodes, extra parallelism won’t make the rebalance complete faster if some resource is already maxed out (usually cpu or disk). Or, if the cluster is being used in a production setting, the moves might compete for resources with production queries much more than if they had been executed sequentially. ### How does it work? A new column named nodes_involved is added to the catalog table that keeps track of the scheduled background tasks, pg_dist_background_task. It is of type integer[] - to store a list of node ids. It is NULL by default - the column will be filled by the rebalancer, but we may not care about the nodes involved in other uses of the background task runner. Table "pg_catalog.pg_dist_background_task" Column \| Type ============================================ job_id \| bigint task_id \| bigint owner \| regrole pid \| integer status \| citus_task_status command \| text retry_count \| integer not_before \| timestamp with time zone message \| text +nodes_involved \| integer[] A hashtable named ParallelTasksPerNode keeps track of the number of parallel running background tasks per node. An entry in the hashtable is as follows: ParallelTasksPerNodeEntry { node_id // The node is used as the hash table key counter // Number of concurrent background tasks that involve node node_id // The counter limit is citus.max_background_task_executors_per_node } When the background task runner assigns a runnable task to a new executor, it increments the counter for each of the nodes involved with that runnable task. The limit of each counter is citus.max_background_task_executors_per_node. If the limit is reached for any of the nodes involved, this runnable task is skipped. And then, later, when the running task finishes, the background task runner decrements the counter for each of the nodes involved with the done task. The following functions take care of these increment-decrement steps: IncrementParallelTaskCountForNodesInvolved(task) DecrementParallelTaskCountForNodesInvolved(task) citus.max_background_task_executors_per_node can be changed in the fly. In the background rebalancer, we simply give {source_node, target_node} as the nodesInvolved input to the ScheduleBackgroundTask function. The rest is taken care of by the general background task runner infrastructure explained above. Check background_task_queue_monitor.sql and background_rebalance_parallel.sql tests for detailed examples. #### Note This PR also adds a hard node dependency if a node is first being used as a source for a move, and then later as a target. The reason this should be a hard dependency is that the first move might make space for the second move. So, we could run out of disk space (or at least overload the node) if we move the second shard to it before the first one is moved away. Fixes https://github.com/citusdata/citus/issues/6716	2023-04-06 14:12:39 +03:00
Gokhan Gulbiz	fa00fc6e3e	Add upgrade/downgrade paths between v11.2.2 and v11.3.1 (#6820 ) DESCRIPTION: PR description that will go into the change log, up to 78 characters --------- Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2023-04-06 12:46:09 +03:00
Ahmet Gedemenli	83a2cfbfcf	Move cleanup record test to upgrade schedule (#6794 ) DESCRIPTION: Move cleanup record test to upgrade schedule	2023-04-06 11:42:49 +03:00
Naisila Puka	fc479bfa49	Fixes flakiness in multi_metadata_sync test (#6824 ) Fixes flakiness in multi_metadata_sync test https://app.circleci.com/pipelines/github/citusdata/citus/31863/workflows/ea937480-a4cc-4646-815c-bb2634361d98/jobs/1074457 ```diff SELECT logicalrelid, repmodel FROM pg_dist_partition WHERE logicalrelid = 'mx_test_schema_1.mx_table_1'::regclass OR logicalrelid = 'mx_test_schema_2.mx_table_2'::regclass; logicalrelid \| repmodel -----------------------------+---------- - mx_test_schema_1.mx_table_1 \| s mx_test_schema_2.mx_table_2 \| s + mx_test_schema_1.mx_table_1 \| s (2 rows) ``` This is a simple issue of missing `ORDER BY` clauses. I went ahead and added some other missing ones in the same file as well. Also, I replaced existing `ORDER BY logicalrelid` with `ORDER BY logicalrelid::text`, in order to compare names, not OIDs.	2023-04-06 11:19:32 +03:00
Halil Ozan Akgül	52ad2d08c7	Multi tenant monitoring (#6725 ) DESCRIPTION: Adds views that monitor statistics on tenant usages This PR adds `citus_stats_tenants` view that monitors the tenants on the cluster. `citus_stats_tenants` shows the node id, colocation id, tenant attribute, read count in this period and last period, and query count in this period and last period of the tenant. Tenant attribute currently is the tenant's distribution column value, later when schema based sharding is introduced, this meaning might change. A period is a time bucket the queries are counted by. Read and query counts for this period can increase until the current period ends. After that those counts are moved to last period's counts, which cannot change. The period length can be set using 'citus.stats_tenants_period'. `SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and `DELETE` queries are counted as _write_ queries. So in the view read counts are `SELECT` counts and query counts are `SELECT`, `INSERT`, `UPDATE` and `DELETE` count. The data is stored in shared memory, in a struct named `MultiTenantMonitor`. `citus_stats_tenants` shows the data from local tenants. `citus_stats_tenants` show up to `citus.stats_tenant_limit` number of tenants. The tenants are scored based on the number of queries they run and the recency of those queries. Every query ran increases the score of tenant by `ONE_QUERY_SCORE`, and after every period ends the scores are halved. Halving is done lazily. To retain information a longer the monitor keeps up to 3 times `citus.stats_tenant_limit` tenants. When the tenant count hits `3 * citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are removed. To see all stored tenants you can use `citus_stats_tenants(return_all_tenants := true)` - [x] Create collector view that gets data from all nodes. #6761 - [x] Add monitoring log #6762 - [x] Create enable/disable GUC #6769 - [x] Parse the annotation string correctly #6796 - [x] Add local queries and prepared statements #6797 - [x] Rename to citus_stat_statements #6821 - [x] Run pgbench - [x] Fix role permissions #6812 --------- Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-04-05 17:44:17 +03:00
Jelte Fennema	d04d32b314	In run_test.py actually return worker_count (#6830 ) Fixes a small mistake that was missed in the refactor of run_test.py that was done in #6816.	2023-04-05 16:38:57 +03:00
Naisila Puka	eda3cc418a	Fixes flakiness in multi_cluster_management test (#6825 ) Fixes flakiness in multi_cluster_management test https://app.circleci.com/pipelines/github/citusdata/citus/31816/workflows/2f455a30-1c0b-4b21-9831-f7cf2169df5a/jobs/1071444 ```diff SELECT public.wait_until_metadata_sync(); +WARNING: waiting for metadata sync timed out wait_until_metadata_sync -------------------------- (1 row) ``` Default timeout value is 15000. I increased it to 60000.	2023-04-05 15:50:22 +03:00
Jelte Fennema	e5e5eb35c7	Refactor run_test.py (#6816 ) Over the last few months run_test.py got more and more complex. This refactors the code in `run_test.py` to be better understandable. Mostly this splits up separate pieces of logic into separate functions.	2023-04-05 11:11:30 +02:00
Onur Tirtir	d4f9de7875	Explicitly disallow local rels when inserting into dist table (#6817 )	2023-04-04 17:46:43 +02:00
Jelte Fennema	dcee370270	Fix flakyness in citus_split_shard_by_split_points_deferred_drop (#6819 ) In CI we would sometimes get this failure: ```diff -- The original shard is marked for deferred drop with policy_type = 2. -- The previous shard should be dropped at the beginning of the second split call SELECT * from pg_dist_cleanup; record_id \| operation_id \| object_type \| object_name \| node_group_id \| policy_type -----------+--------------+-------------+--------------------------------------------------------------------------+---------------+------------- + 60 \| 778 \| 3 \| citus_shard_split_slot_18_21216_778 \| 16 \| 0 512 \| 778 \| 1 \| citus_split_shard_by_split_points_deferred_schema.table_to_split_8981001 \| 16 \| 2 -(1 row) +(2 rows) ``` Replication slots sometimes cannot be deleted right away. Which is hard to resolve, but luckily we can filter these cleanup records out easily by filtering by policy_type. While debugging this issue I learnt that we did not use `GetNextCleanupRecordId` in all places where we created cleanup records. This caused test failures when running tests multiple times, when they set `citus.next_cleanup_record_id`. I tried fixing that by calling GetNextCleanupRecordId in all places but that caused many other tests to fail due to deadlocks. So, instead this adresses that issue by using `ALTER SEQUENCE ... RESTART` instead of `citus.next_cleanup_record_id`. In a follow up PR we should probably get rid of `citus.next_cleanup_record_id`, since it's only used in one other file.	2023-04-04 09:45:48 +02:00
Marco Slot	7c0589abb8	Do not override combinefunc of custom aggregates with common names (#6805 ) DESCRIPTION: Fix an issue that caused some queries with custom aggregates to fail While playing around with https://github.com/pgvector/pgvector I noticed that the AVG query was broken. That's because we treat it as any other AVG by breaking it down in SUM and COUNT, but there are no SUM/COUNT functions in this case, but there is a perfectly usable combinefunc. This PR changes our aggregate logic to prefer custom aggregates with a combinefunc even if they have a common name. Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-04-03 19:43:09 +02:00
rajeshkt78	d5df892394	Make CDC decoder an independent extension (#6810 ) DESCRIPTION: - The CDC decoder is refacroted into a seperate extension that can be used loaded dynamically without having to reload citus. - CDC decoder code can be compiled using DECODER flag to work with different decoders like pgoutput and wal2json. by default the base decode is "pgoutput". - the dynamic_library_path config is adjusted dynamically to prefer the decoders in cdc_decoders directory in citus init so that the users can use the replication subscription commands without having to make any config changes.	2023-04-03 21:32:15 +05:30
Ahmet Gedemenli	697bb55fc5	Refactor shard transfers (#6631 ) DESCRIPTION: Refactor and unify shard move and copy functions Shard move and copy functions share a lot of code in common. This PR unifies these functions into one, along with some helper functions. To preserve the current behavior, we'll introduce and use an enum parameter, and hardcoded strings for producing error/warning messages.	2023-04-03 10:43:54 +03:00
Jelte Fennema	92b358fe0a	Make python-regress based tests runnable with run_test.py (#6814 ) For some tests such as upgrade tests and arbitrary config tests we set up the citus cluster using Python. This setup is slightly different from the perl based setup script (`multi_regress.pl`). Most importantly it uses replication factor 1 by default. This changes our run_test.py script to be able to run a schedule using python instead of `multi_regress.pl`, for the tests that require it. For now arbitrary config tests are still not runnable with `run_test.py`, but this brings us one step closer to being able to do that. Fixes #6804	2023-03-31 17:07:12 +02:00
Marco Slot	343d1c5072	Refactor executor utility functions into multiple files (#6593 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-03-31 13:07:48 +02:00
Jelte Fennema	085b59f586	Fix flaky multi_mx_schema_support test (#6813 ) This happened sometimes: ```diff SELECT objid::oid::regnamespace as "Distributed Schemas" FROM pg_catalog.pg_dist_object WHERE objid::oid::regnamespace IN ('mx_old_schema', 'mx_new_schema'); Distributed Schemas --------------------- - mx_old_schema mx_new_schema + mx_old_schema (2 rows) ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/31706/workflows/edc84a6a-dfef-42b3-ab5c-54daa64c2154/jobs/1065463 In passing make multi_mx_schema_support runnable with run_test.py	2023-03-31 12:36:53 +02:00
Jelte Fennema	7b60cdd13b	Convert columnar tap tests to pytest (#6720 ) Having as little Perl as possible in our repo seems a worthy goal. Sadly Postgres its Perl based TAP infrastructure was the only way in which we could run tests that were hard to do using only SQL commands. This change adds infrastructure to run such "application style tests" using python and converts all our existing Perl TAP tests to this new infrastructure. Some of the helper functions that are added in this PR are currently unused. Most of these will be used by the CDC PR that depends on this. Some others are there because they were needed by the PgBouncer test framework that this is based on, and the functions seemed useful enough to citus testing to keep. The main features of the test suite are: 1. Application style tests using a programming language that our developers know how to write. 2. Caching of Citus clusters in-between tests using the ["fixture" pattern][fixture] from `pytest` to achieve speedy tests. To make this work in practice any changes made during a test are automatically undone. Schemas, replication slots, subscriptions, publications are dropped at the end of each test. And any changes made by `ALTER SYSTEM` or manually editing of `pg_hba.conf` are undone too. 3. Automatic parallel execution of tests using the `-n auto` flag that's added by `pytest-xdist`. This improved the speed of tests greatly with the similar test framework I created for PgBouncer. Right now it doesn't help much yet though, since this PR only adds two tests (one of which takes ~10 times longer than the other). Possible future improvements are: 1. Clean up even more things at the end of each test (e.g. users that were created). These are fairly easy to add, but I have not done so yet since they were not needed yet for this PR or the CDC PR. So I would not be able to test the cleanup easily. 2. Support for query block detection similar to what we can now do using isolation tests. [fixture]: https://docs.pytest.org/en/6.2.x/fixture.html	2023-03-31 12:25:19 +02:00
Teja Mupparti	01ea5f58a9	Fix the incorrect value passed to pointer-to-bool parameter, pass a NULL as the value is not used for this invocation.	2023-03-30 10:45:32 -07:00
aykut-bozkurt	104e85e18f	stabilize metadata syncing (#6728 ) Motivation Some customers experienced out of memory or max allocation block size errors during metadata sync when they had a lot of shards, partitions, indexes, or columns. This PR has motivation to prevent those 2 types of memory failures to boost the scalability of Citus and unlock some customers with huge clusters by letting them add new nodes and upgrade their Citus version above 11.0 which introduced important features e.g. query from any node. Problems Memory errors are caused by the fact that we finish all the metadata sync operations within a single coordinated transaction, which causes mainly 3 problems: 1. Collecting metadata sync commands without freeing until the end of the transaction, 2. Each modification causes PG invalidations related to cache memory. PG stores those invalidations until the end of transaction (for visibility guarantees) to notify other backends about the invalidations. As we do a lot of modifications during the metadata syncing within single coordinated transaction, PG can sometimes exceed max allocation block size at worker nodes due to huge invalidation messages, 3. Citus has MetadataCacheMemory for fast access to metadata objects. To see the effects of the modifications inside the same transaction, we locally process PG invalidations and rebuild many objects without freeing invalidated ones until the end of transaction for simplicity. Solution We decided to add nontransactional mode for metadata sync, where we send each command in separate transaction and reset memory context after each transaction. User can switch to nontransactional mode via a GUC if they hit memory problems during the sync. (Default mode is transactional) We created a common api for both transactional (old mode) and nontransactional modes to have a uniform code and to not disturb test coverage by introducing new code paths. Below items are addressed for the solution: - [x] Commit-1 Add a method to send multiple commands to worker list reusing bare connections. Change will be useful for metadata sync api, - [x] Commit-2 Create MetadataSyncContext api to encapsulate both transactional and nontransactional modes, - [x] Commit-3 Let nontransactional sync mode create transaction per shell table during dropping the shell tables from worker, - [x] Commit-4 Add new metadata sync methods which uses MetadataSyncContext api so that during the sync we can 1. free memory to prevent OOM, 2. use either transactional or nontransactional modes according to the GUC `citus.metadata_sync_transaction_mode`. - [x] Commit-5 Let `ActivateNode` use new metadata sync api, - [x] Commit-6 Let `activate_node_snapshot` use new metadata sync api, - [x] Commit-7 Remove unused old metadata sync methods, - [x] Commit-8 Drop table, if exists, during table dependency creation, - [x] Commit-9 Do not enforce distributed transaction at `EnsureCoordinatorInitiatedOperation`, - [x] Commit-10 Do not acquire strict lock on separate transaction to localhost as we already take the lock before, - [x] Commit-11 Let `AddNodeMetadata` to use metadatasync api during `citus_add_node`, - [x] Commit-12 Force activated bare connections to close at transaction end, - [x] Commit-13 Add failure tests for nontransactional metadata sync mode, - [x] Verify OOM and max allowed allocation block errors do not happen with nontransactional sync mode. DESCRIPTION: Fixes memory leak and max allocation block errors during metadata syncing. DESCRIPTION: Introduces nontransactional mode for metadata sync. DESCRIPTION: Introduces the GUC `citus.metadata_sync_mode` to switch sync modes.	2023-03-30 11:21:13 +03:00
aykutbozkurt	dc57e4b2d8	PR #6728 / commit - 13 Add failure tests for nontransactional metadata sync mode.	2023-03-30 11:06:16 +03:00
aykutbozkurt	f2f0ec9dda	PR #6728 / commit - 12 Force activated bare connections to close at transaction end.	2023-03-30 11:06:16 +03:00
aykutbozkurt	35dbdae5a4	PR #6728 / commit - 11 Let AddNodeMetadata to use metadatasync api during node addition.	2023-03-30 11:06:16 +03:00
aykutbozkurt	fe00b3263a	PR #6728 / commit - 10 Do not acquire strict lock on separate transaction to localhost as we already take the lock before. But make sure that caller has the ExclusiveLock.	2023-03-30 11:06:16 +03:00
aykutbozkurt	a74232bb39	PR #6728 / commit - 9 Do not enforce distributed transaction at `EnsureCoordinatorInitiatedOperation`.	2023-03-30 10:53:22 +03:00
aykutbozkurt	cf4e93a332	PR #6728 / commit - 8 Drop table, if exists, during table dependency creation.	2023-03-30 10:53:22 +03:00
aykutbozkurt	f8fb20cc95	PR #6728 / commit - 7 Remove unused old metadata sync methods.	2023-03-30 10:53:22 +03:00
aykutbozkurt	1fb3de14df	PR #6728 / commit - 6 Let `activate_node_snapshot` use new metadata sync api.	2023-03-30 10:53:22 +03:00
aykutbozkurt	bc25ba51c3	PR #6728 / commit - 5 Let `ActivateNode` use new metadata sync api.	2023-03-30 10:53:22 +03:00
aykutbozkurt	29ef9117e6	PR #6728 / commit - 4 Add new metadata sync methods which uses MemorySyncContext api so that during the sync we can - free memory to prevent OOM, - use either transactional or nontransactional modes according to the GUC .	2023-03-30 10:53:22 +03:00
aykutbozkurt	8feb8c634a	PR #6728 / commit - 3 Let nontransactional sync mode create transaction per shell table during dropping the shell tables from worker.	2023-03-30 10:53:20 +03:00
aykutbozkurt	85d50203d1	PR #6728 / commit - 2 - Create MetadataSyncContext api to encapsulate both transactional and nontransactional modes, - Add a GUC to switch between metadata sync transaction modes.	2023-03-30 10:52:46 +03:00
aykutbozkurt	98abd68178	PR #6728 / commit - 1 Add a method to send multiple commands to worker list reusing the same bare connections. Change will be useful for metadata sync api.	2023-03-30 10:52:46 +03:00
Gokhan Gulbiz	e71bfd6074	Identity column implementation refactorings (#6738 ) This pull request proposes a change to the logic used for propagating identity columns to worker nodes in citus. Instead of creating a dependent sequence for each identity column and changing its default value to `nextval(seq)/worker_nextval(seq)`, this update will pass the identity columns as-is to the worker nodes. Please note that there are a few limitations to this change. 1. Only bigint identity columns will be allowed in distributed tables to ensure compatibility with the DDL from any node functionality. Our current distributed sequence implementation only allows insert statements from all nodes for bigint sequences. 2. `alter_distributed_table` and `undistribute_table` operations will not be allowed for tables with identity columns. This is because we do not have a proper way of keeping sequence states consistent across the cluster. DESCRIPTION: Prevents using identity columns on data types other than `bigint` on distributed tables DESCRIPTION: Prevents using `alter_distributed_table` and `undistribute_table` UDFs when a table has identity columns DESCRIPTION: Fixes a bug that prevents enforcing identity column restrictions on worker nodes Depends on #6740 Fixes #6694	2023-03-30 10:41:01 +03:00
Emel Şimşek	d3fb9288ab	Schedule parallel shard moves in background rebalancer by removing task dependencies between shard moves across colocation groups. (#6756 ) DESCRIPTION: This PR removes the task dependencies between shard moves for which the shards belong to different colocation groups. This change results in scheduling multiple tasks in the RUNNABLE state. Therefore it is possible that the background task monitor can run them concurrently. Previously, all the shard moves planned in a rebalance operation took dependency on each other sequentially. For instance, given the following table and shards colocation group 1 colocation group 2 table1 table2 table3 table4 table 5 shard11 shard21 shard31 shard41 shard51 shard12 shard22 shard32 shard42 shard52 if the rebalancer planner returned the below set of moves ` {move(shard11), move(shard12), move(shard41), move(shard42)}` background rebalancer scheduled them such that they depend on each other sequentially. ``` {move(reftables) if there is any, none} \| move( shard11) \| move(shard12) \| {move(shard41)<--- move(shard12)} This is an artificial dependency move(shard41) \| move(shard42) ``` This results in artificial dependencies between otherwise independent moves. Considering that the shards in different colocation groups can be moved concurrently, this PR changes the dependency relationship between the moves as follows: ``` {move(reftables) if there is any, none} {move(reftables) if there is any, none} \| \| move(shard11) move(shard41) \| \| move(shard12) move(shard42) ``` --------- Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>	2023-03-29 22:03:37 +03:00
Marco Slot	ce4bcf6de0	Propagate CREATE/ALTER/DROP PUBLICATION statements (#6776 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2023-03-29 15:25:35 +02:00
Gokhan Gulbiz	e618345703	Handle identity columns properly in the router planner (#6802 ) DESCRIPTION: Fixes a bug with insert..select queries with identity columns Fixes #6798	2023-03-29 15:50:12 +03:00
Marco Slot	e5fd1c3a87	Fix TAP tests after CREATE PUBLICATION changes	2023-03-29 00:59:12 +02:00

1 2 3 4 5 ...

6487 Commits (fcd3b6c12f2c512578b682a4d814ebaa3e423d8b) All Branches Search

6487 Commits (fcd3b6c12f2c512578b682a4d814ebaa3e423d8b)

All Branches