citus

Commit Graph

Author	SHA1	Message	Date
Naisila Puka	0d503dd5ac	PG16 compatibility: ruleutils and successful CREATE EXTENSION (#7087 ) PG16 compatibility - Part 2 Part 1 provided successful compilation against pg16beta2. `42d956888d` This PR provides ruleutils changes with pg16beta2 and successful CREATE EXTENSION command. Note that more changes are needed in order to have successful regression tests. More commits are coming soon ... For any_value changes, I referred to this commit `8ef94dc1f5` where we did something similar for PG14 support.	2023-08-02 16:04:51 +03:00
Halil Ozan Akgül	c99a93ffa7	Move SQL file changes for citus_shard_sizes fixes into the new 11.3-2 version (#7050 ) This PR moves `citus_shard_sizes` changes from #7003, and #7018 to into a new Citus version, 11.3-2	2023-07-14 17:19:54 +03:00
aykut-bozkurt	609a5465ea	Bump Citus version into 12.1devel (#7061 )	2023-07-14 13:12:30 +03:00
Halil Ozan Akgül	613cced1ae	Use citus_shard_sizes in citus_tables (#7018 ) Fixes #7019 This PR updates citus_tables view to use citus_shard_sizes function, instead of citus_total_relation_size to improve performance.	2023-07-05 11:40:34 +03:00
Jelte Fennema	ac24e11986	Change default rebalance strategy to by_disk_size (#7033 ) DESCRIPTION: Change default rebalance strategy to by_disk_size When introducing rebalancing by disk size we didn't make it the default initially. The main reason was, because we expected some problems with it. We have indeed had some problems/bugs with it over the years, and have fixed all of them. By now we're quite confident in its stability, and that it pretty much always gives better results than by_shard_count. So this PR makes by_disk_size the new default. We don't change the default when some other strategy than by_shard_count is the current default. This is in case someone defined their own rebalance strategy and marked this as the default themselves. Note: It explicitly does nothing during a downgrade, because there's no way of knowing if the rebalance strategy before the upgrade was by_disk_size or by_shard_count. And even in previous versions by_disk_size is considered superior for quite some time.	2023-07-03 11:08:24 +02:00
Xin Li	c10cb50aa9	Support custom cast from / to timestamptz in time partition management UDFs (#6923 ) This is to implement custom cast of table partition column type from / to `timestamptz` in time partition management UDFs, as proposed in ticket #6454 The general idea is for a time partition column with type other than `date`, `timestamp`, or `timestamptz`, users can provide custom bidirectional cast between the column type and `timestamptz`, the UDFs then will be able to create and drop time partitions for such tables. Fixes #6454 --------- Signed-off-by: Xin Li <xin@swirldslabs.com> Co-authored-by: Marco Slot <marco.slot@microsoft.com> Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>	2023-06-19 17:49:05 +03:00
Halil Ozan Akgül	04f6868ed2	Add citus_schemas view (#6979 ) DESCRIPTION: Adds citus_schemas view The citus_schemas view will be created in public schema if it exists, if not the view will be created in pg_catalog. Need to: - [x] Add tests - [x] Fix tests	2023-06-16 14:21:58 +03:00
Onur Tirtir	dbdf04e8ba	Rename pg_dist tenant_schema to pg_dist_schema (#7001 )	2023-06-14 12:12:15 +03:00
Halil Ozan Akgül	772d194357	Changes citus_shard_sizes view's Shard Name Column to Shard Id (#7003 ) citus_shard_sizes view had a shard name column we use to extract shard id. This PR changes the column to shard id so we don't do unnecessary string operation.	2023-06-13 16:36:35 +03:00
Gokhan Gulbiz	e0ccd155ab	Make citus_stat_tenants work with schema-based tenants. (#6936 ) DESCRIPTION: Enabling citus_stat_tenants to support schema-based tenants. This pull request modifies the existing logic to enable tenant monitoring with schema-based tenants. The changes made are as follows: - If a query has a partitionKeyValue (which serves as a tenant key/identifier for distributed tables), Citus annotates the query with both the partitionKeyValue and colocationId. This allows for accurate tracking of the query. - If a query does not have a partitionKeyValue, but its colocationId belongs to a distributed schema, Citus annotates the query with only the colocationId. The tenant monitor can then easily look up the schema to determine if it's a distributed schema and make a decision on whether to track the query. --------- Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>	2023-06-13 14:11:45 +03:00
aykut-bozkurt	213d363bc3	Add citus_schema_distribute/undistribute udfs to convert a schema into a tenant schema / back to a regular schema (#6933 ) * Currently we do not allow any Citus tables other than Citus local tables inside a regular schema before executing `citus_schema_distribute`. * `citus_schema_undistribute` expects only single shard distributed tables inside a tenant schema. DESCRIPTION: Adds the udf `citus_schema_distribute` to convert a regular schema into a tenant schema. DESCRIPTION: Adds the udf `citus_schema_undistribute` to convert a tenant schema back to a regular schema. --------- Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2023-06-12 18:41:31 +03:00
Halil Ozan Akgül	7e486345f1	Fix citus_table_type column in citus_tables and citus_shards views for single shard tables (#6971 ) `citus_table_type` column of `citus_tables` and `citus_shards` will show "schema" for tenants schema tables and "distributed" for single shard tables that are not in a tenant schema.	2023-06-06 16:20:11 +03:00
Onur Tirtir	246b054a7d	Add support for schema-based-sharding via a GUC (#6866 ) DESCRIPTION: Adds citus.enable_schema_based_sharding GUC that allows sharding the database based on schemas when enabled. * Refactor the logic that automatically creates Citus managed tables * Refactor CreateSingleShardTable() to allow specifying colocation id instead * Add support for schema-based-sharding via a GUC ### What this PR is about: Add citus.enable_schema_based_sharding GUC to enable schema-based sharding. Each schema created while this GUC is ON will be considered as a tenant schema. Later on, regardless of whether the GUC is ON or OFF, any table created in a tenant schema will be converted to a single shard distributed table (without a shard key). All the tenant tables that belong to a particular schema will be co-located with each other and will have a shard count of 1. We introduce a new metadata table --pg_dist_tenant_schema-- to do the bookkeeping for tenant schemas: ```sql psql> \d pg_dist_tenant_schema Table "pg_catalog.pg_dist_tenant_schema" ┌───────────────┬─────────┬───────────┬──────────┬─────────┐ │ Column │ Type │ Collation │ Nullable │ Default │ ├───────────────┼─────────┼───────────┼──────────┼─────────┤ │ schemaid │ oid │ │ not null │ │ │ colocationid │ integer │ │ not null │ │ └───────────────┴─────────┴───────────┴──────────┴─────────┘ Indexes: "pg_dist_tenant_schema_pkey" PRIMARY KEY, btree (schemaid) "pg_dist_tenant_schema_unique_colocationid_index" UNIQUE, btree (colocationid) psql> table pg_dist_tenant_schema; ┌───────────┬───────────────┐ │ schemaid │ colocationid │ ├───────────┼───────────────┤ │ 41963 │ 91 │ │ 41962 │ 90 │ └───────────┴───────────────┘ (2 rows) ``` Colocation id column of pg_dist_tenant_schema can never be NULL even for the tenant schemas that don't have a tenant table yet. This is because, we assign colocation ids to tenant schemas as soon as they are created. That way, we can keep associating tenant schemas with particular colocation groups even if all the tenant tables of a tenant schema are dropped and recreated later on. When a tenant schema is dropped, we delete the corresponding row from pg_dist_tenant_schema. In that case, we delete the corresponding colocation group from pg_dist_colocation as well. ### Future work for 12.0 release: We're building schema-based sharding on top of the infrastructure that adds support for creating distributed tables without a shard key (https://github.com/citusdata/citus/pull/6867). However, not all the operations that can be done on distributed tables without a shard key necessarily make sense (in the same way) in the context of schema-based sharding. For example, we need to think about what happens if user attempts altering schema of a tenant table. We will tackle such scenarios in a future PR. We will also add a new UDF --citus.schema_tenant_set() or such-- to allow users to use an existing schema as a tenant schema, and another one --citus.schema_tenant_unset() or such-- to stop using a schema as a tenant schema in future PRs.	2023-05-26 10:49:58 +03:00
Onur Tirtir	e7abde7e81	Prevent downgrades when there is a single-shard table in the cluster (#6908 ) Also add a few tests for Citus/PG upgrade/downgrade scenarios.	2023-05-16 09:44:28 +02:00
Halil Ozan Akgül	9ba70696f7	Add CPU usage to citus_stat_tenants (#6844 ) This PR adds CPU usage to `citus_stat_tenants` monitor. CPU usage is tracked in periods, similar to query counts.	2023-04-12 16:23:00 +03:00
Onur Tirtir	0194657c5d	Bump Citus to 12.0devel (#6840 )	2023-04-10 12:05:18 +03:00
Naisila Puka	84f2d8685a	Adds control for background task executors involving a node (#6771 ) DESCRIPTION: Adds control for background task executors involving a node ### Background and motivation Nonblocking concurrent task execution via background workers was introduced in [#6459](https://github.com/citusdata/citus/pull/6459), and concurrent shard moves in the background rebalancer were introduced in [#6756](https://github.com/citusdata/citus/pull/6756) - with a hard dependency that limits to 1 shard move per node. As we know, a shard move consists of a shard moving from a source node to a target node. The hard dependency was used because the background task runner didn't have an option to limit the parallel shard moves per node. With the motivation of controlling the number of concurrent shard moves that involve a particular node, either as source or target, this PR introduces a general new GUC citus.max_background_task_executors_per_node to be used in the background task runner infrastructure. So, why do we even want to control and limit the concurrency? Well, it's all about resource availability: because the moves involve the same nodes, extra parallelism won’t make the rebalance complete faster if some resource is already maxed out (usually cpu or disk). Or, if the cluster is being used in a production setting, the moves might compete for resources with production queries much more than if they had been executed sequentially. ### How does it work? A new column named nodes_involved is added to the catalog table that keeps track of the scheduled background tasks, pg_dist_background_task. It is of type integer[] - to store a list of node ids. It is NULL by default - the column will be filled by the rebalancer, but we may not care about the nodes involved in other uses of the background task runner. Table "pg_catalog.pg_dist_background_task" Column \| Type ============================================ job_id \| bigint task_id \| bigint owner \| regrole pid \| integer status \| citus_task_status command \| text retry_count \| integer not_before \| timestamp with time zone message \| text +nodes_involved \| integer[] A hashtable named ParallelTasksPerNode keeps track of the number of parallel running background tasks per node. An entry in the hashtable is as follows: ParallelTasksPerNodeEntry { node_id // The node is used as the hash table key counter // Number of concurrent background tasks that involve node node_id // The counter limit is citus.max_background_task_executors_per_node } When the background task runner assigns a runnable task to a new executor, it increments the counter for each of the nodes involved with that runnable task. The limit of each counter is citus.max_background_task_executors_per_node. If the limit is reached for any of the nodes involved, this runnable task is skipped. And then, later, when the running task finishes, the background task runner decrements the counter for each of the nodes involved with the done task. The following functions take care of these increment-decrement steps: IncrementParallelTaskCountForNodesInvolved(task) DecrementParallelTaskCountForNodesInvolved(task) citus.max_background_task_executors_per_node can be changed in the fly. In the background rebalancer, we simply give {source_node, target_node} as the nodesInvolved input to the ScheduleBackgroundTask function. The rest is taken care of by the general background task runner infrastructure explained above. Check background_task_queue_monitor.sql and background_rebalance_parallel.sql tests for detailed examples. #### Note This PR also adds a hard node dependency if a node is first being used as a source for a move, and then later as a target. The reason this should be a hard dependency is that the first move might make space for the second move. So, we could run out of disk space (or at least overload the node) if we move the second shard to it before the first one is moved away. Fixes https://github.com/citusdata/citus/issues/6716	2023-04-06 14:12:39 +03:00
Gokhan Gulbiz	fa00fc6e3e	Add upgrade/downgrade paths between v11.2.2 and v11.3.1 (#6820 ) DESCRIPTION: PR description that will go into the change log, up to 78 characters --------- Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2023-04-06 12:46:09 +03:00
Halil Ozan Akgül	52ad2d08c7	Multi tenant monitoring (#6725 ) DESCRIPTION: Adds views that monitor statistics on tenant usages This PR adds `citus_stats_tenants` view that monitors the tenants on the cluster. `citus_stats_tenants` shows the node id, colocation id, tenant attribute, read count in this period and last period, and query count in this period and last period of the tenant. Tenant attribute currently is the tenant's distribution column value, later when schema based sharding is introduced, this meaning might change. A period is a time bucket the queries are counted by. Read and query counts for this period can increase until the current period ends. After that those counts are moved to last period's counts, which cannot change. The period length can be set using 'citus.stats_tenants_period'. `SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and `DELETE` queries are counted as _write_ queries. So in the view read counts are `SELECT` counts and query counts are `SELECT`, `INSERT`, `UPDATE` and `DELETE` count. The data is stored in shared memory, in a struct named `MultiTenantMonitor`. `citus_stats_tenants` shows the data from local tenants. `citus_stats_tenants` show up to `citus.stats_tenant_limit` number of tenants. The tenants are scored based on the number of queries they run and the recency of those queries. Every query ran increases the score of tenant by `ONE_QUERY_SCORE`, and after every period ends the scores are halved. Halving is done lazily. To retain information a longer the monitor keeps up to 3 times `citus.stats_tenant_limit` tenants. When the tenant count hits `3 * citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are removed. To see all stored tenants you can use `citus_stats_tenants(return_all_tenants := true)` - [x] Create collector view that gets data from all nodes. #6761 - [x] Add monitoring log #6762 - [x] Create enable/disable GUC #6769 - [x] Parse the annotation string correctly #6796 - [x] Add local queries and prepared statements #6797 - [x] Rename to citus_stat_statements #6821 - [x] Run pgbench - [x] Fix role permissions #6812 --------- Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-04-05 17:44:17 +03:00
aykutbozkurt	fe00b3263a	PR #6728 / commit - 10 Do not acquire strict lock on separate transaction to localhost as we already take the lock before. But make sure that caller has the ExclusiveLock.	2023-03-30 11:06:16 +03:00
aykutbozkurt	8feb8c634a	PR #6728 / commit - 3 Let nontransactional sync mode create transaction per shell table during dropping the shell tables from worker.	2023-03-30 10:53:20 +03:00
Gokhan Gulbiz	e71bfd6074	Identity column implementation refactorings (#6738 ) This pull request proposes a change to the logic used for propagating identity columns to worker nodes in citus. Instead of creating a dependent sequence for each identity column and changing its default value to `nextval(seq)/worker_nextval(seq)`, this update will pass the identity columns as-is to the worker nodes. Please note that there are a few limitations to this change. 1. Only bigint identity columns will be allowed in distributed tables to ensure compatibility with the DDL from any node functionality. Our current distributed sequence implementation only allows insert statements from all nodes for bigint sequences. 2. `alter_distributed_table` and `undistribute_table` operations will not be allowed for tables with identity columns. This is because we do not have a proper way of keeping sequence states consistent across the cluster. DESCRIPTION: Prevents using identity columns on data types other than `bigint` on distributed tables DESCRIPTION: Prevents using `alter_distributed_table` and `undistribute_table` UDFs when a table has identity columns DESCRIPTION: Fixes a bug that prevents enforcing identity column restrictions on worker nodes Depends on #6740 Fixes #6694	2023-03-30 10:41:01 +03:00
Marco Slot	b09d239809	Propagate CREATE PUBLICATION statements	2023-03-29 00:59:12 +02:00
rajeshkt78	85b8a2c7a1	CDC implementation for Citus using Logical Replication (#6623 ) Description: Implementing CDC changes using Logical Replication to avoid re-publishing events multiple times by setting up replication origin session, which will add "DoNotReplicateId" to every WAL entry. - shard splits - shard moves - create distributed table - undistribute table - alter distributed tables (for some cases) - reference table operations The citus decoder which will be decoding WAL events for CDC clients, ignores any WAL entry with replication origin that is not zero. It also maps the shard names to distributed table names.	2023-03-28 16:00:21 +05:30
Onur Tirtir	483b51392f	Bump Citus to 11.3devel (#6690 )	2023-02-06 10:23:25 +00:00
Hanefi Onaldi	47ff03123b	Improve rebalance reporting for retried tasks (#6683 ) If there is a problem with an ongoing rebalance, we did not show details on background tasks that are stuck in runnable state. Similar to how we show details for errored tasks, we now show details on tasks that are being retried. Earlier we showed the following output when a task was stuck: ``` ┌────────────────────────────┐ │ { ↵│ │ "tasks": [ ↵│ │ ], ↵│ │ "task_state_counts": {↵│ │ "done": 13, ↵│ │ "blocked": 2, ↵│ │ "runnable": 1 ↵│ │ } ↵│ │ } │ └────────────────────────────┘ ``` Now we show details like the following: ``` +----------------------------------------------------------------------- \| { \| "tasks": [ \| { \| "state": "runnable", \| "command": "SELECT pg_catalog.citus_move_shard_placement(1 \| "message": "ERROR: Moving shards to a node that shouldn't \| "retried": 2, \| "task_id": 3 \| } \| ], \| "task_state_counts": { \| "blocked": 1, \| "runnable": 1 \| } \| } +----------------------------------------------------------------------- ```	2023-01-31 15:26:52 +03:00
Ahmet Gedemenli	b3b135867e	Remove shardstate from placement insert functions (#6615 )	2023-01-18 09:52:38 +01:00
Hanefi Onaldi	f21dfd5fae	Rebalance Progress Reporting API (#6576 ) citus_job_list() lists all background jobs by simply showing the records in pg_dist_background_job. citus_job_status(job_id bigint, raw boolean default false) shows the status of a single background job by appending a jsonb details column to the associated row from pg_dist_background_job. If the raw argument is set, machine readable sizes are used instead of human readable alternatives. citus_rebalance_status(raw boolean default false) shows the status of the last rebalance operation. If the raw argument is set, machine readable sizes are used instead of human readable alternatives.	2023-01-16 16:17:31 +03:00
Ahmet Gedemenli	e5fef40c06	Introduce citus_move_shard_placement UDF with nodeid	2023-01-12 16:57:51 +03:00
Ahmet Gedemenli	e19c545fbf	Introduce citus_copy_shard_placement UDF with nodeid	2023-01-12 16:57:51 +03:00
Ahmet Gedemenli	235047670d	Drop SHARD_STATE_TO_DELETE (#6494 ) DESCRIPTION: Drop `SHARD_STATE_TO_DELETE` and use the cleanup records instead Drops the shard state that is used to mark shards as orphaned. Now we insert cleanup records into `pg_dist_cleanup` so "orphaned" shards will be dropped either by maintenance daemon or internal cleanup calls. With this PR, we make the "cleanup orphaned shards" functions to be no-op, as they would not be needed anymore. This PR includes some naming changes about placement functions. We don't need functions that filter orphaned shards, as there will be no orphaned shards anymore. We will also be introducing a small script with this PR, for users with orphaned shards. We'll basically delete the orphaned shard entries from `pg_dist_placement` and insert cleanup records into `pg_dist_cleanup` for each one of them, during Citus upgrade. We also have a lot of flakiness fixes in this PR. Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-01-03 14:38:16 +03:00
Nils Dijk	b5b73d78c3	add prepare and finish pg upgrade functions to 11.2-1 (#6560 ) Fixes a missed include in #6315. While adding the cluster clock we have added some extra steps to `citus_prepare_pg_upgrade` and `citus_finish_pg_upgrade`. These changes were not added to the citus upgrade and downgrade scripts, this allowed for a syntax error to slip in. This PR adds the new versions of both UDF's to the upgrade script while adding the old version to the downgrade script. This exposed the syntax error which is also solved.	2022-12-14 12:34:22 +01:00
aykut-bozkurt	1ad1a0a336	add citus_task_wait udf to wait on desired task status (#6475 ) We already have citus_job_wait to wait until the job reaches the desired state. That PR adds waiting on task state to allow more granular waiting. It can be used for Citus operations. Moreover, it is also useful for testing purposes. (wait until a task reaches specified state) Related to #6459.	2022-12-12 22:41:03 +03:00
Ahmet Gedemenli	cb02d62369	Unique names for replication artifacts (#6529 ) DESCRIPTION: Create replication artifacts with unique names We're creating replication objects with generic names. This disallows us to enable parallel shard moves, as two operations might use the same objects. With this PR, we'll create below objects with operation specific names, by appending OparationId to the names. * Subscriptions * Publications * Replication Slots * Users created for subscriptions	2022-12-06 15:48:16 +03:00
Teja Mupparti	e14dc5d45d	Address the issues/comments from the original PR# 6315 1) Regular users fail to use clock UDF with permission issue. 2) Clock functions were declared as STABLE, whereas by definition they are VOLATILE. By design, any clock/time functions will return different results for each call even within a single SQL statement. Note: UDF citus_get_transaction_clock() is a misnomer as it internally calls the clock tick which always returns different results for every invocation in the same transaction.	2022-12-05 11:06:21 -08:00
Marco Slot	666696c01c	Deprecate citus.replicate_reference_tables_on_activate, make it always off (#6474 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-11-04 16:21:10 +01:00
Teja Mupparti	01103ce05d	This implements a new UDF citus_get_cluster_clock() that returns a monotonically increasing logical clock. Clock guarantees to never go back in value after restarts, and makes best attempt to keep the value close to unix epoch time in milliseconds. Also, introduces a new GUC "citus.enable_cluster_clock", when true, every distributed transaction is stamped with logical causal clock and persisted in a catalog pg_dist_commit_transaction.	2022-10-28 10:15:08 -07:00
aykut-bozkurt	162c8a5160	Drop worker_fetch_foreign_file/worker_repartition_cleanup only if they exist when upgrading Citus (#6441 ) We should not introduce breaking sql changes to upgrade files after they are released. We did that for worker_fetch_foreign_file in v9.0.0 and worker_repartition_cleanup in v9.2.0. Later when we try to drop those udfs, they were missing for some clients unexpectedly due to breaking change in an old upgrade script. For that case, the fix is to add DROP IF EXISTS for those 2 udfs in 11.0-4--11.1-1.	2022-10-21 14:32:42 +03:00
Ahmet Gedemenli	96912d9ba1	Add status column to get_rebalance_progress() (#6403 ) DESCRIPTION: Adds status column to get_rebalance_progress() Introduces a new column named `status` for the function `get_rebalance_progress()`. For each ongoing shard move, this column will reveal information about that shard move operation's current status. For now, candidate status messages could be one of the below. * Not Started * Setting Up * Copying Data * Catching Up * Creating Constraints * Final Catchup * Creating Foreign Keys * Completing * Completed	2022-10-17 16:55:31 +03:00
Jelte Fennema	0cee79a7ab	Actually enable improved blocked process detection (#6426 ) In #6405 I added better improved blocked process detection for isolation tests. But when cleaning up unnecessary code I cleaned up a bit too much. This actually includes the new function definition in our migrations.	2022-10-13 09:50:37 +02:00
Jelte Fennema	6277ffd69e	Reduce isolation flakyness by improving blocked process detection (#6405 ) Sometimes our CI randomly fails on a test in a way similar to this: ```diff step s2-drop: DROP TABLE cancel_table; - + <waiting ...> +step s2-drop: <... completed> starting permutation: s1-timeout s1-begin s1-sleep10000 s1-rollback s1-reset s1-drop ``` Source: https://app.circleci.com/pipelines/github/citusdata/citus/26524/workflows/5415b84f-13a3-482f-bef9-648314c79a67/jobs/756377 I tried to fix that already in #6252 by disabling the maintenance daemon during isolation tests. But it seems that hasn't fixed all cases of these errors. This is another attempt at fixing these issues that seems to have better results. What it does is that it starts using the pInterestingPids parameter that citus_isolation_test_session_is_blocked receives. With this change we start filter out block-edges that are not caused by any of these pids. In passing this change also makes it possible to run `isolation_create_distributed_table_concurrently` with `check-isolation-base`	2022-10-12 16:35:09 +02:00
Ahmet Gedemenli	e36890ce55	Add source_lsn and target_lsn fields into get_rebalance_progress (#6364 ) DESCRIPTION: Adds source_lsn and target_lsn fields into get_rebalance_progress Adding two fields named `source_lsn` and `target_lsn` to the function `get_rebalance_progress`. Target lsn data is fetched in `GetShardStatistics`, by expanding the query sent to workers (joining with pg_subscription_rel and pg_stat_subscription). Then put into the hashmap, for each shard. Source lsn data is fetched in `BuildWorkerShardStatististicsHash`, in the loop that iterate each node, by sending a pg_current_wal_lsn query to each node. Then put into the hashmap, for each node.	2022-10-05 11:12:24 +03:00
Ahmet Gedemenli	d0fa10a98c	Bump Citus to 11.2devel (#6385 )	2022-09-30 14:47:42 +03:00
Jelte Fennema	f13b140621	Show citus_copy_shard_placement progress in get_rebalance_progress (#6322 ) DESCRIPTION: Show citus_copy_shard_placement progress in get_rebalance_progress When rebalancing to a new node that does not have reference tables yet the rebalancer will first copy the reference tables to the nodes. Depending on the size of the reference tables, this might take a long time. However, there's no indication of what's happening at this stage of the rebalance. This PR improves this situation by also showing the progress of any citus_copy_shard_placement calls when calling get_rebalance_progress.	2022-09-13 08:59:52 +00:00
Nils Dijk	cda3686d86	Feature: run rebalancer in the background (#6215 ) DESCRIPTION: Add a rebalancer that uses background tasks for its execution Based on the baclground jobs and tasks introduced in #6296 we implement a new rebalancer on top of the primitives of background execution. This allows the user to initiate a rebalance and let Citus execute the long running steps in the background until completion. Users can invoke the new background rebalancer with `SELECT citus_rebalance_start();`. It will output information on its job id and how to track progress. Also it returns its job id for automation purposes. If you simply want to wait till the rebalance is done you can use `SELECT citus_rebalance_wait();` A running rebalance can be canelled/stopped with `SELECT citus_rebalance_stop();`.	2022-09-12 20:46:53 +03:00
Marco Slot	48f7d6c279	Show local managed tables in citus_tables and hide tables owned by extensions (#6321 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-09-12 17:49:17 +03:00
Marco Slot	ba2fe3e3c4	Remove do_repair option from citus_copy_shard_placement (#6299 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-09-09 15:44:30 +02:00
Nils Dijk	00a94c7f13	Implement infrastructure to run sql jobs in the background (#6296 ) DESCRIPTION: Add infrastructure to run long running management operations in background This infrastructure introduces the primitives of jobs and tasks. A task consists of a sql statement and an owner. Tasks belong to a Job and can depend on other tasks from the same job. When there are either runnable or running tasks we would like to make sure a bacgrkound task queue monitor process is running. A Task could be in running state while there is actually no monitor present due to a database restart or failover. Once the monitor starts it will reset any running task to its runnable state. To make sure only one background task queue monitor is ever running at once it will acquire an advisory lock that self conflicts. Once a task is done it will find all tasks depending on this task. After checking that the task doesn't have unmet dependencies it will transition the task from blocked to runnable state for the task to be picked up on a subsequent task start. Currently only one task can be running at a time. This can be improved upon in later releases without changes to the higher level API. The initial goal for this background tasks is to allow a rebalance to run in the background. This will be implemented in a subsequent PR.	2022-09-09 16:11:19 +03:00
Jelte Fennema	e29db74a19	Don't override postgres C symbols with our own (#6300 ) When introducing our overrides of pg_cancel_backend and pg_terminate_backend we accidentally did that in such a way that we cannot call the original pg_cancel_backend and pg_terminate_backend from C anymore. This happened because we defined the exact same symbols in our shared library as postgres does in its own binary. This fixes that by using a different names for the C function than for the SQL function. Making this work in all upgrade and downgrade scenarios is not trivial though, because we actually need to remove the C function definition. Postgres errors in two different times when the symbol that a C function wants to call is not defined in the library it expects it in: 1. When creating the SQL function definition 2. When calling the SQL function Item 1 causes an issue when creating our extension for the first time. We then go execute all the migrations that we have. So if the 11.0 migration contains a SQL function definition that still references the pg_cancel_backend symbol, that migration will fail. This issue is solved by actually changing the SQL definition in the old migration. This is not enough to fix all issues though. Item 2 causes an issue after an upgrade to 11.1, because it won't have the new definition of the SQL function. This is solved by recreating the SQL functions in the migration to 11.1. That way it gets the new definition. Then finally there's the case of downgrades. To continue to make our pg_cancel_backend SQL function work after downgrading, we will need to make a patch release for 11.0 that includes the new citus_cancel_backend symbol. This is done in a separate commit.	2022-09-07 11:27:05 +02:00
Nitish Upreti	d7404a9446	'Deferred Drop' and robust 'Shard Cleanup' for Splits. (#6258 ) DESCRIPTION: This PR adds support for 'Deferred Drop' and robust 'Shard Cleanup' for Splits. Common Infrastructure This PR introduces new common infrastructure so as any operation that wants robust cleanup of resources can register with the cleaner and have the resources cleaned appropriately based on a specified policy. 'Shard Split' is the first consumer using this new infrastructure. Note : We only support adding 'shards' as resources to be cleaned-up right now but the framework will be extended to support other resources in future. Deferred Drop for Split Deferred Drop Support ensures that shards undergoing split are not dropped inline as part of operation but dropped later when no active read queries are running on shard. This helps with : Avoids any potential deadlock scenarios that can cause long running Split operation to rollback. Avoids Split operation blocking writes and then getting blocked (due to running queries on the shard) when trying to drop shards. Deferred drop is the new default behavior going forward. Shard Cleaner Extension Shard Cleaner is a background task responsible for deferred drops in case of 'Move' operations. The cleaner has been extended to ensure robust cleanup of shards (dummy shards and split children) in case of a failure based on the new infrastructure mentioned above. The cleaner also handles deferred drop for 'Splits'. TESTING: New test ''citus_split_shard_by_split_points_deferred_drop' to test deferred drop support. New test 'failure_split_cleanup' to test shard cleanup with failures in different stages. Update 'isolation_blocking_shard_split and isolation_non_blocking_shard_split' for deferred drop. Added non-deferred drop version of existing tests : 'citus_split_shard_no_deferred_drop' and 'citus_non_blocking_splits_no_deferred_drop'	2022-09-06 12:11:20 -07:00

1 2 3 4 5 ...

266 Commits (0d503dd5ac5547ca71cd0147e53236d8d8a22fce)