citus

Commit Graph

Author	SHA1	Message	Date
Marco Slot	72d8fde28b	Use intermediate results for re-partition joins	2022-02-23 19:40:21 +01:00
Onder Kalaci	dffcafc096	Use global pids in citus_lock_waits	2022-02-21 17:46:34 +01:00
Onder Kalaci	331af3dce8	Dumping wait edges becomes optionally scan all backends Before this commit, dumping wait edges can only be used for distributed deadlock detection purposes. With this commit, we open the possibility that we can use it for any backend.	2022-02-21 17:37:07 +01:00
Halil Ozan Akgul	f6cd4d0f07	Overrides pg_cancel_backend and pg_terminate_backend to accept global pid	2022-02-21 16:41:35 +03:00
Nils Dijk	ea86f9f94e	Add support for TEXT SEARCH CONFIGURATION objects (#5685 ) DESCRIPTION: Implement TEXT SEARCH CONFIGURATION propagation The change adds support to Citus for propagating TEXT SEARCH CONFIGURATION objects. TSConfig objects cannot always be created in one create statement, and instead require a create statement followed by many alter statements to get turned into the object they should represent. To support this we add functionality to the worker to create or replace objects based on a list of statements. When the lists of the local object and the remote object correspond 1:1 we skip the creation of the object and simply mark it distributed. This is especially important for TSConfig objects as initdb pre-populates databases with a dozen configurations (for many different languages). When the user creates a new TSConfig based on the copy of an existing configuration there is no direct link to the object copied from. Since there is no link we can't simply rely on propagating the dependencies to the worker and send a qualified	2022-02-17 13:12:46 +01:00
Halil Ozan Akgul	8ee02b29d0	Introduce global PID	2022-02-08 16:49:38 +03:00
Burak Velioglu	f88cc230bf	Handle tables and objects as metadata. Update UDFs accordingly With this commit we've started to propagate sequences and shell tables within the object dependency resolution. So, ensuring any dependencies for any object will consider shell tables and sequences as well. Separate logics for both shell tables and sequences have been removed. Since both shell tables and sequences logic were implemented as a part of the metadata handling before that logic, we were propagating them while syncing table metadata. With this commit we've divided metadata (which means anything except shards thereafter) syncing logic into multiple parts and implemented it either as a part of ActivateNode. You can check the functions called in ActivateNode to check definition of different metadata. Definitions of start_metadata_sync_to_node and citus_activate_node have also been updated. citus_activate_node will basically create an active node with all metadata and reference table shards. start_metadata_sync_to_node will be same with citus_activate_node except replicating reference tables. stop_metadata_sync_to_node will remove all the metadata. All of those UDFs need to be called by superuser.	2022-01-31 16:20:15 +03:00
Teja Mupparti	54862f8c22	(1) Functions will be delegated even when present in the scope of an explicit BEGIN/COMMIT transaction block or in a UDF calling another UDF. (2) Prohibit/Limit the delegated function not to do a 2PC (or any work on a remote connection). (3) Have a safety net to ensure the (2) i.e. we should block the connections from the delegated procedure or make sure that no 2PC happens on the node. (4) Such delegated functions are restricted to use only the distributed argument value. Note: To limit the scope of the project we are considering only Functions(not procedures) for the initial work. DESCRIPTION: Introduce a new flag "force_delegation" in create_distributed_function(), which will allow a function to be delegated in an explicit transaction block. Fixes #3265 Once the function is delegated to the worker, on that node during the planning distributed_planner() TryToDelegateFunctionCall() CheckDelegatedFunctionExecution() EnableInForceDelegatedFuncExecution() Save the distribution argument (Constant) ExecutorStart() CitusBeginScan() IsShardKeyValueAllowed() Ensure to not use non-distribution argument. ExecutorRun() AdaptiveExecutor() StartDistributedExecution() EnsureNoRemoteExecutionFromWorkers() Ensure all the shards are local to the node in the remoteTaskList. NonPushableInsertSelectExecScan() InitializeCopyShardState() EnsureNoRemoteExecutionFromWorkers() Ensure all the shards are local to the node in the placementList. This also fixes a minor issue: Properly handle expressions+parameters in distribution arguments	2022-01-19 16:43:33 -08:00
Marco Slot	33bfa0b191	Hide shards from application_name's with a specific prefix	2022-01-18 15:20:55 +04:00
Önder Kalacı	5305aa4246	Do not drop sequences when dropping metadata (#5584 ) Dropping sequences means we need to recreate and hence losing the sequence. With this commit, we keep the existing sequences such that resyncing wouldn't drop the sequence. We do that by breaking the dependency of the sequence from the table.	2022-01-06 09:48:34 +01:00
Önder Kalacı	c9127f921f	Avoid round trips while fixing index names (#5549 ) With this commit, fix_partition_shard_index_names() works significantly faster. For example, 32 shards, 365 partitions, 5 indexes drop from ~120 seconds to ~44 seconds 32 shards, 1095 partitions, 5 indexes drop from ~600 seconds to ~265 seconds `queryStringList` can be really long, because it may contain #partitions * #indexes entries. Before this change, we were actually going through the executor where each command in the query string triggers 1 round trip per entry in queryStringList. The aim of this commit is to avoid the round-trips by creating a single query string. I first simply tried sending `q1;q2;..;qn` . However, the executor is designed to handle `q1;q2;..;qn` type of query executions via the infrastructure mentioned above (e.g., by tracking the query indexes in the list and doing 1 statement per round trip). One another option could have been to change the executor such that only track the query index when `queryStringList` is provided not with queryString including multiple `;`s . That is (a) more work (b) could cause weird edge cases with failure handling (c) felt like coding a special case in to the executor	2021-12-27 10:29:37 +01:00
Hanefi Onaldi	29e4516642	Introduce citus_check_cluster_node_health UDF This UDF coordinates connectivity checks accross the whole cluster. This UDF gets the list of active readable nodes in the cluster, and coordinates all connectivity checks in sequential order. The algorithm is: for sourceNode in activeReadableWorkerList: c = connectToNode(sourceNode) for targetNode in activeReadableWorkerList: result = c.execute( "SELECT citus_check_connection_to_node(targetNode.name, targetNode.port") emit sourceNode.name, sourceNode.port, targetNode.name, targetNode.port, result - result -> true -> connection attempt from source to target succeeded - result -> false -> connection attempt from source to target failed - result -> NULL -> connection attempt from the current node to source node failed I suggest you use the following query to get an overview on the connectivity: SELECT bool_and(COALESCE(result, false)) FROM citus_check_cluster_node_health(); Whenever this query returns false, there is a connectivity issue, check in detail.	2021-12-15 01:41:51 +03:00
Burak Velioglu	ed8e32de5e	Sync pg_dist_object on an update and propagate while syncing to a new node Before that PR we were updating citus.pg_dist_object metadata, which keeps the metadata related to objects on Citus, only on the coordinator node. In order to allow using those object from worker nodes (or erroring out with proper error message) we've started to propagate that metedata to worker nodes as well.	2021-12-06 19:25:50 +03:00
Hanefi Onaldi	56e9b1b968	Introduce UDF to check worker connectivity citus_check_connection_to_node runs a simple query on a remote node and reports whether this attempt was successful. This UDF will be used to make sure each worker node can connect to all the worker nodes in the cluster. parameters: nodename: required nodeport: optional (default: 5432) return value: boolean success	2021-12-03 02:30:28 +03:00
Onder Kalaci	549edcabb6	Allow disabling node(s) when multiple failures happen As of master branch, Citus does all the modifications to replicated tables (e.g., reference tables and distributed tables with replication factor > 1), via 2PC and avoids any shardstate=3. As a side-effect of those changes, handling node failures for replicated tables change. With this PR, when one (or multiple) node failures happen, the users would see query errors on modifications. If the problem is intermitant, that's OK, once the node failure(s) recover by themselves, the modification queries would succeed. If the node failure(s) are permenant, the users should call `SELECT citus_disable_node(...)` to disable the node. As soon as the node is disabled, modification would start to succeed. However, now the old node gets behind. It means that, when the node is up again, the placements should be re-created on the node. First, use `SELECT citus_activate_node()`. Then, use `SELECT replicate_table_shards(...)` to replicate the missing placements on the re-activated node.	2021-12-01 10:19:48 +01:00
Onur Tirtir	73f06323d8	Introduce dependencies from columnarAM to columnar metadata objects During pg upgrades, we have seen that it is not guaranteed that a columnar table will be created after metadata objects got created. Prior to changes done in this commit, we had such a dependency relationship in `pg_depend`: ``` columnar_table ----> columnarAM ----> citus extension ^ ^ \| \| columnar.storage_id_seq -------------------- \| \| columnar.stripe ------------------------------- ``` Since `pg_upgrade` just knows to follow topological sort of the objects when creating database dump, above dependency graph doesn't imply that `columnar_table` should be created before metadata objects such as `columnar.storage_id_seq` and `columnar.stripe` are created. For this reason, with this commit we add new records to `pg_depend` to make columnarAM depending on all rel objects living in `columnar` schema. That way, `pg_upgrade` will know it needs to create those before creating `columnarAM`, and similarly, before creating any tables using `columnarAM`. Note that in addition to inserting those records via installation script, we also do the same in `citus_finish_pg_upgrade()`. This is because, `pg_upgrade` rebuilds catalog tables in the new cluster and that means, we must insert them in the new cluster too.	2021-11-23 13:14:00 +03:00
Hanefi Onaldi	3d9cec70fd	Update migration paths from 10.2 to 11.0 (#5459 ) We recently introduced a set of patches to 10.2, and introduced 10.2-4 migration version. This migration version only resides on `release-10.2` branch, and is missing on our default branch. This creates a problem because we do not have a valid migration path from 10.2 to latest 11.0. To remedy this issue, I copied the relevant migration files from `release-10.2` branch, and renamed some of our migration files on default branch to make sure we have a linear upgrade path.	2021-11-11 13:55:28 +03:00
Marco Slot	78866df13c	Remove master_append_table_to_shard UDF	2021-11-08 10:43:24 +01:00
Halil Ozan Akgul	c0785d570c	Remove EnsureSuperUser from start and stop metadata sync to node	2021-11-01 18:01:49 +03:00
Ahmet Gedemenli	67dca4363d	Dont auto-undistribute user-added citus local tables (#5314 ) * Disable auto-undistribute for user-added citus local tables	2021-10-28 12:10:26 +03:00
Marco Slot	dafba6c242	Deprecate master_get_table_metadata UDF	2021-10-21 12:08:05 +02:00
Marco Slot	096660d61d	Remove master_apply_delete_command	2021-10-18 22:29:37 +02:00
Naisila Puka	d0390af72d	Add fix_partition_shard_index_names udf to fix currently broken names (#5291 ) * Add udf to include shardId in broken partition shard index names * Address reviews: rename index such that operations can be done on it * More comprehensive index tests * Final touches and formatting	2021-10-07 19:34:52 +03:00
Hanefi Onaldi	a74409f24c	Bump Citus to 11.0devel	2021-10-01 22:21:22 +03:00
Önder Kalacı	c2311b4c0c	Make (columnar.stripe) first_row_number index a unique constraint (#5324 ) * Make (columnar.stripe) first_row_number index a unique constraint Since stripe_first_row_number_idx is required to scan a columnar table, we need to make sure that it is created before doing anything with columnar tables during pg upgrades. However, a plain btree index is not a dependency of a table, so pg_upgrade cannot guarantee that stripe_first_row_number_idx gets created when creating columnar.stripe, unless we make it a unique "constraint". To do that, drop stripe_first_row_number_idx and create a unique constraint with the same name to keep the code change at minimum. * Add more pg upgrade tests for columnar * Fix a logic error in uprade_columnar_after test Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>	2021-09-30 10:51:56 +03:00
Onur Tirtir	77a2dd68da	Revoke read access to columnar.chunk from unprivileged user (#5313 ) Since this could expose chunk min/max values to unprivileged users.	2021-09-22 16:23:02 +03:00
Naisila Puka	a69abe3be0	Fixes bug about int and smallint sequences on MX (#5254 ) * Introduce worker_nextval udf for int&smallint column defaults * Fix current tests and add new ones for worker_nextval	2021-09-09 23:41:07 +03:00
Hanefi Onaldi	9ae912a8c8	Prevent C-style comments in all directories (#5250 )	2021-09-09 11:54:58 +03:00
Burak Velioglu	c3895f35cd	Add helper UDFs for easy time partition management - get_missing_time_partition_ranges: Gets the ranges of missing partitions for the given table, interval and range unless any existing partition conflicts with calculated missing ranges. - create_time_partitions: Creates partitions by getting range values from get_missing_time_partition_ranges. - drop_old_time_partitions: Drops partitions of the table older than given threshold.	2021-09-03 23:03:13 +03:00
Naisila Puka	acb5ae6ab6	Skip dropping shards when we know it's a partition (#5176 )	2021-08-31 17:41:37 +03:00
Onder Kalaci	482b8096e9	Introduce citus_internal_update_relation_colocation update_distributed_table_colocation can be called by the relation owner, and internally it updates pg_dist_partition. With this commit, update_distributed_table_colocation uses an internal UDF to access pg_dist_partition. As a result, this operation can now be done by regular users on MX.	2021-08-03 11:44:58 +02:00
Onder Kalaci	c8368e7929	Introduce citus_internal_delete_shard_metadata With this function, the owner of the table is allowed to remove shard metadata. This is going to be useful for tenant-isolation.	2021-07-19 13:25:05 +02:00
Onder Kalaci	2c349e6dfd	Use current user to sync metadata Before this commit, we always synced the metadata with superuser. However, that creates various edge cases such as visibility errors or self distributed deadlocks or complicates user access checks. Instead, with this commit, we use the current user to sync the metadata. Note that, `start_metadata_sync_to_node` still requires super user because accessing certain metadata (like pg_dist_node) always require superuser (e.g., the current user should be a superuser). However, metadata syncing operations regarding the distributed tables can now be done with regular users, as long as the user is the owner of the table. A table owner can still insert non-sense metadata, however it'd only affect its own table. So, we cannot do anything about that.	2021-07-16 13:25:27 +02:00
Hanefi Onaldi	efc5776451	Remove public schema dependency for 10.1 upgrades This commit contains a subset of the changes that should be cherry picked to 10.1 releases.	2021-07-09 02:08:22 +03:00
Hanefi Onaldi	8e9cc229ff	Remove public schema dependency for 10.0 upgrades This commit contains a subset of the changes that should be cherry picked to 10.0 releases.	2021-07-09 02:08:22 +03:00
Marco Slot	b14955c2bd	Fix PG upgrade scripts for 10.0	2021-07-05 14:38:20 +02:00
Ahmet Gedemenli	8bae58fdb7	Add parameter to cleanup metadata (#5055 ) * Add parameter to cleanup metadata * Set clear metadata default to true * Add test for clearing metadata * Separate test file for start/stop metadata syncing * Fix stop_sync bug for secondary nodes * Use PreventInTransactionBlock * DRemovedebuggiing logs * Remove relation not found logs from mx test * Revert localGroupId when doing stop_sync * Move metadata sync test to mx schedule * Add test with name that needs to be quoted * Add test for views and matviews * Add test for distributed table with custom type * Add comments to test * Add test with stats, indexes and constraints * Fix matview test * Add test for dropped column * Add notice messages to stop_metadata_sync * Add coordinator check to stop metadat sync * Revert local_group_id only if clearMetadata is true * Add a final check to see the metadata is sane * Remove the drop verbosity in test * Remove table description tests from sync test * Add stop sync to coordinator test * Change the order in stop_sync * Add test for hybrid (columnar+heap) partitioned table * Change error to notice for stop sync to coordinator * Sync at the end of the test to prevent any failures * Add test case in a transaction block * Remove relation not found tests	2021-07-01 16:23:53 +03:00
Onur Tirtir	18fe0311c0	Move rest of the schema changes to 10.2-1	2021-06-16 20:43:41 +03:00
Halil Ozan Akgul	db03afe91e	Bump citus version to 10.2devel	2021-06-16 17:44:05 +03:00
Jelte Fennema	7015049ea5	Add citus_cleanup_orphaned_shards UDF Sometimes the background daemon doesn't cleanup orphaned shards quickly enough. It's useful to have a UDF to trigger this removal when needed. We already had a UDF like this but it was only used during testing. This exposes that UDF to users. As a safety measure it cannot be run in a transaction, because that would cause the background daemon to stop cleaning up shards while this transaction is running.	2021-06-04 11:23:07 +02:00
Jelte Fennema	4c20bf7a36	Remove pg_dist_rebalence_strategy_enterprise_check (#5014 ) This is not necessary anymore now that the rebalancer is open source.	2021-06-01 06:16:46 -07:00
SaitTalhaNisanci	a20cc3b36a	Only consider shard state 1 in citus shards (#4970 )	2021-05-28 11:33:48 +03:00
Jelte Fennema	10f06ad753	Fetch shard size on the fly for the rebalance monitor Without this change the rebalancer progress monitor gets the shard sizes from the `shardlength` column in `pg_dist_placement`. This column needs to be updated manually by calling `citus_update_table_statistics`. However, `citus_update_table_statistics` could lead to distributed deadlocks while database traffic is on-going (see #4752). To work around this we don't use `shardlength` column anymore. Instead for every rebalance we now fetch all shard sizes on the fly. Two additional things this does are: 1. It adds tests for the rebalance progress function. 2. If a shard move cannot be done because a source or target node is unreachable, then we error in stop the rebalance, instead of showing a warning and continuing. When using the by_disk_size rebalance strategy it's not safe to continue with other moves if a specific move failed. It's possible that the failed move made space for the next move, and because the failed move never happened this space now does not exist. 3. Adds two new columns to the result of `get_rebalancer_progress` which shows the size of the shard on the source and target node. Fixes #4930	2021-05-20 16:38:17 +02:00
Nils Dijk	a6c2d2a4c4	Feature: alter database owner (#4986 ) DESCRIPTION: Add support for ALTER DATABASE OWNER This adds support for changing the database owner. It achieves this by marking the database as a distributed object. By marking the database as a distributed object it will look for its dependencies and order the user creation commands (enterprise only) before the alter of the database owner. This is mostly important when adding new nodes. By having the database marked as a distributed object it can easily understand for which `ALTER DATABASE ... OWNER TO ...` commands to propagate by resolving the object address of the database and verifying it is a distributed object, and hence should propagate changes of owner ship to all workers. Given the ownership of the database might have implications on subsequent commands in transactions we force sequential mode for transactions that have a `ALTER DATABASE ... OWNER TO ...` command in them. This will fail the transaction with meaningful help when the transaction already executed parallel statements. By default the feature is turned off since roles are not automatically propagated, having it turned on would cause hard to understand errors for the user. It can be turned on by the user via setting the `citus.enable_alter_database_owner`.	2021-05-20 13:27:44 +02:00
Jelte Fennema	cbbd10b974	Implement an improvement threshold in the rebalancer (#4927 ) Every move in the rebalancer algorithm results in an improvement in the balance. However, even if the improvement in the balance was very small the move was still chosen. This is especially problematic if the shard itself is very big and the move will take a long time. This changes the rebalancer algorithm to take the relative size of the balance improvement into account when choosing moves. By default a move will not be chosen if it improves the balance by less than half of the size of the shard. An extra argument is added to the rebalancer functions so that the user can decide to lower the default threshold if the ignored move is wanted anyway.	2021-05-11 14:24:59 +02:00
SaitTalhaNisanci	6b1904d37a	When moving a shard to a new node ensure there is enough space (#4929 ) * When moving a shard to a new node ensure there is enough space * Add WairForMiliseconds time utility * Add more tests and increase readability * Remove the retry loop and use a single udf for disk stats * Address review * address review Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2021-05-06 17:28:02 +03:00
Ahmet Gedemenli	332c5ce4ad	Fix worker partitioned size functions (#4922 )	2021-04-26 10:29:46 +03:00
Ahmet Gedemenli	e445e3d39c	Introduce 3 partitioned size udfs (#4899 ) * Introduce 3 partitioned size udfs * Add tests for new partition size udfs * Fix type incompatibilities * Convert UDFs into pure sql functions * Fix function comment	2021-04-13 17:36:27 +03:00
Onur Tirtir	fe5c985e1d	Remove HAS_TABLEAM config since we dropped pg11 support (#4862 ) * Remove HAS_TABLEAM config * Drop columnar_ensure_objects_exist * Not call columnar_ensure_objects_exist in citus_finish_pg_upgrade	2021-04-13 10:51:26 +03:00
Halil Ozan Akgul	a5038046f9	Adds shard_count parameter to create_distributed_table	2021-03-29 16:22:49 +03:00

1 2

88 Commits (72d8fde28ba55711ef5896eff8f806637d7077bf)