citus

Commit Graph

Author	SHA1	Message	Date
aykutbozkurt	f2f0ec9dda	PR #6728 / commit - 12 Force activated bare connections to close at transaction end.	2023-03-30 11:06:16 +03:00
aykutbozkurt	35dbdae5a4	PR #6728 / commit - 11 Let AddNodeMetadata to use metadatasync api during node addition.	2023-03-30 11:06:16 +03:00
aykutbozkurt	1fb3de14df	PR #6728 / commit - 6 Let `activate_node_snapshot` use new metadata sync api.	2023-03-30 10:53:22 +03:00
Ahmet Gedemenli	03f1bb70b7	Rebalance shard groups with placement count less than worker count (#6739 ) DESCRIPTION: Adds logic to distribute unbalanced shards If the number of shard placements (for a colocation group) is less than the number of workers, it means that some of the workers will remain empty. With this PR, we consider these shard groups as a colocation group, in order to make them be distributed evenly as much as possible across the cluster. Example: ```sql create table t1 (a int primary key); create table t2 (a int primary key); create table t3 (a int primary key); set citus.shard_count =1; select create_distributed_table('t1','a'); select create_distributed_table('t2','a',colocate_with=>'t1'); select create_distributed_table('t3','a',colocate_with=>'t2'); create table tb1 (a bigint); create table tb2 (a bigint); select create_distributed_table('tb1','a'); select create_distributed_table('tb2','a',colocate_with=>'tb1'); select citus_add_node('localhost',9702); select rebalance_table_shards(); ``` Here we have two colocation groups, each with one shard group. Both shard groups are placed on the first worker node. When we add a new worker node and try to rebalance table shards, the rebalance planner considers it well balanced and does nothing. With this PR, the rebalancer tries to distribute these shard groups evenly across the cluster as much as possible. For this example, with this PR, the rebalancer moves one of the shard groups to the second worker node. fixes: #6715	2023-03-06 14:14:27 +03:00
Marco Slot	64e3fee89b	Remove shardstate leftovers (#6627 ) Remove ShardState enum and associated logic. Co-authored-by: Marco Slot <marco.slot@gmail.com> Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>	2023-01-19 11:43:58 +03:00
Jelte Fennema	92689a8362	Make GPIDs work with pg_dist_poolinfo (#6588 ) The original implementation of GPIDs didn't work correctly when using `pg_dist_poolinfo` together with PgBouncer. The reason is that it assumed that once a connection was made to a worker, the originating GPID should stay the same for ever. But when pg_dist_poolinfo is used this isn't the case, because the same connection on the worker might be used by different backends of the coordinator. This fixes that issue by updating the GPID whenever a new application name is set on a connection. This is the only thing that's needed, because PgBouncer already sets the application name correctly on the server connection whenever a client is updated.	2023-01-13 14:39:19 +00:00
Ahmet Gedemenli	235047670d	Drop SHARD_STATE_TO_DELETE (#6494 ) DESCRIPTION: Drop `SHARD_STATE_TO_DELETE` and use the cleanup records instead Drops the shard state that is used to mark shards as orphaned. Now we insert cleanup records into `pg_dist_cleanup` so "orphaned" shards will be dropped either by maintenance daemon or internal cleanup calls. With this PR, we make the "cleanup orphaned shards" functions to be no-op, as they would not be needed anymore. This PR includes some naming changes about placement functions. We don't need functions that filter orphaned shards, as there will be no orphaned shards anymore. We will also be introducing a small script with this PR, for users with orphaned shards. We'll basically delete the orphaned shard entries from `pg_dist_placement` and insert cleanup records into `pg_dist_cleanup` for each one of them, during Citus upgrade. We also have a lot of flakiness fixes in this PR. Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-01-03 14:38:16 +03:00
aykut-bozkurt	8be4ce546e	fix vanilla test status on CI (#6555 ) - Because of the make command used for vanilla tests, test status is always shown as success on CI. As a fix, I added `&& false` at the end of the copying diff file to make the command fail when check-vanilla fails. ```make check-vanilla: all $(pg_regress_multi_check) --vanillatest \|\| (cp $(vanilla_diffs_file) $(citus_abs_srcdir)/regression.diffs && false) ``` - I also fixed some vanilla tests that fails due to recently added clock related operators shown up at some queries.	2022-12-13 11:15:47 +03:00
Ahmet Gedemenli	0e92244bfe	Cleanup for shard moves (#6472 ) DESCRIPTION: Extend cleanup process for replication artifacts This PR adds new cleanup record types for: * Subscriptions * Replication slots * Publications * Users created for subscriptions We add records for these object types, to `pg_dist_cleanup` during creation phase. Once the operation is done, in case of success or failure, we iterate those records and drop the objects. With this PR we will not be dropping any of these objects during the operation. In short, we will always be deferring the drop. One thing that's worth mentioning is that we sort cleanup records before processing (dropping) them, because of dependency relations among those objects, e.g a subscription might depend on a publication. Therefore, we always drop subscriptions before publications. We have some renames in this PR: * `TryDropOrphanedShards` -> `TryDropOrphanedResources` * `DropOrphanedShardsForCleanup` -> `DropOrphanedResourcesForCleanup` * `run_try_drop_marked_shards` -> `run_try_drop_marked_resources` as these functions now process replication artifacts as well. This PR drops function `DropAllLogicalReplicationLeftovers` and its all usages, since now we rely on the deferring drop mechanism.	2022-11-30 15:38:05 +03:00
Jelte Fennema	76137e967f	Create all foreign keys quickly at the end of a shard move (#6148 ) Previously we would create foreign keys to reference table in an extra fast way at the end of a shard move. This uses that same logic to also do it for foreign keys between distributed tables. Fixes #6141	2022-09-09 09:58:33 +02:00
Onder Kalaci	149771792b	Remove useless version compats most likely leftover from earlier versions	2022-07-29 10:31:55 +02:00
aykut-bozkurt	67ac3da2b0	added citus_depended_objects udf and HideCitusDependentObjects GUC to hide citus depended objects from pg meta queries (#6055 ) use RecurseObjectDependencies api to find if an object is citus depended make vanilla tests runnable to see if citus_depended function is working correctly	2022-07-25 16:43:34 +03:00
Jelte Fennema	184c7c0bce	Make enterprise features open source (#6008 ) This PR makes all of the features open source that were previously only available in Citus Enterprise. Features that this adds: 1. Non blocking shard moves/shard rebalancer (`citus.logical_replication_timeout`) 2. Propagation of CREATE/DROP/ALTER ROLE statements 3. Propagation of GRANT statements 4. Propagation of CLUSTER statements 5. Propagation of ALTER DATABASE ... OWNER TO ... 6. Optimization for COPY when loading JSON to avoid double parsing of the JSON object (`citus.skip_jsonb_validation_in_copy`) 7. Support for row level security 8. Support for `pg_dist_authinfo`, which allows storing different authentication options for different users, e.g. you can store passwords or certificates here. 9. Support for `pg_dist_poolinfo`, which allows using connection poolers in between coordinator and workers 10. Tracking distributed query execution times using citus_stat_statements (`citus.stat_statements_max`, `citus.stat_statements_purge_interval`, `citus.stat_statements_track`). This is disabled by default. 11. Blocking tenant_isolation 12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`	2022-06-16 00:23:46 -07:00
Gledis Zeneli	beef392f5a	Fix memory error with citus_add_node reported by valgrind test (#5967 ) The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed. Also fix another place in the code that might have been susceptible to this issue. I tested on both multi-vg and multi-1-vg and the test were successful.	2022-05-28 00:22:00 +03:00
Jeff Davis	b6a5617ea8	PG15: handle pg_analyze_and_rewrite_* renaming. From PG commit 791b1b71da.	2022-05-02 10:12:03 -07:00
Jeff Davis	3799f95742	PG15: Value -> String, Integer, Float. Handle PG commit 639a86e36a.	2022-05-02 10:12:03 -07:00
Jeff Davis	26f5e20580	PG15: update integer parsing APIs. Account for PG commits 3c6f8c011f and cfc7191dfe.	2022-05-02 10:12:03 -07:00
Jeff Davis	1c1ef7ab8d	PG15: Handle extra argument to RelationCreateStorage. Account for PG commit 9c08aea6a309. Introduce RelationCreateStorage_compat.	2022-05-02 10:12:03 -07:00
Jeff Davis	f944722c6a	PG15: Use RelationGetSmgr() instead of RelationOpenSmgr(). Handle PG commit f10f0ae420.	2022-05-02 10:12:03 -07:00
Onder Kalaci	b0b91bab04	Rename metadata sync to node metadata sync where applicable	2022-04-07 17:51:31 +02:00
Marco Slot	9476f377b5	Remove old re-partitioning functions	2022-04-04 18:11:52 +02:00
Onder Kalaci	9043a1ed3f	Only hide shards from client backends and pg bg workers The aim of hiding shards is to hide shards from client applications. Certain bg workers (such as pg_cron or Citus maintanince daemon) should be treated like client applications because users can run queries from such bg workers. And, these bg workers should follow the similar application_name checks as client backeends. Certain other bg workers, such as logical replication or postgres' parallel workers, should never hide shards. They are internal operations. Similarly the other backend types like the walsender or checkpointer or autovacuum should never hide shards.	2022-03-30 16:56:12 +02:00
Halil Ozan Akgül	333bcc7948	Global PID Helper Functions (#5768 ) * Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions * Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions * Add tests	2022-03-09 13:15:59 +03:00
Onder Kalaci	c32b2de1a7	Improve citus_lock_waits 1) Remove useless columns 2) Show backends that are blocked on a DDL even before gpid is assigned 3) One minor bugfix, where we clear distributedCommandOriginator properly.	2022-03-07 11:10:44 +01:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Gledis Zeneli	b825232ecb	Handle rebalance / replication when a node is disabled (Fix #5664 ) (#5729 ) The issue in question is caused when rebalance / replication call `FullShardPlacementList` which returns all shard placements (including those in disabled nodes with `citus_disable_node`). Eventually, `FindFillStateForPlacement` looks for the state across active workers and fails to find a state for the placements which are in the disabled workers causing a seg fault shortly after. Approach: * `ActivePlacementHash` was not using the status of the shard placement's node to determine if the node it is active. Initially, I just fixed that. * Additionally, I refactored the code which handles active shards in replication / rebalance to: * use a single function to determine if a shard placement is active. * do the shard active shard filtering before calling `RebalancePlacementUpdates` and `ReplicationPlacementUpdates`, so test methods like `shard_placement_rebalance_array` and `shard_placement_replication_array` which have different shard placement active requirements can do their own filtering while using the same rebalance / replicate logic that `rebalance_table_shards` and `replicate_table_shards` use. Fix #5664	2022-02-25 19:54:30 +03:00
Onder Kalaci	95d5918967	Properly set worker_query and use	2022-02-21 18:22:33 +01:00
Onder Kalaci	331af3dce8	Dumping wait edges becomes optionally scan all backends Before this commit, dumping wait edges can only be used for distributed deadlock detection purposes. With this commit, we open the possibility that we can use it for any backend.	2022-02-21 17:37:07 +01:00
Burak Velioglu	fa6866ed36	Start to propagate functions to worker nodes with CREATE FUNCTION command together with it's dependencies. If the function depends on any nondistributable object, function will be created only locally. Parameterless version of create_distributed_function becomes obsolete with this change, it will deprecated from the code with a subsequent PR.	2022-02-18 13:56:51 +03:00
Burak Velioglu	f88cc230bf	Handle tables and objects as metadata. Update UDFs accordingly With this commit we've started to propagate sequences and shell tables within the object dependency resolution. So, ensuring any dependencies for any object will consider shell tables and sequences as well. Separate logics for both shell tables and sequences have been removed. Since both shell tables and sequences logic were implemented as a part of the metadata handling before that logic, we were propagating them while syncing table metadata. With this commit we've divided metadata (which means anything except shards thereafter) syncing logic into multiple parts and implemented it either as a part of ActivateNode. You can check the functions called in ActivateNode to check definition of different metadata. Definitions of start_metadata_sync_to_node and citus_activate_node have also been updated. citus_activate_node will basically create an active node with all metadata and reference table shards. start_metadata_sync_to_node will be same with citus_activate_node except replicating reference tables. stop_metadata_sync_to_node will remove all the metadata. All of those UDFs need to be called by superuser.	2022-01-31 16:20:15 +03:00
Marco Slot	ee3b50b026	Disallow remote execution from queries on shards	2022-01-07 17:46:21 +01:00
Hanefi Onaldi	13fff9c37a	Remove NOOP tuplestore_donestoring calls PostgreSQL does not need calling this function since 7.4 release, and it is a NOOP. For more details, check PostgreSQL commit below : commit dd04e958c8b03c0f0512497651678c7816af3198 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun Mar 9 03:34:10 2003 +0000 tuplestore_donestoring() isn't needed anymore, but provide a no-op macro definition so as not to create compatibility problems. diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h index b46babacd1..76fe9fb428 100644 --- a/src/include/utils/tuplestore.h +++ b/src/include/utils/tuplestore.h @@ -17,7 +17,7 @@ * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $ + * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $ * ------------------------------------------------------------------------- / @@ -41,6 +41,9 @@ extern Tuplestorestate tuplestore_begin_heap(bool randomAccess, extern void tuplestore_puttuple(Tuplestorestate state, void tuple); +/ tuplestore_donestoring() used to be required, but is no longer used / +#define tuplestore_donestoring(state) ((void) 0) + / backwards scan is only allowed if randomAccess was specified 'true' / extern void tuplestore_gettuple(Tuplestorestate state, bool forward, bool should_free);	2021-12-14 18:55:02 +03:00
Önder Kalacı	31c8f279ac	Add helper UDFs to inspect object dependencies (#5293 ) - citus_get_all_dependencies_for_object: emulate what Citus would qualify as dependency when adding a new node - citus_get_dependencies_for_object: emulate what Citus would qualify as dependency when creating an object Example use: ```SQL -- find all the depedencies of table test SELECT pg_identify_object(t.classid, t.objid, t.objsubid) FROM (SELECT * FROM pg_get_object_address('table', '{test}', '{}')) as addr JOIN LATERAL citus_get_all_dependencies_for_object(addr.classid, addr.objid, addr.objsubid) as t(classid oid, objid oid, objsubid int) ON TRUE ORDER BY 1; ```	2021-10-18 14:46:49 +03:00
Halil Ozan Akgul	347ae2928f	Introduces stats_compat macro for MemoryContextMethods->stats stats function now have a new bool print_to_stderr parameter This new macro gives us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions Existing print_to_stderr parameter is set to true to keep current behavior Relevant PG commit: 43620e328617c1f41a2a54c8cee01723064e3ffa	2021-09-03 15:27:24 +03:00
Halil Ozan Akgul	4bc0c80bba	Adds index_delete_tuples instead of compute_xid_horizon_for_tuples Relevant PG commit: d168b666823b6e0bcf60ed19ce24fb5fb91b8ccf	2021-09-03 15:27:24 +03:00
Sait Talha Nisanci	e7ed16c296	Not include to-be-deleted shards while finding shard placements Ignore orphaned shards in more places Only use active shard placements in RouterInsertTaskList Use IncludingOrphanedPlacements in some more places Fix comment Add tests	2021-06-28 13:05:31 +03:00
Jelte Fennema	ca00b63272	Avoid two race conditions in the rebalance progress monitor (#5050 ) The first and main issue was that we were putting absolute pointers into shared memory for the `steps` field of the `ProgressMonitorData`. This pointer was being overwritten every time a process requested the monitor steps, which is the only reason why this even worked in the first place. To quote a part of a relevant stack overflow answer: > First of all, putting absolute pointers in shared memory segments is > terrible terible idea - those pointers would only be valid in the > process that filled in their values. Shared memory segments are not > guaranteed to attach at the same virtual address in every process. > On the contrary - they attach where the system deems it possible when > `shmaddr == NULL` is specified on call to `shmat()` Source: https://stackoverflow.com/a/10781921/2570866 In this case a race condition occurred when a second process overwrote the pointer in between the first process its write and read of the steps field. This issue is fixed by not storing the pointer in shared memory anymore. Instead we now calculate it's position every time we need it. The second race condition I have not been able to trigger, but I found it while investigating this. This issue was that we published the handle of the shared memory segment, before we initialized the data in the steps. This means that during initialization of the data, a call to `get_rebalance_progress()` could read partial data in an unsynchronized manner.	2021-06-21 14:03:42 +00:00
Jelte Fennema	1a83628195	Use "orphaned shards" naming in more places We were not very consistent in how we named these shards.	2021-06-04 11:39:19 +02:00
Jelte Fennema	b1cad26ebc	Move CheckCitusVersion to the top of each function Previously this was usually done after argument parsing. This can cause SEGFAULTs if the number or type of arguments changes in a new version. By checking that Citus version is correct before doing any argument parsing we protect against these types of issues. Issues like this have occurred in pg_auto_failover, so it's not just a theoretical issue. The main reason why these calls were not at the top of functions is really just historical. It was because in the past we didn't allow statements before declarations. Thus having this check before the argument parsing would have only been possible if we first declared all variables. In addition to moving existing CheckCitusVersion calls it also adds these calls to rebalancer related functions (they were missing there).	2021-06-01 17:43:46 +02:00
SaitTalhaNisanci	eaa7d2bada	Not block maintenance daemon (#4972 ) It was possible to block maintenance daemon by taking an SHARE ROW EXCLUSIVE lock on pg_dist_placement. Until the lock is released maintenance daemon would be blocked. We should not block the maintenance daemon under any case hence now we try to get the pg_dist_placement lock without waiting, if we cannot get it then we don't try to drop the old placements.	2021-05-17 03:22:35 -07:00
Nils Dijk	c91f8d8a15	Feature: localhost guc (#4836 ) DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate. By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.	2021-05-12 16:59:44 +02:00
Jelte Fennema	cbbd10b974	Implement an improvement threshold in the rebalancer (#4927 ) Every move in the rebalancer algorithm results in an improvement in the balance. However, even if the improvement in the balance was very small the move was still chosen. This is especially problematic if the shard itself is very big and the move will take a long time. This changes the rebalancer algorithm to take the relative size of the balance improvement into account when choosing moves. By default a move will not be chosen if it improves the balance by less than half of the size of the shard. An extra argument is added to the rebalancer functions so that the user can decide to lower the default threshold if the ignored move is wanted anyway.	2021-05-11 14:24:59 +02:00
Jelte Fennema	50357db957	Simplify code that tests the shard rebalancer algorithm (#4925 ) This modifies the test code to use sane defaults instead of requiring all values to be specified in the test.	2021-05-03 15:47:19 +02:00
SaitTalhaNisanci	03832f353c	Drop postgres 11 support	2021-03-25 09:20:28 +03:00
Onder Kalaci	e65e72130d	Rename use -> shouldUse Because setting the flag doesn't necessarily mean that we'll use 2PC. If connections are read-only, we will not use 2PC. In other words, we'll use 2PC only for connections that modified any placements.	2021-03-12 08:29:43 +00:00
Onder Kalaci	6a7ed7b309	Do not trigger 2PC for reads on local execution Before this commit, Citus used 2PC no matter what kind of local query execution happens. For example, if the coordinator has shards (and the workers as well), even a simple SELECT query could start 2PC: ```SQL WITH cte_1 AS (SELECT * FROM test LIMIT 10) SELECT count(*) FROM cte_1; ``` In this query, the local execution of the shards (and also intermediate result reads) triggers the 2PC. To prevent that, Citus now distinguishes local reads and local writes. And, Citus switches to 2PC only if a modification happens. This may still lead to unnecessary 2PCs when there is a local modification and remote SELECTs only. Though, we handle that separately via #4587.	2021-03-12 08:29:43 +00:00
Hanefi Onaldi	353b080474	Fix Semmle errors (#4636 ) Co-authored-by: Halil Ozan Akgül <hozanakgul@gmail.com>	2021-02-08 18:37:44 +03:00
Onur Tirtir	941c8fbf32	Automatically undistribute citus local tables when no more fkeys with reference tables (#4538 )	2021-01-22 18:15:41 +03:00
Hadi Moshayedi	bc01c795a2	Reland #4419	2021-01-19 07:48:47 -08:00
Marco Slot	011283122b	Add the shard rebalancer implementation	2021-01-07 16:51:55 +01:00

1 2 3 4

156 Commits (201d976a3bc4ac5450183afa3f91063c223fe337)