citus

Commit Graph

Author	SHA1	Message	Date
Jeff Davis	b6a5617ea8	PG15: handle pg_analyze_and_rewrite_* renaming. From PG commit 791b1b71da.	2022-05-02 10:12:03 -07:00
Jeff Davis	3799f95742	PG15: Value -> String, Integer, Float. Handle PG commit 639a86e36a.	2022-05-02 10:12:03 -07:00
Jeff Davis	26f5e20580	PG15: update integer parsing APIs. Account for PG commits 3c6f8c011f and cfc7191dfe.	2022-05-02 10:12:03 -07:00
Jeff Davis	1c1ef7ab8d	PG15: Handle extra argument to RelationCreateStorage. Account for PG commit 9c08aea6a309. Introduce RelationCreateStorage_compat.	2022-05-02 10:12:03 -07:00
Jeff Davis	f944722c6a	PG15: Use RelationGetSmgr() instead of RelationOpenSmgr(). Handle PG commit f10f0ae420.	2022-05-02 10:12:03 -07:00
Onder Kalaci	b0b91bab04	Rename metadata sync to node metadata sync where applicable	2022-04-07 17:51:31 +02:00
Marco Slot	9476f377b5	Remove old re-partitioning functions	2022-04-04 18:11:52 +02:00
Onder Kalaci	9043a1ed3f	Only hide shards from client backends and pg bg workers The aim of hiding shards is to hide shards from client applications. Certain bg workers (such as pg_cron or Citus maintanince daemon) should be treated like client applications because users can run queries from such bg workers. And, these bg workers should follow the similar application_name checks as client backeends. Certain other bg workers, such as logical replication or postgres' parallel workers, should never hide shards. They are internal operations. Similarly the other backend types like the walsender or checkpointer or autovacuum should never hide shards.	2022-03-30 16:56:12 +02:00
Halil Ozan Akgül	333bcc7948	Global PID Helper Functions (#5768 ) * Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions * Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions * Add tests	2022-03-09 13:15:59 +03:00
Onder Kalaci	c32b2de1a7	Improve citus_lock_waits 1) Remove useless columns 2) Show backends that are blocked on a DDL even before gpid is assigned 3) One minor bugfix, where we clear distributedCommandOriginator properly.	2022-03-07 11:10:44 +01:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Gledis Zeneli	b825232ecb	Handle rebalance / replication when a node is disabled (Fix #5664 ) (#5729 ) The issue in question is caused when rebalance / replication call `FullShardPlacementList` which returns all shard placements (including those in disabled nodes with `citus_disable_node`). Eventually, `FindFillStateForPlacement` looks for the state across active workers and fails to find a state for the placements which are in the disabled workers causing a seg fault shortly after. Approach: * `ActivePlacementHash` was not using the status of the shard placement's node to determine if the node it is active. Initially, I just fixed that. * Additionally, I refactored the code which handles active shards in replication / rebalance to: * use a single function to determine if a shard placement is active. * do the shard active shard filtering before calling `RebalancePlacementUpdates` and `ReplicationPlacementUpdates`, so test methods like `shard_placement_rebalance_array` and `shard_placement_replication_array` which have different shard placement active requirements can do their own filtering while using the same rebalance / replicate logic that `rebalance_table_shards` and `replicate_table_shards` use. Fix #5664	2022-02-25 19:54:30 +03:00
Onder Kalaci	95d5918967	Properly set worker_query and use	2022-02-21 18:22:33 +01:00
Onder Kalaci	331af3dce8	Dumping wait edges becomes optionally scan all backends Before this commit, dumping wait edges can only be used for distributed deadlock detection purposes. With this commit, we open the possibility that we can use it for any backend.	2022-02-21 17:37:07 +01:00
Burak Velioglu	fa6866ed36	Start to propagate functions to worker nodes with CREATE FUNCTION command together with it's dependencies. If the function depends on any nondistributable object, function will be created only locally. Parameterless version of create_distributed_function becomes obsolete with this change, it will deprecated from the code with a subsequent PR.	2022-02-18 13:56:51 +03:00
Burak Velioglu	f88cc230bf	Handle tables and objects as metadata. Update UDFs accordingly With this commit we've started to propagate sequences and shell tables within the object dependency resolution. So, ensuring any dependencies for any object will consider shell tables and sequences as well. Separate logics for both shell tables and sequences have been removed. Since both shell tables and sequences logic were implemented as a part of the metadata handling before that logic, we were propagating them while syncing table metadata. With this commit we've divided metadata (which means anything except shards thereafter) syncing logic into multiple parts and implemented it either as a part of ActivateNode. You can check the functions called in ActivateNode to check definition of different metadata. Definitions of start_metadata_sync_to_node and citus_activate_node have also been updated. citus_activate_node will basically create an active node with all metadata and reference table shards. start_metadata_sync_to_node will be same with citus_activate_node except replicating reference tables. stop_metadata_sync_to_node will remove all the metadata. All of those UDFs need to be called by superuser.	2022-01-31 16:20:15 +03:00
Marco Slot	ee3b50b026	Disallow remote execution from queries on shards	2022-01-07 17:46:21 +01:00
Hanefi Onaldi	13fff9c37a	Remove NOOP tuplestore_donestoring calls PostgreSQL does not need calling this function since 7.4 release, and it is a NOOP. For more details, check PostgreSQL commit below : commit dd04e958c8b03c0f0512497651678c7816af3198 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun Mar 9 03:34:10 2003 +0000 tuplestore_donestoring() isn't needed anymore, but provide a no-op macro definition so as not to create compatibility problems. diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h index b46babacd1..76fe9fb428 100644 --- a/src/include/utils/tuplestore.h +++ b/src/include/utils/tuplestore.h @@ -17,7 +17,7 @@ * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $ + * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $ * ------------------------------------------------------------------------- / @@ -41,6 +41,9 @@ extern Tuplestorestate tuplestore_begin_heap(bool randomAccess, extern void tuplestore_puttuple(Tuplestorestate state, void tuple); +/ tuplestore_donestoring() used to be required, but is no longer used / +#define tuplestore_donestoring(state) ((void) 0) + / backwards scan is only allowed if randomAccess was specified 'true' / extern void tuplestore_gettuple(Tuplestorestate state, bool forward, bool should_free);	2021-12-14 18:55:02 +03:00
Önder Kalacı	31c8f279ac	Add helper UDFs to inspect object dependencies (#5293 ) - citus_get_all_dependencies_for_object: emulate what Citus would qualify as dependency when adding a new node - citus_get_dependencies_for_object: emulate what Citus would qualify as dependency when creating an object Example use: ```SQL -- find all the depedencies of table test SELECT pg_identify_object(t.classid, t.objid, t.objsubid) FROM (SELECT * FROM pg_get_object_address('table', '{test}', '{}')) as addr JOIN LATERAL citus_get_all_dependencies_for_object(addr.classid, addr.objid, addr.objsubid) as t(classid oid, objid oid, objsubid int) ON TRUE ORDER BY 1; ```	2021-10-18 14:46:49 +03:00
Halil Ozan Akgul	347ae2928f	Introduces stats_compat macro for MemoryContextMethods->stats stats function now have a new bool print_to_stderr parameter This new macro gives us the ability to use this new parameter for PG14 and it doesn't give the parameter for previous versions Existing print_to_stderr parameter is set to true to keep current behavior Relevant PG commit: 43620e328617c1f41a2a54c8cee01723064e3ffa	2021-09-03 15:27:24 +03:00
Halil Ozan Akgul	4bc0c80bba	Adds index_delete_tuples instead of compute_xid_horizon_for_tuples Relevant PG commit: d168b666823b6e0bcf60ed19ce24fb5fb91b8ccf	2021-09-03 15:27:24 +03:00
Sait Talha Nisanci	e7ed16c296	Not include to-be-deleted shards while finding shard placements Ignore orphaned shards in more places Only use active shard placements in RouterInsertTaskList Use IncludingOrphanedPlacements in some more places Fix comment Add tests	2021-06-28 13:05:31 +03:00
Jelte Fennema	ca00b63272	Avoid two race conditions in the rebalance progress monitor (#5050 ) The first and main issue was that we were putting absolute pointers into shared memory for the `steps` field of the `ProgressMonitorData`. This pointer was being overwritten every time a process requested the monitor steps, which is the only reason why this even worked in the first place. To quote a part of a relevant stack overflow answer: > First of all, putting absolute pointers in shared memory segments is > terrible terible idea - those pointers would only be valid in the > process that filled in their values. Shared memory segments are not > guaranteed to attach at the same virtual address in every process. > On the contrary - they attach where the system deems it possible when > `shmaddr == NULL` is specified on call to `shmat()` Source: https://stackoverflow.com/a/10781921/2570866 In this case a race condition occurred when a second process overwrote the pointer in between the first process its write and read of the steps field. This issue is fixed by not storing the pointer in shared memory anymore. Instead we now calculate it's position every time we need it. The second race condition I have not been able to trigger, but I found it while investigating this. This issue was that we published the handle of the shared memory segment, before we initialized the data in the steps. This means that during initialization of the data, a call to `get_rebalance_progress()` could read partial data in an unsynchronized manner.	2021-06-21 14:03:42 +00:00
Jelte Fennema	1a83628195	Use "orphaned shards" naming in more places We were not very consistent in how we named these shards.	2021-06-04 11:39:19 +02:00
Jelte Fennema	b1cad26ebc	Move CheckCitusVersion to the top of each function Previously this was usually done after argument parsing. This can cause SEGFAULTs if the number or type of arguments changes in a new version. By checking that Citus version is correct before doing any argument parsing we protect against these types of issues. Issues like this have occurred in pg_auto_failover, so it's not just a theoretical issue. The main reason why these calls were not at the top of functions is really just historical. It was because in the past we didn't allow statements before declarations. Thus having this check before the argument parsing would have only been possible if we first declared all variables. In addition to moving existing CheckCitusVersion calls it also adds these calls to rebalancer related functions (they were missing there).	2021-06-01 17:43:46 +02:00
SaitTalhaNisanci	eaa7d2bada	Not block maintenance daemon (#4972 ) It was possible to block maintenance daemon by taking an SHARE ROW EXCLUSIVE lock on pg_dist_placement. Until the lock is released maintenance daemon would be blocked. We should not block the maintenance daemon under any case hence now we try to get the pg_dist_placement lock without waiting, if we cannot get it then we don't try to drop the old placements.	2021-05-17 03:22:35 -07:00
Nils Dijk	c91f8d8a15	Feature: localhost guc (#4836 ) DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate. By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.	2021-05-12 16:59:44 +02:00
Jelte Fennema	cbbd10b974	Implement an improvement threshold in the rebalancer (#4927 ) Every move in the rebalancer algorithm results in an improvement in the balance. However, even if the improvement in the balance was very small the move was still chosen. This is especially problematic if the shard itself is very big and the move will take a long time. This changes the rebalancer algorithm to take the relative size of the balance improvement into account when choosing moves. By default a move will not be chosen if it improves the balance by less than half of the size of the shard. An extra argument is added to the rebalancer functions so that the user can decide to lower the default threshold if the ignored move is wanted anyway.	2021-05-11 14:24:59 +02:00
Jelte Fennema	50357db957	Simplify code that tests the shard rebalancer algorithm (#4925 ) This modifies the test code to use sane defaults instead of requiring all values to be specified in the test.	2021-05-03 15:47:19 +02:00
SaitTalhaNisanci	03832f353c	Drop postgres 11 support	2021-03-25 09:20:28 +03:00
Onder Kalaci	e65e72130d	Rename use -> shouldUse Because setting the flag doesn't necessarily mean that we'll use 2PC. If connections are read-only, we will not use 2PC. In other words, we'll use 2PC only for connections that modified any placements.	2021-03-12 08:29:43 +00:00
Onder Kalaci	6a7ed7b309	Do not trigger 2PC for reads on local execution Before this commit, Citus used 2PC no matter what kind of local query execution happens. For example, if the coordinator has shards (and the workers as well), even a simple SELECT query could start 2PC: ```SQL WITH cte_1 AS (SELECT * FROM test LIMIT 10) SELECT count(*) FROM cte_1; ``` In this query, the local execution of the shards (and also intermediate result reads) triggers the 2PC. To prevent that, Citus now distinguishes local reads and local writes. And, Citus switches to 2PC only if a modification happens. This may still lead to unnecessary 2PCs when there is a local modification and remote SELECTs only. Though, we handle that separately via #4587.	2021-03-12 08:29:43 +00:00
Hanefi Onaldi	353b080474	Fix Semmle errors (#4636 ) Co-authored-by: Halil Ozan Akgül <hozanakgul@gmail.com>	2021-02-08 18:37:44 +03:00
Onur Tirtir	941c8fbf32	Automatically undistribute citus local tables when no more fkeys with reference tables (#4538 )	2021-01-22 18:15:41 +03:00
Hadi Moshayedi	bc01c795a2	Reland #4419	2021-01-19 07:48:47 -08:00
Marco Slot	011283122b	Add the shard rebalancer implementation	2021-01-07 16:51:55 +01:00
Marco Slot	47c1b19174	Revert "Do metadata sync in a separate background worker." This reverts commit `4df723cf9b`.	2021-01-07 10:30:04 +01:00
Marco Slot	d9f175532b	Revert "Trigger metadata sync at transaction commit" This reverts commit `a2c73bef27`.	2021-01-07 10:30:00 +01:00
Hadi Moshayedi	a2c73bef27	Trigger metadata sync at transaction commit	2020-12-24 08:28:38 -08:00
Hadi Moshayedi	4df723cf9b	Do metadata sync in a separate background worker.	2020-12-24 08:25:55 -08:00
Onur Tirtir	5ed9197041	Implement infra to get foreign key connected relations (#4439 ) On top of our foreign key graph, implement the infrastructure to get list of relations that are connected to input relation via a foreign key graph. We need this to support cascading create_citus_local_table & undistribute_table operations. Also add regression tests to see what our foreign key graph is able to capture currently.	2020-12-24 16:42:40 +03:00
Onder Kalaci	629ecc3dee	Add the infrastructure to count the number of client backends Considering the adaptive connection management improvements that we plan to roll soon, it makes it very helpful to know the number of active client backends. We are doing this addition to simplify yhe adaptive connection management for single node Citus. In single node Citus, both the client backends and Citus parallel queries would compete to get slots on Postgres' `max_connections` on the same Citus database. With adaptive connection management, we have the counters for Citus parallel queries. That helps us to adaptively decide on the remote executions pool size (e.g., throttle connections if necessary). However, we do not have any counters for the total number of client backends on the database. For single node Citus, we should consider all the client backends, not only the remote connections that Citus does. Of course Postgres internally knows how many client backends are active. However, to get that number Postgres iterates over all the backends. For examaple, see [pg_stat_get_db_numbackends](`8e90ec5580/src/backend/utils/adt/pgstatfuncs.c (L1240)`) where Postgres iterates over all the backends. For our purpuses, we need this information on every connection establishment. That's why we cannot affort to do this kind of iterattion.	2020-11-25 19:19:24 +01:00
Nils Dijk	caabbf4b84	Table access method support for distributed tables	2020-10-16 12:02:25 -07:00
SaitTalhaNisanci	366461ccdb	Introduce cache entry/table utilities (#4132 ) Introduce table entry utility functions Citus table cache entry utilities are introduced so that we can easily extend existing functionality with minimum changes, specifically changes to these functions. For example IsNonDistributedTableCacheEntry can be extended for citus local tables without the need to scan the whole codebase and update each relevant part. * Introduce utility functions to find the type of tables A table type can be a reference table, a hash/range/append distributed table. Utility methods are created so that we don't have to worry about how a table is considered as a reference table etc. This also makes it easy to extend the table types. * Add IsCitusTableType utilities * Rename IsCacheEntryCitusTableType -> IsCitusTableTypeCacheEntry * Change citus table types in some checks	2020-09-02 22:26:05 +03:00
Jelte Fennema	451ea04508	Rename ForceXxx functions to to XxxOrError This clearer naming was suggested in https://github.com/citusdata/citus/pull/4001	2020-09-01 11:19:17 +02:00
Sait Talha Nisanci	d68bfc5687	Improve error for index operator class parameters The error message when index has opclassopts is improved and the commit from postgres side is also included for future reference. Also some minor style related changes are applied.	2020-08-04 15:38:13 +03:00
Sait Talha Nisanci	1112b254a7	adapt recently added code for pg13 This commit mostly adds pg_get_triggerdef_command to our ruleutils_13. This doesn't add anything extra for ruleutils 13 so it is basically a copy of the change on ruleutils_12	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	62879ee8c1	introduce planner_compat and pg_plan_query_compat macros As the new planner and pg_plan_query_compat methods expect the query string as well, macros are defined to be compatible in different versions of postgres. Relevant commit on Postgres: 6aba63ef3e606db71beb596210dd95fa73c44ce2 Command on Postgres: git log --all --grep="pg_plan_query"	2020-08-04 15:10:22 +03:00
Sait Talha Nisanci	0819b79631	introduce list compat macros Pass the list to lnext API lnext API now expects the list as well. The commit on Postgres that introduced the change: 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164 lnext_compat and list_delete_cell_compat macros are introduced so that we can use these macros in the codebase without having to use #if directives in the codebase. Related commit on postgres: 1cff1b95ab6ddae32faa3efe0d95a820dbfdc164 Command to search in postgres: git log --all --grep="list_delete_cell" add ListCellAndListWrapper When iterating a list in separate function calls, we need both the list and the current cell starting from PG13, therefore ListCellAndListWrapper is added to store both as a wrapper. Use ListCellAndListWrapper in foreign key test udfs As we iterate a list in these udfs using a functionContext, we need to use the wrapper to be able to access both the list and the current cell.	2020-08-04 15:10:22 +03:00
SaitTalhaNisanci	ef841115de	Fix int32 overflow and use PG macros for INT32_XX (#4061 ) * Use CalculateUniformHashRangeIndex in HashPartitionId INT32_MIN definition can change among different platforms hence it is possible to get overflow, we would see crashes because of this in debian distros. We have already solved a similar problem with introducing CalculateUniformHashRangeIndex method, hence to solve it we can use the same method, this also removes some duplication and has a single place to decide that. * Use PG_INT32_XX instead of INT32_XX to be safer	2020-07-23 18:30:08 +03:00

1 2 3

142 Commits (a1151c2395dab15f6803ced0d80d5b0a597175b5)