citus

Commit Graph

Author	SHA1	Message	Date
Teja Mupparti	01103ce05d	This implements a new UDF citus_get_cluster_clock() that returns a monotonically increasing logical clock. Clock guarantees to never go back in value after restarts, and makes best attempt to keep the value close to unix epoch time in milliseconds. Also, introduces a new GUC "citus.enable_cluster_clock", when true, every distributed transaction is stamped with logical causal clock and persisted in a catalog pg_dist_commit_transaction.	2022-10-28 10:15:08 -07:00
Gokhan Gulbiz	e87eda6496	Introduce a new GUC to propagate local settings to new connections in rebalancer (#6396 ) DESCRIPTION: Introduce ```citus.propagate_session_settings_for_loopback_connection``` GUC to propagate local settings to new connections. Fixes: #5289	2022-10-18 12:50:30 +03:00
Önder Kalacı	8b624b5c9d	Detect remotely closed sockets and add a single connection retry in the executor (#6404 ) PostgreSQL 15 exposes WL_SOCKET_CLOSED in WaitEventSet API, which is useful for detecting closed remote sockets. In this patch, we use this new event and try to detect closed remote sockets in the executor. When a closed socket is detected, the executor now has the ability to retry the connection establishment. Note that, the executor can retry connection establishments only for the connection that has not been used. Basically, this patch is mostly useful for preventing the executor to fail if a cached connection is closed because of the worker node restart (or worker failover). In other words, the executor cannot retry connection establishment if we are in a distributed transaction AND any command has been sent over the connection. That requires more sophisticated retry mechanisms. For now, fixing the above use case is enough. Fixes #5538 Earlier discussions: #5908, #6259 and #6283 ### Summary of the current approach regards to earlier trials As noted, we explored some alternatives before getting into this. https://github.com/citusdata/citus/pull/6283 is simple, but lacks an important property. We should be checking for `WL_SOCKET_CLOSED` _before_ sending anything over the wire. Otherwise, it becomes very tricky to understand which connection is actually safe to retry. For example, in the current patch, we can safely check `transaction->transactionState == REMOTE_TRANS_NOT_STARTED` before restarting a connection. #6259 does what we intent here (e.g., check for sending any command). However, as @marcocitus noted, it is very tricky to handle `WaitEventSets` in multiple places. And, the executor is designed such that it reacts to the events. So, adding anything `pre-executor` seemed too ugly. In the end, I converged into this patch. This patch relies on the simplicity of #6283 and also does a very limited handling of `WaitEventSets`, just for our purpose. Just before we add any connection to the execution, we check if the remote session has already closed. With that, we do a brief interaction of multiple wait event processing, but with different purposes. The new wait event processing we added does not even consider cancellations. We let that handled by the main event processing loop. Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-10-14 15:08:49 +02:00
Jelte Fennema	24e06af6d2	Reuse connections for Splits and Logical Replication (#6314 ) In Split, Logical replication logic and ShardCleaner we call `SendCommandListToWorkerOutsideTransaction` and `SendOptionalCommandListToWorkerOutsideTransaction` frequently. This opens new connection for each of those calls, even though we already have a perfectly good connection lying around. This PR adds two new APIs `SendCommandListToWorkerOutsideTransactionWithConnection` and `SendOptionalCommandListToWorkerOutsideTransactionWithConnection` that allow sending a list of queries in a transaction over an existing connection. We also update the callers (Split, ShardCleaner, Logical Replication) to use these new APIs instead. Co-authored-by: Nitish Upreti <niupre@microsoft.com> Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>	2022-09-26 13:37:40 +02:00
Marco Slot	639588bee0	Remove unused functions (#6220 ) Co-authored-by: Marco Slot <marco.slot@gmail.com>	2022-08-22 11:53:25 +03:00
Sameer Awasekar	e236711eea	Introduce Non-Blocking Shard Split Workflow	2022-08-04 16:32:38 +02:00
Marco Slot	6d6e44166f	Avoid catalog read via superuser() call in DecrementSharedConnectionCounter	2022-07-29 14:05:41 +02:00
Onder Kalaci	149771792b	Remove useless version compats most likely leftover from earlier versions	2022-07-29 10:31:55 +02:00
Onder Kalaci	0a5112964d	Call relation access hash clean-up irrespective of remote transaction state Mainly because local-only transactions should be cleaned up	2022-07-28 11:27:59 +02:00
Onder Kalaci	d67cf907a2	Detach relation access tracking from connection management	2022-07-28 11:27:59 +02:00
Naisila Puka	7d6410c838	Drop postgres 12 support (#6040 ) * Remove if conditions with PG_VERSION_NUM < 13 * Remove server_above_twelve(&eleven) checks from tests * Fix tests * Remove pg12 and pg11 alternative test output files * Remove pg12 specific normalization rules * Some more if conditions in the code * Change RemoteCollationIdExpression and some pg12/pg13 comments * Remove some more normalization rules	2022-07-20 17:49:36 +03:00
Onder Kalaci	483a3a5875	PG 15 Compat: Resolve compile issues + shmem requests Similar to #5897, one more step for running Citus with PG 15. This PR at least make Citus run with PG 15. I have not tried running the tests with PG 15. Shmem changes are based on `4f2400cb3f` Compile breaks are mostly due to #6008	2022-07-15 10:11:39 +02:00
Jelte Fennema	184c7c0bce	Make enterprise features open source (#6008 ) This PR makes all of the features open source that were previously only available in Citus Enterprise. Features that this adds: 1. Non blocking shard moves/shard rebalancer (`citus.logical_replication_timeout`) 2. Propagation of CREATE/DROP/ALTER ROLE statements 3. Propagation of GRANT statements 4. Propagation of CLUSTER statements 5. Propagation of ALTER DATABASE ... OWNER TO ... 6. Optimization for COPY when loading JSON to avoid double parsing of the JSON object (`citus.skip_jsonb_validation_in_copy`) 7. Support for row level security 8. Support for `pg_dist_authinfo`, which allows storing different authentication options for different users, e.g. you can store passwords or certificates here. 9. Support for `pg_dist_poolinfo`, which allows using connection poolers in between coordinator and workers 10. Tracking distributed query execution times using citus_stat_statements (`citus.stat_statements_max`, `citus.stat_statements_purge_interval`, `citus.stat_statements_track`). This is disabled by default. 11. Blocking tenant_isolation 12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`	2022-06-16 00:23:46 -07:00
Marco Slot	7abcfac61f	Add caching for functions that check the backend type	2022-05-20 19:02:37 +02:00
Marco Slot	ad5214b50c	Allow distributed execution from run_command_on_* functions	2022-05-20 15:26:47 +02:00
Onder Kalaci	338752d96e	Guard against hard wait event set errors Similar to https://github.com/citusdata/citus/pull/5158, but this time instead of the executor, use this in all the remaining places.	2022-03-14 14:35:56 +01:00
Onder Kalaci	953951007c	Move wait event error checks to connection manager	2022-03-14 14:35:56 +01:00
Marco Slot	43e4dd3808	Add a citus.internal_reserved_connections setting	2022-03-02 19:13:53 +01:00
Halil Ozan Akgul	8ee02b29d0	Introduce global PID	2022-02-08 16:49:38 +03:00
Teja Mupparti	f31bce5b48	Fixes the issue seen in https://github.com/citusdata/citus-enterprise/issues/745 With this commit, rebalancer backends are identified by application_name = citus_rebalancer and the regular internal backends are identified by application_name = citus_internal	2022-02-03 09:40:46 -08:00
Onder Kalaci	b26eeaecd3	Use a fixed application_name while connecting to remote nodes Citus heavily relies on application_name, see `IsCitusInitiatedRemoteBackend()`. But if the user set the application name, such as export PGAPPNAME=test_name, Citus uses that name while connecting to the remote node. With this commit, we ensure that Citus always connects with the "citus" user name to the remote nodes.	2022-01-27 10:46:25 +01:00
Onur Tirtir	3cc44ed8b3	Tell other backends it's safe to ignore the backend that concurrently built the shell table index (#5520 ) In addition to starting a new transaction, we also need to tell other backends --including the ones spawned for connections opened to localhost to build indexes on shards of this relation-- that concurrent index builds can safely ignore us. Normally, DefineIndex() only does that if index doesn't have any predicates (i.e.: where clause) and no index expressions at all. However, now that we already called standard process utility, index build on the shell table is finished anyway. The reason behind doing so is that we cannot guarantee not grabbing any snapshots via adaptive executor, and the backends creating indexes on local shards (if any) might block on waiting for current xact of the current backend to finish, which would cause self deadlocks that are not detectable.	2022-01-10 10:23:09 +03:00
Onder Kalaci	7cb1d6ae06	Improve metadata connections With https://github.com/citusdata/citus/pull/5493 we introduced metadata specific connections. With this connection we guarantee that there is a single metadata connection. But note that this connection can be used for any other operation. In other words, this connection is not only reserved for metadata operations. However, as https://github.com/citusdata/citus-enterprise/issues/715 showed us that the logic has a flaw. We allowed ineligible connections to be picked as metadata connections: such as exclusively claimed connections or not fully initialized connections. With this commit, we make sure that we only consider eligable connections for metadata operations.	2022-01-07 10:36:32 +01:00
Onder Kalaci	fc98f83af2	Add citus.grep_remote_commands Simply applies ```SQL SELECT textlike(command, citus.grep_remote_commands) ``` And, if returns true, the command is logged. Else, the log is ignored. When citus.grep_remote_commands is empty string, all commands are logged.	2021-12-17 11:47:40 +01:00
Hanefi Onaldi	13fff9c37a	Remove NOOP tuplestore_donestoring calls PostgreSQL does not need calling this function since 7.4 release, and it is a NOOP. For more details, check PostgreSQL commit below : commit dd04e958c8b03c0f0512497651678c7816af3198 Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun Mar 9 03:34:10 2003 +0000 tuplestore_donestoring() isn't needed anymore, but provide a no-op macro definition so as not to create compatibility problems. diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h index b46babacd1..76fe9fb428 100644 --- a/src/include/utils/tuplestore.h +++ b/src/include/utils/tuplestore.h @@ -17,7 +17,7 @@ * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $ + * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $ * ------------------------------------------------------------------------- / @@ -41,6 +41,9 @@ extern Tuplestorestate tuplestore_begin_heap(bool randomAccess, extern void tuplestore_puttuple(Tuplestorestate state, void tuple); +/ tuplestore_donestoring() used to be required, but is no longer used / +#define tuplestore_donestoring(state) ((void) 0) + / backwards scan is only allowed if randomAccess was specified 'true' / extern void tuplestore_gettuple(Tuplestorestate state, bool forward, bool should_free);	2021-12-14 18:55:02 +03:00
Onder Kalaci	d405993b57	Make sure to use a dedicated metadata connection With this commit, we make sure to use a dedicated connection per node for all the metadata operations within the same transaction. This is needed because the same metadata (e.g., metadata includes the distributed table on the workers) can be modified accross multiple connections. With this connection we guarantee that there is a single metadata connection. But note that this connection can be used for any other operation. In other words, this connection is not only reserved for metadata operations.	2021-11-26 14:36:28 +01:00
Halil Ozan Akgul	c0eb67b24f	Skip forceCloseAtTransactionEnd connections only if BEGIN was not sent on them	2021-11-01 17:43:04 +03:00
Philip Dubé	cc50682158	Fix typos. Spurred spotting "connectios" in logs	2021-10-25 13:54:09 +00:00
Onder Kalaci	575bb6dde9	Drop support for Inactive Shard placements Given that we do all operations via 2PC, there is no way for any placement to be marked as INACTIVE.	2021-10-22 18:03:35 +02:00
Halil Ozan Akgul	43d5853b6d	Fixes function names in comments	2021-10-06 09:24:43 +03:00
Sait Talha Nisanci	e7ed16c296	Not include to-be-deleted shards while finding shard placements Ignore orphaned shards in more places Only use active shard placements in RouterInsertTaskList Use IncludingOrphanedPlacements in some more places Fix comment Add tests	2021-06-28 13:05:31 +03:00
Jelte Fennema	b1cad26ebc	Move CheckCitusVersion to the top of each function Previously this was usually done after argument parsing. This can cause SEGFAULTs if the number or type of arguments changes in a new version. By checking that Citus version is correct before doing any argument parsing we protect against these types of issues. Issues like this have occurred in pg_auto_failover, so it's not just a theoretical issue. The main reason why these calls were not at the top of functions is really just historical. It was because in the past we didn't allow statements before declarations. Thus having this check before the argument parsing would have only been possible if we first declared all variables. In addition to moving existing CheckCitusVersion calls it also adds these calls to rebalancer related functions (they were missing there).	2021-06-01 17:43:46 +02:00
SaitTalhaNisanci	87e3a5e24a	Use 2PC when using a node connection (#4997 )	2021-05-21 14:58:53 +03:00
Nils Dijk	c91f8d8a15	Feature: localhost guc (#4836 ) DESCRIPTION: introduce `citus.local_hostname` GUC for connections to the current node Citus once in a while needs to connect to itself for some systems operations. This used to be hardcoded to `localhost`. The hardcoded hostname causes some issues, for example in environments where `sslmode=verify-full` is required. It is not always desirable or even feasible to get `localhost` as an alt name on the certificate. By introducing a GUC to use when connecting to the current instance the user has more control what network path is used and what hostname is required to be present in the server certificate.	2021-05-12 16:59:44 +02:00
Onder Kalaci	5482d5822f	Keep more statistics about connection establishment times When DEBUG4 enabled, Citus now prints per connection establishment time.	2021-04-16 14:56:31 +02:00
SaitTalhaNisanci	b453563e88	Warm up connections params hash (#4872 ) ConnParams(AuthInfo and PoolInfo) gets a snapshot, which will block the remote connectinos to localhost. And the release of snapshot will be blocked by the snapshot. This leads to a deadlock. We warm up the conn params hash before starting a new transaction so that the entries will already be there when we start a new transaction. Hence GetConnParams will not get a snapshot.	2021-04-12 13:08:38 +03:00
Marco Slot	fbc2147e11	Replace MAX_PUT_COPY_DATA_BUFFER_SIZE by citus.remote_copy_flush_threshold GUC	2021-03-16 06:00:38 +01:00
Marco Slot	1646fca445	Add GUC to set maximum connection lifetime	2021-03-16 01:57:57 +01:00
Philip Dubé	4e22f02997	Fix various typos due to zealous repetition	2021-03-04 19:28:15 +00:00
Onder Kalaci	fc9a23792c	COPY uses adaptive connection management on local node With #4338, the executor is smart enough to failover to local node if there is not enough space in max_connections for remote connections. For COPY, the logic is different. With #4034, we made COPY work with the adaptive connection management slightly differently. The cause of the difference is that COPY doesn't know which placements are going to be accessed hence requires to get connections up-front. Similarly, COPY decides to use local execution up-front. With this commit, we change the logic for COPY on local nodes: Try to reserve a connection to local host. This logic follows the same logic (e.g., citus.local_shared_pool_size) as the executor because COPY also relies on TryToIncrementSharedConnectionCounter(). If reservation to local node fails, switch to local execution Apart from this, if local execution is disabled, we follow the exact same logic for multi-node Citus. It means that if we are out of the connection, we'd give an error.	2021-02-04 09:45:07 +01:00
Onur Tirtir	253c19062a	Rename IsCitusInitiatedBackend to IsCitusInitiatedRemoteBackend (#4562 )	2021-01-23 01:07:43 +03:00
Onur Tirtir	941c8fbf32	Automatically undistribute citus local tables when no more fkeys with reference tables (#4538 )	2021-01-22 18:15:41 +03:00
Onder Kalaci	c546ec5e78	Local node connection management When Citus needs to parallelize queries on the local node (e.g., the node executing the distributed query and the shards are the same), we need to be mindful about the connection management. The reason is that the client backends that are running distributed queries are competing with the client backends that Citus initiates to parallelize the queries in order to get a slot on the max_connections. In that regard, we implemented a "failover" mechanism where if the distributed queries cannot get a connection, the execution failovers the tasks to the local execution. The failover logic is follows: - As the connection manager if it is OK to get a connection - If yes, we are good. - If no, we fail the workerPool and the failure triggers the failover of the tasks to local execution queue The decision of getting a connection is follows: /* * For local nodes, solely relying on citus.max_shared_pool_size or * max_connections might not be sufficient. The former gives us * a preview of the future (e.g., we let the new connections to establish, * but they are not established yet). The latter gives us the close to * precise view of the past (e.g., the active number of client backends). * * Overall, we want to limit both of the metrics. The former limit typically * kics in under regular loads, where the load of the database increases in * a reasonable pace. The latter limit typically kicks in when the database * is issued lots of concurrent sessions at the same time, such as benchmarks. */	2020-12-03 14:16:13 +03:00
Onur Tirtir	03bcccdee0	Fix hostname length check in StartNodeUserDatabaseConnection (#4363 ) Copying string before hostname length check makes the check useless	2020-11-30 20:00:35 +03:00
Onur Tirtir	7f3d1182ed	Handle invalid connection hash entries (#4362 ) If MemoryContextAlloc errors out -e.g. during an OOM-, ConnectionHashEntry->connections stays as NULL. With this commit, we add isValid flag to ConnectionHashEntry that should be set to true right after we allocate & initialize ConnectionHashEntry->connections list properly, and we check it before accesing to ConnectionHashEntry->connections.	2020-11-30 19:44:03 +03:00
Onur Tirtir	f80f4839ad	Remove unused functions that cppcheck found	2020-10-19 13:50:52 +03:00
Onur Tirtir	ba208eae4d	Record non-distributed table accesses in local executor (#4139 )	2020-09-07 18:19:08 +03:00
Sait Talha Nisanci	1112b254a7	adapt recently added code for pg13 This commit mostly adds pg_get_triggerdef_command to our ruleutils_13. This doesn't add anything extra for ruleutils 13 so it is basically a copy of the change on ruleutils_12	2020-08-04 15:18:27 +03:00
Sait Talha Nisanci	01632c56a0	Change utils/hashutils.h to common/hashfn.h for PG >= 13 Commit on postgres side: 05d8449e73694585b59f8b03aaa087f04cc4679a Command on postgres side: git log --all --grep="hashutils" include common/hashfn.h for pg >= 13 tag_hash was moved from hsearch.h to hashutils.h then to hashfn.h Commits on Postgres side: 9341c783cc42ffae5860c86bdc713bd47d734ffd	2020-08-04 15:10:22 +03:00
Onder Kalaci	eeb8c81de2	Implement shared connection count reservation & enable `citus.max_shared_pool_size` for COPY With this patch, we introduce `locally_reserved_shared_connections.c/h` files which are responsible for reserving some space in shared memory counters upfront. We sometimes need to reserve connections, but not necessarily establish them. For example: - COPY command should reserve connections as it cannot know which connections it needs in which order. COPY establishes connections as any input data hits the workers. For example, for router COPY command, it only establishes 1 connection. As discussed here (https://github.com/citusdata/citus/pull/3849#pullrequestreview-431792473), COPY needs to reserve connections up-front, otherwise we can end up with resource starvation/un-detected deadlocks.	2020-08-03 18:51:40 +02:00

1 2 3 4 5

201 Commits (01103ce05d4767b1fb6ca50bb5c95ec37dc953d6)