citus

Commit Graph

Author	SHA1	Message	Date
Jelte Fennema	c70cf963c4	Also reset transactions at connection shutdown (#6685 ) In #6314 I refactored the connection cleanup to be simpler to understand and use. However, by doing so I introduced a use-after-free possibility (that valgrind luckily picked up): In the `ShouldShutdownConnection` path of `AfterXactHostConnectionHandling` we free connections without removing the `transactionNode` from the dlist that it might be part of. Before the refactoring this wasn't a problem, because the dlist would be completely reset quickly after in `ResetGlobalVariables` (without reading or writing the dlist entries). The refactoring changed this by moving the `dlist_delete` call to `ResetRemoteTransaction`, which in turn was called in the `!ShouldShutdownConnection` path of `AfterXactHostConnectionHandling`. Thus this `!ShouldShutdownConnection` path would now delete from the `dlist`, but the `ShouldShutdownConnection` path would not. Thus to remove itself the deleting path would sometimes update nodes in the list that were freed right before. There's two ways of fixing this: 1. Call `dlist_delete` from both of paths. 2. Call `dlist_delete` from neither of the paths. This commit implements the second approach, and #6684 implements the first. We need to choose which approach we prefer. To make calling `dlist_delete` from both paths actually work, we also need to use a slightly different check to determine if we need to call dlist_delete. Various regression tests showed that there can be cases where the `transactionState` is something else than `REMOTE_TRANS_NOT_STARTED` but the connection was not added to the `InProgressTransactions` list One example of such a case is when running `TransactionStateMachine` without calling `StartRemoteTransactionBegin` beforehand. In those cases the connection won't be added to `InProgressTransactions`, but the `transactionState` is changed to `REMOTE_TRANS_SENT_COMMAND`. Sidenote: This bug already existed in 11.1, but valgrind didn't catch it back then. My guess is that this happened because #6314 was merged after the initial release branch was cut. Fixes #6638 (cherry picked from commit `f061dbb253`)	2023-02-03 18:33:42 +03:00
Jelte Fennema	006f6aceaf	Reuse connections for Splits and Logical Replication (#6314 ) In Split, Logical replication logic and ShardCleaner we call `SendCommandListToWorkerOutsideTransaction` and `SendOptionalCommandListToWorkerOutsideTransaction` frequently. This opens new connection for each of those calls, even though we already have a perfectly good connection lying around. This PR adds two new APIs `SendCommandListToWorkerOutsideTransactionWithConnection` and `SendOptionalCommandListToWorkerOutsideTransactionWithConnection` that allow sending a list of queries in a transaction over an existing connection. We also update the callers (Split, ShardCleaner, Logical Replication) to use these new APIs instead. Co-authored-by: Nitish Upreti <niupre@microsoft.com> Co-authored-by: Onder Kalaci <onderkalaci@gmail.com> (cherry picked from commit `24e06af6d2`)	2022-09-26 16:53:38 +02:00
Sameer Awasekar	e236711eea	Introduce Non-Blocking Shard Split Workflow	2022-08-04 16:32:38 +02:00
Onder Kalaci	149771792b	Remove useless version compats most likely leftover from earlier versions	2022-07-29 10:31:55 +02:00
Marco Slot	7abcfac61f	Add caching for functions that check the backend type	2022-05-20 19:02:37 +02:00
Marco Slot	ad5214b50c	Allow distributed execution from run_command_on_* functions	2022-05-20 15:26:47 +02:00
Onder Kalaci	338752d96e	Guard against hard wait event set errors Similar to https://github.com/citusdata/citus/pull/5158, but this time instead of the executor, use this in all the remaining places.	2022-03-14 14:35:56 +01:00
Onder Kalaci	953951007c	Move wait event error checks to connection manager	2022-03-14 14:35:56 +01:00
Halil Ozan Akgul	8ee02b29d0	Introduce global PID	2022-02-08 16:49:38 +03:00
Teja Mupparti	f31bce5b48	Fixes the issue seen in https://github.com/citusdata/citus-enterprise/issues/745 With this commit, rebalancer backends are identified by application_name = citus_rebalancer and the regular internal backends are identified by application_name = citus_internal	2022-02-03 09:40:46 -08:00
Onur Tirtir	3cc44ed8b3	Tell other backends it's safe to ignore the backend that concurrently built the shell table index (#5520 ) In addition to starting a new transaction, we also need to tell other backends --including the ones spawned for connections opened to localhost to build indexes on shards of this relation-- that concurrent index builds can safely ignore us. Normally, DefineIndex() only does that if index doesn't have any predicates (i.e.: where clause) and no index expressions at all. However, now that we already called standard process utility, index build on the shell table is finished anyway. The reason behind doing so is that we cannot guarantee not grabbing any snapshots via adaptive executor, and the backends creating indexes on local shards (if any) might block on waiting for current xact of the current backend to finish, which would cause self deadlocks that are not detectable.	2022-01-10 10:23:09 +03:00
Onder Kalaci	7cb1d6ae06	Improve metadata connections With https://github.com/citusdata/citus/pull/5493 we introduced metadata specific connections. With this connection we guarantee that there is a single metadata connection. But note that this connection can be used for any other operation. In other words, this connection is not only reserved for metadata operations. However, as https://github.com/citusdata/citus-enterprise/issues/715 showed us that the logic has a flaw. We allowed ineligible connections to be picked as metadata connections: such as exclusively claimed connections or not fully initialized connections. With this commit, we make sure that we only consider eligable connections for metadata operations.	2022-01-07 10:36:32 +01:00
Onder Kalaci	d405993b57	Make sure to use a dedicated metadata connection With this commit, we make sure to use a dedicated connection per node for all the metadata operations within the same transaction. This is needed because the same metadata (e.g., metadata includes the distributed table on the workers) can be modified accross multiple connections. With this connection we guarantee that there is a single metadata connection. But note that this connection can be used for any other operation. In other words, this connection is not only reserved for metadata operations.	2021-11-26 14:36:28 +01:00
Halil Ozan Akgul	c0eb67b24f	Skip forceCloseAtTransactionEnd connections only if BEGIN was not sent on them	2021-11-01 17:43:04 +03:00
Philip Dubé	cc50682158	Fix typos. Spurred spotting "connectios" in logs	2021-10-25 13:54:09 +00:00
Onder Kalaci	5482d5822f	Keep more statistics about connection establishment times When DEBUG4 enabled, Citus now prints per connection establishment time.	2021-04-16 14:56:31 +02:00
SaitTalhaNisanci	b453563e88	Warm up connections params hash (#4872 ) ConnParams(AuthInfo and PoolInfo) gets a snapshot, which will block the remote connectinos to localhost. And the release of snapshot will be blocked by the snapshot. This leads to a deadlock. We warm up the conn params hash before starting a new transaction so that the entries will already be there when we start a new transaction. Hence GetConnParams will not get a snapshot.	2021-04-12 13:08:38 +03:00
Marco Slot	1646fca445	Add GUC to set maximum connection lifetime	2021-03-16 01:57:57 +01:00
Philip Dubé	4e22f02997	Fix various typos due to zealous repetition	2021-03-04 19:28:15 +00:00
Onur Tirtir	253c19062a	Rename IsCitusInitiatedBackend to IsCitusInitiatedRemoteBackend (#4562 )	2021-01-23 01:07:43 +03:00
Onur Tirtir	941c8fbf32	Automatically undistribute citus local tables when no more fkeys with reference tables (#4538 )	2021-01-22 18:15:41 +03:00
Onur Tirtir	03bcccdee0	Fix hostname length check in StartNodeUserDatabaseConnection (#4363 ) Copying string before hostname length check makes the check useless	2020-11-30 20:00:35 +03:00
Onur Tirtir	7f3d1182ed	Handle invalid connection hash entries (#4362 ) If MemoryContextAlloc errors out -e.g. during an OOM-, ConnectionHashEntry->connections stays as NULL. With this commit, we add isValid flag to ConnectionHashEntry that should be set to true right after we allocate & initialize ConnectionHashEntry->connections list properly, and we check it before accesing to ConnectionHashEntry->connections.	2020-11-30 19:44:03 +03:00
Onur Tirtir	f80f4839ad	Remove unused functions that cppcheck found	2020-10-19 13:50:52 +03:00
Onder Kalaci	eeb8c81de2	Implement shared connection count reservation & enable `citus.max_shared_pool_size` for COPY With this patch, we introduce `locally_reserved_shared_connections.c/h` files which are responsible for reserving some space in shared memory counters upfront. We sometimes need to reserve connections, but not necessarily establish them. For example: - COPY command should reserve connections as it cannot know which connections it needs in which order. COPY establishes connections as any input data hits the workers. For example, for router COPY command, it only establishes 1 connection. As discussed here (https://github.com/citusdata/citus/pull/3849#pullrequestreview-431792473), COPY needs to reserve connections up-front, otherwise we can end up with resource starvation/un-detected deadlocks.	2020-08-03 18:51:40 +02:00
Onder Kalaci	a2f53dff74	Make FindAvailableConnection() more strict With adaptive connection management, we might have some connections which are not fully initialized. Those connections should not be qualified as available.	2020-07-23 15:59:50 +02:00
Jelte Fennema	c6f5d5fe88	Add some asserts to pass static analysis (#3805 )	2020-04-29 11:19:11 +02:00
Onder Kalaci	bc54c5125f	Increase the default value of citus.node_connection_timeout The previous default was 5 seconds, and we change it to 30 seconds. The main motivation for this is that for busy clusters, 5 seconds can be too aggressive. Especially with connection throttling, the servers might be kept busy for a really long time, and users may see the connection errors more frequently. We've done some sanity checks, for really quick queries (like `SELECT count(*) from table`), 30 seconds is a decent value even if users execute 300 distributed queries on the coordinator. We've verified this on Hyperscale(Citus).	2020-04-24 15:16:42 +02:00
Marco Slot	8b83306a27	Issue worker messages with the same log level	2020-04-14 21:08:25 +02:00
Onder Kalaci	aa6b641828	Throttle connections to the worker nodes With this commit, we're introducing a new infrastructure to throttle connections to the worker nodes. This infrastructure is useful for multi-shard queries, router queries are have not been affected by this. The goal is to prevent establishing more than citus.max_shared_pool_size number of connections per worker node in total, across sessions. To do that, we've introduced a new connection flag OPTIONAL_CONNECTION. The idea is that some connections are optional such as the second (and further connections) for the adaptive executor. A single connection is enough to finish the distributed execution, the others are useful to execute the query faster. Thus, they can be consider as optional connections. When an optional connection is not allowed to the adaptive executor, it simply skips it and continues the execution with the already established connections. However, it'll keep retrying to establish optional connections, in case some slots are open again.	2020-04-14 10:27:48 +02:00
Philip Dubé	ab0b59ad3b	GetConnParams: Set runtimeParamStart before setting keywords/values to avoid out of bounds access	2020-04-10 13:14:06 +00:00
Onder Kalaci	9b29a32d7a	Remove all references for side channel connections We don't need any side channel connections. That is actually problematic in the sense that it creates extra connections. Say, citus.max_adaptive_executor_pool_size equals to 1, Citus ends up using one extra connection for the intermediate results. Thus, not obeying citus.max_adaptive_executor_pool_size. In this PR, we remove the following entities from the codebase to allow further commits to implement not requiring extra connection for the intermediate results: - The connection flag REQUIRE_SIDECHANNEL - The function GivePurposeToConnection - The ConnectionPurpose struct and related fields	2020-04-07 17:06:55 +02:00
Jelte Fennema	2aabe3e2ef	Mark all connections for shutdown when citus.node_conninfo chan… (#3642 ) We cache connections between nodes in our connection management code. This is good for speed. For security this can be a problem though. If the user changes settings related to TLS encryption they want those to be applied to future queries. This is especially important when they did not have TLS enabled before and now they want to enable it. This can normally be achieved by changing citus.node_conninfo. However, because connections are not reopened there will still be old connections that might not be encrypted at all. This commit changes that by marking all connections to be shutdown at the end of their current transaction. This way running transactions will succeed, even if placement requires connections to be reused for this transaction. But after this transaction completes any future statements will use a connection created with the new connection options. If a connection is requested and a connection is found that is marked for shutdown, then we don't return this connection. Instead a new one is created. This is needed to make sure that if there are no running transactions, then the next statement will not use an old cached connection, since connections are only actually shutdown at the end of a transaction.	2020-03-24 15:31:41 +01:00
Onder Kalaci	7b4eb9611b	Properly terminate connections at the end session Citus coordinator (or MX nodes) caches `citus.max_cached_conns_per_worker` connections per node. This means that, those connections are not terminated after each statement. Instead, cached to avoid the cost of re-establishment. This is crucial for OLTP performance. The problem with that approach is that, we never properly handle the termnation of those cached connections. For instance, when a session on the coordinator disconnects, you'd see the following logs on the workers: ``` 2020-03-20 09:13:39.454 CET [64028] LOG: could not receive data from client: Connection reset by peer ``` With this patch, we're terminating the cached connections properly at the end of the connection.	2020-03-20 17:34:34 +01:00
Philip Dubé	20abc4d2b5	Replace foreach with foreach_ptr/foreach_oid (#3544 )	2020-02-27 16:54:49 +01:00
Jelte Fennema	8de8b62669	Convert unsafe APIs to safe ones	2020-02-25 15:39:27 +01:00
Philip Dubé	52042d4a00	Prefer instr_time to TimestampTz when we want CLOCK_MONOTONIC	2020-02-19 00:34:17 +00:00
Philip Dubé	c252811884	dont: don't, wont: won't, acylic: acyclic	2020-02-05 17:32:22 +00:00
Philip Dubé	fdcc413559	Code cleanup of adaptive_executor, connection_management, placement_connection adaptive_executor: sort includes, use foreach_ptr, remove lies from FinishDistributedExecution docs connection_management: rename msecs, which isn't milliseconds placement_connection: small typos	2020-01-17 17:44:47 +00:00
Philip Dubé	73c06fae3b	Introduce GetDistributeObjectOps to organize dispatch of logic dependent on node/object type	2020-01-09 18:24:29 +00:00
SaitTalhaNisanci	7ff4ce2169	Add adaptive executor support for repartition joins (#3169 ) * WIP * wip * add basic logic to run a single job with repartioning joins with adaptive executor * fix some warnings and return in ExecuteDependedTasks if there is none * Add the logic to run depended jobs in adaptive executor The execution of depended tasks logic is changed. With the current logic: - All tasks are created from the top level task list. - At one iteration: - CurTasks whose dependencies are executed are found. - CurTasks are executed in parallel with adapter executor main logic. - The iteration is repeated until all tasks are completed. * Separate adaptive executor repartioning logic * Remove duplicate parts * cleanup directories and schemas * add basic repartion tests for adaptive executor * Use the first placement to fetch data In task tracker, when there are replicas, we try to fetch from a replica for which a map task is succeeded. TaskExecution is used for this, however TaskExecution is not used in adaptive executor. So we cannot use the same thing as task tracker. Since adaptive executor fails when a map task fails (There is no retry logic yet). We know that if we try to execute a fetch task, all of its map tasks already succeeded, so we can just use the first one to fetch from. * fix clean directories logic * do not change the search path while creating a udf * Enable repartition joins with adaptive executor with only enable_reparitition_joins guc * Add comments to adaptive_executor_repartition * dont run adaptive executor repartition test in paralle with other tests * execute cleanup only in the top level execution * do cleanup only in the top level ezecution * not begin a transaction if repartition query is used * use new connections for repartititon specific queries New connections are opened to send repartition specific queries. The opened connections will be closed at the FinishDistributedExecution. While sending repartition queries no transaction is begun so that we can see all changes. * error if a modification was done prior to repartition execution * not start a transaction if a repartition query and sql task, and clean temporary files and schemas at each subplan level * fix cleanup logic * update tests * add missing function comments * add test for transaction with DDL before repartition query * do not close repartition connections in adaptive executor * rollback instead of commit in repartition join test * use close connection instead of shutdown connection * remove unnecesary connection list, ensure schema owner before removing directory * rename ExecuteTaskListRepartition * put fetch query string in planner not executor as we currently support only replication factor = 1 with adaptive executor and repartition query and we know the query string in the planner phase in that case * split adaptive executor repartition to DAG execution logic and repartition logic * apply review items * apply review items * use an enum for remote transaction state and fix cleanup for repartition * add outside transaction flag to find connections that are unclaimed instead of always opening a new transaction * fix style * wip * rename removejobdir to partition cleanup * do not close connections at the end of repartition queries * do repartition cleanup in pg catch * apply review items * decide whether to use transaction or not at execution creation * rename isOutsideTransaction and add missing comment * not error in pg catch while doing cleanup * use replication factor of the creation time, not current time to decide if task tracker should be chosen * apply review items * apply review items * apply review item	2019-12-17 19:09:45 +03:00
Marco Slot	2f568ad5a5	Forbid using connections that sent intermediate results for data access and vice versa	2019-12-17 11:49:13 +01:00
SaitTalhaNisanci	a0fe8646e0	add IsHoldOffCancellationReceived utility function (#3290 )	2019-12-12 17:32:59 +03:00
Jelte Fennema	1d8dde232f	Automatically convert useless declarations using regex replace (#3181 ) * Add declaration removal to CI * Convert declarations	2019-11-21 13:47:29 +01:00
SaitTalhaNisanci	306d159072	refactor AfterXacthodtConnectionHandling (#3202 )	2019-11-19 14:50:23 +03:00
SaitTalhaNisanci	b9b7fd7660	add IsLoggableLevel utility function (#3149 ) * add IsLoggableLevel utility function * add function comment for IsLoggableLevel * put ApplyLogRedaction to logutils	2019-11-15 14:59:13 +03:00
Jelte Fennema	1b2c438e69	Rename variables to not shadow globals in RHEL6 (#3194 ) Fixes #2839	2019-11-15 12:12:24 +01:00
Önder Kalacı	960cd02c67	Remove real time router executors (#3142 ) * Remove unused executor codes All of the codes of real-time executor. Some functions in router executor still remains there because there are common functions. We'll move them to accurate places in the follow-up commits. * Move GUCs to transaction mngnt and remove unused struct * Update test output * Get rid of references of real-time executor from code * Warn if real-time executor is picked * Remove lots of unused connection codes * Removed unused code for connection restrictions Real-time and router executors cannot handle re-using of the existing connections within a transaction block. Adaptive executor and COPY can re-use the connections. So, there is no reason to keep the code around for applying the restrictions in the placement connection logic.	2019-11-05 12:48:10 +01:00
SaitTalhaNisanci	94a7e6475c	Remove copyright years (#2918 ) * Update year as 2012-2019 * Remove copyright years	2019-10-15 17:44:30 +03:00
Marco Slot	35bef0f3db	Avoid caching connections from backends that servicei internal connections	2019-09-28 08:32:10 +02:00

1 2

89 Commits (54424583c514184ea662acd733b5abecac937198)