citus

Commit Graph

Author	SHA1	Message	Date
Onder Kalaci	a94184fff8	Prevent overflow of memory accesses during deadlock detection In the distributed deadlock detection design, we concluded that prepared transactions cannot be part of a distributed deadlock. The idea is that (a) when the transaction is prepared it already acquires all the locks, so cannot be part of a deadlock (b) even if some other processes blocked on the prepared transaction, prepared transactions would eventually be committed (or rollbacked) and the system will continue operating. With the above in mind, we probably had a mistake in terms of memory allocations. For each backend initialized, we keep a `BackendData` struct. The bug we've introduced is that, we assumed there would only be `MaxBackend` number of backends. However, `MaxBackends` doesn't include the prepared transactions and axuliary processes. When you check Postgres' InitProcGlobal` you'd see that `TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;` This commit aligns with total procs processed with that.	2018-09-17 16:23:29 +03:00
Onder Kalaci	d657759c97	Views to Provide some insight about the distributed transactions on Citus MX With this commit, we implement two views that are very similar to pg_stat_activity, but showing queries that are involved in distributed queries: - citus_dist_stat_activity: Shows all the distributed queries - citus_worker_stat_activity: Shows all the queries on the shards that are initiated by distributed queries. Both views have the same columns in the outputs. In very basic terms, both of the views are meant to provide some useful insights about the distributed transactions within the cluster. As the names reveal, both views are similar to pg_stat_activity. Also note that these views can be pretty useful on Citus MX clusters. Note that when the views are queried from the worker nodes, they'd not show the distributed transactions that are initiated from the coordinator node. The reason is that the worker nodes do not know the host/port of the coordinator. Thus, it is advisable to query the views from the coordinator. If we bucket the columns that the views returns, we'd end up with the following: - Hostnames and ports: - query_hostname, query_hostport: The node that the query is running - master_query_host_name, master_query_host_port: The node in the cluster initiated the query. Note that for citus_dist_stat_activity view, the query_hostname-query_hostport is always the same with master_query_host_name-master_query_host_port. The distinction is mostly relevant for citus_worker_stat_activity. For example, on Citus MX, a users starts a transaction on Node-A, which starts worker transactions on Node-B and Node-C. In that case, the query hostnames would be Node-B and Node-C whereas the master_query_host_name would Node-A. - Distributed transaction related things: This is mostly the process_id, distributed transactionId and distributed transaction number. - pg_stat_activity columns: These two views get all the columns from pg_stat_activity. We're basically joining pg_stat_activity with get_all_active_transactions on process_id.	2018-09-10 21:33:27 +03:00
Onder Kalaci	5bea95009b	Skip autovacuum processes for distributed deadlock detection Autovacuum process cancels itself if any modification starts on the table in order to avoid blocking your regular Postgres sessions. That's normal and expected. Thus, any locks held by autovacuum process cannot involve in a distributed deadlock since it'll be released if needed.	2017-11-15 14:32:16 +02:00
Onder Kalaci	c65c153a46	Skip speculative locks for distributed deadlock detection These locks are held for a very short duration time and cannot contribute to a deadlock. Speculative locks are used by Postgres for internal notification mechanism among transactions.	2017-11-15 12:43:45 +02:00
Onder Kalaci	94921a2be1	Skip page-level locks on distributed deadlock detection Short-term share/exclusive page-level locks are used for read/write access. Locks are released immediately after each index row is fetched or inserted. Since those locks may not lead to any deadlocks, it's safe to ignore them in the distributed deadlock detection.	2017-11-09 10:37:23 +02:00
Onder Kalaci	68ca8cb7f0	Skip relation extension locks We should skip if the process blocked on the relation extension since those locks are hold for a short duration while the relation is actually extended on the disk and released as soon as the extension is done. Thus, recording such waits on our lock graphs could yield detecting wrong distributed deadlocks.	2017-09-28 10:09:09 +03:00
Marco Slot	dbf18df995	Don't error out if BuildGlobalWaitGraph fails to connect	2017-08-23 19:08:03 +02:00
Marco Slot	641420d79f	Remove source node argument from dump_local_wait_edges	2017-08-23 13:14:00 +02:00
Marco Slot	bd6bf29983	Don't add procs multiple times in BuildWaitGraphForSourceNode	2017-08-21 16:48:30 +02:00
Onder Kalaci	a333c9f16c	Add infrastructure for distributed deadlock detection This commit adds all the necessary pieces to do the distributed deadlock detection. Each distributed transaction is already assigned with distributed transaction ids introduced with `3369f3486f`. The dependency among the distributed transactions are gathered with `80ea233ec1`. With this commit, we implement a DFS (depth first seach) on the dependency graph and search for cycles. Finding a cycle reveals a distributed deadlock. Once we find the deadlock, we examine the path that the cycle exists and cancel the youngest distributed transaction. Note that, we're not yet enabling the deadlock detection by default with this commit.	2017-08-12 13:28:37 +03:00
Brian Cloutier	9d93fb5551	Create citus.use_secondary_nodes GUC This GUC has two settings, 'always' and 'never'. When it's set to 'never' all behavior stays exactly as it was prior to this commit. When it's set to 'always' only SELECT queries are allowed to run, and only secondary nodes are used when processing those queries. Add some helper functions: - WorkerNodeIsSecondary(), checks the noderole of the worker node - WorkerNodeIsReadable(), returns whether we're currently allowed to read from this node - ActiveReadableNodeList(), some functions (namely, the ones on the SELECT path) don't require working with Primary Nodes. They should call this function instead of ActivePrimaryNodeList(), because the latter will error out in contexts where we're not allowed to write to nodes. - ActiveReadableNodeCount(), like the above, replaces ActivePrimaryNodeCount(). - EnsureModificationsCanRun(), error out if we're not currently allowed to run queries which modify data. (Either we're in read-only mode or use_secondary_nodes is set) Some parts of the code were switched over to use readable nodes instead of primary nodes: - Deadlock detection - DistributedTableSize, - the router, real-time, and task tracker executors - ShardPlacement resolution	2017-08-10 17:37:17 +03:00
Onder Kalaci	b5ea3ab6a3	Improve locking semantics for backend management We use the backend shared memory lock for preventing new backends to be part of a new distributed transaction or an existing backend to leave a distributed transaction while we're reading the all backends' data. The primary goal is to provide consistent view of the current distributed transactions while doing the deadlock detection.	2017-08-09 17:17:12 +03:00
Marco Slot	80ea233ec1	Add function for dumping global wait edges	2017-07-25 16:52:32 +02:00
Marco Slot	81198a1d02	Add function for dumping local wait edges	2017-07-25 16:52:32 +02:00

14 Commits (a94184fff8801825f0bb6a3d990574b44a02fc1f)