mirror of https://github.com/citusdata/citus.git
fix 0x7f counter values after dropping a database, allow no params to reset udf
also comment the tests and mention potential fixes for some caveats in tech readme.pull/7917/head
parent
8673bfc1b7
commit
abfc9a4982
|
@ -2721,13 +2721,19 @@ Citus keeps track of several stat counters and exposes them via the `citus_stat_
|
|||
|
||||
Details about the implementation and its caveats can be found in the header comment of [stat_counters.c](/src/backend/distributed/stat_counters.c). However, at the high level;
|
||||
1. We allocate a shared memory array of length `MaxBackends` so that each backend has its own counter slot to reduce the contention while incrementing the counters at the runtime.
|
||||
2. We also allocate a shared hash, whose entries correspond to different databases,
|
||||
Then, when a backend exits, it first aggregates its counters to the relevant entry in the shared hash, and then it resets its own counters because the same counter slot might be reused by another backend later.
|
||||
2. We also allocate a shared hash, whose entries correspond to different databases. Then, when a backend exits, it first aggregates its counters to the relevant entry in the shared hash, and then it resets its own counters because the same counter slot might be reused by another backend later.
|
||||
|
||||
Note that today we use the regular shared hash table API (`ShmemInitHash()`) to do this, but we should consider using `dshash_table()` once using many databases with Citus becomes "practically" possible because the performance of the regular shared hash table API is supposed to degrade when the number of entries in the hash table is large.
|
||||
3. So, when `citus_stat_counters` is queried, we first aggregate the counters from the shared memory array and then we add this with the counters aggregated so far in the relevant shared hash entry for the database.
|
||||
This means that if we weren't aggregating the counters in the shared hash when exiting, counters seen in `citus_stat_counters` could drift backwards in time. Note that `citus_stat_counters` might observe the counters for a backend twice or perhaps unsee it if the backend was concurrently exiting, However, the next call to `citus_stat_counters` will see the correct values for the counters, so we can live with that for now.
|
||||
4. Finally, when `citus_stat_counters_reset()` is called, we reset the shared hash entry for the relevant database and also reset the relevant slots in the shared memory array for the provided database. Note that there is chance that `citus_stat_counters_reset()` might partially fail to reset the counters for of a backend slot under some rare circumstances, but this should be very rare and we choose to ignore that for the sake of lock-free counter increments.
|
||||
5. As of today, we don't persist stat counters on server shutdown. Although it seems quite straightforward to do so, we skipped doing that at v1. Once we decide to persist the counters, one can check the relevant functions that we have for `citus_stat_statements`, namely, `CitusQueryStatsShmemShutdown()` and `CitusQueryStatsShmemStartup()`. And since it has been quite a long time since we wrote these two functions, we should also make sure to check `pgstat_write_statsfile()` and `pgstat_read_statsfile()` in Postgres to double check if we're missing anything -it seems we have a few-.
|
||||
6. Note that today we don't evict the entries of the said hash table that point to dropped databases because the wrapper view anyway filters them out (thanks to LEFT JOIN) and we don't expect a performance hit when reading from / writing to the hash table because of that unless users have a lot of databases that are dropped and recreated frequently. If some day we think that this is a problem, then we can let Citus maintenance daemon to do that for us periodically.
|
||||
This means that if we weren't aggregating the counters in the shared hash when exiting, counters seen in `citus_stat_counters` could drift backwards in time.
|
||||
|
||||
Note that `citus_stat_counters` might observe the counters for a backend twice or perhaps unsee it if the backend was concurrently exiting, However, the next call to `citus_stat_counters` will see the correct values for the counters, so we can live with that for now. However, if one day we think that this is very undesirable, we can enforce blocking behavior between the whole period of `citus_stat_counters` queries and saving the counters in the shared hash entry. However, that will also mean that exiting backends will have to wait for any active `citus_stat_counters` queries to finish, so this needs to be carefully considered.
|
||||
4. Finally, when `citus_stat_counters_reset()` is called, we reset the shared hash entry for the relevant database and also reset the relevant slots in the shared memory array for the provided database.
|
||||
|
||||
Note that there is chance that `citus_stat_counters_reset()` might partially fail to reset the counters for of a backend slot under some rare circumstances, but this should be very rare and we choose to ignore that for the sake of lock-free counter increments.
|
||||
5. Also, today neither of `citus_stat_counters` nor `citus_stat_counters_reset()` explicitly exclude the backend slots that belong to exited backends during their operations. Instead, they consider any "not unused" backend slots where the relevant `PGPROC` points to a valid database oid, which doesn't guarantee that the backend slot is actively used. However, in practice, this is not a problem for neither of these operations due to the reasons mentioned in the relevant functions. However, if we ever decide that this fact slows down these operations, we can consider explicitly excluding the exited backend slots by checking the `BackendData`'s `activeBackend` field.
|
||||
6. As of today, we don't persist stat counters on server shutdown. Although it seems quite straightforward to do so, we skipped doing that at v1. Once we decide to persist the counters, one can check the relevant functions that we have for `citus_stat_statements`, namely, `CitusQueryStatsShmemShutdown()` and `CitusQueryStatsShmemStartup()`. And since it has been quite a long time since we wrote these two functions, we should also make sure to check `pgstat_write_statsfile()` and `pgstat_read_statsfile()` in Postgres to double check if we're missing anything -it seems we have a few-.
|
||||
7. Note that today we don't evict the entries of the said hash table that point to dropped databases because the wrapper view anyway filters them out (thanks to LEFT JOIN) and we don't expect a performance hit when reading from / writing to the hash table because of that unless users have a lot of databases that are dropped and recreated frequently. If some day we think that this is a problem, then we can let Citus maintenance daemon to do that for us periodically.
|
||||
|
||||
The reason why we don't just use a shared hash table for the counters is that it could be more expensive to do hash lookups for each increment. Plus, using a single counter slot for all the backends that are connected to the same database could lead to contention because that definitely requires a lock to be taken. However, incrementing a counter in today's implementation doesn't require acquiring any sort of locks.
|
||||
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
CREATE OR REPLACE FUNCTION pg_catalog.citus_stat_counters_reset(database_oid oid)
|
||||
CREATE OR REPLACE FUNCTION pg_catalog.citus_stat_counters_reset(database_oid oid DEFAULT 0)
|
||||
RETURNS VOID
|
||||
LANGUAGE C STRICT PARALLEL SAFE
|
||||
AS 'MODULE_PATHNAME', $$citus_stat_counters_reset$$;
|
||||
COMMENT ON FUNCTION pg_catalog.citus_stat_counters_reset(oid) IS 'Resets Citus stat counters for the given database OID.';
|
||||
COMMENT ON FUNCTION pg_catalog.citus_stat_counters_reset(oid) IS 'Resets Citus stat counters for the given database OID or for the current database if nothing or 0 is provided.';
|
||||
|
||||
-- Rather than using explicit superuser() check in the function, we use
|
||||
-- the GRANT system to REVOKE access to it when creating the extension.
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
CREATE OR REPLACE FUNCTION pg_catalog.citus_stat_counters_reset(database_oid oid)
|
||||
CREATE OR REPLACE FUNCTION pg_catalog.citus_stat_counters_reset(database_oid oid DEFAULT 0)
|
||||
RETURNS VOID
|
||||
LANGUAGE C STRICT PARALLEL SAFE
|
||||
AS 'MODULE_PATHNAME', $$citus_stat_counters_reset$$;
|
||||
COMMENT ON FUNCTION pg_catalog.citus_stat_counters_reset(oid) IS 'Resets Citus stat counters for the given database OID.';
|
||||
COMMENT ON FUNCTION pg_catalog.citus_stat_counters_reset(oid) IS 'Resets Citus stat counters for the given database OID or for the current database if nothing or 0 is provided.';
|
||||
|
||||
-- Rather than using explicit superuser() check in the function, we use
|
||||
-- the GRANT system to REVOKE access to it when creating the extension.
|
||||
|
|
|
@ -192,6 +192,8 @@ static Size SharedBackendStatsSlotArrayShmemSize(void);
|
|||
/* helper functions for citus_stat_counters() */
|
||||
static void CollectActiveBackendStatsIntoHTAB(Oid databaseId, HTAB *databaseStats);
|
||||
static void CollectSavedBackendStatsIntoHTAB(Oid databaseId, HTAB *databaseStats);
|
||||
static DatabaseStatsHashEntry * DatabaseStatsHashEntryFindOrCreate(Oid databaseId,
|
||||
HTAB *databaseStats);
|
||||
static void StoreDatabaseStatsIntoTupStore(HTAB *databaseStats,
|
||||
Tuplestorestate *tupleStore,
|
||||
TupleDesc tupleDescriptor);
|
||||
|
@ -201,8 +203,8 @@ static bool ResetActiveBackendStats(Oid databaseId);
|
|||
static void ResetSavedBackendStats(Oid databaseId, bool force);
|
||||
|
||||
/* saved backend stats */
|
||||
static SavedBackendStatsHashEntry * SavedBackendStatsHashEntryAllocIfNotExists(Oid
|
||||
databaseId);
|
||||
static SavedBackendStatsHashEntry * SavedBackendStatsHashEntryCreateIfNotExists(Oid
|
||||
databaseId);
|
||||
|
||||
|
||||
/* sql exports */
|
||||
|
@ -313,6 +315,16 @@ citus_stat_counters_reset(PG_FUNCTION_ARGS)
|
|||
PG_ENSURE_ARGNOTNULL(0, "database_id");
|
||||
Oid databaseId = PG_GETARG_OID(0);
|
||||
|
||||
/*
|
||||
* If the database id is InvalidOid, then we assume that
|
||||
* the caller wants to reset the stat counters for the
|
||||
* current database.
|
||||
*/
|
||||
if (databaseId == InvalidOid)
|
||||
{
|
||||
databaseId = MyDatabaseId;
|
||||
}
|
||||
|
||||
/* just to be on the safe side */
|
||||
if (!EnsureStatCountersShmemInitDone())
|
||||
{
|
||||
|
@ -464,7 +476,7 @@ SaveBackendStatsIntoSavedBackendStatsHash(void)
|
|||
LWLockAcquire(*SharedSavedBackendStatsHashLock, LW_EXCLUSIVE);
|
||||
|
||||
dbSavedBackendStatsEntry =
|
||||
SavedBackendStatsHashEntryAllocIfNotExists(databaseId);
|
||||
SavedBackendStatsHashEntryCreateIfNotExists(databaseId);
|
||||
|
||||
LWLockRelease(*SharedSavedBackendStatsHashLock);
|
||||
|
||||
|
@ -639,15 +651,8 @@ CollectActiveBackendStatsIntoHTAB(Oid databaseId, HTAB *databaseStats)
|
|||
continue;
|
||||
}
|
||||
|
||||
bool found = false;
|
||||
DatabaseStatsHashEntry *dbStatsEntry = (DatabaseStatsHashEntry *)
|
||||
hash_search(databaseStats, &procDatabaseId,
|
||||
HASH_ENTER, &found);
|
||||
if (!found)
|
||||
{
|
||||
MemSet(dbStatsEntry->counters, 0, sizeof(StatCounters));
|
||||
dbStatsEntry->resetTimestamp = 0;
|
||||
}
|
||||
DatabaseStatsHashEntry *dbStatsEntry =
|
||||
DatabaseStatsHashEntryFindOrCreate(procDatabaseId, databaseStats);
|
||||
|
||||
BackendStatsSlot *backendStatsSlot =
|
||||
&SharedBackendStatsSlotArray[backendSlotIdx];
|
||||
|
@ -689,9 +694,8 @@ CollectSavedBackendStatsIntoHTAB(Oid databaseId, HTAB *databaseStats)
|
|||
|
||||
if (dbSavedBackendStatsEntry)
|
||||
{
|
||||
DatabaseStatsHashEntry *dbStatsEntry = (DatabaseStatsHashEntry *)
|
||||
hash_search(databaseStats, &databaseId,
|
||||
HASH_ENTER, NULL);
|
||||
DatabaseStatsHashEntry *dbStatsEntry =
|
||||
DatabaseStatsHashEntryFindOrCreate(databaseId, databaseStats);
|
||||
|
||||
SpinLockAcquire(&dbSavedBackendStatsEntry->mutex);
|
||||
|
||||
|
@ -715,11 +719,9 @@ CollectSavedBackendStatsIntoHTAB(Oid databaseId, HTAB *databaseStats)
|
|||
SavedBackendStatsHashEntry *dbSavedBackendStatsEntry = NULL;
|
||||
while ((dbSavedBackendStatsEntry = hash_seq_search(&hashSeqStatus)) != NULL)
|
||||
{
|
||||
DatabaseStatsHashEntry *dbStatsEntry = (DatabaseStatsHashEntry *)
|
||||
hash_search(databaseStats,
|
||||
&dbSavedBackendStatsEntry
|
||||
->databaseId, HASH_ENTER,
|
||||
NULL);
|
||||
DatabaseStatsHashEntry *dbStatsEntry =
|
||||
DatabaseStatsHashEntryFindOrCreate(dbSavedBackendStatsEntry->databaseId,
|
||||
databaseStats);
|
||||
|
||||
SpinLockAcquire(&dbSavedBackendStatsEntry->mutex);
|
||||
|
||||
|
@ -740,6 +742,29 @@ CollectSavedBackendStatsIntoHTAB(Oid databaseId, HTAB *databaseStats)
|
|||
}
|
||||
|
||||
|
||||
/*
|
||||
* DatabaseStatsHashEntryFindOrCreate creates a new entry in databaseStats
|
||||
* hash table for the given database id if it doesn't already exist and
|
||||
* initializes it, or just returns the existing entry if it does.
|
||||
*/
|
||||
static DatabaseStatsHashEntry *
|
||||
DatabaseStatsHashEntryFindOrCreate(Oid databaseId, HTAB *databaseStats)
|
||||
{
|
||||
bool found = false;
|
||||
DatabaseStatsHashEntry *dbStatsEntry = (DatabaseStatsHashEntry *)
|
||||
hash_search(databaseStats, &databaseId,
|
||||
HASH_ENTER, &found);
|
||||
|
||||
if (!found)
|
||||
{
|
||||
MemSet(dbStatsEntry->counters, 0, sizeof(StatCounters));
|
||||
dbStatsEntry->resetTimestamp = 0;
|
||||
}
|
||||
|
||||
return dbStatsEntry;
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* StoreDatabaseStatsIntoTupStore stores the database stats from the
|
||||
* databaseStats hash table into given tuple store.
|
||||
|
@ -866,7 +891,7 @@ ResetSavedBackendStats(Oid databaseId, bool force)
|
|||
LWLockAcquire(*SharedSavedBackendStatsHashLock, LW_EXCLUSIVE);
|
||||
|
||||
dbSavedBackendStatsEntry =
|
||||
SavedBackendStatsHashEntryAllocIfNotExists(databaseId);
|
||||
SavedBackendStatsHashEntryCreateIfNotExists(databaseId);
|
||||
|
||||
LWLockRelease(*SharedSavedBackendStatsHashLock);
|
||||
|
||||
|
@ -906,9 +931,9 @@ ResetSavedBackendStats(Oid databaseId, bool force)
|
|||
|
||||
|
||||
/*
|
||||
* SavedBackendStatsHashEntryAllocIfNotExists allocates a new entry in the
|
||||
* SavedBackendStatsHashEntryCreateIfNotExists creates a new entry in the
|
||||
* saved backend stats hash table for the given database id if it doesn't
|
||||
* already exist.
|
||||
* already exist and initializes it.
|
||||
*
|
||||
* Assumes that the caller has exclusive access to the hash table since it
|
||||
* performs HASH_ENTER_NULL.
|
||||
|
@ -917,7 +942,7 @@ ResetSavedBackendStats(Oid databaseId, bool force)
|
|||
* we're out of (shared) memory.
|
||||
*/
|
||||
static SavedBackendStatsHashEntry *
|
||||
SavedBackendStatsHashEntryAllocIfNotExists(Oid databaseId)
|
||||
SavedBackendStatsHashEntryCreateIfNotExists(Oid databaseId)
|
||||
{
|
||||
bool found = false;
|
||||
SavedBackendStatsHashEntry *dbSavedBackendStatsEntry =
|
||||
|
|
|
@ -113,7 +113,9 @@ SELECT stats_reset IS NOT NULL FROM citus_stat_counters WHERE name = current_dat
|
|||
(1 row)
|
||||
|
||||
-- multi_1_schedule has this test in an individual line, so there cannot be any other backends
|
||||
-- that can update the stat counters other than us.
|
||||
-- that can update the stat counters other than us except Citus Maintenance Daemon, but
|
||||
-- Citus Maintenance Daemon is not supposed to update the query related stats, so we can
|
||||
-- ensure that query related stats are 0.
|
||||
--
|
||||
-- So, no one could have incremented query related stats so far.
|
||||
SELECT query_execution_single_shard = 0, query_execution_multi_shard = 0 FROM citus_stat_counters;
|
||||
|
@ -180,7 +182,8 @@ SELECT query_execution_single_shard = 1 FROM (SELECT (citus_stat_counters(oid)).
|
|||
t
|
||||
(1 row)
|
||||
|
||||
SELECT citus_stat_counters_reset(oid) FROM pg_database WHERE datname = current_database();
|
||||
-- reset the stat counters for the current database by providing nothing to citus_stat_counters_reset()
|
||||
SELECT citus_stat_counters_reset();
|
||||
citus_stat_counters_reset
|
||||
---------------------------------------------------------------------
|
||||
|
||||
|
@ -324,7 +327,8 @@ SET citus.enable_stat_counters TO true;
|
|||
-- make sure that the GUC is disabled
|
||||
SET citus.enable_stat_counters TO false;
|
||||
SET client_min_messages TO NOTICE;
|
||||
SELECT citus_stat_counters_reset(oid) FROM pg_database WHERE datname = current_database();
|
||||
-- reset the stat counters for the current database by providing 0 to citus_stat_counters_reset()
|
||||
SELECT citus_stat_counters_reset(0);
|
||||
citus_stat_counters_reset
|
||||
---------------------------------------------------------------------
|
||||
|
||||
|
|
|
@ -45,7 +45,7 @@ test: single_shard_table_udfs
|
|||
test: schema_based_sharding
|
||||
test: citus_schema_distribute_undistribute
|
||||
# Don't parallelize stat_counters with others because we don't want statistics
|
||||
# to be updated by other tests concurrently.
|
||||
# to be updated by other tests concurrently except Citus Maintenance Daemon.
|
||||
#
|
||||
# Also, this needs to be the first test that calls citus_stat_counters()
|
||||
# because it checks the value of stats_reset column before calling the function.
|
||||
|
|
|
@ -61,7 +61,9 @@ SELECT citus_stat_counters_reset(oid) FROM pg_database WHERE datname = current_d
|
|||
SELECT stats_reset IS NOT NULL FROM citus_stat_counters WHERE name = current_database();
|
||||
|
||||
-- multi_1_schedule has this test in an individual line, so there cannot be any other backends
|
||||
-- that can update the stat counters other than us.
|
||||
-- that can update the stat counters other than us except Citus Maintenance Daemon, but
|
||||
-- Citus Maintenance Daemon is not supposed to update the query related stats, so we can
|
||||
-- ensure that query related stats are 0.
|
||||
--
|
||||
-- So, no one could have incremented query related stats so far.
|
||||
SELECT query_execution_single_shard = 0, query_execution_multi_shard = 0 FROM citus_stat_counters;
|
||||
|
@ -91,7 +93,8 @@ SET citus.enable_stat_counters TO true;
|
|||
SELECT * FROM dist_table WHERE a = 1;
|
||||
SELECT query_execution_single_shard = 1 FROM (SELECT (citus_stat_counters(oid)).* FROM pg_database WHERE datname = current_database()) q;
|
||||
|
||||
SELECT citus_stat_counters_reset(oid) FROM pg_database WHERE datname = current_database();
|
||||
-- reset the stat counters for the current database by providing nothing to citus_stat_counters_reset()
|
||||
SELECT citus_stat_counters_reset();
|
||||
|
||||
SELECT query_execution_single_shard = 0 FROM (SELECT (citus_stat_counters(oid)).* FROM pg_database WHERE datname = current_database()) q;
|
||||
|
||||
|
@ -167,7 +170,8 @@ SET citus.enable_stat_counters TO false;
|
|||
|
||||
SET client_min_messages TO NOTICE;
|
||||
|
||||
SELECT citus_stat_counters_reset(oid) FROM pg_database WHERE datname = current_database();
|
||||
-- reset the stat counters for the current database by providing 0 to citus_stat_counters_reset()
|
||||
SELECT citus_stat_counters_reset(0);
|
||||
|
||||
-- No one could have incremented query related stats and connection_reused so far.
|
||||
SELECT query_execution_single_shard = 0, query_execution_multi_shard = 0, connection_reused = 0 FROM citus_stat_counters WHERE name = current_database();
|
||||
|
|
Loading…
Reference in New Issue