The deadlock scenario is describe below:
1. pgss_store is called, it acquires the lock pgss->lock.
2. An error ocurr, mostly out of memory when accessing internal hash
tables used to store internal data, on functions
pgss_store_query_info and pgss_get_entry.
3. Call elog() to report out of memory error.
4. Our pgsm_emit_log_hook is called, it calls pgss_store_error, which in
turn calls pgss_store.
5. Try to acquire already acquired lock pgss->lock, deadlock happens.
To fix the problem, there are two modifications worth mentioning done by
this commit:
1. We are now passing HASH_ENTER_NULL flag to hash_search, instead of
HASH_ENTER, as read in postgresql sources, this prevents this
function from reporting error when out of memory, but instead it will
only return NULL if we pass HASH_ENTER_NULL, so we can handle the
error ourselves.
2. In pgss_store, if an error happens after the pgss->lock is acquired,
we only set a flag, then, after releasing the lock, we check if the
flag is set and report the error accordingly.
Added a new view 'pg_stat_monitor_hook_stats' that provide execution
time statistics for all hooks installed by the module, following is a
description of the fields:
- hook: The hook function name.
- min_time: The fastest execution time recorded for the given hook.
- max_time: The slowest execution time recorded for the given hook.
- total_time: Total execution time taken by all calls to the hook.
- avg_time: Average execution time of a call to the hook.
- ncalls: Total number of calls to the hook.
- load_comparison: A percentual of time taken by an individual hook
compared to every other hook.
To enable benchmark, code must be compiled with -DBENCHMARK flag, this
will make the hook functions to be replaced by a function with the same
name plus a '_benchmark' suffix, e.g. hook_function_benchmark.
The hook_function_benchmark will call the original function and
calculate the amount of time it took to execute, than it will update
statistics for that hook.
Add application name to the key used to identify queries in the hash
table, this allows different applications to have separate entries in
pg_stat_monitor view if they issued the same query.
Jira: PG-141
There is lock conflict, so used LW_EXCLUSIVE instead of LW_SHARED. This
need to be investigatedĀ again and check the possibility to use a shared lock.