bgworkers with backend connection should handle SIGTERM properly (#6552)

Fixes task executor SIGTERM handling.

Problem:
When task executors are sent SIGTERM, their default handler
`bgworker_die`, which is set at worker startup, logs FATAL error. But
they do not release locks there before logging the error, which
sometimes causes hanging of the monitor. e.g. Monitor waits for the lock
forever at pg_stat flush after calling proc_exit.

Solution:
Because executors have connection to backend, they should handle SIGTERM
similar to normal backends. Normal backends uses `die` handler, in which
they set ProcDiePending flag and the next CHECK_FOR_INTERRUPTS call
handles it gracefully by releasing any lock before termination.
hotfix/6440
aykut-bozkurt 2022-12-12 16:44:36 +03:00 committed by GitHub
parent f6b8990fc7
commit 3da6e3e743
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 10 additions and 5 deletions

View File

@ -1613,6 +1613,8 @@ CitusBackgroundJobExecutorErrorCallback(void *arg)
void
CitusBackgroundTaskExecutor(Datum main_arg)
{
/* handles SIGTERM similar to backends */
pqsignal(SIGTERM, die);
BackgroundWorkerUnblockSignals();
/* Set up a dynamic shared memory segment. */

View File

@ -304,3 +304,6 @@ s/LOG: duration: [0-9].[0-9]+ ms/LOG: duration: xxxx ms/g
s/"Total Cost": [0-9].[0-9]+/"Total Cost": xxxx/g
s/(NOTICE: issuing SET LOCAL application_name TO 'citus_rebalancer gpid=)[0-9]+/\1xxxxx/g
# PG13 changes bgworker sigterm message, we can drop that line with PG13 drop
s/(FATAL: terminating).*Citus Background Task Queue Executor.*(due to administrator command)\+/\1 connection \2 \+/g

View File

@ -454,13 +454,13 @@ SELECT pg_sleep(2); -- wait enough to show that tasks are terminated
SELECT task_id, status, retry_count, message FROM pg_dist_background_task
WHERE task_id IN (:task_id1, :task_id2)
ORDER BY task_id; -- show that all tasks are runnable by retry policy after termination signal
task_id | status | retry_count | message
task_id | status | retry_count | message
---------------------------------------------------------------------
21 | runnable | 1 | FATAL: terminating background worker "Citus Background Task Queue Executor: regression/postgres for (13/21)" due to administrator command+
| | | CONTEXT: Citus Background Task Queue Executor: regression/postgres for (13/21) +
21 | runnable | 1 | FATAL: terminating connection due to administrator command +
| | | CONTEXT: Citus Background Task Queue Executor: regression/postgres for (13/21)+
| | |
22 | runnable | 1 | FATAL: terminating background worker "Citus Background Task Queue Executor: regression/postgres for (14/22)" due to administrator command+
| | | CONTEXT: Citus Background Task Queue Executor: regression/postgres for (14/22) +
22 | runnable | 1 | FATAL: terminating connection due to administrator command +
| | | CONTEXT: Citus Background Task Queue Executor: regression/postgres for (14/22)+
| | |
(2 rows)