pg_stat_monitor

mirror of https://github.com/percona/pg_stat_monitor.git synced 2026-02-04 22:16:20 +00:00

Author	SHA1	Message	Date
Naeem Akhter	8647a52856	PG-292: Automate the Q/A and implement tap testcases. This commit brings following changes: 1) Implementation of tap based testing using perl language for different scenarios that could not be covered under traditional SQL based diff testing, or require server start/shutdown. At this point of time, tap testing is only enabled for Postgres 14 & 13, for rest of back branches it will be done at laster time as there is substantial change in number of columns and their names. 2) Changes to github action workflows for Postgres 14 & 13 to accomodate the requirements for tap testing. 3) Similarly, minor changes to Makefile are also done. 4) Testing of supported GUCs using tap tests for different possible configuration. 5) pg_stat_monitor_reset_errors testing using the tap testcases. 6) Insufficient shared space and buffer overflow testing via tap testcases. 7) Some sql scripts under 'scripts' folder to generate some work load requried for tap test cases. 8) Everything under 't' folder is specific to perl based test cases. It houses perl files and folders for some expected files and result folder. 9) 90%+ code coverage for LOC and functions. 10) PG-339 Fix by diego, change in pgsm_errors.c.	2022-01-27 23:49:38 +05:00
Ibrar Ahmed	363f4ab2bd	Merge pull request #175 from darkfronza/PG-338_fix_query_call_count PG-338: Fix query call count	2022-01-24 18:45:30 +05:00
Ibrar Ahmed	9ebf66ec04	Merge pull request #177 from darkfronza/PG-311_fix_cumulative_cpu_system_time PG-311: Fix cpu_time and system_time accumulation.	2022-01-24 18:44:39 +05:00
Diego Fronza	868551c1b6	PG-311: Fix cpu_time and system_time accumulation. Fixed a small typo introduced by commit `eaa9c08e24`: PG-316: Convert blocks timings to average per buckets. The e->counters.sysinfo.utime = and e->counters.sysinfo.stime = operations should be += to accumulate the results and calculate the average correctly.	2022-01-21 16:07:28 -03:00
Diego Fronza	52fb1fbc7d	PG-338: Fix counters regression test. After fixing the problem with utility statements, this whole block: do $$ declare n integer:= 1; begin loop PERFORM a,b,c,d FROM t1, t2, t3, t4 WHERE t1.a = t2.b AND t3.c = t4.d ORDER BY a; exit when n = 1000; n := n + 1; end loop; end $$; Is only processed once, as those are nested statements, in order to match the 1000 statements the GUC pg_stat_monitor.track must be set to 'all' and then back to the default of 'top' when done testing it.	2022-01-21 13:44:05 -03:00
Diego Fronza	2c14e1dc08	PG-338: Fix query call count (utilities). There was a missing increment/decrement to exec_nested_level in pgss_ProcessUtility hook, due to this, some utility statements could end up being processed more than once, as PostgreSQL may recurse into this hook for sub-statements or when processing a query string containing multiple semicolon-separated statements.	2022-01-21 13:35:14 -03:00
Ibrar Ahmed	74780a30af	Merge pull request #174 from nastena1606/PG-287-Doc-delete-pgsm PG-287 Added uninstall steps to README	2022-01-21 02:59:38 +05:00
Anastasia Alexadrova	0470546069	PG-287 Added uninstall steps to README modified: README.md	2022-01-20 11:12:46 +02:00
Ibrar Ahmed	37b05decda	Merge pull request #172 from Naeem-Akhter/sql_new_testcases PG-267: Add testcase to test histogram, unique application name and insert error.	2022-01-20 01:39:49 +05:00
Ibrar Ahmed	e6f89725e2	Merge pull request #173 from LenzGr/update-forum docs: Updated Forum URL, removed Discord link	2022-01-20 01:39:32 +05:00
Lenz Grimmer	63abff2fe5	docs: Updated Forum URL, removed Discord link Updated the Forum URL to point to the dedicated pg_stat_monitor forum in `README.md` and `CONTRIBUTING.md`, removed link to the Discord channel to ensure that conversations are focused to one location. Updated the table of contents in the README. Signed-off-by: Lenz Grimmer <lenz.grimmer@percona.com>	2022-01-19 15:35:35 +01:00
Naeem Akhter	433472ef12	PG-267 Add testcase to test histogram. This commit adds following three sql based testcases: 1) Test unique application name set by user. 2) Histogram function is working properly as desired. 3) Error on insert is shown with proper message.	2022-01-19 18:04:02 +05:00
Ibrar Ahmed	26000f40b6	Merge pull request #170 from darkfronza/PG-329_fix_errors_view_sql_master PG-329: Fix creation of pg_stat_monitor_errors view on SQL files.	2022-01-18 19:30:40 +05:00
Ibrar Ahmed	86b098f0f2	Merge pull request #171 from nastena1606/PG-308-Readme-link-to-view-doc PG-308 README Updates	2022-01-18 19:30:22 +05:00
Diego Fronza	e38d79365b	PG-329: Fix creation of pg_stat_monitor_errors view on SQL files. After the split into multiple pg_stat_monitor--1.0.XX.sql.in sql files, where XX is the PostgreSQL version, it was forgotten to add the errors view to the relevant files, this commit fixes that.	2022-01-17 15:29:40 -03:00
Ibrar Ahmed	09e0f26511	Merge pull request #169 from nastena1606/PG-298-Readme-add-report-bug PG-298 Add 'Report a bug' section to readme	2022-01-13 22:11:31 +05:00
Ibrar Ahmed	cc8506233d	Merge pull request #165 from darkfronza/PG-296_fix_application_name_master PG-296: Fix application name.	2022-01-13 22:09:20 +05:00
Ibrar Ahmed	eb9e1c1d49	Merge pull request #167 from darkfronza/PG-325_fix_deadlock_master PG-325: Fix deadlock.	2022-01-13 22:08:29 +05:00
Anastasia Alexadrova	1ad5a06d46	PG-308 README Updates Added a link to pg_stat_monitor view reference in the Overview Aded PG 14 to supported versions Updated links in the Documentation section modified: README.md	2022-01-13 16:47:00 +02:00
Anastasia Alexadrova	dcfde045a0	PG-298 Add 'Report a bug' section to readme modified: README.md	2022-01-13 10:53:49 +02:00
Diego Fronza	d25dcadadd	PG-325: Fix deadlock. If a query exceeds pg_stat_monitor.pgsm_query_max_len, then it's truncated before we save it into the query buffer (SaveQueryText). When reading the query back, on pg_stat_monitor_internal, we allocate a buffer for the query with length = pg_stat_monitor.pgsm_query_max_len, the problem is that the read_query function adds a '\0' to the end of the buffer when reading a query, thus if a query has been truncated, for example, to 1024, when reading it back read_query will store the '\0' at the position 1025, an out of array bounds position. Then, when we call pfree to release the buffer, PostgreSQL notices the buffer overrun and triggers an error assertion, the assertion calls our error hook which attempts to acquire the shared pgss->lock before pg_stat_monitor_internal has released it, leading to a deadlock. To avoid the problem we add 1 more byte to the extra '\0' during palloc call for query_text and parent_query_text. Also, we release the lock before calling pfree, just in case PostgreSQL finds a problem in pfree we won't deadlock again and get the error reported correctly.	2022-01-12 17:23:02 -03:00
Diego Fronza	cb554f78dd	PG-296: Fix application name. If a backend would change the application name during execution, pg_stat_monitor would then fail to read the updated value, as it caches the result in order to avoid calling the expensive functions pgstat_fetch_stat_numbackends() and pgstat_fetch_stat_local_beentry(). A workaround was found, we can just read an exported GUC from PostgreSQL backend itself, namely application_name, from utils/guc.h, thus saving us from having to call those expensive functions.	2022-01-12 14:06:50 -03:00
Vadim Yalovets	b1c54b877b	Merge pull request #164 from adivinho/DISTPG-353-pg_stat_monitor-PG-Debian-Requirement-the-Description-is-overflowing DISTPG-353 PG Debian Requirement: the Description is overflowing	2022-01-07 14:55:49 +02:00
Vadim Yalovets	bb2cfa3c59	DISTPG-353 PG Debian Requirement: the Description is overflowing	2022-01-07 14:45:29 +02:00
Ibrar Ahmed	caeb825619	Merge pull request #162 from darkfronza/PG-320_remove_state_column_master PG-320: Removal of state columnns from pgsm view.	2022-01-06 21:08:54 +05:00
Diego Fronza	8a94129848	PG-320: Removal of state columnns from pgsm view. We are concerned with finished queries in pg_stat_monitor, so state and state_code columns were removed from the pg_stat_monitor_view as we only list finished queries on it. Removed state regression as it is not necessary anymore. Also, this allowed us to remove the call to pgss_store on ExecutorStart, which just updated query state to EXEC, thus saving some CPU.	2022-01-06 12:19:05 -03:00
Ibrar Ahmed	eaa9c08e24	PG-316: Convert blocks timings to average per buckets.	2022-01-05 19:49:47 +00:00
Ibrar Ahmed	63908af3f3	PG-311: Aggregate the cpu_user_time, cpu_sys_time columns per bucket.	2022-01-05 16:48:04 +00:00
Ibrar Ahmed	51ad08c42f	PG-306: Covnereted bucket_start_time datatype from TEXT to TIMESTAMP.	2022-01-05 16:41:17 +00:00
Vadim Yalovets	9a35b5d1c0	Merge pull request #145 from adivinho/DISTPG-349-pg_stat_monitor-Remove-dependency-on-Percona-PostgreSQL DISTPG-349 remove dependency on Percona PostgreSQL	2022-01-04 18:23:49 +02:00
Ibrar Ahmed	c05335a326	Merge pull request #159 from darkfronza/PG-299_fix_conflicts PG-299: Fix conflicts between devel and master.	2022-01-04 02:03:01 +05:00
Diego Fronza	78c97088cf	PG-299: Fix conflicts between devel and master. Updated sql files (pg_stat_monitor_settings view). Using right variable name and level checking on pgss_store: key.toplevel = ((exec_nested_level + plan_nested_level) == 0);	2022-01-03 11:01:01 -03:00
Ibrar Ahmed	d3fe5edc80	git push origin develMerge branch 'darkfronza-devel' into devel	2021-12-30 19:33:39 +00:00
Diego Fronza	eb4087be4e	PG-291: Fix query call count. The issue with wrong query call count was taking place during transition to a new bucket, the process is shortly describe bellow: 1. Scan for pending queries in previous bucket. 2. Add pending queries to the new bucket id. 3. Remove pending queries from previous bucket id. The problem is that when switching to a new bucket, we reset query statistics for a given entry being processed, so, for example, if the pending query had a call count of 10 (9 of which were finished, 10th is the pending one), if we move this query to the new bucket, the entry will have its stats reseted, clearing the query call count to zero. To solve the problem, whenever a pending query is detected, if the entry has a call count > 1, we mark it as finished, and don't remove it from the previous bucket in order to keep its statistics, then we move just the pending query (10th in the example) to the new bucket id. Another issue is that when moving a entry to a new bucket, we missed copying the query position from the previous entry, which is used to locate the query text in the query buffer: hash_entry_dealloc():291 new_entry->query_pos = old_entry->query_pos;	2021-12-30 09:49:32 -03:00
Diego Fronza	57839c7664	PG-295: Fix top_query regression test. The issue is that between changing GUC "track" from track='top' to track='all' the queries are executing using previous state of track='top', to fix that we sleep 1 second after calling pg_reload_conf() to ensure that queries will run with new settings.	2021-12-30 09:49:32 -03:00
Diego Fronza	fd1691626c	PG-293: Disable pgsm_track_planning. This GUC must be disabled by default, it incurss a small performance penalty in the PostgreSQL TPS, users can enable it at anytime if they wish to.	2021-12-30 09:49:32 -03:00
Diego Fronza	a702f24465	PG-293: Update regression tests (extract_comments). guc: Add the new GUC variable to the output. tags: Handle both cases, enable/disable extracting query comments.	2021-12-30 09:49:32 -03:00
Diego Fronza	6042795930	PG-293: Add pg_stat_monitor.extract_comments GUC. This new GUC allows the user to enable/disable extracting query comments.	2021-12-30 09:49:32 -03:00
Diego Fronza	30a328f381	PG-293: Update regression tests. cmd_type: Added missing DROP TABLE t2; guc: Adjusted to match the updated settings view, which now display boolean values as 'yes' and 'no', also added the 'options' column to the output. guc_1: Handle PostgreSQL versions <= 12 which don't have the track_planning feature. rows.out: Added missing DROP TABLE t2. Also removed the line 'ERROR: relation "t2" already exists' since we fixed the problem in cmd_type regression. top_query: Handling both track = 'top' and track = 'all' cases. top_query_1: On PostgreSQL >= 14 the sub query from the procedure is stored as (select $1 + $2), whereas on PG <= 13 it is stored as SELECT (select $1 + $2).	2021-12-30 09:49:30 -03:00
Diego Fronza	b6838049b6	PG-293: Add pg_stat_monitor.track GUC. This new GUC allows users to select which statements are tracked by pg_stat_monitor: - 'top': Default, track only top level queries. - 'all': Track top along with sub/nested queries. - 'none': Disable query monitoring. To avoid redudancy, now users disable pg_stat_monitor by setting pg_stat_monitor.track = 'none', similar to pg_stat_statements. This new GUC is an enumeration, so the pg_stat_monitor_settings view was adjusted to add a new column 'options' which lists the possible values for the field. The "value" and "default_value" columns in the pg_stat_monitor_settings view was adjusted to be of type text, so we can better display the enumeration values. Also the boolean types now have their values displayed as either 'yes' or 'no' to easily distinguish them from the integer types.	2021-12-30 09:48:27 -03:00
Diego Fronza	6f7f44b744	PG-286: Fix deadlock. Can't call elog() function from inside the pgsm_log as the pgss_hash lock could be already acquired in exclusive mode, since elog() triggers the psmg_emit_log hook, when it calls pgss_store it will try to acquire the pgss_hash lock again, leading to a deadlock.	2021-12-30 09:48:27 -03:00
Diego Fronza	b702145ac3	PG-286: Update regression tests. As the query normalization and query cleaning is always done in the right place (pgss_store), no more parsed queries have a trailling comma ';' at the end. Also, on error regression test, after fixing some problems with utility related queries, we now have two entries for the RAISE WARNING case, the first entry is the utility query itself, the second entry is the error message logged by emit_log_hook. Some queries have the order adjusted due to the fix introduced by the previous commits.	2021-12-30 09:48:27 -03:00
Diego Fronza	007445a0d5	PG-286: Several improvements. This commit introduces serveral improvements: 1. Removal of pgss_store_query and pgss_store_utility functions: To store a query, we just use pgss_store(), this makes the code more uniform. 2. Always pass the query length to the pgss_store function using parse state from PostgreSQL to avoid calculating query length again. 3. Always clean the query (extra spaces, update query location) in pgss_store. 4. Normalize queries right before adding them to the query buffer, but only if user asked for query normalization. 5. Correctly handle utility queries among different PostgreSQL versions: - A word about how utility functions are handled on PG 13 and later versions: - On PostgreSQL <= 13, we have to compute a query ID, on later versions we can call EnableQueryId() to inform Postmaster we want to enable query ID computation. - On PostgreSQL <= 13, post_parse hook is called after process utility hook, on PostgreSQL >= 14, post_parse hook is called before process utility functions. - Based on that information, on PostgreSQL <= 13 / process utility, we pass 0 as queryid to the pgss_store function, then we calculate a queryid after cleaning the query (CleanQueryText) using pgss_hash_string. - On PostgreSQL 14 onward, post_parse() is called before pgss_ProcessUtility, we Clear queryId for prepared statements related utility, on process utility hook, we save the query ID for passing it to the pgss_store function, but mark the query ID with zero to avoid instrumenting it again on executor hooks.	2021-12-30 09:48:26 -03:00
Diego Fronza	a071516a0f	PG-286: Check for NULL return on hash_search before using object. Check if hash_search() function returns NULL before attempting to use the object in hash_entry_alloc().	2021-12-30 09:47:06 -03:00
Diego Fronza	59c321ebc5	PG-286: Reduce calls to pgstat_fetch_stat_numbackends(). After couple CPU profiling sessions with perf, it was detected that the function pgstat_fetch_stat_numbackends() is very expensive, reading the implementation on PostgreSQL's backend_status.c just confirmed that. We use that function on pg_stat_monitor to retrieve the application name and IP address of the client, we now cache the results in order to avoid calling it for every query being processed.	2021-12-30 09:47:06 -03:00
Diego Fronza	d32dea0daa	PG-286: Fix query buffer overflow management. If pgsm_overflow_target is ON (default, 1) and the query buffer overflows, we now dump the buffer and keep track of how many times pg_stat_monitor changed bucket since that. If an overflow happen again before pg_stat_monitor cycle through pgsm_max_buckets buckets (default 10), then we don't dump the buffer again, but instead report an error, this ensures that only one dump file of size pgsm_query_shared_buffer will be in disk at any time, avoiding slowing down queries to the pg_stat_monitor view. As soon as pg_stat_monitor cycles through all buckets, we remove the dump file and reset the counter (pgss->n_bucket_cycles).	2021-12-30 09:47:06 -03:00
Diego Fronza	1b51defc68	PG-286: Small performance improvements. pgss_ExecutorEnd: Avoid unnecessary memset(plan_info, 0, ...). We only use this object if the condition below is true, in which case we already initialize all the fields in the object, also we now store the plan string length (plan_info.plan_len) to avoid calling strlen on it again later: if (queryDesc->operation == CMD_SELECT && PGSM_QUERY_PLAN) { ... here we initialize plan_info If the condition is false, then we pass a NULL PlanInfo* to the pgss_store to avoid more unnecessary processing. pgss_planner_hook: Similar, avoid memset(plan_info, 0, ...) this object is not used here, so we pass NULL to pgss_store. pg_get_application_name: Remove call to strlen, snprintf already give us the calculated string length, so we just return it. pg_get_client_addr: Cache localhost, avoid calling ntohl(inet_addr("127.0.0.1")) all the time. pgss_update_entry: Make use of PlanInfo->plan_len, avoiding a call to strlen again. intarray_get_datum: Init the string by setting the first byte to '\0'.	2021-12-30 09:47:06 -03:00
Diego Fronza	ad1187b9da	PG-286: Avoid duplicate queries in text buffer. The memory area reserved for query text (pgsm_query_shared_buffer) was divided evenly for each bucket, this allowed to have the same query, e.g. "SELECT 1", duplicated in different buckets, thus wasting space. This commit fix the query text duplication by adding a new hash table whose only purpose is to verify if a given query is already added to the buffer (by using the queryID). This allows different buckets that share the same query to point to a unique entry in the query buffer (pgss_qbuf). When pg_stat_monitor moves to a new bucket id, by avoiding adding a query that already exists in the buffer it can also save some CPU time.	2021-12-30 09:47:04 -03:00
Diego Fronza	8ea02b0f2a	PG-228: Fix hash table creation flags on PG <= 13. Before PostgreSQL 14, HASH_STRINGS flag was not available when creating a hash table with ShmemInitHash(). Use HASH_BLOBS for previous PostgreSQL versions.	2021-12-30 09:46:15 -03:00
Diego Fronza	6f353a5596	PG-228: Add severity to the internal message logging API. Add support to include the severity of messages added to the pg_stat_monitor_errors view.	2021-12-30 09:46:15 -03:00

1 2 3 4 5 ...

429 Commits