citus

Commit Graph

Author	SHA1	Message	Date
Mehmet Yilmaz	216ce4efbc	Normalize PG18 error messages by removing subscription prefix and connection failure details Improve logical replication connection-failure messages. 0d8bd0a72ea284ffb1d1154efbe799241cc5edc6	2025-11-10 07:43:44 +00:00
Mehmet YILMAZ	b2356f1c85	PG18: Make EXPLAIN ANALYZE output stable by routing through explain_filter and hiding footers (#8325 ) PostgreSQL 18 adds a new line to text EXPLAIN with ANALYZE (`Index Searches: N`). That extra line both creates noise and bumps psql’s `(N rows)` footer. This PR keeps ANALYZE (so statements still execute) while removing the version-specific churn in our regress outputs. ### What changed * Use `explain_filter(...)` instead of raw text EXPLAIN * In `local_shard_execution.sql` and `local_shard_execution_replicated.sql`, replace direct: ```sql EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) <stmt>; ``` with: ```sql \pset footer off SELECT public.explain_filter('EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) <stmt>'); \pset footer on ``` * Expected files updated accordingly to show the `explain_filter` output block instead of raw EXPLAIN text. * Extend `explain_filter` to drop the PG18 line * Filter now removes any `Index Searches: <number>` line before normalizing numeric fields, preventing the “N” version of the same line from sneaking in. * Keep suite-wide normalizer intact	2025-11-10 10:43:11 +03:00
Mehmet YILMAZ	be2fcda071	PG18 - Normalize PG18 EXPLAIN: hide “Storage … Maximum Storage …” line (#8292 ) fixes #8267 * Extend `src/test/regress/bin/normalize.sed` to drop the new PG18 EXPLAIN instrumentation line: ``` Storage: <Memory\|Disk\|Memory and Disk> Maximum Storage: <size> ``` which appears under `Materialize`, some `CTE Scan`s, etc. when `ANALYZE` is on. Why * PG18 added storage usage reporting for materialization/tuplestore nodes. It’s useful for humans but creates noisy, non-semantic diffs in regression output. There’s no EXPLAIN flag to suppress it, so we normalize in tests instead. This PR wires that normalization into our sed pipeline. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1eff8279d How * Add a narrowly scoped sed rule that matches only lines starting with `Storage:` (keeping `Sort Method`, `Hash`, `Buffers`, etc. intact). Use ERE compatible with `sed -Ef` and Python `re` (no POSIX character classes), e.g.: ``` /^[ \t]Storage:[ \t].$/d ```	2025-11-03 15:59:00 +03:00
Mehmet YILMAZ	dba9379ea5	PG18: Normalize EXPLAIN ANALYZE output for drop “Index Searches” line (#8291 ) fixes #8265 `0fbceae841` PostgreSQL 18 started printing an extra line in `EXPLAIN (ANALYZE …)` for index scans: ``` Index Searches: N ``` normalize.sed: add a rule to remove the PG18-only line ``` /^\sIndex Searches:\s\d+\s*$/d ```	2025-10-27 14:37:20 +03:00
Naisila Puka	287abea661	PG18 compatibility - varreturningtype additions (#8231 ) This PR solves the following diffs, originating from the addition of `varreturningtype` field to the `Var` struct in PG18: https://github.com/postgres/postgres/commit/80feb727c Previously we didn't account for this new field (as it's new), so this wouldn't allow the parser to correctly reconstruct the `Var` node structure, but rather it would error out with `did not find '}' at end of input node`: ```diff SELECT column_to_column_name(logicalrelid, partkey) FROM pg_dist_partition WHERE partkey IS NOT NULL ORDER BY 1 LIMIT 1; - column_to_column_name ---------------------------------------------------------------------- - a -(1 row) - +ERROR: did not find '}' at end of input node ``` Solution follows precedent https://github.com/citusdata/citus/pull/7107, when varnullingrels field was added to the `Var` struct in PG16. The solution includes: - Taking care of the `partkey` in `pg_dist_partition` table because it's coming from the `Var` struct. This mainly includes fixing the upgrade script to PG18, by saving all the `partkey` infos before upgrading to PG18 (in `citus_prepare_pg_upgrade`), and then re-generating `partkey` columns in `pg_dist_partition` (using `UPDATE`) after upgrading to PG18 (in `citus_finish_pg_upgrade`). - Adding a normalize rule to fix output differences among PG versions. Note that we need two normalize lines: one for PG15 since it doesn't have `varnullingrels`, and one for PG16/PG17. - Small trick on `metadata_sync_helpers` to use different text when generating the `partkey`, based on the PG version. Fixes #8189	2025-10-09 17:35:03 +03:00
Naisila Puka	f0014cf0df	PG18 compatibility: misc output diffs pt2 (#8234 ) 3 minor changes to reduce some noise from the regression diffs. 1 - Reduce verbosity when ALTER EXTENSION fails PG18 has improved reporting of errors in extension script files Relevant PG commit: https://github.com/postgres/postgres/commit/774171c4f There was more context in PG18, so reducing verbosity ``` ALTER EXTENSION citus UPDATE TO '11.0-1'; ERROR: cstore_fdw tables are deprecated as of Citus 11.0 HINT: Install Citus 10.2 and convert your cstore_fdw tables to the columnar access method before upgrading further CONTEXT: PL/pgSQL function inline_code_block line 4 at RAISE +SQL statement "DO LANGUAGE plpgsql +$$ +BEGIN + IF EXISTS (SELECT 1 FROM pg_dist_shard where shardstorage = 'c') THEN + RAISE EXCEPTION 'cstore_fdw tables are deprecated as of Citus 11.0' + USING HINT = 'Install Citus 10.2 and convert your cstore_fdw tables to the columnar access method before upgrading further'; + END IF; +END; +$$" +extension script file "citus--10.2-5--11.0-1.sql", near line 532 ``` 2 - Fix backend type order in tests for PG18 PG18 added another backend type which messed the order in this test Adding a separate IF condition for PG18 Relevant PG commit: https://github.com/postgres/postgres/commit/18d67a8d7d 3 - Ignore "DEBUG: find_in_path" lines in output Relevant PG commit: https://github.com/postgres/postgres/commit/4f7f7b0375 The new GUC extension_control_path specifies a path to look for extension control files.	2025-10-09 16:50:41 +03:00
Mehmet YILMAZ	4012e5938a	PG18 - normalize PG18 “RESTRICT” FK error wording to legacy form (#8188 ) fixes #8186 `086c84b23d` PG18 emitting a more specific message for foreign-key violations when the action is `RESTRICT` (SQLSTATE 23001), e.g. `violates RESTRICT setting of foreign key constraint ...` and `Key (...) is referenced from table ...`. Older versions printed the generic FK text (SQLSTATE 23503), e.g. `violates foreign key constraint ...` and `Key (...) is still referenced from table ...`. This change was causing noisy diffs in our regression tests (e.g., `multi_foreign_key.out`). To keep a single set of expected files across PG15–PG18, this PR adds two normalization rules to the test filter: ```sed # PG18 FK wording -> legacy generic form s/violates RESTRICT setting of foreign key constraint/violates foreign key constraint/g # DETAIL line: "is referenced" -> "is still referenced" s/\<is referenced from table\>/is still referenced from table/g ``` Scope / impact * Test-only change; runtime behavior is unaffected. * Keeps outputs stable across PG15–PG18 without version-splitting expected files. * Rules are narrowly targeted to the FK wording introduced in PG18. with pr: https://github.com/citusdata/citus/actions/runs/17698469722/job/50300960878#step:5:252	2025-09-17 10:46:36 +03:00
Mehmet YILMAZ	86b5bc6a20	Normalize Actual Rows output in regression tests for PG18 compatibility (#8141 ) DESCRIPTION: Normalize Actual Rows output in regression tests for PG18 compatibility PostgreSQL 18 changed `EXPLAIN ANALYZE` to always print fractional row counts (e.g. `1.00` instead of `1`). `95dbd827f2` This caused diffs across multiple output formats in Citus regression tests: * Text EXPLAIN: `actual rows=50.00` vs `actual rows=50` * YAML: `Actual Rows: 1.00` vs `Actual Rows: 1` * XML: `<Actual-Rows>1.00</Actual-Rows>` vs `<Actual-Rows>1</Actual-Rows>` * JSON: `"Actual Rows": 1.00` vs `"Actual Rows": 1` * Placeholders: `rows=N.N` vs `rows=N` This patch extends `normalize.sed` to strip trailing `.0…` from `Actual Rows` in all supported formats and collapses placeholder values back to `N`. With these changes, regression tests produce stable output across PG15–PG18. No functional changes to Citus itself — only test normalization was updated.	2025-08-21 17:47:46 +03:00
Mehmet YILMAZ	41883cea38	PG18 - unify psql headings to ‘List of relations’ (#8119 ) fixes #8110 This patch updates the `normalize.sed` script used in pg18 psql regression tests: - Replaces the headings “List of tables”, “List of indexes”, and “List of sequences” with a single, uniform heading: “List of relations”.	2025-08-13 12:22:23 +03:00
Mehmet YILMAZ	bfc6d1f440	PG18 - Adjust EXPLAIN's output for disabled nodes (#8108 ) fixes #8097	2025-08-12 12:38:19 +03:00
Mehmet YILMAZ	a8900b57e6	PG18 - Strip decimal fractions from actual rows counts in normalize.sed (#8041 ) Fixes #8040 ``` - Custom Scan (Citus Adaptive) (actual rows=0 loops=1) + Custom Scan (Citus Adaptive) (actual rows=0.00 loops=1) ``` Add a normalization rule to the pg_regress `normalize.sed` script that strips any trailing decimal fraction from actual rows= counts (e.g. turning `actual rows=0.00` into `actual rows=0`). This silences noise diffs introduced by the new PostgreSQL 18 beta’s planner output. commit `b06bde5771`	2025-07-17 15:38:06 +03:00
Naisila Puka	3b1c082791	Drops PG14 support (#7753 ) DESCRIPTION: Drops PG14 support 1. Remove "$version_num" != 'xx' from configure file 2. delete all PG_VERSION_NUM = PG_VERSION_XX references in the code 3. Look at pg_version_compat.h file, remove all _compat functions etc defined specifically for PGXX differences 4. delete all PG_VERSION_NUM >= PG_VERSION_(XX+1), PG_VERSION_NUM < PG_VERSION_(XX+1) ifs in the codebase 5. delete ruleutils_xx.c file 6. cleanup normalize.sed file from pg14 specific lines 7. delete all alternative output files for that particular PG version, server_version_ge variable helps here	2025-03-12 12:43:01 +03:00
Naisila Puka	c662e68e44	Remove redundant normalize (#7794 ) Redundant from this commit `acd7b1e690`	2025-03-12 12:25:49 +03:00
Mehmet YILMAZ	3935710c17	PG17 compatibility: Fix Test Failure in local_dist_join_mixed (#7731 ) PostgreSQL 16 adds an extra condition (id IS NOT NULL) to the subquery. This condition is likely used to ensure that no null values are processed in the subquery. Instead of using the condition id IS NOT NULL, PostgreSQL 17 generates the subplan with a trivial condition (WHERE true), indicating that it does not need to explicitly check for non-null values. PostgreSQL 17 likely includes optimizations to handle null checks more efficiently. The WHERE (id IS NOT NULL) condition that was present in PostgreSQL 16 may now be considered redundant by the planner, as it is implicitly handled by the query execution engine. https://github.com/postgres/postgres/commit/b262ad44 ```diff SELECT foo1.id FROM (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo9, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo8, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo7, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo6, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo5, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo4, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo3, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo2, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo10, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo1 WHERE foo1.id = foo9.id AND foo1.id = foo8.id AND foo1.id = foo7.id AND foo1.id = foo6.id AND foo1.id = foo5.id AND foo1.id = foo4.id AND foo1.id = foo3.id AND foo1.id = foo2.id AND foo1.id = foo10.id AND foo1.id = foo1.id ORDER BY 1; ... -DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE (id IS NOT NULL) +DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE true ... ```	2025-03-12 12:25:49 +03:00
Colm	beb222ea8d	PG17 compatibility: fix multi-1 diffs caused by PG17 optimizer enhancements (#7769 ) This fix ensures that the expected DEBUG error messages from the router planner in `multi_router_planner`, `multi_router_planner_fast_path` and `query_single_shard_table` are present with PG17. In `query_single_shard_table` the diff: ``` SELECT COUNT() FROM citus_local_table t1 WHERE t1.b IN ( SELECT b+1 FROM nullkey_c1_t1 t2 WHERE t2.b = t1.a ); -DEBUG: router planner does not support queries that reference non-colocated distributed tables +DEBUG: Local tables cannot be used in distributed queries. ``` occurred because of[ this PG17 commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639) which enables the optimizer to pull up a correlated ANY subquery to a join. The fix inhibits subquery pull up by including a volatile function in the predicate involving the ANY subquery, preserving the pre-PG17 optimizer treatment of the query. In the case of `multi_router_planner` and `multi_router_planner_fast_path` the diffs: ``` -- partition_column is null clause does not prune out any shards, -- all shards remain after shard pruning, not router plannable SELECT FROM articles_hash a WHERE a.author_id is null; -DEBUG: Router planner cannot handle multi-shard select queries +DEBUG: Creating router plan ``` are because of [this PG17 commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b262ad440), which enables the optimizer to detect and remove redundant IS (NOT) NULL expressions. The fix is to adjust the table definition so the column used for distribution is not marked NOT NULL, thus preserving the pre-PG17 query planning behavior. Finallly, a rule is added to `normalize.sed` to ignore DEBUG logging in CREATE MATERIALIZED VIEW AS statements introduced by [this PG17 commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b4da732fd64); _when creating materialized views, use REFRESH logic to load data_, a consequence of which is that with `client_min_messages` at `DEBUG2` Postgres emits extra detail for CREATE MATERIALIZED VIEW AS statements. ``` CREATE MATERIALIZED VIEW mv_articles_hash_empty AS SELECT * FROM articles_hash WHERE author_id = 1; DEBUG: Creating router plan DEBUG: query has a single distribution column value: 1 +DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391 +DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391[] ``` The rule can be changed to a normalization, or possibly dropped, when 17 becomes the minimum supported version.	2025-03-12 12:25:49 +03:00
Mehmet YILMAZ	9615b52863	PG17 compatibility: Fix Test Failure in multi_name_lengths multi_create_table_constraints (#7726 ) PG 17 Removes outer parentheses from CHECK constraints we add them back for pg15,pg16 compatibility e.g. change CHECK other_col >= 100 to CHECK (other_col >= 100) Relevant PG commit: e59fcbd712c777eb2987d7c9ad542a7e817954ec `e59fcbd712` CI link https://github.com/citusdata/citus/actions/runs/11844794788 ```difft SELECT "Constraint", "Definition" FROM table_checks WHERE relid='public.check_example_365068'::regclass; Constraint \| Definition -------------------------------------+----------------------------------- - check_example_other_col_check \| CHECK (other_col >= 100) - check_example_other_other_col_check \| CHECK (abs(other_other_col) >= 100) + check_example_other_col_check \| CHECK other_col >= 100 + check_example_other_other_col_check \| CHECK abs(other_other_col) >= 100 ``` Co-authored-by: Mehmet YILMAZ <mehmet.yilmaz@microsoft.com>	2025-03-12 12:25:49 +03:00
Colm	b46d311e30	PG17 compatibility: Normalize COPY error messages (#7759 ) A recent Postgres commit () that refactored error messages is the cause of the diffs in pg16 regress test when running Citus on Postgres 17. The fix changes the pg16 goldfile and includes a normalization rule for the error messages so pg16 will pass when running with version 16 of Postgres. () https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=498ee9ee2f	2025-03-12 12:25:49 +03:00
Parag Jain	3c467e6e02	Support MERGE command for single_shard_distributed Target (#7643 ) This PR has following changes : 1. Enable MERGE command for single_shard_distributed targets.	2024-07-16 08:08:44 -07:00
Karina	21464adfec	Make isolation_update_node test system independent (#7423 ) Test isolation_update_node fails on some systems with the following error: ``` -s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Name or service not known +s2: WARNING: connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Temporary failure in name resolution ``` This slightly modifies an already existing [normalization rule](`739c6d26df/src/test/regress/bin/normalize.sed (L217-L218)`) to fix it. Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>	2024-01-17 13:39:07 +00:00
Benjamin O	f9218d9780	Support replacing IPv6 Loopback in `normalize.sed` (#7269 ) I had a test failure issue due to my machine using the IPv6 loopback address. This change to the `normalize.sed` solves that issue.	2023-10-27 16:42:55 +02:00
Naisila Puka	b2291374b4	PG16 compatibility - more test output fixes (#7112 ) PG16 compatibility - part 9 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` part 4 `7c6b4ce103` part 5 `6056cb2c29` part 6 `b36c431abb` part 7 `ee3153fe50` part 8 `2c50b5f7ff` This commit is in the series of PG16 compatibility commits. It makes some changes to our tests in order to be compatible with the following in PG16: - Fix multi_subquery_in_where_reference_clause test somehow PG got rid of the outer join (e.g., explain doesn't show outer joins), hence we can pushdown the subquery. Changing to users_reference_table - Fix unqualified column names for views in PG16 Relevant PG commit: `47bb9db759` 47bb9db75996232ea71fc1e1888ffb0e70579b54 - Fix global_cancel test Error wording and detail changed Relevant PG commit: `2631ebab7b` 2631ebab7b18bdc079fd86107c47d6104a6b3c6e - Fix local_table_join_test with lateral subquery Possible relevant PG commit: `ae89129aa3` ae89129aa3555c263b8c3ccc4c0f1ef7e46201aa I removed the where clause and the limit count error was hit again. With the where clause the query unexpectedly works. - Fix test outputs Relevant PG commits: -- `1349d2790b` -- `f4c7c410ee` For multi_explain and multi_complex_count_distinct there were too many places touched so I just added an alternative test output. For the other tests I modified the problematic parts. More PG16 compatibility commits are coming soon ...	2023-08-15 13:49:25 +03:00
Naisila Puka	2c50b5f7ff	PG16 compatibility - varnullingrels additions (#7107 ) PG16 compatibility - part 7 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` part 4 `7c6b4ce103` part 5 `6056cb2c29` part 6 `b36c431abb` part 7 `ee3153fe50` This commit is in the series of PG16 compatibility commits. PG16 introduced a new entry varnnullingrels to Var, which represents our partkey in pg_dist_partition. This commit does the necessary changes in Citus to support this. Relevant PG commit: `2489d76c49` 2489d76c4906f4461a364ca8ad7e0751ead8aa0d More PG16 compatibility commits are coming soon ...	2023-08-15 13:07:55 +03:00
Naisila Puka	ee3153fe50	PG16 compatibility - more test output fixes (#7108 ) PG16 compatibility - part 7 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` part 4 `7c6b4ce103` part 5 `6056cb2c29` part 6 `b36c431abb` This commit is in the series of PG16 compatibility commits. It makes some changes to our tests in order to be compatible with the following in PG16: - PG16 removed logic for converting a table to a view Relevant PG commit: `b23cd185fd` b23cd185fd5410e5204683933f848d4583e34b35 - Fix changed error message in certificate verification Relevant PG commit: `8eda731465` 8eda7314652703a2ae30d6c4a69c378f6813a7f2 - Fix backend type order in tests Relevant PG commit: `0c679464a8` 0c679464a837079acc75ff1d45eaa83f79e05690 - Reduce log level to omit extra NOTICE in create collation in PG16 Relevant PG commit: `a14e75eb0b` a14e75eb0b6a73821e0d66c0d407372ec8376105 That commit made LOCALE parameter apply regardless of the provider used, and it printed the following notice: NOTICE: using standard form "und-u-ks-level2" for ICU locale "@colStrength=secondary" We omit this notice to omit output change between pg versions. - Fix columnar_memory test TopMemoryContext now has more children contexts Possible relevant PG commit: `9d3ebba729` 9d3ebba729ebaf5882a92f0f5f662a3312037605 memusage is now around 8.5 MB, whereas it was less than 8MB before. To avoid differences between PG versions, I changed the test to compare to less than 9 MB. It still reflects very well the improvement from 28MB. - Alternative test output for GRANTOR values in pg_auth_members grantor changed in PG16 Relevant PG commit: `ce6b672e44` ce6b672e4455820a0348214be0da1a024c3f619f - Remove redundant grouping columns from our tests Relevant PG commit: `8d83a5d0a2` 8d83a5d0a2673174dc478e707de1f502935391a5 - Fix tests with different order in Filters Relevant PG commit: `2489d76c49` 2489d76c4906f4461a364ca8ad7e0751ead8aa0d More PG16 compatibility commits are coming soon ...	2023-08-09 18:04:32 +03:00
Naisila Puka	7c6b4ce103	PG16 compatibility - outer join checks, subscription password, crash fixes (#7097 ) PG16 compatibility - Part 4 Check out part 1 `42d956888d` part 2 `0d503dd5ac` part 3 `907d72e60d` This commit is in the series of PG16 compatibility commits. It adds some outer join checks to the planner, the new password_required option to the subscription, and a crash fix related to PGIOAlignedBlock, see below for more details: - Fix PGIOAlignedBlock Assert crash in PG16 Relevant PG commit: `faeedbcefd` faeedbcefd40bfdf314e048c425b6d9208896d90 - Pass planner info as argument to make_simple_restrictinfo Pre PG16 passing plannerInfo to make_simple_restrictinfo was only needed for placeholder Vars, which is not the case in this part of the codebase because we are building the expression from shard intervals which don't have placeholder vars. However, PG16 is counting baserels appearing in clause_relids and is deleting the rels mentioned in plannerinfo->outer_join_rels Hence directly accessing plannerinfo. We will crash if we leave it as NULL. For reference `2489d76c49 (diff-e045c41eda9686451a7993e91518e40056b3739365e39eb1b70ae438dc1f7c76R207)` Relevant PG commit: `2489d76c49` 2489d76c4906f4461a364ca8ad7e0751ead8aa0d - Add outer join checks, root->simple_rel_array - fix rebalancer to include passwork_required option Relevant PG commit: `c3afe8cf5a` c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6 More PG16 compatibility commits are coming soon ...	2023-08-04 14:51:28 +03:00
Naisila Puka	69af3e8509	Drop PG13 Support Phase 2 - Remove PG13 specific paths/tests (#7007 ) This commit is the second and last phase of dropping PG13 support. It consists of the following: - Removes all PG_VERSION_13 & PG_VERSION_14 from codepaths - Removes pg_version_compat entries and columnar_version_compat entries specific for PG13 - Removes alternative pg13 test outputs - Removes PG13 normalize lines and fix the test outputs based on that It is a continuation of `5bf163a27d`	2023-06-21 14:18:23 +03:00
Teja Mupparti	58da8771aa	This pull request introduces support for nonroutable merge commands in the following scenarios: 1) For distributed tables that are not colocated. 2) When joining on a non-distribution column for colocated tables. 3) When merging into a distributed table using reference or citus-local tables as the data source. This is accomplished primarily through the implementation of the following two strategies. Repartition: Plan the source query independently, execute the results into intermediate files, and repartition the files to co-locate them with the merge-target table. Subsequently, compile a final merge query on the target table using the intermediate results as the data source. Pull-to-coordinator: Execute the plan that requires evaluation at the coordinator, run the query on the coordinator, and redistribute the resulting rows to ensure colocation with the target shards. Direct the MERGE SQL operation to the worker nodes' target shards, using the intermediate files colocated with the data as the data source.	2023-06-19 12:23:40 -07:00
Naisila Puka	5bf163a27d	Remove PG13 from CI and Configure (#7002 ) DESCRIPTION: Drops PG13 Support This commit is the first phase of dropping PG13 support. It consists of the following: - Removes pg13 from CI tests Among other things, Citus upgrade tests should now use PG14. Earliest Citus version supporting PG14 is 10.2. We also pick 11.3 version for upgrade_pg_dist_cleanup tests. Therefore, we run the citus upgrade tests with versions 10.2 and 11.3. - Removes pg13 from configure script - Remove upgrade_columnar_metapage upgrade tests We populate first_row_number column of columnar.stripe table during citus 10.1-10.2 upgrade. Given that we start from citus 10.2.0, which is the oldest version supporting PG14, we don't have that upgrade path anymore. Hence we remove these tests. - Removes upgrade_pg_dist_object_test and upgrade_partition_constraints tests These upgrade tests require the citus old version to be less than 10.0. Given that we drop support for PG13, we run upgrade tests with PG14, which starts with 10.2. So we remove these upgrade tests. - Documents that upgrade_post_11 should upgrade from version less than 11 In this way we make sure we run citus_finalize_upgrade_to_citus11 script - Adds needed alternative output for upgrade_citus_finish_citus_upgrade Given that we use 11.3 as the citus old version as well, we add this alternative output because pg_catalog.citus_finish_citus_upgrade() makes sense if last_upgrade_major_version < 11. See below for reference: pg_catalog.citus_finish_citus_upgrade(): ... IF last_upgrade_major_version < 11 THEN PERFORM citus_finalize_upgrade_to_citus11(); performed_upgrade := true; END IF; IF NOT performed_upgrade THEN RAISE NOTICE 'already at the latest distributed schema version (%)', last_upgrade_version_string; RETURN; END IF; ... And that's it :) The second phase of dropping PG13 support will consist in removing all the PG13 specific compilation paths/tests in the Citus repo. Will be done soon.	2023-06-15 14:54:06 +03:00
Gokhan Gulbiz	e0ccd155ab	Make citus_stat_tenants work with schema-based tenants. (#6936 ) DESCRIPTION: Enabling citus_stat_tenants to support schema-based tenants. This pull request modifies the existing logic to enable tenant monitoring with schema-based tenants. The changes made are as follows: - If a query has a partitionKeyValue (which serves as a tenant key/identifier for distributed tables), Citus annotates the query with both the partitionKeyValue and colocationId. This allows for accurate tracking of the query. - If a query does not have a partitionKeyValue, but its colocationId belongs to a distributed schema, Citus annotates the query with only the colocationId. The tenant monitor can then easily look up the schema to determine if it's a distributed schema and make a decision on whether to track the query. --------- Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>	2023-06-13 14:11:45 +03:00
Halil Ozan Akgül	321fcfcdb5	Add Support for Single Shard Tables in update_distributed_table_colocation (#6924 ) Adds Support for Single Shard Tables in `update_distributed_table_colocation`. This PR changes checks that make sure tables should be hash distributed table to hash or single shard distributed tables.	2023-05-29 11:47:50 +03:00
Emel Şimşek	f9a5be59b9	Run replicate_reference_tables background task as superuser. (#6930 ) DESCRIPTION: Fixes a bug in background shard rebalancer where the replicate reference tables task fails if the current user is not a superuser. This change is to be backported to earlier releases. We should fix the permissions for replicate_reference_tables on main branch such that it can be run by non-superuser roles. Fixes #6925. Fixes #6926.	2023-05-18 23:46:32 +03:00
Hanefi Onaldi	06e6f8e428	Normalize columnar version in tests (#6917 ) When we bump columnar version, some tests fail because of the output change. Instead of changing those lines every time, I think it is better to normalize it in tests.	2023-05-08 16:10:55 +03:00
Halil Ozan Akgül	52ad2d08c7	Multi tenant monitoring (#6725 ) DESCRIPTION: Adds views that monitor statistics on tenant usages This PR adds `citus_stats_tenants` view that monitors the tenants on the cluster. `citus_stats_tenants` shows the node id, colocation id, tenant attribute, read count in this period and last period, and query count in this period and last period of the tenant. Tenant attribute currently is the tenant's distribution column value, later when schema based sharding is introduced, this meaning might change. A period is a time bucket the queries are counted by. Read and query counts for this period can increase until the current period ends. After that those counts are moved to last period's counts, which cannot change. The period length can be set using 'citus.stats_tenants_period'. `SELECT` queries are counted as _read_ queries, `INSERT`, `UPDATE` and `DELETE` queries are counted as _write_ queries. So in the view read counts are `SELECT` counts and query counts are `SELECT`, `INSERT`, `UPDATE` and `DELETE` count. The data is stored in shared memory, in a struct named `MultiTenantMonitor`. `citus_stats_tenants` shows the data from local tenants. `citus_stats_tenants` show up to `citus.stats_tenant_limit` number of tenants. The tenants are scored based on the number of queries they run and the recency of those queries. Every query ran increases the score of tenant by `ONE_QUERY_SCORE`, and after every period ends the scores are halved. Halving is done lazily. To retain information a longer the monitor keeps up to 3 times `citus.stats_tenant_limit` tenants. When the tenant count hits `3 * citus.stats_tenant_limit`, last `citus.stats_tenant_limit` tenants are removed. To see all stored tenants you can use `citus_stats_tenants(return_all_tenants := true)` - [x] Create collector view that gets data from all nodes. #6761 - [x] Add monitoring log #6762 - [x] Create enable/disable GUC #6769 - [x] Parse the annotation string correctly #6796 - [x] Add local queries and prepared statements #6797 - [x] Rename to citus_stat_statements #6821 - [x] Run pgbench - [x] Fix role permissions #6812 --------- Co-authored-by: Gokhan Gulbiz <ggulbiz@gmail.com> Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-04-05 17:44:17 +03:00
Teja Mupparti	1e42cd3da0	Support MERGE on distributed tables with restrictions This implements the phase - II of MERGE sql support Support routable query where all the tables in the merge-sql are distributed, co-located, and both the source and target relations are joined on the distribution column with a constant qual. This should be a Citus single-task query. Below is an example. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id AND t1.id = 100 WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) Basically, MERGE checks to see if There are a minimum of two distributed tables (source and a target). All the distributed tables are indeed colocated. MERGE relations are joined on the distribution column MERGE .. USING .. ON target.dist_key = source.dist_key The query should touch only a single shard i.e. JOIN AND with a constant qual MERGE .. USING .. ON target.dist_key = source.dist_key AND target.dist_key = <> If any of the conditions are not met, it raises an exception. (cherry picked from commit `44c387b978`) This implements MERGE phase3 Support pushdown query where all the tables in the merge-sql are Citus-distributed, co-located, and both the source and target relations are joined on the distribution column. This will generate multiple tasks which execute independently after pushdown. SELECT create_distributed_table('t1', 'id'); SELECT create_distributed_table('s1', 'id', colocate_with => ‘t1’); MERGE INTO t1 USING s1 ON t1.id = s1.id WHEN MATCHED THEN UPDATE SET val = s1.val + 10 WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT (id, val, src) VALUES (s1.id, s1.val, s1.src) *The only exception for both the phases II and III is, UPDATEs and INSERTs must be done on the same shard-group as the joined key; for example, below scenarios are NOT supported as the key-value to be inserted/updated is not guaranteed to be on the same node as the id distribution-column. MERGE INTO target t USING source s ON (t.customer_id = s.customer_id) WHEN NOT MATCHED THEN - - INSERT(customer_id, …) VALUES (<non-local-constant-key-value>, ……); OR this scenario where we update the distribution column itself MERGE INTO target t USING source s On (t.customer_id = s.customer_id) WHEN MATCHED THEN UPDATE SET customer_id = 100; (cherry picked from commit `fa7b8949a8`)	2023-03-16 13:43:08 -07:00
Marco Slot	64e3fee89b	Remove shardstate leftovers (#6627 ) Remove ShardState enum and associated logic. Co-authored-by: Marco Slot <marco.slot@gmail.com> Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>	2023-01-19 11:43:58 +03:00
Ahmet Gedemenli	b3b135867e	Remove shardstate from placement insert functions (#6615 )	2023-01-18 09:52:38 +01:00
Ahmet Gedemenli	235047670d	Drop SHARD_STATE_TO_DELETE (#6494 ) DESCRIPTION: Drop `SHARD_STATE_TO_DELETE` and use the cleanup records instead Drops the shard state that is used to mark shards as orphaned. Now we insert cleanup records into `pg_dist_cleanup` so "orphaned" shards will be dropped either by maintenance daemon or internal cleanup calls. With this PR, we make the "cleanup orphaned shards" functions to be no-op, as they would not be needed anymore. This PR includes some naming changes about placement functions. We don't need functions that filter orphaned shards, as there will be no orphaned shards anymore. We will also be introducing a small script with this PR, for users with orphaned shards. We'll basically delete the orphaned shard entries from `pg_dist_placement` and insert cleanup records into `pg_dist_cleanup` for each one of them, during Citus upgrade. We also have a lot of flakiness fixes in this PR. Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>	2023-01-03 14:38:16 +03:00
Naisila Puka	e937935935	Clean up normalize file (#6578 )	2022-12-26 12:08:27 +03:00
aykut-bozkurt	3da6e3e743	bgworkers with backend connection should handle SIGTERM properly (#6552 ) Fixes task executor SIGTERM handling. Problem: When task executors are sent SIGTERM, their default handler `bgworker_die`, which is set at worker startup, logs FATAL error. But they do not release locks there before logging the error, which sometimes causes hanging of the monitor. e.g. Monitor waits for the lock forever at pg_stat flush after calling proc_exit. Solution: Because executors have connection to backend, they should handle SIGTERM similar to normal backends. Normal backends uses `die` handler, in which they set ProcDiePending flag and the next CHECK_FOR_INTERRUPTS call handles it gracefully by releasing any lock before termination.	2022-12-12 16:44:36 +03:00
Ahmet Gedemenli	cb02d62369	Unique names for replication artifacts (#6529 ) DESCRIPTION: Create replication artifacts with unique names We're creating replication objects with generic names. This disallows us to enable parallel shard moves, as two operations might use the same objects. With this PR, we'll create below objects with operation specific names, by appending OparationId to the names. * Subscriptions * Publications * Replication Slots * Users created for subscriptions	2022-12-06 15:48:16 +03:00
Jelte Fennema	68de2ce601	Include gpid in all internal application names (#6431 ) When debugging issues it's quite useful to see the originating gpid in the application_name of a query on a worker. This already happens for most queries, but not for queries created by the rebalancer or by run_command_on_worker. This adds a gpid to those two application_names too. Note, that if the GPID of the new application_names is different than the current GPID of the backend the backend will continue to keep the old gpid as its actual GPID. This PR is just meant to make sure that the application_name is as useful as it can be for users to look at. Updating of gpids will be done in a follow-up PR, and adding gpids to all internal connections will make this easier.	2022-11-25 11:16:33 +01:00
Emel Şimşek	8e5ba45b74	Fixes a bug that causes crash when using auto_explain extension with ALTER TABLE...ADD FOREIGN KEY... queries. (#6470 ) Fixes a bug that causes crash when using auto_explain extension with ALTER TABLE...ADD FOREIGN KEY... queries. Those queries trigger a SELECT query on the citus tables as part of the foreign key constraint validation check. At the explain hook, workers try to explain this SELECT query as a distributed query causing memory corruption in the connection data structures. Hence, we will not explain ALTER TABLE...ADD FOREIGN KEY... and the triggered queries on the workers. Fixes #6424.	2022-11-15 17:53:39 +03:00
Teja Mupparti	01103ce05d	This implements a new UDF citus_get_cluster_clock() that returns a monotonically increasing logical clock. Clock guarantees to never go back in value after restarts, and makes best attempt to keep the value close to unix epoch time in milliseconds. Also, introduces a new GUC "citus.enable_cluster_clock", when true, every distributed transaction is stamped with logical causal clock and persisted in a catalog pg_dist_commit_transaction.	2022-10-28 10:15:08 -07:00
Nils Dijk	cda3686d86	Feature: run rebalancer in the background (#6215 ) DESCRIPTION: Add a rebalancer that uses background tasks for its execution Based on the baclground jobs and tasks introduced in #6296 we implement a new rebalancer on top of the primitives of background execution. This allows the user to initiate a rebalance and let Citus execute the long running steps in the background until completion. Users can invoke the new background rebalancer with `SELECT citus_rebalance_start();`. It will output information on its job id and how to track progress. Also it returns its job id for automation purposes. If you simply want to wait till the rebalance is done you can use `SELECT citus_rebalance_wait();` A running rebalance can be canelled/stopped with `SELECT citus_rebalance_stop();`.	2022-09-12 20:46:53 +03:00
Hanefi Onaldi	a557a196aa	Add tests for numeric with scale greater than precision	2022-09-07 13:12:04 +03:00
Naisila Puka	35b4ddc355	Pg15 support (#6085 ) * Adjust configure script to allow PG15 * Adds copy of ruleutils_14.c as ruleutils_15.c * Uses get_namespace_name_or_temp in ruleutils_15.c Relevant PG commit: 48c5c9068211e0a04fd9553c8714b2821ed3ad17 * Clean up code using "(expr) ? true : false" in ruleutils_15.c Relevant PG commit: fd0625c7a9c679c0c1e896014b8f49a489c3a245 * Change varno from Index (unsigned int) to int in ruleutils_15.c Relevant PG commit: e3ec3c00d85bd2844ffddee83df2bd67c4f8297f * Adds find_recursive_union to ruleutils_15.c Relevant PG commit: 3f50b82639637c9908afa2087de7588450aa866b * Fix display of SQL-std func's args in INSERT/SELECT in ruleutils_15.c Relevant PG commit: a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759 * Fix ruleutils_15.c's dumping of whole-row Vars in more contexts Relevant PG commit: 43c2175121c829c8591fc5117b725f1f22bfb670 * Fix assorted missing logic for GroupingFunc nodes in ruleutils_15.c Relevant PG commit: 2591ee8ec44d8cbc8e1226550337a64c684746e4 * Adds grammar support for SQL/JSON clauses in ruleutils_15.c Relevant PG commit: f79b803dcc98d707450e158db3638dc67ff8380b * Adds SQL/JSON constructors to ruleutils_15.c Relevant PG commits: f4fb45d15c59d7add2e1b81a9d477d0119a9691a cc7401d5ca498a84d9b47fd2e01cebd8e830e558 * Adds support for MERGE in ruleutils_15.c Relevant PG commit: 7103ebb7aae8ab8076b7e85f335ceb8fe799097c * Add IS JSON predicate to ruleutils_15.c Relevant PG commit: 33a377608fc29cdd1f6b63be561eab0aee5c81f0 * Add SQL/JSON query functions to ruleutils_15.c Relevant PG commit: 1a36bc9dba8eae90963a586d37b6457b32b2fed4 * Adds three different SQL/JSON values to ruleutils_15.c Relevant PG commits: 606948b058dc16bce494270eea577011a602810e 49082c2cc3d8167cca70cfe697afb064710828ca * Adds JSON table functions in ruleutils_15.c Relevant PG commit: 4e34747c88a03ede6e9d731727815e37273d4bc9 * Add PLAN function for JSON table in ruleutils_15.c Relevant PG commit: fadb48b00e02ccfd152baa80942de30205ab3c4f * Remove extra blank lines before block-closing braces ruleutils_15.c Relevant PG commit: 24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031 * set_deparse_plan: Reuse variable to appease Coverity ruleutils_15.c Relevant PG commit: e70813fbc4aaca35ec012d5a426706bd54e4acab * Mechanical code beautification ruleutils_15.c Relevant PG commit: 23e7b38bfe396f919fdb66057174d29e17086418 * Rename value_type to item_type in ruleutils_15.c Relevant PG commit: 3ab9a63cb638a1fd99475668e2da9c237495aeda * Show 'AS "?column?"' explicitly when it's important in ruleutils_15.c Relevant PG commit: c7461fc25558832dd347a9c8150b0f1ed85e36e8 * Fix ruleutils_15.c issues with dropped cols in funcs-returning-composite Relevant PG commit: c1d1e8469c77ce6b8e5310955580b4a3eee7fe96 * Change comment regarding functions returning composite in ruleutils_15.c Relevant PG commit: c2fa113ddb1117b1f03e91960f65d5d7d8a90270 * Replace int nodes with bool nodes where needed In PG15, Boolean nodes are added. Pre PG15, internal Boolean values in Create Role commands were represented by Integer nodes. This commit replaces int nodes logic with bool nodes logic where needed. Mostly there are CREATE ROLE logic changes. Relevant PG commit: 941460fcf731a32e6a90691508d5cfa3d1f8eeaf * Handle new option colliculocale in CREATE COLLATION logic In PG15, there is an added option to use ICU as global locale provider. pg_collation has three locale-related fields: collcollate and collctype, which are libc-related fields, and a new one colliculocale, which is the ICU-related field. Only the libc-related fields or the ICU-related field is set, never both. Relevant PG commits: f2553d43060edb210b36c63187d52a632448e1d2 54637508f87bd5f07fb9406bac6b08240283be3b * Add PG15 tests to CI using test images that have 15beta2 (#6093) * Change warning message in pg_signal_backend() Relevant PG commit: 7fa945b857cc1b2964799411f1633468826861ff * Revert "Add missing ifdef for PG 15" This reverts commit `c7b51025ab`. * Fixes tests for ALTER TRIGGER RENAME consistency for part. tables Relevant PG commit: 80ba4bb383538a2ee846fece6a7b8da9518b6866 * Prevent creating child triggers on partitions when adding new node Pre PG15, tgisinternal is true for a "child" trigger on a partition cloned from the trigger on the parent. In PG15, tgisinternal is false in that case. However, we don't want to create this trigger on the partition since it will create a conflict when we try to attach the partition to the parent table: ERROR: trigger "..." for relation "{partition_name}" already exists Relevant PG commit: f4566345cf40b068368cb5617e61318da60676ec * Fix tests for generated columns dependency changes In PG15, For GENERATED columns, all dependencies of the generation expression are recorded as NORMAL dependencies of the column itself. This requires CASCADE to drop generated cols with the original col. PRE PG15, dependencies were recorded as AUTO, with which generated columns are silently dropped with the original column. Relevant PG commit: cb02fcb4c95bae08adaca1202c2081cfc81a28b5 * Explicitly cast catalog "char" column to text before concatenation Relevant PG commit: 07eee5a0dc642d26f44d65c4e6263304208e8583 * Remove 'AS "?column?"' from test outputs There were some instances in the following tst outputs in planning debug outputs where AS "?column?" is added. We add a normalization rule to remove it as it is not important. cte_inline.out recursive_relation_planning_restriction_pushdown.out Relevant PG commit: c7461fc25558832dd347a9c8150b0f1ed85e36e8 * Use pg_backup_stop(PG15) instead of pg_stop_backup(PG<15) Add an alternative test output because of the change in the backup modes of Postgres. Specifically here, there is a renaming issue: pg_stop_backup PRE PG15 vs pg_backup_stop PG15+ The alternative output can be deleted when we drop support for PG14 Relevant PG commit: 39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a * Adds citus.mitmfifo GUC Previously we setting this configuration parameter in the fly for failure tests schedule. However, PG15 doesn't allow that anymore: reserved prefixes like "citus" cannot be used to set non-existing GUCs. Relevant PG commit: 88103567cb8fa5be46dc9fac3e3b8774951a2be7 * Handles EXPLAIN output diffs in PG15 - Extra result lines To handle extra "Result" lines in explain outputs, we add explain method to multi_test_helpers.sql file - plan_without_result_lines() is added for cases where we want the whole explain output with only "Result" lines removed * Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage To handle differences in usage of GroupAggregate vs HashAggregate or Merge Join vs Hash join in cases where this detail doesn't seem to matter, we use coordinator_plan(). - coordinator_plan() is updated to remove "Result" lines There are some cases where we have subplans so we add a new function that prints all Task Count lines as well - coordinator_plan_with_subplans() Still not sure of the relevant PG commit Could be db0d67db2401eb6238ccc04c6407a4fd4f985832 but disabling enable_group_by_reordering didn't help. * Handles EXPLAIN output diffs in PG15: enable_group_by_reordering Relevant PG commit db0d67db2401eb6238ccc04c6407a4fd4f985832 * Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs We create a new function in multi_test_helpers, which is similar to explain_merge function in PG15. This explain helper function normalies Memory Usage, Buckets and Batches, and we use it in the tests which give a different output for PG15. * Bump test images to 15beta3 (#6172) * Omit namespace in post-copy errmsg Relevant PG commit: 069d33d0c5a021601245e44df77a0423ddd69359 * Handles EXPLAIN output diffs in PG15: extra arrows&result lines To handle extra "->" arrows resulting from extra Result lines in explain outputs, we add the following explain method to multi_test_helpers.sql file - plan_without_arrows() is added for cases where we want the whole explain output without arrows and without Result lines * Alters public schema's owner to pg_database_owner in PG15 In PG15, public schema is owned by pg_database_owner role. In multi_extension, we drop and recreate the ppublic schema, hence its owner become the default user in our tests, postgres. Change that to pg_database_owner for PG15 consistency. This results in alternative test output for public schema grants in the following test: grant_on_schema_propagation.sql Relevant PG commit: b073c3ccd06e4cb845e121387a43faa8c68a7b62 * Add alternative test outputs for change in Insert Select display citus_local_tables_queries.sql coordinator_shouldhaveshards.sql cte_inline.sql insert_select_repartition.sql intermediate_result_pruning.sql local_shard_execution.sql local_shard_execution_replicated.sql multi_deparse_shard_query.sql multi_insert_select.sql multi_insert_select_conflict.sql multi_mx_insert_select_repartition.sql mx_coordinator_shouldhaveshards.sql single_node.sql Relevant PG commit: a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759 * Fixes columnar tap tests for PG15 In PG15, Perl test modules have been moved to a new namespace. Also, postgres node new() and get_new_node() methods have been unified to one method: new() We create separate tap tests for PG13/14 and PG15+ and update the Makefiles accordingly. Relevant PG commits: 201a76183e2056c2217129e12d68c25ec9c559c8 b3b4d8e68ae83f432f43f035c7eb481ef93e1583 * Handles EXPLAIN output diffs in PG15: HashAgg Leverage,alt. output Still not sure of the relevant PG commit Could be db0d67db2401eb6238ccc04c6407a4fd4f985832 but disabling enable_group_by_reordering didn't help.	2022-08-24 17:59:17 +02:00
Hanefi Onaldi	616b1758c2	Add more normalization rules	2022-08-22 17:16:52 +03:00
Jelte Fennema	78a5013e24	Support changing CPU priorities for backends and shard moves (#6126 ) Intro This adds support to Citus to change the CPU priority values of backends. This is created with two main usecases in mind: 1. Users might want to run the logical replication part of the shard moves or shard splits at a higher speed than they would do by themselves. This might cause some small loss of DB performance for their regular queries, but this is often worth it. During high load it's very possible that the logical replication WAL sender is not able to keep up with the WAL that is generated. This is especially a big problem when the machine is close to running out of disk when doing a rebalance. 2. Users might have certain long running queries that they don't impact their regular workload too much. Be very careful!!! Using CPU priorities to control scheduling can be helpful in some cases to control which processes are getting more CPU time than others. However, due to an issue called "[priority inversion][1]" it's possible that using CPU priorities together with the many locks that are used within Postgres cause the exact opposite behavior of what you intended. This is why this PR only allows the PG superuser to change the CPU priority of its own processes. Currently it's not recommended to set `citus.cpu_priority` directly. Currently the only recommended interface for users is the setting called `citus.cpu_priority_for_logical_replication_senders`. This setting controls CPU priority for a very limited set of processes (the logical replication senders). So, the dangers of priority inversion are also limited with when using it for this usecase. Background Before reading the rest it's important to understand some basic background regarding process CPU priorities, because they are a bit counter intuitive. A lower priority value, means that the process will be scheduled more and whatever it's doing will thus complete faster. The default priority for processes is 0. Valid values are from -20 to 19 inclusive. On Linux a larger difference between values of two processes will result in a bigger difference in percentage of scheduling. Handling the usecases Usecase 1 can be achieved by setting `citus.cpu_priority_for_logical_replication_senders` to the priority value that you want it to have. It's necessary to set this both on the workers and the coordinator. Example: ``` citus.cpu_priority_for_logical_replication_senders = -10 ``` Usecase 2 can with this PR be achieved by running the following as superuser. Note that this is only possible as superuser currently due to the dangers mentioned in the "Be very carefull!!!" section. And although this is possible it's NOT recommended: ```sql ALTER USER background_job_user SET citus.cpu_priority = 5; ``` OS configuration To actually make these settings work well it's important to run Postgres with more a more permissive value for the 'nice' resource limit than Linux will do by default. By default Linux will not allow a process to set its priority lower than it currently is, even if it was lower when the process originally started. This capability is necessary to reset the CPU priority to its original value after a transaction finishes. Depending on how you run Postgres this needs to be done in one of two ways: If you use systemd to start Postgres all you have to do is add a line like this to the systemd service file: ```conf LimitNice=+0 # the + is important, otherwise its interpreted incorrectly as 20 ``` If that's not the case you'll have to configure `/etc/security/limits.conf` like so, assuming that you are running Postgres as the `postgres` OS user: ``` postgres soft nice 0 postgres hard nice 0 ``` Finally you'd have add the following line to `/etc/pam.d/common-session` ``` session required pam_limits.so ``` These settings would allow to change the priority back after setting it to a higher value. However, to actually allow you to set priorities even lower than the default priority value you would need to change the values in the config to something lower than 0. So for example: ```conf LimitNice=-10 ``` or ``` postgres soft nice -10 postgres hard nice -10 ``` If you use WSL2 you'll likely have to do another thing. You have to open a new shell, because when PAM is only used during login, and WSL2 doesn't actually log you in. You can force a login like this: ``` sudo su $USER --shell /bin/bash ``` Source: https://stackoverflow.com/a/68322992/2570866 [1]: https://en.wikipedia.org/wiki/Priority_inversion	2022-08-16 13:07:17 +03:00
Jelte Fennema	fd07cc9baf	Fix flakyness in create index concurrently isolation tests (#6158 ) This creates consistent test output for isolation tests that involve `CREATE INDEX CONCURRENTLY`. `CREATE INDEX CONCURRENTLY` is sometimes temporarily detected as blocking, even though it will complete without any other queries needing to be run. This change makes sure that we wait until that happens without running any other queries in the meantime. This way we always get consistent output. The way we do that is addressed by using an empty step in the same session as the `CREATE INDEX CONCURRENLTY` command. Doing so forces the isolation tester to wait until the command is finished and not continue with steps from other sessions. This is [the recommended approach by Postgres][1]. There's two separate cases which are addressed in slightly different ways: 1. If `CREATE INDEX CONCURRENTLY` is actually blocked on another session: Add an empty step right after the commit of blocking session. e.g. `"s2-ddl-create-index-concurrently" "s1-commit" "s2-empty"` 2. If it's not actually blocked on another session: Add [an asterisk marker][2] to make it look like it's blocked (because sometimes this happens randomly) and right after that we add an empty step to trigger waiting. e.g. `"s2-ddl-create-index-concurrently"(*) "s2-empty" "s1-commit"` In passing this also enables isolation tests that were disabled due to a bug that has already been fixed for a while. Fixes #5993 Related to #5910 and #2966 [1]: `5f0adec253/src/test/isolation/README (L197-L204)` [2]: `5f0adec253/src/test/isolation/README (L174-L179)` Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2022-08-11 10:29:11 +02:00
Hanefi Onaldi	f944f97d01	Normalize messages from different libpq versions Historically we have been testing with the 'latest' version of libpq when the CI images were build. This has the downside that rebuilding the images often break our tests due to different errors returned from libpq. With this change we will actually test with a stable version of libpq that is based on the postgres minor version that we test against. This will make it easier to maintain postgres images over time, as well as running _all_ tests locally, where we change libpq in sync with the postgres server version.	2022-07-26 01:41:34 +03:00
Naisila Puka	7d6410c838	Drop postgres 12 support (#6040 ) * Remove if conditions with PG_VERSION_NUM < 13 * Remove server_above_twelve(&eleven) checks from tests * Fix tests * Remove pg12 and pg11 alternative test output files * Remove pg12 specific normalization rules * Some more if conditions in the code * Change RemoteCollationIdExpression and some pg12/pg13 comments * Remove some more normalization rules	2022-07-20 17:49:36 +03:00

1 2 3 4

153 Commits (216ce4efbce6d5f601d44f1eac0d3a4177e022ca)