citus

Commit Graph

Author	SHA1	Message	Date
naisila	9bc95faa32	Adds JSON_TABLE() support PG17 has added basic JSON_TABLE() functionality JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples). In the end, for multi-shard JSON_TABLE commands, we apply the same restrictions as reference tables (e.g., cannot perform a lateral outer join when a distributed subquery references a (reference table)/JSON_TABLE etc.) Relevant PG commit: https://github.com/postgres/postgres/commit/de3600452 Onder had previously added json table support for PG15BETA1, but we reverted that commit because json table was reverted in PG15. `ce7f1a530f` Therefore, I referred to that commit for this commit as well, with a few changes due to some differences between PG15/PG17: 1) In PG15Beta1, we had also PLAN clauses for JSON_TABLE, and Onder's commit includes tests for those as well. However, PLAN nodes are not added in PG17. Therefore I didn't include the json_table_select_only test, which had mostly queries involving PLAN. 2) In PG15 timeline (Citus 11.1), we didn't support outer joins where the outer rel is a recurring one and the inner one is a non-recurring one. However, Onur added support for that one in Citus 11.2, therefore I updated the tests from Onder's commit accordingly. 3) PG17 json table has nested paths and columns, therefore I added a test with a distributed table, which is exactly the same as the one in sqljson_jsontable in PG17. https://github.com/postgres/postgres/commit/bb766cde6	2024-12-30 15:10:24 +03:00
Naisila Puka	8a8b2f9a6d	Add tests for inserting with AT LOCAL operator (#7815 ) PG17 has added support for AT LOCAL operator it converts the given time type to time stamp with the session's TimeZone value as time zone. Here we add tests that validate that we can use AT LOCAL at INSERT commands Relevant PG commit: https://github.com/postgres/postgres/commit/97957fdba With the tests, we verify that we evaluate AT LOCAL at the coordinator and then perform the insert remotely.	2024-12-30 12:54:21 +03:00
Mehmet YILMAZ	1a3316281c	Error out for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION (#7814 ) PG17 added support for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION. Relevant PG commit: https://github.com/postgres/postgres/commit/5d06e99a3 We currently don't support propagating this command for Citus tables. It is added to future work. This PR disallows `ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION` on all Citus table types (local, distributed, and partitioned distributed) by adding an error check in `ErrorIfUnsupportedAlterTableStmt`. A new regression test verifies that each table type fails with a consistent error message when attempting to set an expression.	2024-12-27 16:02:12 +03:00
Mehmet YILMAZ	62a00b1b34	Error out for ALTER TABLE ... SET ACCESS METHOD DEFAULT (#7803 ) PG17 introduced ALTER TABLE ... SET ACCESS METHOD DEFAULT This PR introduces and enforces an error check preventing ALTER TABLE ... SET ACCESS METHOD DEFAULT on both Citus local tables (added via citus_add_local_table_to_metadata) and distributed/partitioned distributed tables. The regression tests now demonstrate that each table type raises an error advising users to explicitly specify an access method, rather than relying on DEFAULT. This ensures consistent behavior across local and distributed environments in Citus. The reason why we currently don't support this is that we can't simply propagate the command as it is, because the default table access method may be different across Citus cluster nodes. Relevant PG commit: https://github.com/postgres/postgres/commit/d61a6cad6	2024-12-27 15:07:38 +03:00
Naisila Puka	b4cc7219ae	Add tests for FORCE_NULL * and FORCE_NOT_NULL * options for COPY FROM (#7812 ) These options already existed in PG17, and we support them and have tests for them in `multi_copy.sql`. In PG17, their capability was extended to specify ALL columns at once using . Citus performs the COPY correctly, as is validated by the added tests in this PR. Relevant PG commit: https://github.com/postgres/postgres/commit/f6d4c9cf1 Copy-pasting from Postgres documentation what these options do, such that the reviewer may better understand the tests added: `FORCE_NOT_NULL`: Do not match the specified columns' values against the null string. In the default case where the null string is empty, this means that empty values will be read as zero-length strings rather than nulls, even when they are not quoted. If is specified, the option will be applied to all columns. This option is allowed only in `COPY FROM`, and only when using `CSV` format. `FORCE_NULL`: Match the specified columns' values against the null string, even if it has been quoted, and if a match is found set the value to `NULL`. In the default case where the null string is empty, this converts a quoted empty string into `NULL`. If * is specified, the option will be applied to all columns. This option is allowed only in `COPY FROM`, and only when using `CSV` format. `FORCE_NULL` and `FORCE_NOT_NULL` can be used simultaneously on the same column. This results in converting quoted null strings to null values and unquoted null strings to empty strings. Explain it to me like I'm a 5-year-old, for a text column: `FORCE_NULL` looks for empty strings and registers them as `NULL` `FORCE_NOT_NULL` looks for null values and registers them as empty strings.	2024-12-26 16:52:42 +03:00
Naisila Puka	d682bf06f7	Error for COPY FROM ... on_error, log_verbosity with Citus tables (#7811 ) PG17 added the new ON_ERROR option for COPY FROM. When this option is specified, COPY skips soft errors and continues copying. Relevant PG commits: -- https://github.com/postgres/postgres/commit/9e2d87011 -- https://github.com/postgres/postgres/commit/b725b7eec I tried it locally with Citus tables. Without further implementation, it doesn't work correctly. Therefore, we error out for now, and add it to future work. PG17 also added log_verbosity option, which controls the amount of messages emitted during processing. This is currently used in COPY FROM when ON_ERROR option is set to ignore. Therefore, we error out for this option as well. Relevant PG17 commit: https://github.com/postgres/postgres/commit/f5a227895	2024-12-26 15:29:44 +03:00
Naisila Puka	af88c37b56	PG17: ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT (#7808 ) DESCRIPTION: Propagates ALTER INDEX ALTER COLUMN SET STATISTICS DEFAULT We automatically support this. Adding tests only. We currently don't support ALTER TABLE ALTER COLUMN SET STATISTICS Relevant PG commit: https://github.com/postgres/postgres/commit/4f622503d	2024-12-25 16:58:51 +03:00
Naisila Puka	743f0ebb75	Bump Citus version into 13.0.0 (#7792 ) We are using `release-13.0` branch for both development and release, to deliver PG17 support in Citus. Afterwards, we will (probably) merge this branch into main. Some potential changes for main branch, after we are done working on release-13.0: - Merge changes from `release-13.0` to `main` - Figure out what changes were there on 12.2, move them to 13.1 version. In a nutshell: rename `12.1--12.2` to `13.0--13.1` and fix issues. - Set version to 13.1devel	2024-12-24 11:40:59 +03:00
Mehmet YILMAZ	351b1ca63c	PG17 compatibility: Fix Test Failure in multi_alter_table_add_const (#7733 ) In earlier versions of PostgreSQL, exclusion constraints were not allowed on partitioned tables. This is why the error in your regression test (ERROR: exclusion constraints are not supported on partitioned tables) was raised in PostgreSQL 16. In PostgreSQL 17, exclusion constraints are now allowed on partitioned tables, which is why the error no longer appears when you attempt to add an exclusion constraint. The constraint exclusion mechanism, described in the documentation, relies on CHECK constraints to decide which partitions or child tables need to be queried. [CHECK constraints](https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-CONSTRAINT-EXCLUSION) ```diff -- Check "ADD EXCLUDE" errors out for partitioned table since the postgres does not allow it ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD EXCLUDE(partition_col WITH =); -ERROR: exclusion constraints are not supported on partitioned tables -- Check "ADD CHECK" SET client_min_messages TO DEBUG1; ALTER TABLE AT_AddConstNoName.citus_local_partitioned_table ADD CHECK (dist_col > 0); DEBUG: the constraint name on the shards of the partition is too long, switching to sequential and local execution mode to prevent self deadlocks: longlonglonglonglonglonglonglonglonglonglonglo_537570f5_5_check DEBUG: verifying table "longlonglonglonglonglonglonglonglonglonglonglonglonglonglongabc" DEBUG: verifying table "p1" RESET client_min_messages; SELECT con.conname FROM pg_catalog.pg_constraint con INNER JOIN pg_catalog.pg_class rel ON rel.oid = con.conrelid INNER JOIN pg_catalog.pg_namespace nsp ON nsp.oid = connamespace WHERE rel.relname = 'citus_local_partitioned_table'; conname -------------------------------------------------- + citus_local_partitioned_table_partition_col_excl citus_local_partitioned_table_check -(1 row) +(2 rows) ```	2024-12-23 15:24:46 +03:00
Naisila Puka	f123632f85	Fix pg17 test (#7797 ) Broken from this commit `e3db375149` https://github.com/citusdata/citus/actions/runs/12429202397/attempts/1#summary-34702334056	2024-12-20 20:13:48 +03:00
Mehmet YILMAZ	e3db375149	PG17 compatibility: Fix Test Failure in local_table_join (#7732 ) PostgreSQL 17 seems to have introduced improvements in how correlated subqueries are handled during plan generation. Instead of generating a trivial subplan with WHERE true, it now applies more specific filtering (WHERE (key = 5)), which makes the execution plan more efficient. https://github.com/postgres/postgres/commit/b262ad44 ``` diff -dU10 -w /__w/citus/citus/src/test/regress/expected/local_table_join.out /__w/citus/citus/src/test/regress/results/local_table_join.out --- /__w/citus/citus/src/test/regress/expected/local_table_join.out.modified 2024-11-05 09:53:50.423970699 +0000 +++ /__w/citus/citus/src/test/regress/results/local_table_join.out.modified 2024-11-05 09:53:50.463971296 +0000 @@ -1420,32 +1420,32 @@ ) as subq_1 ) as subq_2; DEBUG: Wrapping relation "custom_pg_type" to a subquery DEBUG: generating subplan 204_1 for subquery SELECT typdefault FROM local_table_join.custom_pg_type WHERE true ERROR: direct joins between distributed and local tables are not supported HINT: Use CTE's or subqueries to select from local tables and use them in joins -- correlated sublinks are not yet supported because of #4470, unless we convert not-correlated table SELECT COUNT(*) FROM distributed_table d1 JOIN postgres_table using(key) WHERE d1.key IN (SELECT key FROM distributed_table WHERE d1.key = key and key = 5); DEBUG: Wrapping relation "postgres_table" to a subquery -DEBUG: generating subplan XXX_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE true +DEBUG: generating subplan 206_1 for subquery SELECT key FROM local_table_join.postgres_table WHERE (key OPERATOR(pg_catalog.=) 5) ``` Co-authored-by: Naisila Puka <37271756+naisila@users.noreply.github.com>	2024-12-19 22:21:51 +03:00
Mehmet YILMAZ	acd7b1e690	PG17 compatibility: Fix Test Failure in local_dist_join_mixed (#7731 ) PostgreSQL 16 adds an extra condition (id IS NOT NULL) to the subquery. This condition is likely used to ensure that no null values are processed in the subquery. Instead of using the condition id IS NOT NULL, PostgreSQL 17 generates the subplan with a trivial condition (WHERE true), indicating that it does not need to explicitly check for non-null values. PostgreSQL 17 likely includes optimizations to handle null checks more efficiently. The WHERE (id IS NOT NULL) condition that was present in PostgreSQL 16 may now be considered redundant by the planner, as it is implicitly handled by the query execution engine. https://github.com/postgres/postgres/commit/b262ad44 ```diff SELECT foo1.id FROM (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo9, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo8, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo7, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo6, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo5, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo4, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo3, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo2, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo10, (SELECT local.id, local.title FROM local, distributed WHERE local.id = distributed.id ) as foo1 WHERE foo1.id = foo9.id AND foo1.id = foo8.id AND foo1.id = foo7.id AND foo1.id = foo6.id AND foo1.id = foo5.id AND foo1.id = foo4.id AND foo1.id = foo3.id AND foo1.id = foo2.id AND foo1.id = foo10.id AND foo1.id = foo1.id ORDER BY 1; ... -DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE (id IS NOT NULL) +DEBUG: generating subplan XXX_10 for subquery SELECT id FROM local_dist_join_mixed.local WHERE true ... ```	2024-12-19 22:21:23 +03:00
Teja Mupparti	a0cd8bd37b	PG17 Compatibility: Support MERGE features in Citus with clean exceptions (#7781 ) - Adapted `pgmerge.sql` tests from PostgreSQL community's `merge.sql` to Citus by converting tables into Citus local tables. - Identified two new PostgreSQL 17 MERGE features (`RETURNING` support and MERGE on updatable views) not yet supported by Citus. - Implemented changes to detect unsupported features and raise clean exceptions, ensuring pgmerge tests pass without diffs. - Addressed breaking changes caused by `MERGE ... WHEN NOT MATCHED BY SOURCE` restructuring, reducing diffs in pgmerge tests. - Segregated unsupported test cases into `merge_unsupported.sql` to maintain clarity and avoid large diffs in test files. - Prepared the Citus MERGE planner to handle new PostgreSQL changes, reducing remaining test discrepancies. All merge tests now pass cleanly, with unsupported cases clearly isolated. Relevant PG commits: c649fa24a https://github.com/postgres/postgres/commit/c649fa24a 0294df2f1 https://github.com/postgres/postgres/commit/0294df2f1 --------- Co-authored-by: naisila <nicypp@gmail.com>	2024-12-19 14:02:24 +03:00
Colm	e91ee245ac	PG17 compatibility: account for identity columns in partitioned tables. (#7785 ) PG17 added support for identity columns in partitioned tables: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=699586315 A consequence is that a table with an identity column cannot be attached as a partition. But Citus on Postgres 17 will generate identity column for the partitions if the parent table has one (or more) identity columns when propagating distributed table DDL to worker nodes, as happens in the `generated_identity` regress test in #7768: ``` CREATE TABLE partitioned_table ( a bigint CONSTRAINT myconname GENERATED BY DEFAULT AS IDENTITY (START WITH 10 INCREMENT BY 10), b bigint GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10), c int ) PARTITION BY RANGE (c); CREATE TABLE partitioned_table_1_50 PARTITION OF partitioned_table FOR VALUES FROM (1) TO (50); CREATE TABLE partitioned_table_50_500 PARTITION OF partitioned_table FOR VALUES FROM (50) TO (1000); SELECT create_distributed_table('partitioned_table', 'a'); - create_distributed_table ---------------------------------------------------------------------- - -(1 row) - +ERROR: table "partitioned_table_1_50" being attached contains an identity column "a" +DETAIL: The new partition may not contain an identity column. ``` It is the Citus-generated ATTACH PARTITION statement that errors out, because the Citus-generated CREATE TABLE for the partitions included identity column definitions. The fix is straightforward - when propagating the CREATE TABLE ddl for a partition of a table with an identity column, don't include the identity column(s), they will be inherited on attaching the partition. In Citus on Postgres 16 (or less) partitions do not inherit identity; the partitions in the example would not have any identity columns so it was not an issue previously.	2024-12-18 13:18:53 +00:00
Colm	89e0b53644	PG17 compatibility: fix plan diffs in multi_explain (#7780 ) Regress test `multi_explain` has two queries that have a different query plan with PG17. Here is part of the plan diff for the query labelled _Union and left join subquery pushdown_ in `multi_explain.sql` (for the complete diff, search for `multi_explain` [here](https://github.com/citusdata/citus/actions/runs/12158205599/attempts/1)): ``` -> Sort Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), subquery_2.hasdone, events.event_time - -> Hash Left Join - Hash Cond: (users.composite_id = subquery_2.composite_id) - -> HashAggregate - Group Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time + -> Nested Loop Left Join + Join Filter: (users.composite_id = subquery_2.composite_id) + -> Unique + -> Sort + Sort Key: ((users.composite_id).tenant_id), ((users.composite_id).user_id), users.composite_id, ('action=>1'::text), events.event_time -> Append ``` The change is the same in both queries; a hash left join with subquery_1 on the outer and subquery_2 on the inner side of the join is now a nested loop left join with subquery_1 on the outer and subquery_2 on the inner; additionally, the chosen method of uniquifying the UNION in subquery_1 has changed from hashed grouping to sort followed by unique, as shown in the diff above. The PG17 commit that caused this plan change is likely _[Fix MergeAppend to more accurately compute the number of rows that need to be sorted](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9d1a5354f)_ because it impacts the estimated rows counts of UNION paths. Comparing a costed plan of the query between PG16 and PG17 I noticed that with PG16 the rows estimate for the UNION in subquery_1 is 4, whereas with PG17 the rows estimate is 2. A lower rows estimate in the outer side of the join may result in nested loop looking cheaper than hash join for the left outer join, hence the plan change in the two queries where there is a UNION on the outer side of a left outer join. The proposed fix achieves a consistent plan across all supported postgres versions by temporarily disabling nested loop join and sort for the two impacted queries; the postgres optimizer selects hash join for the outer left join and hashed aggregation for the UNION operation. I investigated tweaking the queries, but was not able to arrive at a consistent plan, and I believe the SQL operator (e.g. join, group by, union) implementations are orthogonal to the intent of the test, so this should be a satisfactory solution, particularly as it avoids introducing a second alternative output file for `multi_explain`.	2024-12-17 21:42:15 +00:00
Colm	1c2e78405b	PG17 compatibility: account for MAINTAIN privilege in regress tests (#7774 ) This PR addresses regress tests impacted by the introduction of [the MAINTAIN privilege in PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ecb0fd337). The impacted tests include `generated_identity`, `create_single_shard_table`, `grant_on_sequence_propagation`, `grant_on_foreign_server_propagation`, `single_node_enterprise`, `multi_multiuser_master_protocol`, `multi_alter_table_row_level_security`, `shard_move_constraints` which show the following error: ``` SELECT start_metadata_sync_to_node('localhost', :worker_2_port); - start_metadata_sync_to_node ---------------------------------------------------------------------- - -(1 row) - +ERROR: unrecognized aclright: 16384 ``` and `multi_multiuser_master_protocol`, where the `pg_class.relacl` column has 'm' for MAINTAIN if applicable: ``` relname \| rolname \| relacl ---------------------+-------------+------------------------------------------------------------ trivial_full_access \| full_access \| - trivial_postgres \| postgres \| {postgres=arwdDxt/postgres,full_access=arwdDxt/postgres} + trivial_postgres \| postgres \| {postgres=arwdDxtm/postgres,full_access=arwdDxtm/postgres} ``` The PR updates function `convert_aclright_to_string()` in citus_ruleutils.c to include a case for `ACL_MAINTAIN`. Per the comment on `convert_aclright_to_string()` in citus_ruleutils.c, it is a copy of `convert_aclright_to_string()` in Postgres (where it is in `src/backend/utils/adt/acl.c`), so requires updating to be consistent with Postgres. With this change Citus can recognize the MAINTAIN privilege, and will not emit the `unrecognized aclright` error. The PR also adds an alternative goldfile for `multi_multiuser_master_protocol`. Note that `convert_aclright_to_string()` in Postgres includes access types SET and ALTER SYSTEM on system parameters (aka GUCs), added by [this PG16 commit](https://github.com/postgres/postgres/commit/a0ffa885e). If Citus were to have a requirement to support granting SET and ALTER SYSTEM we would need to update `convert_aclright_to_string()` in citus_ruleutils.c with SET and ALTER SYSTEM.	2024-12-06 13:03:51 +00:00
Colm	7c9280f6e3	PG17 compatibility: fix multi-1 diffs caused by PG17 optimizer enhancements (#7769 ) This fix ensures that the expected DEBUG error messages from the router planner in `multi_router_planner`, `multi_router_planner_fast_path` and `query_single_shard_table` are present with PG17. In `query_single_shard_table` the diff: ``` SELECT COUNT() FROM citus_local_table t1 WHERE t1.b IN ( SELECT b+1 FROM nullkey_c1_t1 t2 WHERE t2.b = t1.a ); -DEBUG: router planner does not support queries that reference non-colocated distributed tables +DEBUG: Local tables cannot be used in distributed queries. ``` occurred because of[ this PG17 commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639) which enables the optimizer to pull up a correlated ANY subquery to a join. The fix inhibits subquery pull up by including a volatile function in the predicate involving the ANY subquery, preserving the pre-PG17 optimizer treatment of the query. In the case of `multi_router_planner` and `multi_router_planner_fast_path` the diffs: ``` -- partition_column is null clause does not prune out any shards, -- all shards remain after shard pruning, not router plannable SELECT FROM articles_hash a WHERE a.author_id is null; -DEBUG: Router planner cannot handle multi-shard select queries +DEBUG: Creating router plan ``` are because of [this PG17 commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b262ad440), which enables the optimizer to detect and remove redundant IS (NOT) NULL expressions. The fix is to adjust the table definition so the column used for distribution is not marked NOT NULL, thus preserving the pre-PG17 query planning behavior. Finallly, a rule is added to `normalize.sed` to ignore DEBUG logging in CREATE MATERIALIZED VIEW AS statements introduced by [this PG17 commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b4da732fd64); _when creating materialized views, use REFRESH logic to load data_, a consequence of which is that with `client_min_messages` at `DEBUG2` Postgres emits extra detail for CREATE MATERIALIZED VIEW AS statements. ``` CREATE MATERIALIZED VIEW mv_articles_hash_empty AS SELECT * FROM articles_hash WHERE author_id = 1; DEBUG: Creating router plan DEBUG: query has a single distribution column value: 1 +DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391 +DEBUG: drop auto-cascades to type multi_router_planner.pg_temp_61391[] ``` The rule can be changed to a normalization, or possibly dropped, when 17 becomes the minimum supported version.	2024-12-06 11:55:12 +00:00
Naisila Puka	90d76e8097	Fix bug: alter database shouldn't propagate for undistributed database (#7777 ) Before this fix, the following would happen for a database that is not distributed: ```sql CREATE DATABASE db_to_test; NOTICE: Citus partially supports CREATE DATABASE for distributed databases DETAIL: Citus does not propagate CREATE DATABASE command to workers HINT: You can manually create a database and its extensions on workers. ALTER DATABASE db_to_test with CONNECTION LIMIT 100; NOTICE: issuing ALTER DATABASE db_to_test WITH CONNECTION LIMIT 100; DETAIL: on server postgres@localhost:57638 connectionId: 2 NOTICE: issuing ALTER DATABASE db_to_test WITH CONNECTION LIMIT 100; DETAIL: on server postgres@localhost:57637 connectionId: 1 ERROR: database "db_to_test" does not exist ``` With this fix, we also take care of https://github.com/citusdata/citus/issues/7763 Fixes #7763	2024-12-05 13:47:51 +03:00
Colm	063be46444	PG17 compatibility: Fix check-style, broken by PG17 columnar test fix… (#7776 ) … (`698699d89e`) --------- Co-authored-by: naisila <nicypp@gmail.com>	2024-12-04 15:29:39 +00:00
Colm	698699d89e	PG17 compatibility (#7653 ): Fix test diffs in columnar schedule (#7768 ) This PR fixes diffs in `columnnar_chunk_filtering` and `columnar_paths` tests. In `columnnar_chunk_filtering` an expression `(NOT (SubPlan 1))` changed to `(NOT (ANY (a = (SubPlan 1).col1)))`. This is due to [aPG17 commit](https://github.com/postgres/postgres/commit/fd0398fc) that improved how scalar subqueries (InitPlans) and ANY subqueries (SubPlans) are EXPLAINed in expressions. The fix uses a helper function which converts the PG17 format to the pre-PG17 format. It is done this way because pre-PG17 EXPLAIN does not provide enough context to convert to the PG17 format. The helper function can (and should) be retired when 17 becomes the minimum supported PG. In `columnar_paths`, a merge join changed to a hash join. This is due to [this PG17 commit](`f7816aec23`), which improved the PG optimizer's ability to estimate the size of a CTE scan. The impacted query involves a CTE scan with a point predicate `(a=123)` and before the change the CTE size was estimated to be 5000, but with the change it is correctly (given the data in the table) estimated to be 1, making hash join a more attractive join method. The fix is to have an alternative goldfile for pre-PG17. I tried, but was unable, to force a specific kind of join method using the GUCs (`enable_nestloop`, `enable_hashjoin`, `enable_mergejoin`), but it was not possible to obtain a consistent plan across all supported PG versions (in some cases the join inputs switched sides).	2024-12-03 09:14:47 +00:00
Naisila Puka	26c09f340f	PG17 compatibility: fix some tests outputs (#7765 ) There are two commits in this PR: 1) Remove domain_default column since it has been removed from PG17 Relevant PG commit: `78806a9509` 78806a95095c4fb9230a441925244690d9c07d23 2) pg_stat_statements reset output diff fix pg_stat_statements reset output changed in PG17, fix idea from Relevant PG commits: `6ab1dbd26b` 6ab1dbd26bbf307055d805feaaca16dc3e750d36	2024-12-02 18:03:38 +03:00
Colm	089555c197	PG17 compatibility: fix diff in tableam (#7771 ) Test `tableam` expects that this CREATE TABLE statement: `CREATE TABLE test_partitioned(id int, p int, val int) PARTITION BY RANGE (p) USING fake_am;` will produce this error: `specifying a table access method is not supported on a partitioned table` but as of [this PG commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=374c7a229) it is possible to specify an access method on a partitioned table. This fix moves the CREATE TABLE statement to pg17, and adds an additional test to show parent access method is inherited.	2024-12-02 13:47:19 +00:00
Colm	1d0111aed6	PG17 regress test sanity: fix diffs in union_pushdown. (#7762 ) Preserve the test error message by adjusting the query so that PG17 cannot pull it up to a join. Another instance of a subquery that can be pulled up to a join with PG17 (#7745) This should have been fixed in, but slipped by, #7745	2024-11-22 16:25:22 +00:00
Naisila Puka	12dd9c1f6b	PG17 compatibility: Adjust print_extension_changes function for extra type outputs in PG17 (#7761 ) In PG17, Auto-generated array types, multirange types, and relation rowtypes are treated as dependent objects, hence changing the output of the print_extension_changes function. Relevant PG commit: e5bc9454e527b1cba97553531d8d4992892fdeef `e5bc9454e5` Here we create a table with only the basic extension types in order to avoid printing extra ones for now. This can be removed when we drop PG16 support. https://github.com/citusdata/citus/actions/runs/11960253650/attempts/1#summary-33343972656 ```diff \| table pg_dist_rebalance_strategy + \| type citus.distribution_type[] + \| type citus.pg_dist_object + \| type pg_dist_shard + \| type pg_dist_shard[] + \| type pg_dist_shard_placement + \| type pg_dist_shard_placement[] + \| type pg_dist_transaction + \| type pg_dist_transaction[] \| view citus_dist_stat_activity \| view pg_dist_shard_placement ```	2024-11-22 16:10:02 +03:00
Naisila Puka	65dcc5904d	PG17 compatibility: fix backend type orders in test (#7760 ) This work was already done by @m3hm3t and approved as part of https://github.com/citusdata/citus/pull/7722 I separated it in this PR since the previous one contained other changes which we don't currently want to merge. Relevant PG commit: --------- Co-authored-by: Mehmet YILMAZ <mehmety87@gmail.com>	2024-11-22 01:08:15 +03:00
Colm	7e701befde	PG17 compatibility: add helper function for EXPLAIN diffs in scalar subquery output (#7757 ) PG17 changed how scalar subquery outputs appear in EXPLAIN output (). This commit changes impacted regress goldfiles to the PG17 format, and adds a helper function to covert pre-PG17 plans to the PG17 format. The conversion is required when testing Citus on pgversions prior to 17. The helper function can and should be removed when 17 becomes the minimum supported version. () https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=fd0398fcb	2024-11-21 22:22:30 +03:00
Colm	680c23ffcf	PG17 compatibility: add/fix tests with correlated subqueries that can be pulled to a join (#7745 ) Fix Test Failure in subquery_in_where, set_operations, dml_recursive in PG17 #7741 The test failures are caused by[ this commit in PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639), which enables correlated subqueries to be pulled up to a join. Prior to this, the correlated subquery was implemented as a subplan. In citus, it is not possible to pushdown a correlated subplan, but with a different plan in PG17 the query can be executed, per the test diff from `subquery_in_where`: ``` 37,39c37,41 < DEBUG: generating subplan XXX_1 for CTE event_id: SELECT user_id AS events_user_id, "time" AS events_time, event_type FROM public.events_table < DEBUG: Plan XXX query after replacing subqueries and CTEs: SELECT count() AS count FROM ... < ERROR: correlated subqueries are not supported when the FROM clause contains a CTE or subquery --- > count > --------------------------------------------------------------------- > 0 > (1 row) > ``` This is because with pg17 `= ANY subquery` in the queries can be implemented as a join, instead of as a subplan filter on a table scan. For example, `SELECT FROM test a WHERE x IN (SELECT x FROM test b UNION SELECT y FROM test c WHERE a.x = c.x) ORDER BY 1,2` (from set_operations) has this plan in pg17; note that the subquery is the inner side of a nested loop join: ``` ┌───────────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────────┤ │ Sort │ │ Sort Key: a.x, a.y │ │ -> Nested Loop │ │ -> Seq Scan on test a │ │ -> Subquery Scan on "ANY_subquery" │ │ Filter: (a.x = "ANY_subquery".x) │ │ -> HashAggregate │ │ Group Key: b.x │ │ -> Append │ │ -> Seq Scan on test b │ │ -> Seq Scan on test c │ │ Filter: (a.x = x) │ └───────────────────────────────────────────────────┘ ``` and this plan in pg16 (and previous pg versions); the subquery is a correlated subplan filter on a table scan: ``` ┌───────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────┤ │ Sort │ │ Sort Key: a.x, a.y │ │ -> Seq Scan on test a │ │ Filter: (SubPlan 1) │ │ SubPlan 1 │ │ -> HashAggregate │ │ Group Key: b.x │ │ -> Append │ │ -> Seq Scan on test b │ │ -> Seq Scan on test c │ │ Filter: (a.x = x) │ └───────────────────────────────────────────────┘ ``` The fix Modifies the queries causing the test failures so that an ANY subquery is not folded to a join, preserving the expected output of the tests. A similar approach was taken for existing regress tests in the[ postgres commit](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9f1337639). See the `join `regress test, for example. We also add pg17 specific tests that leverage this improvement in Postgres with Citus distributed planning as well.	2024-11-20 14:51:16 +03:00
Colm	0fed87ada9	PG17 compatibility: Preserve DEBUG output in cte_inline (#7755 ) Regression test cte_inline has the following diff; ``` DEBUG: CTE cte_1 is going to be inlined via distributed planning DEBUG: CTE cte_1 is going to be inlined via distributed planning DEBUG: Creating router plan -DEBUG: query has a single distribution column value: 1 ``` DEBUG message `query has a single distribution column value` does not appear with PG17. This is because PG17 can recognize when a Result node does not need to have an input node, so the predicate on the distribution column is not present in the query plan. Comparing the query plan obtained before PG17: ``` │ Result │ │ One-Time Filter: false │ │ -> GroupAggregate │ │ -> Seq Scan on public.test_table │ │ Filter: (test_table.key = 1) │ ``` with the PG17 query plan: ``` ┌──────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────┤ │ Result │ │ One-Time Filter: false │ └──────────────────────────────────┘ ``` we see that the Result node in the PG16 plan has an Aggregate node, but the Result node in the PG17 plan does not have any input node; PG17 recognizes it is not needed given a Filter that evaluates to False at compile-time. The Result node is present in both plans because PG in both versions can recognize when a combination of predicates equate to false at compile time; this is the because the successive predicates in the test query (key=6, key=5, key=4, etc) become contradictory when the CTEs are inlined. Here is an example query showing the effect of the CTE inlining: ``` select count(*), key FROM test_table WHERE key = 1 AND key = 2 GROUP BY key; ``` In this case, the WHERE clause obviously evaluates to False. The PG16 query plan for this query is: ``` ┌────────────────────────────────────┐ │ QUERY PLAN │ ├────────────────────────────────────┤ │ GroupAggregate │ │ -> Result │ │ One-Time Filter: false │ │ -> Seq Scan on test_table │ │ Filter: (key = 1) │ └────────────────────────────────────┘ ``` The PG17 query plan is: ``` ┌────────────────────────────────┐ │ QUERY PLAN │ ├────────────────────────────────┤ │ GroupAggregate │ │ -> Result │ │ One-Time Filter: false │ └────────────────────────────────┘ ``` In both plans the PG optimizer is able to derive the predicate 1=2 from the equivalence class { key, 1, 2 } and then constant fold this to False. But, in the PG16 plan the Result node has an input node (a sequential scan on test_table), while in the PG17 plan the Result node does not have any input. This is because PG17 recognizes that when the Result filter resolves to False at compile time it is not necessary to set an input on the Result. I think this is a consequence of this PG17 commit: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b262ad440 which handles redundant IS [NOT] NULL predicates, but also refactored evaluating of predicates to true/false at compile-time, enabling optimizations such as those seen here. Given the reason for the diff, the fix preserves the test output by modifying the query so the predicates are not contradictory when the CTEs are inlined.	2024-11-20 00:14:57 +03:00
Naisila Puka	ed137001a5	PG17 compatibility: add COLLPROVIDER_BUILTIN option and fix tests (#7752 ) In PG17 adds builtin C.UTF-8 locale option, we add it in the code to avoid "unknown collation provider" in vanilla tests. Relevant PG commit: `f69319f2f1` f69319f2f1fb16eda4b535bcccec90dff3a6795e Also in PG17, colliculocale, daticulocale renamed to colllocale, datlocale Here we fix the following tests to avoid alternative output pg15 pg16 multi_mx_create_table multi_schema_support Relevant PG commit: `f696c0cd5f` f696c0cd5f299f1b51e214efc55a22a782cc175d	2024-11-19 12:26:45 +03:00
Mehmet YILMAZ	32a2a31b13	PG17 compatibility: Fix -1/Null diff in attstattarget test output (#7749 ) Changed `attstattarget` in `pg_attribute` to use `NullableDatum`, allowing null representation for default statistics target in PostgreSQL 17. Relevant PG commit: 6a004f1be87d34cfe51acf2fe2552d2b08a79273 `6a004f1be8` ```diff -- verify statistics is set SELECT c.relname, a.attstattarget FROM pg_attribute a JOIN pg_class c ON a.attrelid = c.oid AND c.relname LIKE 'test\_idx%' ORDER BY c.relname, a.attnum; relname \| attstattarget -----------+--------------- test_idx \| 4646 - test_idx2 \| -1 + test_idx2 \| test_idx2 \| 10000 test_idx2 \| 3737 (4 rows) ```	2024-11-17 23:43:39 +03:00
Mehmet YILMAZ	8c0feee74d	PG17 compatibility: Fix -1/Null diff in stxstattarget test output (#7748 ) Changed stxstattarget in pg_statistic_ext to use nullable representation, removing explicit -1 for default statistics target in PostgreSQL 17. Relevant PG commit: 012460ee93c304fbc7220e5b55d9d0577fc766ab `012460ee93` ```diff SELECT stxstattarget, stxrelid::regclass FROM pg_statistic_ext WHERE stxnamespace IN ( SELECT oid FROM pg_namespace WHERE nspname IN ('statistics''TestTarget') ) AND stxname SIMILAR TO '%\_\d+' ORDER BY stxstattarget, stxrelid::regclass ASC; stxstattarget \| stxrelid ---------------+----------------------------------- - -1 \| "statistics'TestTarget".t1_980000 - -1 \| "statistics'TestTarget".t1_980002 ... + \| "statistics'TestTarget".t1_980000 + \| "statistics'TestTarget".t1_980002 ... ```	2024-11-17 22:41:53 +03:00
Parag Jain	6349f2d52d	Support MERGE command for single_shard_distributed Target (#7643 ) This PR has following changes : 1. Enable MERGE command for single_shard_distributed targets. (cherry picked from commit `3c467e6e02`)	2024-07-17 15:11:38 +03:00
Onur Tirtir	d9635609f4	Fix flaky multi_mx_node_metadata.sql test (#7317 ) Fixes the flaky test that results in following diff: ```diff --- /__w/citus/citus/src/test/regress/expected/multi_mx_node_metadata.out.modified 2023-11-01 14:22:12.890476575 +0000 +++ /__w/citus/citus/src/test/regress/results/multi_mx_node_metadata.out.modified 2023-11-01 14:22:12.914476657 +0000 @@ -840,24 +840,26 @@ (1 row) \c :datname - - :master_port SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%'; datname ------------ db_to_drop (1 row) DROP DATABASE db_to_drop; +ERROR: database "db_to_drop" is being accessed by other users SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%'; datname ------------ -(0 rows) + db_to_drop +(1 row) -- cleanup DROP SEQUENCE sequence CASCADE; NOTICE: drop cascades to default value for column a of table reference_table ``` (cherry picked from commit `9867c5b949`)	2024-06-18 16:49:39 +02:00
Jelte Fennema-Nio	4f0053ed6d	Redo #7620 : Fix merge command when insert value does not have source distributed column (#7627 ) Related to issue #7619, #7620 Merge command fails when source query is single sharded and source and target are co-located and insert is not using distribution key of source. Example ``` CREATE TABLE source (id integer); CREATE TABLE target (id integer ); -- let's distribute both table on id field SELECT create_distributed_table('source', 'id'); SELECT create_distributed_table('target', 'id'); MERGE INTO target t USING ( SELECT 1 AS somekey FROM source WHERE source.id = 1) s ON t.id = s.somekey WHEN NOT MATCHED THEN INSERT (id) VALUES (s.somekey) ERROR: MERGE INSERT must use the source table distribution column value HINT: MERGE INSERT must use the source table distribution column value ``` Author's Opinion: If join is not between source and target distributed column, we should not force user to use source distributed column while inserting value of target distributed column. Fix: If user is not using distributed key of source for insertion let's not push down query to workers and don't force user to use source distributed column if it is not part of join. This reverts commit `fa4fc0b372`. Co-authored-by: paragjain <paragjain@microsoft.com> (cherry picked from commit `aaaf637a6b`)	2024-06-18 16:49:39 +02:00
Gürkan İndibay	4e838a471a	Adds null check for node in HasRangeTableRef (#7604 ) DESCRIPTION: Adds null check for node in HasRangeTableRef to prevent errors When executing the query below, users encountered an error due to a null Node object. This PR adds a null check to handle this error. Query: ```sql select ct.conname as constraint_name, a.attname as column_name, fc.relname as foreign_table_name, fns.nspname as foreign_table_schema, fa.attname as foreign_column_name from (SELECT ct.conname, ct.conrelid, ct.confrelid, ct.conkey, ct.contype, ct.confkey, generate_subscripts(ct.conkey, 1) AS s FROM pg_constraint ct ) AS ct inner join pg_class c on c.oid=ct.conrelid inner join pg_namespace ns on c.relnamespace=ns.oid inner join pg_attribute a on a.attrelid=ct.conrelid and a.attnum = ct.conkey[ct.s] left join pg_class fc on fc.oid=ct.confrelid left join pg_namespace fns on fc.relnamespace=fns.oid left join pg_attribute fa on fa.attrelid=ct.confrelid and fa.attnum = ct.confkey[ct.s] where ct.contype='f' and c.relname='table1' and ns.nspname='schemauser' order by fns.nspname, fc.relname, a.attnum ; ``` Error: ``` #0 HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507 507 if (IsA(node, RangeTblRef)) #0 HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507 #1 0x0000561b0aae390e in expression_tree_walker_impl (node=0x561b0d19cc78, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2091 #2 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #3 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cd68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674) at nodeFuncs.c:2405 #4 0x0000561b0aae3945 in expression_tree_walker_impl (node=0x561b0d19d0f8, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2111 #5 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #6 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cb38, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674) at nodeFuncs.c:2405 #7 0x0000561b0aae396d in expression_tree_walker_impl (node=0x561b0d19d198, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2127 #8 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #9 0x0000561b0aae3ef7 in expression_tree_walker_impl (node=0x561b0d183e88, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2464 #10 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #11 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184278, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2460 #12 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #13 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184668, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2460 #14 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #15 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184f68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2460 #16 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #17 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x7f2a68010148, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674) at nodeFuncs.c:2405 #18 0x00007f2a7324a0eb in FilterShardsFromPgclass (node=node@entry=0x561b0d185de8, context=context@entry=0x0) at worker/worker_shard_visibility.c:464 #19 0x00007f2a7324a5ff in HideShardsFromSomeApplications (query=query@entry=0x561b0d185de8) at worker/worker_shard_visibility.c:294 #20 0x00007f2a731ed7ac in distributed_planner (parse=0x561b0d185de8, query_string=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=<optimized out>, boundParams=0x0) at planner/distributed_planner.c:237 #21 0x00007f2a7311a52a in pgss_planner (parse=0x561b0d185de8, query_string=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=2048, boundParams=0x0) at pg_stat_statements.c:953 #22 0x0000561b0ab65465 in planner (parse=parse@entry=0x561b0d185de8, query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:279 #23 0x0000561b0ac53aa3 in pg_plan_query (querytree=querytree@entry=0x561b0d185de8, query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:904 #24 0x0000561b0ac53b71 in pg_plan_queries (querytrees=0x7f2a68012878, query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:996 #25 0x0000561b0ac5408e in exec_simple_query ( query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"...) at postgres.c:1193 #26 0x0000561b0ac56116 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637 #27 0x0000561b0abab7a7 in BackendRun (port=port@entry=0x561b0d0caf50) at postmaster.c:4464 #28 0x0000561b0abae969 in BackendStartup (port=port@entry=0x561b0d0caf50) at postmaster.c:4192 #29 0x0000561b0abaeaa6 in ServerLoop () at postmaster.c:1782 ``` Fixes #7603	2024-05-28 08:54:40 +03:00
Jelte Fennema-Nio	bac95cc523	Greatly speed up "\d tablename" on servers with many tables (#7577 ) DESCRIPTION: Fix performance issue when using "\d tablename" on a server with many tables We introduce a filter to every query on pg_class to automatically remove shards. This is useful to make sure \d and PgAdmin are not cluttered with shards. However, the way we were introducing this filter was using `securityQuals` which can have negative impact on query performance. On clusters with 100k+ tables this could cause a simple "\d tablename" command to take multiple seconds, because a skipped optimization by Postgres causes a full table scan. This changes the code to introduce this filter in the regular `quals` list instead of in `securityQuals`. Which causes Postgres to use the intended optimization again. For reference, this was initially reported as a Postgres issue by me: https://www.postgresql.org/message-id/flat/4189982.1712785863%40sss.pgh.pa.us#b87421293b362d581ea8677e3bfea920 (cherry picked from commit `a0151aa31d`)	2024-04-17 10:26:50 +02:00
Filip Sedlák	fc09e1cfdc	Log username in the failed connection message (#7432 ) This patch includes the username in the reported error message. This makes debugging easier when certain commands open connections as other users than the user that is executing the command. ``` monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017); ERROR: connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied Time: 40,198 ms ``` (cherry picked from commit `8b48d6ab02`)	2024-04-17 10:26:50 +02:00
eaydingol	db391c0bb7	Change the order in which the locks are acquired (#7542 ) This PR changes the order in which the locks are acquired (for the target and reference tables), when a modify request is initiated from a worker node that is not the "FirstWorkerNode". To prevent concurrent writes, locks are acquired on the first worker node for the replicated tables. When the update statement originates from the first worker node, it acquires the lock on the reference table(s) first, followed by the target table(s). However, if the update statement is initiated in another worker node, the lock requests are sent to the first worker in a different order. This PR unifies the modification order on the first worker node. With the third commit, independent of the node that received the request, the locks are acquired for the modified table and then the reference tables on the first node. The first commit shows a sample output for the test prior to the fix. Fixes #7477 --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com> (cherry picked from commit `8afa2d0386`)	2024-04-17 10:26:50 +02:00
Onur Tirtir	452564c19b	Allow providing "host" parameter via citus.node_conninfo (#7541 ) And when that is the case, directly use it as "host" parameter for the connections between nodes and use the "hostname" provided in pg_dist_node / pg_dist_poolinfo as "hostaddr" to avoid host name lookup. This is to avoid allowing dns resolution (and / or setting up DNS names for each host in the cluster). This already works currently when using IPs in the hostname. The only use of setting host is that you can then use sslmode=verify-full and it will validate that the hostname matches the certificate provided by the node you're connecting too. It would be more flexible to make this a per-node setting, but that requires SQL changes. And we'd like to backport this change, and backporting such a sql change would be quite hard while backporting this change would be very easy. And in many setups, a different hostname for TLS validation is actually not needed. The reason for that is query-from-any node: With query-from-any-node all nodes usually have a certificate that is valid for the same "cluster hostname", either using a wildcard cert or a Subject Alternative Name (SAN). Because if you load balance across nodes you don't know which node you're connecting to, but you still want TLS validation to do it's job. So with this change you can use this same "cluster hostname" for TLS validation within the cluster. Obviously this means you don't validate that you're connecting to a particular node, just that you're connecting to one of the nodes in the cluster, but that should be fine from a security perspective (in most cases). Note to self: This change requires updating https://docs.citusdata.com/en/latest/develop/api_guc.html#citus-node-conninfo-text. DESCRIPTION: Allows overwriting host name for all inter-node connections by supporting "host" parameter in citus.node_conninfo (cherry picked from commit `3586aab17a`)	2024-04-17 10:26:50 +02:00
copetol	2ee43fd00c	Fix segfault when using certain DO block in function (#7554 ) When using a CASE WHEN expression in the body of the function that is used in the DO block, a segmentation fault occured. This fixes that. Fixes #7381 --------- Co-authored-by: Konstantin Morozov <vzbdryn@yahoo.com> (cherry picked from commit `12f56438fc`)	2024-04-17 10:26:50 +02:00
Emel Şimşek	f2d102d54b	Fix crash caused by some form of ALTER TABLE ADD COLUMN statements. (#7522 ) DESCRIPTION: Fixes a crash caused by some form of ALTER TABLE ADD COLUMN statements. When adding multiple columns, if one of the ADD COLUMN statements contains a FOREIGN constraint ommitting the referenced columns in the statement, a SEGFAULT occurs. For instance, the following statement results in a crash: ``` ALTER TABLE lt ADD COLUMN new_col1 bool, ADD COLUMN new_col2 int references rt; ``` Fixes #7520. (cherry picked from commit `fdd658acec`)	2024-04-17 10:26:50 +02:00
Teja Mupparti	a945971f48	Fix the incorrect column count after ALTER TABLE, this fixes the bug #7378 (please read the analysis in the bug for more information) (cherry picked from commit `00068e07c5`)	2024-01-24 11:48:06 -08:00
Onur Tirtir	a4fe969947	Make sure to disallow creating a replicated distributed table concurrently (#7219 ) See explanation in https://github.com/citusdata/citus/issues/7216. Fixes https://github.com/citusdata/citus/issues/7216. DESCRIPTION: Makes sure to disallow creating a replicated distributed table concurrently (cherry picked from commit `111b4c19bc`)	2023-10-24 14:04:37 +03:00
Gürkan İndibay	e5e64b7454	Adds alter database propagation - with and refresh collation (#7172 ) DESCRIPTION: Adds ALTER DATABASE WITH ... and REFRESH COLLATION VERSION support This PR adds supports for basic ALTER DATABASE statements propagation support. Below statements are supported: ALTER DATABASE <database_name> with IS_TEMPLATE <true/false>; ALTER DATABASE <database_name> with CONNECTION LIMIT <integer_value>; ALTER DATABASE <database_name> REFRESH COLLATION VERSION; --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2023-09-12 14:09:15 +03:00
Naisila Puka	1da99f8423	PG16 - Don't propagate GRANT ROLE with INHERIT/SET option (#7190 ) We currently don't support propagating these options in Citus Relevant PG commits: https://github.com/postgres/postgres/commit/e3ce2de https://github.com/postgres/postgres/commit/3d14e17 Limitation: We also need to take care of generated GRANT statements by dependencies in attempt to distribute something else. Specifically, this part of the code in `GenerateGrantRoleStmtsOfRole`: ``` grantRoleStmt->admin_opt = membership->admin_option; ``` In PG16, membership also has `inherit_option` and `set_option` which need to properly be part of the `grantRoleStmt`. We can skip for now since #7164 will take care of this soon, and also this is not an expected use-case.	2023-09-12 12:47:37 +03:00
Naisila Puka	c1dc378504	Fix WITH ADMIN FALSE propagation (#7191 )	2023-09-11 15:58:24 +03:00
Onur Tirtir	d628a4c21a	Add citus_schema_move() function (#7180 ) Add citus_schema_move() that can be used to move tenant tables within a distributed schema to another node. The function has two variations as simple wrappers around citus_move_shard_placement() and citus_move_shard_placement_with_nodeid() respectively. They pick a shard that belongs to the given tenant schema and resolve the source node that contain the shards under given tenant schema. Hence their signatures are quite similar to underlying functions: ```sql -- citus_schema_move(), using target node name and node port CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move( schema_id regnamespace, target_node_name text, target_node_port integer, shard_transfer_mode citus.shard_transfer_mode default 'auto') RETURNS void LANGUAGE C STRICT AS 'MODULE_PATHNAME', $$citus_schema_move$$; -- citus_schema_move(), using target node id CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move( schema_id regnamespace, target_node_id integer, shard_transfer_mode citus.shard_transfer_mode default 'auto') RETURNS void LANGUAGE C STRICT AS 'MODULE_PATHNAME', $$citus_schema_move_with_nodeid$$; ```	2023-09-08 12:03:53 +03:00
Naisila Puka	8894c76ec0	PG16 - Add rules option to CREATE COLLATION (#7185 ) Relevant PG commit: https://github.com/postgres/postgres/commit/30a53b7 30a53b7	2023-09-07 13:50:47 +03:00
Naisila Puka	2df88042b3	Add tests with JSON_ARRAYAGG and JSON_OBJECTAGG aggregates (#7186 ) Relevant PG commit: `7081ac46ac` 7081ac46ace8c459966174400b53418683c9fe5c	2023-09-07 13:29:39 +03:00
Naisila Puka	7e5136f2de	Add tests with publications with schema and table of the same schema (#7184 ) Relevant PG commit: https://github.com/postgres/postgres/commit/13a185f 13a185f It was backpatched through PG15 so I added this test in publication.sql instead of pg16.sql	2023-09-06 16:40:36 +03:00

1 2 3 4 5 ...

2213 Commits (9bc95faa32621f50e9e03a44ca0303f5d1dfb38f)