citus

Commit Graph

Author	SHA1	Message	Date
Onur Tirtir	e4ce4d184e	Properly detect no-op shard-key updates via UPDATE / MERGE (#8214 ) DESCRIPTION: Fixes a bug that causes allowing UPDATE / MERGE queries that may change the distribution column value. Fixes: #8087. Probably as of #769, we were not properly checking if UPDATE may change the distribution column. In #769, we had these checks: ```c if (targetEntry->resno != column->varattno) { /* target entry of the form SET some_other_col = <x> / isColumnValueChanged = false; } else if (IsA(setExpr, Var)) { Var newValue = (Var ) setExpr; if (newValue->varattno == column->varattno) { / target entry of the form SET col = table.col / isColumnValueChanged = false; } } ``` However, what we check in "if" and in the "else if" are not so different in the sense they both attempt to verify if SET expr of the target entry points to the attno of given column. So, in Also see this PR comment from #5220: https://github.com/citusdata/citus/pull/5220#discussion_r699230597. In #769, probably we actually wanted to first check whether both SET expr of the target entry and given variable are pointing to the same range var entry, but this wasn't what the "if" was checking, so removed. As a result, in the cases that are mentioned in the linked issue, we were incorrectly concluding that the SET expr of the target entry won't change given column just because it's pointing to the same attno as given variable, regardless of what range var entries the column and the SET expr are pointing to. Then we also started using the same function to check for such cases for update action of MERGE, so we have the same bug there as well. So with this PR, we properly check for such cases by comparing varno as well in TargetEntryChangesValue(). However, then some of the existing tests started failing where the SET expr doesn't directly assign the column to itself but the "where" clause could actually imply that the distribution column won't change. Even before we were not attempting to verify if "where" cluse quals could imply a no-op assignment for the SET expr in such cases but that was not a problem. This is because, for the most cases, we were always qualifying such SET expressions as a no-op update as long as the SET expr's attno is the same as given column's. For this reason, to prevent regressions, this PR also adds some extra logic as well to understand if the "where" clause quals could imply that SET expr for the distribution key is a no-op. Ideally, we should instead use "relation restriction equivalence" mechanism to understand if the "where" clause implies a no-op update. This is because, for instance, right now we're not able to deduce that the update is a no-op when the "where" clause transitively implies a no-op update, as in the case where we're setting "column a" to "column c" and where clause looks like: "column a = column b AND column b = column c". If this means a regression for some users, we can consider doing it that way. Until then, as a workaround, we can suggest adding additional quals to "where" clause that would directly imply equivalence. Also, after fixing TargetEntryChangesValue(), we started successfully deducing that the update action is a no-op for such MERGE queries: ```sql MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.b; ``` However, we then started seeing below error for above query even though now the update is qualified as a no-op update: ``` ERROR: Unexpected column index of the source list ``` This was because of #8180 and #8201 fixed that. In summary, with this PR: We disallow such queries, ```sql -- attno for dist_1.a, dist_1.b: 1, 2 -- attno for dist_different_order_1.a, dist_different_order_1.b: 2, 1 UPDATE dist_1 SET a = dist_different_order_1.b FROM dist_different_order_1 WHERE dist_1.a dist_different_order_1.a; -- attno for dist_1.a, dist_1.b: 1, 2 -- but ON (..) doesn't imply a no-op update for SET expr MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.a; ``` * .. and allow such queries, ```sql MERGE INTO dist_1 USING dist_1 src ON (dist_1.a = src.b) WHEN MATCHED THEN UPDATE SET a = src.b; ``` (cherry picked from commit `5eb1d93be1`) (cherry picked from commit 2fd20b3bb5dcc4d24cdee5985cf97c2e37a2b5e6)	2025-09-30 15:27:38 +03:00
naisila	754c1a3371	Fix HaveRegisteredOrActiveSnapshot() crashes part of `ce7ddc0d3d`	2025-09-19 14:14:57 +03:00
naisila	d736ba2f65	Add check to identify views converted to RTE_SUBQUERY in 15.13, 14.18 Previously, when views were converted to RTE_SUBQUERY the relid would be cleared in PG15. In this patch of PG15, relid is retained. Therefore, we add a check with the "relkind and rtekind" to identify the converted views in 15.13 Part of `c98341e4ed`	2025-09-19 14:14:57 +03:00
Colm	d7c48a9d1e	Remove incorrect assertion from Postgres ruleutils. (#8136 ) DESCRIPTION: Remove an assertion from Postgres ruleutils that was rendered meaningless by a previous Citus commit. Fixes #8123. This has been present since `00068e0`, which changed the code preceding the assert as follows: ``` - while (i < colinfo->num_cols && colinfo->colnames[i] == NULL) - i++; + for (int col_index = 0; col_index < colinfo->num_cols; col_index++) + { + /* + * In the above processing-loops, "i" advances only if + * the column is not new, check if this is a new column. + */ + if (colinfo->is_new_col[col_index]) + i++; + } Assert(i == colinfo->num_cols); Assert(j == nnewcolumns); ``` This commit altered both the loop condition and the incrementing of `i`. After analysis, the assert no longer makes sense. cherry-picked from `bd0558fe39`	2025-09-19 14:14:57 +03:00
Naisila Puka	be7cb46ddf	Fix bug: alter database shouldn't propagate for undistributed database (#7777 ) Before this fix, the following would happen for a database that is not distributed: ```sql CREATE DATABASE db_to_test; NOTICE: Citus partially supports CREATE DATABASE for distributed databases DETAIL: Citus does not propagate CREATE DATABASE command to workers HINT: You can manually create a database and its extensions on workers. ALTER DATABASE db_to_test with CONNECTION LIMIT 100; NOTICE: issuing ALTER DATABASE db_to_test WITH CONNECTION LIMIT 100; DETAIL: on server postgres@localhost:57638 connectionId: 2 NOTICE: issuing ALTER DATABASE db_to_test WITH CONNECTION LIMIT 100; DETAIL: on server postgres@localhost:57637 connectionId: 1 ERROR: database "db_to_test" does not exist ``` With this fix, we also take care of https://github.com/citusdata/citus/issues/7763 Fixes #7763 cherry-picked from `90d76e8097`	2025-09-19 14:14:57 +03:00
Naisila Puka	67cd59bf24	Add GUC for queries with outer joins and pseudoconstant quals (#8163 ) Users can turn on this GUC at their own risk.	2025-08-28 00:17:55 +03:00
ibrahim halatci	d31e3d371f	fix #7715 - add assign hook for CDC library path adjustment (#8025 ) (#8098 ) DESCRIPTION: Automatically updates dynamic_library_path when CDC is enabled fix : #7715 According to the documentation and `pg_settings`, the context of the `citus.enable_change_data_capture` parameter is user. However, changing this parameter — even as a superuser — doesn't work as expected: while the initial copy phase works correctly, subsequent change events are not propagated. This appears to be due to the fact that `dynamic_library_path` is only updated to `$libdir/citus_decoders:$libdir` when the server is restarted and the `_PG_init` function is invoked. To address this, I added an `EnableChangeDataCaptureAssignHook` that automatically updates `dynamic_library_path` at runtime when `citus.enable_change_data_capture` is enabled, ensuring that the CDC decoder libraries are properly loaded. Note that `dynamic_library_path` is already a `superuser`-context parameter in base PostgreSQL, so updating it from within the assign hook should be safe and consistent with PostgreSQL’s configuration model. If there’s any reason this approach might be problematic or if there’s a preferred alternative, I’d appreciate any feedback. cc. @jy-min --------- (cherry picked from commit `743c9bbf87`) Co-authored-by: SongYoungUk <pidaoh@g.skku.edu> Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>	2025-08-05 15:06:55 +03:00
Naisila Puka	ba2088a1d9	Error out for queries with outer joins and pseudoconstant quals in PG<17 (#7937 ) PG15 commit d1ef5631e620f9a5b6480a32bb70124c857af4f1 and PG16 commit 695f5deb7902865901eb2d50a70523af655c3a00 disallow replacing joins with scans in queries with pseudoconstant quals. This commit prevents the set_join_pathlist_hook from being called if any of the join restrictions is a pseudo-constant. So in these cases, citus has no info on the join, never sees that the query has an outer join, and ends up producing an incorrect plan. PG17 fixes this by commit 9e9931d2bf40e2fea447d779c2e133c2c1256ef3 Therefore, we take this extra measure here for PG versions less than 17. hasOuterJoin can never be true when set_join_pathlist_hook is absent.	2025-05-20 16:56:23 +02:00
Cédric Villemain	2779c424ee	Fix #7242 , CALL(@0) crash backend (#7288 ) When executing a prepared CALL, which is not pure SQL but available with some drivers like npgsql and jpgdbc, Citus entered a code path where a plan is not defined, while trying to increase its cost. Thus SIG11 when plan is a NULL pointer. Fix by only increasing plan cost when plan is not null. However, it is a bit suspicious to get here with a NULL plan and maybe a better change will be to not call ShardPlacementForFunctionColocatedWithDistTable() with a NULL plan at all (in call.c:134) bug hit with for example: ``` CallableStatement proc = con.prepareCall("{CALL p(?)}"); proc.registerOutParameter(1, java.sql.Types.BIGINT); proc.setInt(1, -100); proc.execute(); ``` where `p(bigint)` is a distributed "function" and the param the distribution key (also in a distributed table), see #7242 for details Fixes #7242 (cherry picked from commit `0678a2fd89`)	2025-05-20 16:56:23 +02:00
Maksim Melnikov	5ec2501e7b	AddressSanitizer: stack-use-after-scope on distributed_planner.c	2025-04-25 15:05:06 +03:00
manaldush	1bac9cb0c1	AddressSanitizer: stack-use-after-scope on address in CreateBackground(backport to release-12.1) (#7965 ) Backport of #7943 to release-12.1 Fixes #7964	2025-04-25 11:34:42 +00:00
Colm	1a8e2c8479	[Bug Fix] SEGV on query with Left Outer Join (#7787 ) (#7901 ) (#7904 ) DESCRIPTION: Fixes a crash in left outer joins that can happen when there is an an aggregate on a column from the inner side of the join. Fix the SEGV seen in #7787 and #7899; it occurs because a column in the targetlist of a worker subquery can contain a non-empty varnullingrels field if the column is from the inner side of a left outer join. The issue can also occur with the columns in the HAVING clause, and this is also tested in the fix. The issue was triggered by the introduction of the varnullingrels to Vars in Postgres 16 (2489d76c) There is a related issue, #7705, where a non-empty varnullingrels was incorrectly copied into the query tree for the combine query. Here, a non-empty varnullingrels field of a var is incorrectly copied into the query tree for a worker subquery. The regress file from #7705 is used (and renamed) to also test this (#7787). An alternative test output file is required for Postgres 15 because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).	2025-02-18 20:56:36 +00:00
Teja Mupparti	39e890d9ce	For scenarios, such as, Bug 3697586: Server crashes when assigning distributed transaction: Raise an ERROR instead of a crash	2025-01-07 23:03:18 +03:00
Onur Tirtir	cb31a649e9	Avoid re-assigning the global pid for client backends and bg workers when the application_name changes (#7791 ) DESCRIPTION: Fixes a crash that happens because of unsafe catalog access when re-assigning the global pid after application_name changes. When application_name changes, we don't actually need to try re-assigning the global pid for external client backends because application_name doesn't affect the global pid for such backends. Plus, trying to re-assign the global pid for external client backends would unnecessarily cause performing a catalog access when the cached local node id is invalidated. However, accessing to the catalog tables is dangerous in certain situations like when we're not in a transaction block. And for the other types of backends, i.e., the Citus internal backends, we need to re-assign the global pid when the application_name changes because for such backends we simply extract the global pid inherited from the originating backend from the application_name -that's specified by originating backend when openning that connection- and this doesn't require catalog access. (cherry picked from commit `73411915a4`)	2024-12-24 09:34:08 +03:00
Emel Şimşek	0f1aa0e16a	[Bug Fix] Query on distributed tables with window partition may cause… (#7740 ) … segfault #7705 (#7718) This PR is a proposed fix for issue [7705](https://github.com/citusdata/citus/issues/7705). The following is the background and rationale for the fix (please refer to [7705](https://github.com/citusdata/citus/issues/7705) for context); The `varnullingrels `field was introduced to the Var node struct definition in Postgres 16. Its purpose is to associate a variable with the set of outer join relations that can cause the variable to be NULL. The `varnullingrels ` for the variable `"gianluca_camp_test"."start_timestamp"` in the problem query is 3, because the variable "gianluca_camp_test"."start_timestamp" is coming from the inner (nullable) side of an outer join and 3 is the RT index (aka relid) of that outer join. The problem occurs when the Postgres planner attempts to plan the combine query. The format of a combine query is: ``` SELECT <targets> FROM pg_catalog.citus_extradata_container(); ``` There is only one relation in a combine query, so no outer joins are present, but the non-empty `varnullingrels `field causes the Postgres planner to access structures for a non-existent relation. The source of the problem is that, when creating the target list for the combine query, function MasterAggregateMutator() uses copyObject() to construct a Var node before setting the master table ID, and this copies over the non-empty varnullingrels field in the case of the `"gianluca_camp_test"."start_timestamp"` var. The proposed solution is to have MasterAggregateMutator() use makeVar() instead of copyObject(), and only set the fields that make sense for the combine query; var type, collation and type modifier. The `varnullingrels `field can be left empty because there is only one relation in the combine query. A new regress test issue_7705.sql is added to exercise the fix. The issue is not specific to window functions, any target expression that cannot be pushed down and contains at least one column from the inner side of a left outer join (so has a non-empty varnullingrels field) can cause the same issue. More about Citus combine queries [here](https://github.com/citusdata/citus/tree/main/src/backend/distributed#combine-query-planner). More about Postgres varnullingrels [here](https://github.com/postgres/postgres/blob/master/src/backend/optimizer/README). (cherry picked from commit `248ff5d52a`) DESCRIPTION: PR description that will go into the change log, up to 78 characters Co-authored-by: Colm <colm.mchugh@gmail.com>	2024-11-13 19:37:51 +03:00
Emel Şimşek	686d2b46ca	Propagates SECURITY LABEL ON ROLE stmt (#7304 ) (#7735 ) Propagates SECURITY LABEL ON ROLE stmt (https://github.com/citusdata/citus/pull/7304) We propagate `SECURITY LABEL [for provider] ON ROLE rolename IS labelname` to the worker nodes. We also make sure to run the relevant `SecLabelStmt` commands on a newly added node by looking at roles found in `pg_shseclabel`. See official docs for explanation on how this command works: https://www.postgresql.org/docs/current/sql-security-label.html This command stores the role label in the `pg_shseclabel` catalog table. This commit also fixes the regex string in `check_gucs_are_alphabetically_sorted.sh` script such that it escapes the dot. Previously it was looking for all strings starting with "citus" instead of "citus." as it should. To test this feature, I currently make use of a special GUC to control label provider registration in PG_init when creating the Citus extension. (cherry picked from commit `0d1f18862b`) Co-authored-by: Naisila Puka <37271756+naisila@users.noreply.github.com>	2024-11-13 14:21:08 +03:00
Parag Jain	6349f2d52d	Support MERGE command for single_shard_distributed Target (#7643 ) This PR has following changes : 1. Enable MERGE command for single_shard_distributed targets. (cherry picked from commit `3c467e6e02`)	2024-07-17 15:11:38 +03:00
Jelte Fennema-Nio	4f0053ed6d	Redo #7620 : Fix merge command when insert value does not have source distributed column (#7627 ) Related to issue #7619, #7620 Merge command fails when source query is single sharded and source and target are co-located and insert is not using distribution key of source. Example ``` CREATE TABLE source (id integer); CREATE TABLE target (id integer ); -- let's distribute both table on id field SELECT create_distributed_table('source', 'id'); SELECT create_distributed_table('target', 'id'); MERGE INTO target t USING ( SELECT 1 AS somekey FROM source WHERE source.id = 1) s ON t.id = s.somekey WHEN NOT MATCHED THEN INSERT (id) VALUES (s.somekey) ERROR: MERGE INSERT must use the source table distribution column value HINT: MERGE INSERT must use the source table distribution column value ``` Author's Opinion: If join is not between source and target distributed column, we should not force user to use source distributed column while inserting value of target distributed column. Fix: If user is not using distributed key of source for insertion let's not push down query to workers and don't force user to use source distributed column if it is not part of join. This reverts commit `fa4fc0b372`. Co-authored-by: paragjain <paragjain@microsoft.com> (cherry picked from commit `aaaf637a6b`)	2024-06-18 16:49:39 +02:00
Gürkan İndibay	4e838a471a	Adds null check for node in HasRangeTableRef (#7604 ) DESCRIPTION: Adds null check for node in HasRangeTableRef to prevent errors When executing the query below, users encountered an error due to a null Node object. This PR adds a null check to handle this error. Query: ```sql select ct.conname as constraint_name, a.attname as column_name, fc.relname as foreign_table_name, fns.nspname as foreign_table_schema, fa.attname as foreign_column_name from (SELECT ct.conname, ct.conrelid, ct.confrelid, ct.conkey, ct.contype, ct.confkey, generate_subscripts(ct.conkey, 1) AS s FROM pg_constraint ct ) AS ct inner join pg_class c on c.oid=ct.conrelid inner join pg_namespace ns on c.relnamespace=ns.oid inner join pg_attribute a on a.attrelid=ct.conrelid and a.attnum = ct.conkey[ct.s] left join pg_class fc on fc.oid=ct.confrelid left join pg_namespace fns on fc.relnamespace=fns.oid left join pg_attribute fa on fa.attrelid=ct.confrelid and fa.attnum = ct.confkey[ct.s] where ct.contype='f' and c.relname='table1' and ns.nspname='schemauser' order by fns.nspname, fc.relname, a.attnum ; ``` Error: ``` #0 HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507 507 if (IsA(node, RangeTblRef)) #0 HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507 #1 0x0000561b0aae390e in expression_tree_walker_impl (node=0x561b0d19cc78, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2091 #2 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #3 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cd68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674) at nodeFuncs.c:2405 #4 0x0000561b0aae3945 in expression_tree_walker_impl (node=0x561b0d19d0f8, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2111 #5 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #6 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cb38, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674) at nodeFuncs.c:2405 #7 0x0000561b0aae396d in expression_tree_walker_impl (node=0x561b0d19d198, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2127 #8 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #9 0x0000561b0aae3ef7 in expression_tree_walker_impl (node=0x561b0d183e88, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2464 #10 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #11 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184278, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2460 #12 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #13 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184668, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2460 #14 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #15 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184f68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674) at nodeFuncs.c:2460 #16 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513 #17 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x7f2a68010148, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674) at nodeFuncs.c:2405 #18 0x00007f2a7324a0eb in FilterShardsFromPgclass (node=node@entry=0x561b0d185de8, context=context@entry=0x0) at worker/worker_shard_visibility.c:464 #19 0x00007f2a7324a5ff in HideShardsFromSomeApplications (query=query@entry=0x561b0d185de8) at worker/worker_shard_visibility.c:294 #20 0x00007f2a731ed7ac in distributed_planner (parse=0x561b0d185de8, query_string=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=<optimized out>, boundParams=0x0) at planner/distributed_planner.c:237 #21 0x00007f2a7311a52a in pgss_planner (parse=0x561b0d185de8, query_string=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=2048, boundParams=0x0) at pg_stat_statements.c:953 #22 0x0000561b0ab65465 in planner (parse=parse@entry=0x561b0d185de8, query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:279 #23 0x0000561b0ac53aa3 in pg_plan_query (querytree=querytree@entry=0x561b0d185de8, query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:904 #24 0x0000561b0ac53b71 in pg_plan_queries (querytrees=0x7f2a68012878, query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:996 #25 0x0000561b0ac5408e in exec_simple_query ( query_string=query_string@entry=0x561b0d009478 "select\n ct.conname as constraint_name,\n a.attname as column_name,\n fc.relname as foreign_table_name,\n fns.nspname as foreign_table_schema,\n fa.attname as foreign_column_name\nfrom\n (S"...) at postgres.c:1193 #26 0x0000561b0ac56116 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637 #27 0x0000561b0abab7a7 in BackendRun (port=port@entry=0x561b0d0caf50) at postmaster.c:4464 #28 0x0000561b0abae969 in BackendStartup (port=port@entry=0x561b0d0caf50) at postmaster.c:4192 #29 0x0000561b0abaeaa6 in ServerLoop () at postmaster.c:1782 ``` Fixes #7603	2024-05-28 08:54:40 +03:00
Jelte Fennema-Nio	bac95cc523	Greatly speed up "\d tablename" on servers with many tables (#7577 ) DESCRIPTION: Fix performance issue when using "\d tablename" on a server with many tables We introduce a filter to every query on pg_class to automatically remove shards. This is useful to make sure \d and PgAdmin are not cluttered with shards. However, the way we were introducing this filter was using `securityQuals` which can have negative impact on query performance. On clusters with 100k+ tables this could cause a simple "\d tablename" command to take multiple seconds, because a skipped optimization by Postgres causes a full table scan. This changes the code to introduce this filter in the regular `quals` list instead of in `securityQuals`. Which causes Postgres to use the intended optimization again. For reference, this was initially reported as a Postgres issue by me: https://www.postgresql.org/message-id/flat/4189982.1712785863%40sss.pgh.pa.us#b87421293b362d581ea8677e3bfea920 (cherry picked from commit `a0151aa31d`)	2024-04-17 10:26:50 +02:00
Xing Guo	38967491ef	Add missing volatile qualifier. (#7570 ) Variables being modified in the PG_TRY block and read in the PG_CATCH block should be qualified with volatile. The variable waitEventSet is modified in the PG_TRY block (line 1085) and read in the PG_CATCH block (line 1095). The variable relation is modified in the PG_TRY block (line 500) and read in the PG_CATCH block (line 515). Besides, the variable objectAddress doesn't need the volatile qualifier. Ref: C99 7.13.2.1[^1], > All accessible objects have values, and all other components of the abstract machine have state, as of the time the longjmp function was called, except that the values of objects of automatic storage duration that are local to the function containing the invocation of the corresponding setjmp macro that do not have volatile-qualified type and have been changed between the setjmp invocation and longjmp call are indeterminate. [^1]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf DESCRIPTION: Correctly mark some variables as volatile --------- Co-authored-by: Hong Yi <zouzou0208@gmail.com> (cherry picked from commit `ada3ba2507`)	2024-04-17 10:26:50 +02:00
Filip Sedlák	fc09e1cfdc	Log username in the failed connection message (#7432 ) This patch includes the username in the reported error message. This makes debugging easier when certain commands open connections as other users than the user that is executing the command. ``` monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017); ERROR: connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied Time: 40,198 ms ``` (cherry picked from commit `8b48d6ab02`)	2024-04-17 10:26:50 +02:00
sminux	5708fca1ea	fix bad copy-paste rightComparisonLimit (#7547 ) DESCRIPTION: change for #7543 (cherry picked from commit `d59c93bc50`)	2024-04-17 10:26:50 +02:00
LightDB Enterprise Postgres	2a6164d2d9	Fix timeout when underlying socket is changed in a MultiConnection (#7377 ) When there are multiple localhost entries in /etc/hosts like following /etc/hosts: ``` 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 127.0.0.1 localhost ``` multi_cluster_management check will failed: ``` @@ -857,20 +857,21 @@ ERROR: group 14 already has a primary node -- check that you can add secondaries and unavailable nodes to a group SELECT groupid AS worker_2_group FROM pg_dist_node WHERE nodeport = :worker_2_port \gset SELECT 1 FROM master_add_node('localhost', 9998, groupid => :worker_1_group, noderole => 'secondary'); ?column? ---------- 1 (1 row) SELECT 1 FROM master_add_node('localhost', 9997, groupid => :worker_1_group, noderole => 'unavailable'); +WARNING: could not establish connection after 5000 ms ?column? ---------- 1 (1 row) ``` This actually isn't just a problem in test environments, but could occur as well during actual usage when a hostname in pg_dist_node resolves to multiple IPs and one of those IPs is unreachable. Postgres will then automatically continue with the next IP, but Citus should listen for events on the new socket. Not on the old one. Co-authored-by: chuhx43211 <chuhx43211@hundsun.com> (cherry picked from commit `9a91136a3d`)	2024-04-17 10:26:50 +02:00
eaydingol	db391c0bb7	Change the order in which the locks are acquired (#7542 ) This PR changes the order in which the locks are acquired (for the target and reference tables), when a modify request is initiated from a worker node that is not the "FirstWorkerNode". To prevent concurrent writes, locks are acquired on the first worker node for the replicated tables. When the update statement originates from the first worker node, it acquires the lock on the reference table(s) first, followed by the target table(s). However, if the update statement is initiated in another worker node, the lock requests are sent to the first worker in a different order. This PR unifies the modification order on the first worker node. With the third commit, independent of the node that received the request, the locks are acquired for the modified table and then the reference tables on the first node. The first commit shows a sample output for the test prior to the fix. Fixes #7477 --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com> (cherry picked from commit `8afa2d0386`)	2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio	146725fc9b	Replace more spurious strdups with pstrdups (#7441 ) DESCRIPTION: Remove a few small memory leaks In #7440 one instance of a strdup was removed. But there were a few more. This removes the ones that are left over, or adds a comment why strdup is on purpose. (cherry picked from commit `9683bef2ec`)	2024-04-17 10:26:50 +02:00
Marco Slot	94ab1dc240	Replace spurious strdup with pstrdup (#7440 ) Not sure why we never found this using valgrind, but using strdup will cause memory leaks because the pointer is not tracked in a memory context. (cherry picked from commit `72fbea20c4`)	2024-04-17 10:26:50 +02:00
Onur Tirtir	812a2b759f	Improve error message for recursive CTEs (#7407 ) Fixes #2870 (cherry picked from commit `5aedec4242`)	2024-04-17 10:26:50 +02:00
Onur Tirtir	452564c19b	Allow providing "host" parameter via citus.node_conninfo (#7541 ) And when that is the case, directly use it as "host" parameter for the connections between nodes and use the "hostname" provided in pg_dist_node / pg_dist_poolinfo as "hostaddr" to avoid host name lookup. This is to avoid allowing dns resolution (and / or setting up DNS names for each host in the cluster). This already works currently when using IPs in the hostname. The only use of setting host is that you can then use sslmode=verify-full and it will validate that the hostname matches the certificate provided by the node you're connecting too. It would be more flexible to make this a per-node setting, but that requires SQL changes. And we'd like to backport this change, and backporting such a sql change would be quite hard while backporting this change would be very easy. And in many setups, a different hostname for TLS validation is actually not needed. The reason for that is query-from-any node: With query-from-any-node all nodes usually have a certificate that is valid for the same "cluster hostname", either using a wildcard cert or a Subject Alternative Name (SAN). Because if you load balance across nodes you don't know which node you're connecting to, but you still want TLS validation to do it's job. So with this change you can use this same "cluster hostname" for TLS validation within the cluster. Obviously this means you don't validate that you're connecting to a particular node, just that you're connecting to one of the nodes in the cluster, but that should be fine from a security perspective (in most cases). Note to self: This change requires updating https://docs.citusdata.com/en/latest/develop/api_guc.html#citus-node-conninfo-text. DESCRIPTION: Allows overwriting host name for all inter-node connections by supporting "host" parameter in citus.node_conninfo (cherry picked from commit `3586aab17a`)	2024-04-17 10:26:50 +02:00
Karina	9b06d02c43	Fix error in master_disable_node/citus_disable_node (#7492 ) This fixes #7454: master_disable_node() has only two arguments, but calls citus_disable_node() that tries to read three arguments Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com> (cherry picked from commit `683e10ab69`)	2024-04-17 10:26:50 +02:00
copetol	2ee43fd00c	Fix segfault when using certain DO block in function (#7554 ) When using a CASE WHEN expression in the body of the function that is used in the DO block, a segmentation fault occured. This fixes that. Fixes #7381 --------- Co-authored-by: Konstantin Morozov <vzbdryn@yahoo.com> (cherry picked from commit `12f56438fc`)	2024-04-17 10:26:50 +02:00
Emel Şimşek	f2d102d54b	Fix crash caused by some form of ALTER TABLE ADD COLUMN statements. (#7522 ) DESCRIPTION: Fixes a crash caused by some form of ALTER TABLE ADD COLUMN statements. When adding multiple columns, if one of the ADD COLUMN statements contains a FOREIGN constraint ommitting the referenced columns in the statement, a SEGFAULT occurs. For instance, the following statement results in a crash: ``` ALTER TABLE lt ADD COLUMN new_col1 bool, ADD COLUMN new_col2 int references rt; ``` Fixes #7520. (cherry picked from commit `fdd658acec`)	2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio	364e8ece14	Speed up GetForeignKeyOids (#7578 ) DESCRIPTION: Fix performance issue in GetForeignKeyOids on systems with many constraints GetForeignKeyOids was showing up in CPU profiles when distributing schemas on systems with 100k+ constraints. The reason was that this function was doing a sequence scan of pg_constraint to get the foreign keys that referenced the requested table. This fixes that by finding the constraints referencing the table through pg_depend instead of pg_constraint. We're doing this indirection, because pg_constraint doesn't have an index that we can use, but pg_depend does. (cherry picked from commit `a263ac6f5f`)	2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio	62c32067f1	Use an index to get FDWs that depend on extensions (#7574 ) DESCRIPTION: Fix performance issue when distributing a table that depends on an extension When the database contains many objects this function would show up in profiles because it was doing a sequence scan on pg_depend. And with many objects pg_depend can get very large. This starts using an index scan to only look for rows containing FDWs, of which there are expected to be very few (often even zero). (cherry picked from commit `16604a6601`)	2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio	d9069b1e01	Speed up SequenceUsedInDistributedTable (#7579 ) DESCRIPTION: Fix performance issue when creating distributed tables if many already exist This builds on the work to speed up EnsureSequenceTypeSupported, and now does something similar for SequenceUsedInDistributedTable. SequenceUsedInDistributedTable had a similar O(number of citus tables) operation. This fixes that and speeds up creation of distributed tables significantly when many distributed tables already exist. Fixes #7022 (cherry picked from commit `cdf51da458`)	2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio	fa6743d436	Speed up EnsureSequenceTypeSupported (#7575 ) DESCRIPTION: Fix performance issue when creating distributed tables and many already exist EnsureSequenceTypeSupported was doing an O(number of distributed tables) operation. This can become very slow with lots of Citus tables, which now happens much more frequently in practice due to schema based sharding. Partially addresses #7022 (cherry picked from commit `381f31756e`)	2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio	26178fb538	Add missing postgres.h includes After sorting includes in the previous commit some files were now invalid because they were not including postgres.h	2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio	48c62095ff	Actually sort includes after cherry-pick	2024-04-17 10:26:50 +02:00
Nils Dijk	75df19b616	move pg_version_constants.h to toplevel include (#7335 ) In preparation of sorting and grouping all includes we wanted to move this file to the toplevel includes for good grouping/sorting. (cherry picked from commit `0dac63afc0`)	2024-04-17 10:26:50 +02:00
Teja Mupparti	a945971f48	Fix the incorrect column count after ALTER TABLE, this fixes the bug #7378 (please read the analysis in the bug for more information) (cherry picked from commit `00068e07c5`)	2024-01-24 11:48:06 -08:00
Onur Tirtir	a4fe969947	Make sure to disallow creating a replicated distributed table concurrently (#7219 ) See explanation in https://github.com/citusdata/citus/issues/7216. Fixes https://github.com/citusdata/citus/issues/7216. DESCRIPTION: Makes sure to disallow creating a replicated distributed table concurrently (cherry picked from commit `111b4c19bc`)	2023-10-24 14:04:37 +03:00
Nils Dijk	e59ffbf549	Fix leaking of memory and memory contexts in Foreign Constraint Graphs (#7236 ) DESCRIPTION: Fix leaking of memory and memory contexts in Foreign Constraint Graphs Previously, every time we (re)created the Foreign Constraint Relationship Graph, we created a new Memory Context while loosing a reference to the previous context. This old context could still have left over memory in there causing a memory leak. With this patch we statically have one memory context that we lazily initialize the first time we create our foreign constraint relationship graph. On every subsequent creation, beside destroying our previous hashmap we also reset our memory context to remove any left over references.	2023-10-09 13:07:30 +02:00
Gürkan İndibay	e5e64b7454	Adds alter database propagation - with and refresh collation (#7172 ) DESCRIPTION: Adds ALTER DATABASE WITH ... and REFRESH COLLATION VERSION support This PR adds supports for basic ALTER DATABASE statements propagation support. Below statements are supported: ALTER DATABASE <database_name> with IS_TEMPLATE <true/false>; ALTER DATABASE <database_name> with CONNECTION LIMIT <integer_value>; ALTER DATABASE <database_name> REFRESH COLLATION VERSION; --------- Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>	2023-09-12 14:09:15 +03:00
Naisila Puka	1da99f8423	PG16 - Don't propagate GRANT ROLE with INHERIT/SET option (#7190 ) We currently don't support propagating these options in Citus Relevant PG commits: https://github.com/postgres/postgres/commit/e3ce2de https://github.com/postgres/postgres/commit/3d14e17 Limitation: We also need to take care of generated GRANT statements by dependencies in attempt to distribute something else. Specifically, this part of the code in `GenerateGrantRoleStmtsOfRole`: ``` grantRoleStmt->admin_opt = membership->admin_option; ``` In PG16, membership also has `inherit_option` and `set_option` which need to properly be part of the `grantRoleStmt`. We can skip for now since #7164 will take care of this soon, and also this is not an expected use-case.	2023-09-12 12:47:37 +03:00
Naisila Puka	c1dc378504	Fix WITH ADMIN FALSE propagation (#7191 )	2023-09-11 15:58:24 +03:00
Onur Tirtir	d628a4c21a	Add citus_schema_move() function (#7180 ) Add citus_schema_move() that can be used to move tenant tables within a distributed schema to another node. The function has two variations as simple wrappers around citus_move_shard_placement() and citus_move_shard_placement_with_nodeid() respectively. They pick a shard that belongs to the given tenant schema and resolve the source node that contain the shards under given tenant schema. Hence their signatures are quite similar to underlying functions: ```sql -- citus_schema_move(), using target node name and node port CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move( schema_id regnamespace, target_node_name text, target_node_port integer, shard_transfer_mode citus.shard_transfer_mode default 'auto') RETURNS void LANGUAGE C STRICT AS 'MODULE_PATHNAME', $$citus_schema_move$$; -- citus_schema_move(), using target node id CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move( schema_id regnamespace, target_node_id integer, shard_transfer_mode citus.shard_transfer_mode default 'auto') RETURNS void LANGUAGE C STRICT AS 'MODULE_PATHNAME', $$citus_schema_move_with_nodeid$$; ```	2023-09-08 12:03:53 +03:00
Naisila Puka	8894c76ec0	PG16 - Add rules option to CREATE COLLATION (#7185 ) Relevant PG commit: https://github.com/postgres/postgres/commit/30a53b7 30a53b7	2023-09-07 13:50:47 +03:00
Naisila Puka	5c658b4eb7	PG16 - Add citus_truncate_trigger for Citus foreign tables (#7170 ) Since in PG16, truncate triggers are supported on foreign tables, we add the citus_truncate_trigger to Citus foreign tables as well, such that the TRUNCATE command is propagated to the table's single local shard as well. Note that TRUNCATE command was working for foreign tables even before this commit: see https://github.com/citusdata/citus/pull/7170#issuecomment-1706240593 for details This commit also adds tests with user-enabled truncate triggers on Citus foreign tables: both trigger on the shell table and on its single foreign local shard. Relevant PG commit: https://github.com/postgres/postgres/commit/3b00a94	2023-09-05 19:42:39 +03:00
zhjwpku	205b159606	get rid of {Push/Pop}OverrideSearchPath (#7145 )	2023-09-05 17:40:22 +02:00
aykut-bozkurt	8eb3360017	Fixes visibility problems with dependency propagation (#7028 ) Problem: Previously we always used an outside superuser connection to overcome permission issues for the current user while propagating dependencies. That has mainly 2 problems: 1. Visibility issues during dependency propagation, (metadata connection propagates some objects like a schema, and outside transaction does not see it and tries to create it again) 2. Security issues (it is preferrable to use current user's connection instead of extension superuser) Solution (high level): Now, we try to make a smarter decision on whether should we use an outside superuser connection or current user's metadata connection. We prefer using current user's connection if any of the objects, which is already propagated in the current transaction, is a dependency for a target object. We do that since we assume if current user has permissions to create the dependency, then it can most probably propagate the target as well. Our assumption is expected to hold most of the times but it can still be wrong. In those cases, transaction would fail and user should set the GUC `citus.create_object_propagation` to `deferred` to work around it. Solution: 1. We track all objects propagated in the current transaction (we can handle subtransactions), 2. We propagate dependencies via the current user's metadata connection if any dependency is created in the current transaction to address issues listed above. Otherwise, we still use an outside superuser connection. DESCRIPTION: Fixes some object propagation errors seen with transaction blocks. Fixes https://github.com/citusdata/citus/issues/6614 --------- Co-authored-by: Nils Dijk <nils@citusdata.com>	2023-09-05 18:04:16 +03:00

1 2 3 4 5 ...

3373 Commits (e4ce4d184ef792c37c10e79ec20ee49d3d6d9da3)