Commit Graph

7080 Commits (ad791f4b3ceaa3d9fbaa679bbbfb5d6c723138df)

Author SHA1 Message Date
Jelte Fennema-Nio 4f0053ed6d Redo #7620: Fix merge command when insert value does not have source distributed column (#7627)
Related to issue #7619, #7620
Merge command fails when source query is single sharded and source and
target are co-located and insert is not using distribution key of
source.

Example
```
CREATE TABLE source (id integer);
CREATE TABLE target (id integer );

-- let's distribute both table on id field
SELECT create_distributed_table('source', 'id');
SELECT create_distributed_table('target', 'id');

MERGE INTO target t
  USING ( SELECT 1 AS somekey
          FROM source
        WHERE source.id = 1) s
  ON t.id = s.somekey
  WHEN NOT MATCHED
  THEN INSERT (id)
    VALUES (s.somekey)

ERROR:  MERGE INSERT must use the source table distribution column value
HINT:  MERGE INSERT must use the source table distribution column value
```

Author's Opinion: If join is not between source and target distributed
column, we should not force user to use source distributed column while
inserting value of target distributed column.

Fix: If user is not using distributed key of source for insertion let's
not push down query to workers and don't force user to use source
distributed column if it is not part of join.

This reverts commit fa4fc0b372.

Co-authored-by: paragjain <paragjain@microsoft.com>
(cherry picked from commit aaaf637a6b)
2024-06-18 16:49:39 +02:00
Jelte Fennema-Nio 3594bd7ac0 Fix CI issues after Github Actions networking changes (#7624)
For some reason using localhost in our hba file doesn't have the
intended effect anymore in our Github Actions runners. Probably because
of some networking change (IPv6 maybe) or some change in the
`/etc/hosts` file.

Replacing localhost with the equivalent loopback IPv4 and IPv6 addresses
resolved this issue.

(cherry picked from commit 8c9de08b76)
2024-06-18 16:49:39 +02:00
Jelte Fennema-Nio aaaf637a6b
Redo #7620: Fix merge command when insert value does not have source distributed column (#7627)
Related to issue #7619, #7620
Merge command fails when source query is single sharded and source and
target are co-located and insert is not using distribution key of
source.

Example
```
CREATE TABLE source (id integer);
CREATE TABLE target (id integer );

-- let's distribute both table on id field
SELECT create_distributed_table('source', 'id');
SELECT create_distributed_table('target', 'id');

MERGE INTO target t
  USING ( SELECT 1 AS somekey
          FROM source
        WHERE source.id = 1) s
  ON t.id = s.somekey
  WHEN NOT MATCHED
  THEN INSERT (id)
    VALUES (s.somekey)

ERROR:  MERGE INSERT must use the source table distribution column value
HINT:  MERGE INSERT must use the source table distribution column value
```

Author's Opinion: If join is not between source and target distributed
column, we should not force user to use source distributed column while
inserting value of target distributed column.

Fix: If user is not using distributed key of source for insertion let's
not push down query to workers and don't force user to use source
distributed column if it is not part of join.

This reverts commit fa4fc0b372.

Co-authored-by: paragjain <paragjain@microsoft.com>
2024-06-17 14:07:25 +00:00
Jelte Fennema-Nio fa4fc0b372
Revert rebase merge of #7620 (#7626)
Because we want to track PR numbers and to make backporting easy we
(pretty much always) use squash-merges when merging to master. We
accidentally used a rebase merge for PR #7620. This reverts those
changes so we can redo the merge using squash merge.

This reverts all commits from eedb607c to 9e71750fc.
2024-06-17 15:46:00 +02:00
paragjain 9e71750fcd fixing flakyness in test 2024-06-15 14:55:36 -07:00
paragjain e62ae64d00 some more 2024-06-15 14:55:36 -07:00
paragjain 76f68f47c4 removing flakyness from test 2024-06-15 14:55:36 -07:00
Jelte Fennema-Nio d5231c34ab Revert "Try to fix failure"
This reverts commit 89f7217660.
2024-06-15 14:55:36 -07:00
Jelte Fennema-Nio f883cfdd77 Try to fix failure 2024-06-15 14:55:36 -07:00
paragjain 7c8a366ba2 some more 2024-06-15 14:55:36 -07:00
paragjain 06e9c29950 some more 2024-06-15 14:55:36 -07:00
paragjain 493140287a fix some indent 2024-06-15 14:55:36 -07:00
paragjain ec25b433d4 adding update and delete tests 2024-06-15 14:55:36 -07:00
paragjain eedb607cd5 merge command fix 2024-06-15 14:55:36 -07:00
Jelte Fennema-Nio 8c9de08b76
Fix CI issues after Github Actions networking changes (#7624)
For some reason using localhost in our hba file doesn't have the
intended effect anymore in our Github Actions runners. Probably because
of some networking change (IPv6 maybe) or some change in the
`/etc/hosts` file.

Replacing localhost with the equivalent loopback IPv4 and IPv6 addresses
resolved this issue.
2024-06-14 16:20:23 +02:00
Gürkan İndibay 2874d7af46
Updates github checkout actions to v4 (#7611)
Updates checkout plugin for github actions to v4. Can not update the
version for check-sql-snapshots since new plugin causes below error in
the docker image this step is using . Please refer to:
https://github.com/citusdata/citus/actions/runs/9286197994/job/25552373953
Error: 
```
/__e/node20/bin/node: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.27' not found (required by /__e/node20/bin/node)
/__e/node20/bin/node: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.28' not found (required by /__e/node20/bin/node)
/__e/node20/bin/node: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by /__e/node20/bin/node)
```
2024-05-31 20:52:17 +03:00
Gürkan İndibay 7e0dc18b22
Bump Citus version to 12.1.4 (#7610) 2024-05-29 11:35:08 +03:00
Gürkan İndibay 0ab42e7a80
Adds null check for node in HasRangeTableRef (#7609)
DESCRIPTION: Adds null check for node in HasRangeTableRef to prevent
errors
2024-05-28 11:03:38 +03:00
Gürkan İndibay 4e838a471a
Adds null check for node in HasRangeTableRef (#7604)
DESCRIPTION: Adds null check for node in HasRangeTableRef to prevent
errors

When executing the query below, users encountered an error due to a null
Node object. This PR adds a null check to handle this error.

Query:
```sql
select
    ct.conname as constraint_name,
    a.attname as column_name,
    fc.relname as foreign_table_name,
    fns.nspname as foreign_table_schema,
    fa.attname as foreign_column_name
from
    (SELECT ct.conname, ct.conrelid, ct.confrelid, ct.conkey, ct.contype,
ct.confkey, generate_subscripts(ct.conkey, 1) AS s
       FROM pg_constraint ct
    ) AS ct
    inner join pg_class c on c.oid=ct.conrelid
    inner join pg_namespace ns on c.relnamespace=ns.oid
    inner join pg_attribute a on a.attrelid=ct.conrelid and a.attnum =
ct.conkey[ct.s]
    left join pg_class fc on fc.oid=ct.confrelid
    left join pg_namespace fns on fc.relnamespace=fns.oid
    left join pg_attribute fa on fa.attrelid=ct.confrelid and fa.attnum =
ct.confkey[ct.s]
where
    ct.contype='f'
    and c.relname='table1'
    and ns.nspname='schemauser'
order by
    fns.nspname, fc.relname, a.attnum
;
```

Error:
```
#0  HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507
507             if (IsA(node, RangeTblRef))
#0  HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507
#1  0x0000561b0aae390e in expression_tree_walker_impl (node=0x561b0d19cc78, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2091
#2  0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#3  0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cd68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674)
    at nodeFuncs.c:2405
#4  0x0000561b0aae3945 in expression_tree_walker_impl (node=0x561b0d19d0f8, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2111
#5  0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#6  0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cb38, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674)
    at nodeFuncs.c:2405
#7  0x0000561b0aae396d in expression_tree_walker_impl (node=0x561b0d19d198, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2127
#8  0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#9  0x0000561b0aae3ef7 in expression_tree_walker_impl (node=0x561b0d183e88, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2464
#10 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#11 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184278, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2460
#12 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#13 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184668, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2460
#14 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#15 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184f68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2460
#16 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#17 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x7f2a68010148, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674)
    at nodeFuncs.c:2405
#18 0x00007f2a7324a0eb in FilterShardsFromPgclass (node=node@entry=0x561b0d185de8, context=context@entry=0x0) at worker/worker_shard_visibility.c:464
#19 0x00007f2a7324a5ff in HideShardsFromSomeApplications (query=query@entry=0x561b0d185de8) at worker/worker_shard_visibility.c:294
#20 0x00007f2a731ed7ac in distributed_planner (parse=0x561b0d185de8, 
    query_string=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=<optimized out>, boundParams=0x0) at planner/distributed_planner.c:237
#21 0x00007f2a7311a52a in pgss_planner (parse=0x561b0d185de8, 
    query_string=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=2048, boundParams=0x0) at pg_stat_statements.c:953
#22 0x0000561b0ab65465 in planner (parse=parse@entry=0x561b0d185de8, 
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at planner.c:279
#23 0x0000561b0ac53aa3 in pg_plan_query (querytree=querytree@entry=0x561b0d185de8, 
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at postgres.c:904
#24 0x0000561b0ac53b71 in pg_plan_queries (querytrees=0x7f2a68012878, 
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at postgres.c:996
#25 0x0000561b0ac5408e in exec_simple_query (
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"...) at postgres.c:1193
#26 0x0000561b0ac56116 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#27 0x0000561b0abab7a7 in BackendRun (port=port@entry=0x561b0d0caf50) at postmaster.c:4464
#28 0x0000561b0abae969 in BackendStartup (port=port@entry=0x561b0d0caf50) at postmaster.c:4192
#29 0x0000561b0abaeaa6 in ServerLoop () at postmaster.c:1782
```


Fixes #7603
2024-05-28 08:54:40 +03:00
Evgeny Nechayev fcc72d8a23
Use macro wrapper to access PGPROC data, which allow to improve compa… (#7607)
DESCRIPTION: Use macro wrapper to access PGPROC data, to improve compatibility with PostgreSQL forks.
2024-05-28 00:39:13 +00:00
Gürkan İndibay 035aa6eada
Bump Citus version to 12.1.3 (#7588) 2024-04-24 11:15:04 +03:00
Gürkan İndibay 553d5ba15d
Adds changelog for 12.1.3 (#7587)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2024-04-22 15:38:51 +03:00
Gürkan İndibay 75ff237340 Removes unnecessary package installations in packaging pipelines (#7341)
With the recent changes in packaging images, linux package installations
to execute validate_output is unnecessary now.
In this PR, I removed them to make the pipeline more effective.

- [x] Remove the test warning before merge

(cherry picked from commit 32b0fc23f5)
2024-04-17 10:26:50 +02:00
Gürkan İndibay 40e9e2614d Removes centos 7 for PG 16 in packaging pipelines (#7205)
centos 7 and oracle 7 is not being supported for newer releases by
Postgres. Therefore, getting package download errors in packaging
pipelines.
This PR removes el/7 and ol/7 Postgres 16 pipelines

(cherry picked from commit b0e982d0b5)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio bac95cc523 Greatly speed up "\d tablename" on servers with many tables (#7577)
DESCRIPTION: Fix performance issue when using "\d tablename" on a server
with many tables

We introduce a filter to every query on pg_class to automatically remove
shards. This is useful to make sure \d and PgAdmin are not cluttered
with shards. However, the way we were introducing this filter was using
`securityQuals` which can have negative impact on query performance.

On clusters with 100k+ tables this could cause a simple "\d tablename"
command to take multiple seconds, because a skipped optimization by
Postgres causes a full table scan. This changes the code to introduce
this filter in the regular `quals` list instead of in `securityQuals`.
Which causes Postgres to use the intended optimization again.

For reference, this was initially reported as a Postgres issue by me:

https://www.postgresql.org/message-id/flat/4189982.1712785863%40sss.pgh.pa.us#b87421293b362d581ea8677e3bfea920
(cherry picked from commit a0151aa31d)
2024-04-17 10:26:50 +02:00
Xing Guo 38967491ef Add missing volatile qualifier. (#7570)
Variables being modified in the PG_TRY block and read in the PG_CATCH
block should be qualified with volatile.

The variable waitEventSet is modified in the PG_TRY block (line 1085)
and read in the PG_CATCH block (line 1095).

The variable relation is modified in the PG_TRY block (line 500) and
read in the PG_CATCH block (line 515).

Besides, the variable objectAddress doesn't need the volatile qualifier.

Ref: C99 7.13.2.1[^1],

> All accessible objects have values, and all other components of the
abstract machine have state, as of the time the longjmp function was
called, except that the values of objects of automatic storage duration
that are local to the function containing the invocation of the
corresponding setjmp macro that do not have volatile-qualified type and
have been changed between the setjmp invocation and longjmp call are
indeterminate.

[^1]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

DESCRIPTION: Correctly mark some variables as volatile

---------

Co-authored-by: Hong Yi <zouzou0208@gmail.com>
(cherry picked from commit ada3ba2507)
2024-04-17 10:26:50 +02:00
Filip Sedlák fc09e1cfdc Log username in the failed connection message (#7432)
This patch includes the username in the reported error message.
This makes debugging easier when certain commands open connections
as other users than the user that is executing the command.

```
monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017);
ERROR:  connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied
Time: 40,198 ms
```

(cherry picked from commit 8b48d6ab02)
2024-04-17 10:26:50 +02:00
Karina 7513061057 Make isolation_update_node test system independent (#7423)
Test isolation_update_node fails on some systems with the following error:
```
-s2: WARNING:  connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Name or service not known
+s2: WARNING:  connection to the remote node non-existent:57637 failed with the following error: could not translate host name "non-existent" to address: Temporary failure in name resolution
```

This slightly modifies an already existing [normalization
rule](739c6d26df/src/test/regress/bin/normalize.sed (L217-L218))
to fix it.

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 21464adfec)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio f4af59ab4b Support running isolation_update_node in flaky test detection (#7425)
I noticed in #7423 that `isolation_update_node` could not be run using
flaky test detection. This fixes that.
2024-04-17 10:26:50 +02:00
sminux 5708fca1ea fix bad copy-paste rightComparisonLimit (#7547)
DESCRIPTION: change for #7543
(cherry picked from commit d59c93bc50)
2024-04-17 10:26:50 +02:00
LightDB Enterprise Postgres 2a6164d2d9 Fix timeout when underlying socket is changed in a MultiConnection (#7377)
When there are multiple localhost entries in /etc/hosts like following
/etc/hosts:
```
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1   localhost
```

multi_cluster_management check will failed:
```

@@ -857,20 +857,21 @@
 ERROR:  group 14 already has a primary node
 -- check that you can add secondaries and unavailable nodes to a group
 SELECT groupid AS worker_2_group FROM pg_dist_node WHERE nodeport = :worker_2_port \gset
 SELECT 1 FROM master_add_node('localhost', 9998, groupid => :worker_1_group, noderole => 'secondary');
  ?column?
 ----------
         1
 (1 row)

 SELECT 1 FROM master_add_node('localhost', 9997, groupid => :worker_1_group, noderole => 'unavailable');
+WARNING:  could not establish connection after 5000 ms
  ?column?
 ----------
         1
 (1 row)
```

This actually isn't just a problem in test environments, but could occur
as well during actual usage when a hostname in pg_dist_node
resolves to multiple IPs and one of those IPs is unreachable.
Postgres will then automatically continue with the next IP, but
Citus should listen for events on the new socket. Not on the
old one.

Co-authored-by: chuhx43211 <chuhx43211@hundsun.com>
(cherry picked from commit 9a91136a3d)
2024-04-17 10:26:50 +02:00
eaydingol db391c0bb7 Change the order in which the locks are acquired (#7542)
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".

To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix.

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
(cherry picked from commit 8afa2d0386)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 146725fc9b Replace more spurious strdups with pstrdups (#7441)
DESCRIPTION: Remove a few small memory leaks

In #7440 one instance of a strdup was removed. But there were a few
more. This removes the ones that are left over, or adds a comment why
strdup is on purpose.

(cherry picked from commit 9683bef2ec)
2024-04-17 10:26:50 +02:00
Marco Slot 94ab1dc240 Replace spurious strdup with pstrdup (#7440)
Not sure why we never found this using valgrind, but using strdup will
cause memory leaks because the pointer is not tracked in a memory
context.

(cherry picked from commit 72fbea20c4)
2024-04-17 10:26:50 +02:00
Onur Tirtir 812a2b759f Improve error message for recursive CTEs (#7407)
Fixes #2870

(cherry picked from commit 5aedec4242)
2024-04-17 10:26:50 +02:00
Onur Tirtir 452564c19b Allow providing "host" parameter via citus.node_conninfo (#7541)
And when that is the case, directly use it as "host" parameter for the
connections between nodes and use the "hostname" provided in
pg_dist_node / pg_dist_poolinfo as "hostaddr" to avoid host name lookup.

This is to avoid allowing dns resolution (and / or setting up DNS names
for each host in the cluster). This already works currently when using
IPs in the hostname. The only use of setting host is that you can then
use sslmode=verify-full and it will validate that the hostname matches
the certificate provided by the node you're connecting too.

It would be more flexible to make this a per-node setting, but that
requires SQL changes. And we'd like to backport this change, and
backporting such a sql change would be quite hard while backporting this
change would be very easy. And in many setups, a different hostname for
TLS validation is actually not needed. The reason for that is
query-from-any node: With query-from-any-node all nodes usually have a
certificate that is valid for the same "cluster hostname", either using
a wildcard cert or a Subject Alternative Name (SAN). Because if you load
balance across nodes you don't know which node you're connecting to, but
you still want TLS validation to do it's job. So with this change you
can use this same "cluster hostname" for TLS validation within the
cluster. Obviously this means you don't validate that you're connecting
to a particular node, just that you're connecting to one of the nodes in
the cluster, but that should be fine from a security perspective (in
most cases).

Note to self: This change requires updating

https://docs.citusdata.com/en/latest/develop/api_guc.html#citus-node-conninfo-text.

DESCRIPTION: Allows overwriting host name for all inter-node connections
by supporting "host" parameter in citus.node_conninfo

(cherry picked from commit 3586aab17a)
2024-04-17 10:26:50 +02:00
Karina 9b06d02c43 Fix error in master_disable_node/citus_disable_node (#7492)
This fixes #7454: master_disable_node() has only two arguments, but
calls citus_disable_node() that tries to read three arguments

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 683e10ab69)
2024-04-17 10:26:50 +02:00
copetol 2ee43fd00c Fix segfault when using certain DO block in function (#7554)
When using a CASE WHEN expression in the body
of the function that is used in the DO block, a segmentation
fault occured. This fixes that.

Fixes #7381

---------

Co-authored-by: Konstantin Morozov <vzbdryn@yahoo.com>
(cherry picked from commit 12f56438fc)
2024-04-17 10:26:50 +02:00
Emel Şimşek f2d102d54b Fix crash caused by some form of ALTER TABLE ADD COLUMN statements. (#7522)
DESCRIPTION: Fixes a crash caused by some form of ALTER TABLE ADD COLUMN
statements. When adding multiple columns, if one of the ADD COLUMN
statements contains a FOREIGN constraint ommitting the referenced
columns in the statement, a SEGFAULT occurs.

For instance, the following statement results in a crash:

```
  ALTER TABLE lt ADD COLUMN new_col1 bool,
                          ADD COLUMN new_col2 int references rt;

```

Fixes #7520.

(cherry picked from commit fdd658acec)
2024-04-17 10:26:50 +02:00
Karina 82637f3e13 Use expecteddir option in _run_pg_regress() (#7582)
Fix check-arbitrary-configs tests failure with current REL_16_STABLE.
This is the same problem as described in #7573. I missed pg_regress call
in _run_pg_regress() in that PR.

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 41e2af8ff5)
2024-04-17 10:26:50 +02:00
Karina 79616bc7db Use expecteddir option when running vanilla tests (#7573)
In PostgreSQL 16 a new option expecteddir was introduced to pg_regress.
Together with fix in
[196eeb6b](https://github.com/postgres/postgres/commit/196eeb6b) it
causes check-vanilla failure if expecteddir is not specified.

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 41d99249d9)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 364e8ece14 Speed up GetForeignKeyOids (#7578)
DESCRIPTION: Fix performance issue in GetForeignKeyOids on systems with
many constraints

GetForeignKeyOids was showing up in CPU profiles when distributing
schemas on systems with 100k+ constraints. The reason was that this
function was doing a sequence scan of pg_constraint to get the foreign
keys that referenced the requested table.

This fixes that by finding the constraints referencing the table through
pg_depend instead of pg_constraint. We're doing this indirection,
because pg_constraint doesn't have an index that we can use, but
pg_depend does.

(cherry picked from commit a263ac6f5f)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 62c32067f1 Use an index to get FDWs that depend on extensions (#7574)
DESCRIPTION: Fix performance issue when distributing a table that
depends on an extension

When the database contains many objects this function would show up in
profiles because it was doing a sequence scan on pg_depend. And with
many objects pg_depend can get very large.

This starts using an index scan to only look for rows containing FDWs,
of which there are expected to be very few (often even zero).

(cherry picked from commit 16604a6601)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio d9069b1e01 Speed up SequenceUsedInDistributedTable (#7579)
DESCRIPTION: Fix performance issue when creating distributed tables if
many already exist

This builds on the work to speed up EnsureSequenceTypeSupported, and now
does something similar for SequenceUsedInDistributedTable.
SequenceUsedInDistributedTable had a similar O(number of citus tables)
operation. This fixes that and speeds up creation of distributed tables
significantly when many distributed tables already exist.

Fixes #7022

(cherry picked from commit cdf51da458)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio fa6743d436 Speed up EnsureSequenceTypeSupported (#7575)
DESCRIPTION: Fix performance issue when creating distributed tables and many already exist

EnsureSequenceTypeSupported was doing an O(number of distributed tables)
operation. This can become very slow with lots of Citus tables, which
now happens much more frequently in practice due to schema based sharding.

Partially addresses #7022

(cherry picked from commit 381f31756e)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 26178fb538 Add missing postgres.h includes
After sorting includes in the previous commit some files were now
invalid because they were not including postgres.h
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 48c62095ff Actually sort includes after cherry-pick 2024-04-17 10:26:50 +02:00
Nils Dijk f1b1d7579c Sort includes (#7326)
CHERRY-PICK NOTES: This cherry-pick only includes the scripts, not the
actual changes. These are done in a follow up commit to ease further
backporting.

This change adds a script to programatically group all includes in a
specific order. The script was used as a one time invocation to group
and sort all includes throught our formatted code. The grouping is as
follows:

 - System includes (eg. `#include<...>`)
 - Postgres.h (eg. `#include "postgres.h"`)
- Toplevel imports from postgres, not contained in a directory (eg.
`#include "miscadmin.h"`)
 - General postgres includes (eg . `#include "nodes/..."`)
- Toplevel citus includes, not contained in a directory (eg. `#include
"citus_verion.h"`)
 - Columnar includes (eg. `#include "columnar/..."`)
 - Distributed includes (eg. `#include "distributed/..."`)

Because it is quite hard to understand the difference between toplevel
citus includes and toplevel postgres includes it hardcodes the list of
toplevel citus includes. In the same manner it assumes anything not
prefixed with `columnar/` or `distributed/` as a postgres include.

The sorting/grouping is enforced by CI. Since we do so with our own
script there are not changes required in our uncrustify configuration.

(cherry picked from commit 0620c8f9a6)
2024-04-17 10:26:50 +02:00
Nils Dijk 75df19b616 move pg_version_constants.h to toplevel include (#7335)
In preparation of sorting and grouping all includes we wanted to move
this file to the toplevel includes for good grouping/sorting.

(cherry picked from commit 0dac63afc0)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio a0151aa31d
Greatly speed up "\d tablename" on servers with many tables (#7577)
DESCRIPTION: Fix performance issue when using "\d tablename" on a server
with many tables

We introduce a filter to every query on pg_class to automatically remove
shards. This is useful to make sure \d and PgAdmin are not cluttered
with shards. However, the way we were introducing this filter was using
`securityQuals` which can have negative impact on query performance.

On clusters with 100k+ tables this could cause a simple "\d tablename"
command to take multiple seconds, because a skipped optimization by
Postgres causes a full table scan. This changes the code to introduce
this filter in the regular `quals` list instead of in `securityQuals`.
Which causes Postgres to use the intended optimization again.

For reference, this was initially reported as a Postgres issue by me:

https://www.postgresql.org/message-id/flat/4189982.1712785863%40sss.pgh.pa.us#b87421293b362d581ea8677e3bfea920
2024-04-16 17:26:12 +02:00