Commit Graph

3387 Commits (e2a03ce6a03d18ff26b1409ef1d5f96a0ba9ebf6)

Author SHA1 Message Date
eaydingol e2a03ce6a0 Fix: store the previous shard cost for order verification (#7550)
Store the previous shard cost so that the invariant checking performs as
expected.
2025-01-12 23:43:53 +03:00
Filip Sedlák 657575114c Fail early when shard can't be safely moved to a new node (#7467)
DESCRIPTION: citus_move_shard_placement now fails early when shard
cannot be safely moved

The implementation is quite simplistic -
`citus_move_shard_placement(...)` will fail with an error if there's any
new node in the cluster that doesn't have reference tables yet.

It could have been finer-grained, i.e. erroring only when trying to move
a shard to an unitialized node. Looking at the related functions -
`replicate_reference_tables()` or `citus_rebalance_start()`, I think
it's acceptable behaviour. These other functions also treat "any"
unitialized node as a temporary anomaly.

Fixes #7426

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2025-01-12 22:54:09 +03:00
eaydingol ddef97212a Generate qualified relation name (#7427)
This change refactors the code by using generate_qualified_relation_name
from id instead of using a sequence of functions to generate the
relation name.

Fixes #6602

(cherry picked from commit ee11492a0e)
2025-01-12 22:08:41 +03:00
zhjwpku f42e8556ec [performance improvement] remove duplicate LoadShardList call (#7380)
LoadShardList is called twice, which is not neccessary, and there is no
need to sort the shard placement list since we only want to know the list
length.

(cherry picked from commit 8e979f7ac6)
2025-01-12 22:08:03 +03:00
Karina 01945bd3a3 Fix getting heap tuple size (#7387)
This fixes #7230.

First of all, using HeapTupleHeaderGetDatumLength(heapTuple) is
definetly wrong, it gives a number that's 4 times less than the correct
tuple size (heapTuple.t_len). See

https://github.com/postgres/postgres/blob/REL_16_0/src/include/access/htup_details.h#L455-L456

https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L279

https://github.com/postgres/postgres/blob/REL_16_0/src/include/varatt.h#L225-L226

When I fixed it, the limit_intermediate_size test failed, so I tried to
understand what's going on there. In original commit fd546cf these
queries were supposed to fail. Then in b3af63c three of the queries that
were supposed to fail suddenly worked and tests were changed to pass
without understanding why the output had changed or how to keep test
testing what it had to test. Even comments saying that these queries
should fail were left untouched. Commit message gives no clue about why
exactly test has changed:

> It seems that when we use adaptive executor instead of task tracker,
we
> exceed the intermediate result size less in the test. Therefore
updated
> the tests accordingly.

Then 3fda2c3 also blindly raised the limit for one of the queries to
keep it working:

3fda2c3254 (diff-a9b7b617f9dfd345318cb8987d5897143ca1b723c87b81049bbadd94dcc86570R19)

When in fe3caf3 that HeapTupleHeaderGetDatumLength(heapTuple) call was
finally added, one of those test queries became failing again.

The other two of them now also failing after the fix. I don't understand
how exactly the calculation of "intermediate result size" that is
limited by citus.max_intermediate_result_size had changed through
b3af63c and fe3caf3, but these numbers are now closer to what
they originally were when this limitation was added in
fd546cf. So these queries should fail, like in the original
version of the limit_intermediate_size test.

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 20dc58cf5d)
2025-01-12 22:06:46 +03:00
Cédric Villemain 1a8349aeaf Fix #7242, CALL(@0) crash backend (#7288)
When executing a prepared CALL, which is not pure SQL but available with
some drivers like npgsql and jpgdbc, Citus entered a code path where a
plan is not defined, while trying to increase its cost. Thus SIG11 when
plan is a NULL pointer.

Fix by only increasing plan cost when plan is not null.

However, it is a bit suspicious to get here with a NULL plan and maybe a
better change will be to not call
ShardPlacementForFunctionColocatedWithDistTable() with a NULL plan at
all (in call.c:134)

bug hit with for example:
```
CallableStatement proc = con.prepareCall("{CALL p(?)}");
proc.registerOutParameter(1, java.sql.Types.BIGINT);
proc.setInt(1, -100);
proc.execute();
```

where `p(bigint)` is a distributed "function" and the param the
distribution key (also in a distributed table), see #7242 for details

Fixes #7242

(cherry picked from commit 0678a2fd89)
2025-01-12 22:06:08 +03:00
Emel Şimşek f18c21c2b4 Send keepalive messages in split decoder periodically to avoid wal receiver timeouts during large shard splits. (#7229)
DESCRIPTION: Send keepalive messages during the logical replication
phase of large shard splits to avoid timeouts.

During the logical replication part of the shard split process, split
decoder filters out the wal records produced by the initial copy. If the
number of wal records is big, then split decoder ends up processing for
a long time before sending out any wal records through pgoutput. Hence
the wal receiver may time out and restarts repeatedly causing our split
driver code catch up logic to fail.

Notes:

1. If the wal_receiver_timeout is set to a very small number e.g. 600ms,
it may time out before receiving the keepalives. My tests show that this
code works best when the` wal_receiver_timeout `is set to 1minute, which
is the default value.

2. Once a logical replication worker time outs, a new one gets launched.
The new logical replication worker sets the pg_stat_subscription columns
to initial values. E.g. the latest_end_lsn is set to 0. Our driver logic
in `WaitForGroupedLogicalRepTargetsToCatchUp` can not handle LSN value
to go back. This is the main reason for it to get stuck in the infinite
loop.

(cherry picked from commit e9035f6d32)
2025-01-12 22:05:33 +03:00
Naisila Puka 3e924db90a
Propagate MERGE ... WHEN NOT MATCHED BY SOURCE (#7807)
DESCRIPTION: Propagates MERGE ... WHEN NOT MATCHED BY SOURCE

It seems like there is not much needed to be done here.
`get_merge_query_def` from `ruleutils_17` is updated with "WHEN NOT
MATCHED BY SOURCE" therefore `deparse_shard_query` parses the merge
query for execution on the shard correctly.

Relevant PG commit:
https://github.com/postgres/postgres/commit/0294df2f1
2025-01-09 00:03:06 +03:00
Naisila Puka 9bad33f89d
PG17 - Propagate EXPLAIN options: MEMORY and SERIALIZE (#7802)
DESCRIPTION: Propagates MEMORY and SERIALIZE options of EXPLAIN

The options for `MEMORY` can be true or false. Default is false.
The options for `SERIALIZE` can be none, text or binary. Default is
none.

I referred to how we added support for WAL option in this PR [Support
EXPLAIN(ANALYZE, WAL)](https://github.com/citusdata/citus/pull/4196).
For the tests however, I used the same tests as Postgres, not like the
tests in the WAL PR. I used exactly the same tests as Postgres does, I
simply distributed the table beforehand. See below the relevant Postgres
commits from where you can see the tests added as well:
- [Add EXPLAIN
(MEMORY)](https://github.com/postgres/postgres/commit/5de890e36)
- [Invent SERIALIZE option for
EXPLAIN.](https://github.com/postgres/postgres/commit/06286709e)

This PR required a lot of copying of Postgres static functions regarding
how `EXPLAIN` works for `MEMORY` and `SERIALIZE` options. Specifically,
these copy-pastes were required for updating `ExplainWorkerPlan()`
function, which is in fact based on postgres' `ExplainOnePlan()`:
```C
/* copied from explain.c to update ExplainWorkerPlan() in citus according to ExplainOnePlan() in postgres */
#define BYTES_TO_KILOBYTES(b)
typedef struct SerializeMetrics
static bool peek_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_buffer_usage(ExplainState *es, const BufferUsage *usage);
static void show_memory_counters(ExplainState *es, const MemoryContextCounters *mem_counters);
static void ExplainIndentText(ExplainState *es);
static void ExplainPrintSerialize(ExplainState *es, SerializeMetrics *metrics);
static SerializeMetrics GetSerializationMetrics(DestReceiver *dest);
```

_Note_: it looks like we were missing some `buffers` option details as
well. I put them together with the memory option, like the code in
Postgres explain.c, as I didn't want to change the copied code. However,
I tested locally and there is no big deal in previous Citus versions,
and you can also see that existing Citus tests with `buffers true`
didn't change. Therefore, I prefer not to backport "buffers" changes to
previous versions.
2025-01-02 12:32:36 +03:00
Naisila Puka 6d2a329da6 EXPLAIN generic_plan NOT supported in Citus (#7825)
We thought we provided support for this in

b8c493f2c4

However the use of parameters in SQL is not supported in Citus. Since
generic plan queries use parameters, we can't support for now.

Relevant PG16 commit https://github.com/postgres/postgres/commit/3c05284

Fixes #7813 with proper error message

(cherry picked from commit 0a6adf4ccc)
2025-01-02 10:53:36 +03:00
Naisila Puka 80678bb07e
Allow configuring sslnegotiation using citus.node_conn_info (#7821)
Relevant PG commit:
https://github.com/postgres/postgres/commit/d39a49c1e

PR similar to https://github.com/citusdata/citus/pull/5203
2024-12-30 21:25:50 +03:00
Naisila Puka 7dc97d3dc9
Disallow infinite values for partition interval in create_time_partitions udf (#7822)
PG17 added +/- infinity values for the interval data type
Relevant PG commit:
https://github.com/postgres/postgres/commit/519fc1bd9
2024-12-30 20:27:28 +03:00
Naisila Puka 9e3c3297fc
Adds JSON_TABLE() support, and SQL/JSON constructor/query functions tests (#7816)
DESCRIPTION: Adds JSON_TABLE() support

PG17 has added basic `JSON_TABLE()` functionality
`JSON_TABLE()` allows `JSON` data to be converted into a relational view
and thus used, for example, in a `FROM` clause, like other tabular data.

We treat `JSON_TABLE` the same as correlated functions (e.g., recurring
tuples). In the end, for multi-shard `JSON_TABLE` commands, we apply the
same restrictions as reference tables (e.g., cannot perform a lateral
outer join when a distributed subquery references a (reference
table)/(json table) etc.)

Relevant PG17 commits:
[basic JSON
table](https://github.com/postgres/postgres/commit/de3600452), [nested
paths in json
table](https://github.com/postgres/postgres/commit/bb766cde6)

Onder had previously added json table support for PG15BETA1, but we
reverted that commit because json table was reverted in PG15.
ce7f1a530f
Previous relevant PG15Beta1 commit:
https://github.com/postgres/postgres/commit/4e34747c8
Therefore, I referred to Onder's commit for this commit as well, with a
few changes due to some differences between PG15/PG17:

1) In PG15Beta1, we had also `PLAN` clauses for `JSON_TABLE`
https://github.com/postgres/postgres/commit/fadb48b00, and Onder's
commit includes tests for those as well. However, `PLAN` nodes are _not_
added in PG17. Therefore, I didn't include the `json_table_select_only`
test, which had mostly queries involving `PLAN`. I only included the
last query from that test.

2) In PG15 timeline (Citus 11.1), we didn't support outer joins where
the outer rel is a recurring one and the inner one is a non-recurring
one. However, [Onur added support for that one in Citus
11.2](https://github.com/citusdata/citus/pull/6512), therefore I updated
the tests from Onder's commit accordingly.

3) PG17 json table has nested paths and columns, therefore I added a
test
with a distributed table, which is exactly the same as the one in
sqljson_jsontable in PG17.
https://github.com/postgres/postgres/commit/bb766cde6

This pull request also adds some basic tests on validation of SQL/JSON
constructor functions JSON(), JSON_SCALAR(), and JSON_SERIALIZE(),
and also SQL/JSON query functions JSON_EXISTS(), JSON_QUERY(), and
JSON_VALUE(). The relevant PG commits are the following:
[JSON(), JSON_SCALAR(),
JSON_SERIALIZE()](https://github.com/postgres/postgres/commit/03734a7fe)
[JSON_EXISTS(), JSON_VALUE(),
JSON_QUERY()](https://github.com/postgres/postgres/commit/6185c9737)
2024-12-30 19:19:07 +03:00
Mehmet YILMAZ 1a3316281c
Error out for ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION (#7814)
PG17 added support for
ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION.
Relevant PG commit: https://github.com/postgres/postgres/commit/5d06e99a3

We currently don't support propagating this command for Citus tables.
It is added to future work.

This PR disallows `ALTER TABLE ... ALTER COLUMN ... SET EXPRESSION` on
all Citus table types (local, distributed, and partitioned distributed)
by adding an error check in `ErrorIfUnsupportedAlterTableStmt`. A new
regression test verifies that each table type fails with a consistent
error message when attempting to set an expression.
2024-12-27 16:02:12 +03:00
Mehmet YILMAZ 62a00b1b34
Error out for ALTER TABLE ... SET ACCESS METHOD DEFAULT (#7803)
PG17 introduced ALTER TABLE ... SET ACCESS METHOD DEFAULT

This PR introduces and enforces an error check preventing ALTER TABLE
... SET ACCESS METHOD DEFAULT on both Citus local tables (added via
citus_add_local_table_to_metadata) and distributed/partitioned
distributed tables. The regression tests now demonstrate that each table
type raises an error advising users to explicitly specify an access
method, rather than relying on DEFAULT. This ensures consistent behavior
across local and distributed environments in Citus.

The reason why we currently don't support this is that we can't simply
propagate the command as it is, because the default table access method
may be different across Citus cluster nodes.

Relevant PG commit:
https://github.com/postgres/postgres/commit/d61a6cad6
2024-12-27 15:07:38 +03:00
Naisila Puka d682bf06f7
Error for COPY FROM ... on_error, log_verbosity with Citus tables (#7811)
PG17 added the new ON_ERROR option for COPY FROM. When this option is
specified, COPY skips soft errors and
continues copying.
Relevant PG commits:
-- https://github.com/postgres/postgres/commit/9e2d87011
-- https://github.com/postgres/postgres/commit/b725b7eec

I tried it locally with Citus tables.
Without further implementation, it doesn't work correctly.
Therefore, we error out for now, and add it to future work.

PG17 also added log_verbosity option, which controls the
 amount of messages emitted during processing. This is
 currently used in COPY FROM when ON_ERROR option is set to
 ignore. Therefore, we error out for this option as well.
Relevant PG17 commit:
https://github.com/postgres/postgres/commit/f5a227895
2024-12-26 15:29:44 +03:00
Naisila Puka 6fed917a53
Adds PG17.1 support - Regression tests sanity (#7661)
This is the final commit that adds
PG17 compatibility with Citus's current capabilities.

You can use Citus community, release-13.0 branch, with PG17.1.

---------

Specifically, this commit:

- Enables PG17 in the configure script.

- Adds PG17 tests to CI using test images that have 17.1

- Fixes an upgrade test: see below for details
In `citus_prepare_upgrade()`, don't drop any_value when upgrading from
PG16+, because PG16+ has its own any_value function. Attempting to do so
results in the error seen in [pg16-pg17
upgrade](https://github.com/citusdata/citus/actions/runs/11768444117/job/32778340003?pr=7661):
```
ERROR:  cannot drop function any_value(anyelement) because it is required by the database system
CONTEXT:  SQL statement "DROP AGGREGATE IF EXISTS pg_catalog.any_value(anyelement)"
```
When 16 becomes the minimum supported Postgres version, the drop
statements can be removed.

---------

Several PG17 Compatibility commits have been merged before this final one.
All these subtasks are done https://github.com/citusdata/citus/issues/7653

See the list below:

Compilation PR: https://github.com/citusdata/citus/pull/7699
Ruleutils PR: https://github.com/citusdata/citus/pull/7725
Sister PR for tests: https://github.com/citusdata/the-process/pull/159

Helpful smaller PRs:
- https://github.com/citusdata/citus/pull/7714
- https://github.com/citusdata/citus/pull/7726
- https://github.com/citusdata/citus/pull/7731
- https://github.com/citusdata/citus/pull/7732
- https://github.com/citusdata/citus/pull/7733
- https://github.com/citusdata/citus/pull/7738
- https://github.com/citusdata/citus/pull/7745
- https://github.com/citusdata/citus/pull/7747
- https://github.com/citusdata/citus/pull/7748
- https://github.com/citusdata/citus/pull/7749
- https://github.com/citusdata/citus/pull/7752
- https://github.com/citusdata/citus/pull/7755
- https://github.com/citusdata/citus/pull/7757
- https://github.com/citusdata/citus/pull/7759
- https://github.com/citusdata/citus/pull/7760
- https://github.com/citusdata/citus/pull/7761
- https://github.com/citusdata/citus/pull/7762
- https://github.com/citusdata/citus/pull/7765
- https://github.com/citusdata/citus/pull/7766
- https://github.com/citusdata/citus/pull/7768
- https://github.com/citusdata/citus/pull/7769
- https://github.com/citusdata/citus/pull/7771
- https://github.com/citusdata/citus/pull/7774
- https://github.com/citusdata/citus/pull/7776
- https://github.com/citusdata/citus/pull/7780
- https://github.com/citusdata/citus/pull/7781
- https://github.com/citusdata/citus/pull/7785
- https://github.com/citusdata/citus/pull/7788
- https://github.com/citusdata/citus/pull/7793
- https://github.com/citusdata/citus/pull/7796

---------

Co-authored-by: Colm <colmmchugh@microsoft.com>
2024-12-24 17:56:51 +03:00
Naisila Puka 743f0ebb75
Bump Citus version into 13.0.0 (#7792)
We are using `release-13.0` branch for both development and release, to
deliver PG17 support in Citus.

Afterwards, we will (probably) merge this branch into main.

Some potential changes for main branch, after we are done working on
release-13.0:
- Merge changes from `release-13.0` to `main`
- Figure out what changes were there on 12.2, move them to 13.1 version.
In a nutshell: rename `12.1--12.2` to `13.0--13.1` and fix issues.
- Set version to 13.1devel
2024-12-24 11:40:59 +03:00
Naisila Puka 61d11ec2d4
PG17 Compatibility - Fix HideCitusDependentObjects function (#7796)
There is a crash when running vanilla tests because of the
`citus.hide_citus_dependent_objects` GUC. We turn on this GUC only for
the pg vanilla tests. This GUC runs the following function
`HideCitusDependentObjectsOnQueriesOfPgMetaTables`. This function
doesn't take into account the new `mergeJoinCondition`. I rewrote the
function such that it checks for merge join conditions as well.

Relevant PG commit:
https://github.com/postgres/postgres/commit/0294df2f1

The crash could be reproduced locally like the following:
```SQL
SET citus.hide_citus_dependent_objects TO on;

CREATE OR REPLACE FUNCTION
    pg_catalog.is_citus_depended_object(oid,oid)
    RETURNS bool
    LANGUAGE C
    AS 'citus', $$is_citus_depended_object$$;

-- try a system catalog
MERGE INTO pg_class c
USING (SELECT 'pg_depend'::regclass AS oid) AS j
ON j.oid = c.oid
WHEN MATCHED THEN
UPDATE SET reltuples = reltuples + 1
RETURNING j.oid;

CREATE VIEW classv AS SELECT * FROM pg_class;

MERGE INTO classv c
USING pg_namespace n
ON n.oid = c.relnamespace
WHEN MATCHED AND c.oid = 'pg_depend'::regclass THEN
UPDATE SET reltuples = reltuples - 1
RETURNING c.oid;
-- crash happens here
```
2024-12-20 17:59:09 +03:00
Teja Mupparti a0cd8bd37b
PG17 Compatibility: Support MERGE features in Citus with clean exceptions (#7781)
- Adapted `pgmerge.sql` tests from PostgreSQL community's `merge.sql` to
Citus by converting tables into Citus local tables.
- Identified two new PostgreSQL 17 MERGE features (`RETURNING` support
and MERGE on updatable views) not yet supported by Citus.
- Implemented changes to detect unsupported features and raise clean
exceptions, ensuring pgmerge tests pass without diffs.
- Addressed breaking changes caused by `MERGE ... WHEN NOT MATCHED BY
SOURCE` restructuring, reducing diffs in pgmerge tests.
- Segregated unsupported test cases into `merge_unsupported.sql` to
maintain clarity and avoid large diffs in test files.
- Prepared the Citus MERGE planner to handle new PostgreSQL changes,
reducing remaining test discrepancies.

All merge tests now pass cleanly, with unsupported cases clearly
isolated.

Relevant PG commits:
c649fa24a
https://github.com/postgres/postgres/commit/c649fa24a
0294df2f1
https://github.com/postgres/postgres/commit/0294df2f1
---------

Co-authored-by: naisila <nicypp@gmail.com>
2024-12-19 14:02:24 +03:00
Colm e91ee245ac
PG17 compatibility: account for identity columns in partitioned tables. (#7785)
PG17 added support for identity columns in partitioned tables:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=699586315
A consequence is that a table with an identity column cannot be attached
as a partition. But Citus on Postgres 17 will generate identity column
for the partitions if the parent table has one (or more) identity
columns when propagating distributed table DDL to worker nodes, as
happens in the `generated_identity` regress test in #7768:
```
 CREATE TABLE partitioned_table (
     a bigint CONSTRAINT myconname GENERATED BY DEFAULT AS IDENTITY (START WITH 10 INCREMENT BY 10),
     b bigint GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10),
     c int
 )
 PARTITION BY RANGE (c);
 CREATE TABLE partitioned_table_1_50 PARTITION OF partitioned_table FOR VALUES FROM (1) TO (50);
 CREATE TABLE partitioned_table_50_500 PARTITION OF partitioned_table FOR VALUES FROM (50) TO (1000);
 SELECT create_distributed_table('partitioned_table', 'a');
- create_distributed_table
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  table "partitioned_table_1_50" being attached contains an identity column "a"
+DETAIL:  The new partition may not contain an identity column.
```
It is the Citus-generated ATTACH PARTITION statement that errors out,
because the Citus-generated CREATE TABLE for the partitions included
identity column definitions. The fix is straightforward - when
propagating the CREATE TABLE ddl for a partition of a table with an
identity column, don't include the identity column(s), they will be
inherited on attaching the partition. In Citus on Postgres 16 (or less)
partitions do not inherit identity; the partitions in the example would
not have any identity columns so it was not an issue previously.
2024-12-18 13:18:53 +00:00
Colm 1c2e78405b
PG17 compatibility: account for MAINTAIN privilege in regress tests (#7774)
This PR addresses regress tests impacted by the introduction of [the
MAINTAIN privilege in
PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ecb0fd337).
The impacted tests include `generated_identity`,
`create_single_shard_table`, `grant_on_sequence_propagation`,
`grant_on_foreign_server_propagation`, `single_node_enterprise`,
`multi_multiuser_master_protocol`,
`multi_alter_table_row_level_security`, `shard_move_constraints` which
show the following error:
```
SELECT start_metadata_sync_to_node('localhost', :worker_2_port);
- start_metadata_sync_to_node
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  unrecognized aclright: 16384
```

and `multi_multiuser_master_protocol`, where the `pg_class.relacl`
column has 'm' for MAINTAIN if applicable:
```
        relname       |   rolname   |                           relacl                           
 ---------------------+-------------+------------------------------------------------------------
  trivial_full_access | full_access | 
- trivial_postgres    | postgres    | {postgres=arwdDxt/postgres,full_access=arwdDxt/postgres}
+ trivial_postgres    | postgres    | {postgres=arwdDxtm/postgres,full_access=arwdDxtm/postgres}
```

The PR updates function `convert_aclright_to_string()` in
citus_ruleutils.c to include a case for `ACL_MAINTAIN`. Per the comment
on `convert_aclright_to_string()` in citus_ruleutils.c, it is a copy of
`convert_aclright_to_string()` in Postgres (where it is in
`src/backend/utils/adt/acl.c`), so requires updating to be consistent
with Postgres. With this change Citus can recognize the MAINTAIN
privilege, and will not emit the `unrecognized aclright` error. The PR
also adds an alternative goldfile for `multi_multiuser_master_protocol`.

Note that `convert_aclright_to_string()` in Postgres includes access
types SET and ALTER SYSTEM on system parameters (aka GUCs), added by
[this PG16
commit](https://github.com/postgres/postgres/commit/a0ffa885e). If Citus
were to have a requirement to support granting SET and ALTER SYSTEM we
would need to update `convert_aclright_to_string()` in citus_ruleutils.c
with SET and ALTER SYSTEM.
2024-12-06 13:03:51 +00:00
Naisila Puka 90d76e8097
Fix bug: alter database shouldn't propagate for undistributed database (#7777)
Before this fix, the following would happen for a database that is not
distributed:

```sql
CREATE DATABASE db_to_test;
NOTICE:  Citus partially supports CREATE DATABASE for distributed databases
DETAIL:  Citus does not propagate CREATE DATABASE command to workers
HINT:  You can manually create a database and its extensions on workers.

ALTER DATABASE db_to_test with CONNECTION LIMIT 100;
NOTICE:  issuing ALTER DATABASE db_to_test WITH  CONNECTION LIMIT 100;
DETAIL:  on server postgres@localhost:57638 connectionId: 2
NOTICE:  issuing ALTER DATABASE db_to_test WITH  CONNECTION LIMIT 100;
DETAIL:  on server postgres@localhost:57637 connectionId: 1
ERROR:  database "db_to_test" does not exist
```

With this fix, we also take care of
https://github.com/citusdata/citus/issues/7763

Fixes #7763
2024-12-05 13:47:51 +03:00
Colm 30a75eadf1
PG17 compatibility: fix diffs in create_index, privileges vanilla tests (#7766)
PG17 regress sanity (#7653) fix; address diffs in vanilla tests
`create_index` and `privileges`. There is a change from `permission
denied` to `must be owner of`, seen in create_index:
```
@@ -2970,21 +2970,21 @@
 REINDEX TABLE pg_toast.pg_toast_1260;
 ERROR:  permission denied for table pg_toast_1260
 REINDEX INDEX pg_toast.pg_toast_1260_index;
-ERROR:  permission denied for index pg_toast_1260_index
+ERROR:  must be owner of index pg_toast_1260_index
```
and privileges:
```
@@ -2945,41 +2945,43 @@
ERROR:  permission denied for table maintain_test
 REINDEX INDEX maintain_test_a_idx;
-ERROR:  permission denied for index maintain_test_a_idx
+ERROR:  must be owner of index maintain_test_a_idx
 REINDEX SCHEMA reindex_test;

 REINDEX INDEX maintain_test_a_idx;
+ERROR:  must be owner of index maintain_test_a_idx
 REINDEX SCHEMA reindex_test;
```

The fix updates function `RangeVarCallbackForReindexIndex()` in
`index.c` with changes made by the introduction of the [MAINTAIN
privilege in
PG17](https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ecb0fd337)
to the function `RangeVarCallbackForReindexIndex()` in `indexcmds.c`.
The code is under a Postgres 17 version directive, which can be removed
when 17 becomes the oldest supported Postgres version.
2024-12-05 10:03:28 +00:00
Naisila Puka ed137001a5
PG17 compatibility: add COLLPROVIDER_BUILTIN option and fix tests (#7752)
In PG17 adds builtin C.UTF-8 locale option, we add it in the code to
avoid "unknown collation provider" in vanilla tests.

Relevant PG commit:

f69319f2f1
f69319f2f1fb16eda4b535bcccec90dff3a6795e

Also in PG17, colliculocale, daticulocale renamed to colllocale,
datlocale
Here we fix the following tests to avoid alternative output
pg15 pg16 multi_mx_create_table multi_schema_support

Relevant PG commit:

f696c0cd5f
f696c0cd5f299f1b51e214efc55a22a782cc175d
2024-11-19 12:26:45 +03:00
Naisila Puka a93887b3de
PG17 compatibility: Check whether table AM is default (#7747)
PG 17 added support for DEFAULT in ALTER TABLE .. SET ACCESS METHOD

Relevant PG commit:
d61a6cad6418f643a5773352038d0dfe5d3535b8
d61a6cad64

In that case, name in `AlterTableCmd->name` would be null.
Add a null check here to avoid crash.
2024-11-18 18:09:43 +03:00
Naisila Puka 84b52fc908
PG17 compatibility - Check if there are blocks left in columnar_scan_analyze_next_block (#7738)
In PG17, the outer loop in `acquire_sample_rows()` changed
from
`while (BlockSampler_HasMore(&bs))`
to
`while (table_scan_analyze_next_block(scan, stream))`

Relevant PG commit:
041b96802efa33d2bc9456f2ad946976b92b5ae1

041b96802e

It is expected that the `scan_analyze_next_block` function will
check if there are any blocks left. So we add that check in
`columnar_scan_analyze_next_block`

Without this fix, we will have an indefinite loop causing timeout.
Specifically, in our test schedules,
`multi schedule` stuck at `drop_column_partitioned_table` test
`multi-mx` schedule stuck at `start_stop_metadata_sync` test
`columnar schedule` stuck at `columnar_create` test
2024-11-18 17:27:49 +03:00
Naisila Puka c0a5f5c78c
PG17 compatibility: ruleutils (#7725)
PG17 compatibility - Part 2

https://github.com/citusdata/citus/pull/7699 was the first PG17
compatibility PR merged to main branch, which provided ONLY successful
Citus compilation with PG17.0.

This PR, consider it as Part 2, provides ruleutils changes for PG17.
Ruleutils changes is the first thing we should merge, after successful
build. It's the core for deparsing logic in Citus.

# Question: How do we add ruleutils changes?
- We add a new ruleutils file specific to PG17.
- We keep track of the changes in Postgres's ruleutils file from here
https://github.com/postgres/postgres/commits/REL_17_0/src/backend/utils/adt/ruleutils.c
- Per each commit in that history that belongs only to 17.0, we add the
relevant changes to static functions to our ruleutils file for PG17.
It's like a manual commit copying.

# Check the PR's commits for detailed steps
https://github.com/citusdata/citus/pull/7725/commits
2024-11-11 11:55:10 +03:00
Naisila Puka da2624cee8
PG17 compatibility: Resolve compilation issues (#7699)
This PR provides successful compilation against PG17.0.

- Remove ExecFreeExprContext call
Relevant PG commit
d060e921ea5aa47b6265174c32e1128cebdbc3df
d060e921ea

- PG17 uses streaming IO in analyze, fix scan_analyze_next_block function
Relevant PG commit
041b96802efa33d2bc9456f2ad946976b92b5ae1
041b96802e

- Define ObjectClass for PG17+ only since it's removed
Relevant PG commit:
89e5ef7e21812916c9cf9fcf56e45f0f74034656
89e5ef7e21

- Remove ReorderBufferTupleBuf structure.
Relevant PG commit:
08e6344fd6423210b339e92c069bb979ba4e7cd6
08e6344fd6

- Define colliculocale and daticulocale since they have been renamed
Relevant PG commit:
f696c0cd5f299f1b51e214efc55a22a782cc175d
f696c0cd5f

- makeStringConst defined in PG17
Relevant PG commit:
de3600452b61d1bc3967e9e37e86db8956c8f577
de3600452b

- RangeVarCallbackOwnsTable was replaced by RangeVarCallbackMaintainsTable
Relevant PG commit:
ecb0fd33720fab91df1207e85704f382f55e1eb7
ecb0fd3372

- attstattarget is nullable, define pg compatible functions for it
Relevant PG commit:
4f622503d6de975ac87448aea5cea7de4bc140d5
4f622503d6

- stxstattarget is nullable in PG17, write compat functions for it
Relevant PG commit:
012460ee93c304fbc7220e5b55d9d0577fc766ab
012460ee93

- Use ResourceOwner to track WaitEventSet in PG17
Relevant PG commit:
50c67c2019ab9ade8aa8768bfe604cd802fe8591
50c67c2019

- getIdentitySequence now uses Relation instead of relation_id
Relevant PG commit:
509199587df73f06eda898ae13284292f4ae573a
509199587d

- Remove no-op tuplestore_donestoring function
Relevant PG commit:
75680c3d805e2323cd437ac567f0677fdfc7b680
75680c3d80

- MergeAction can have 3 merge kinds (now enum) in PG17, write compat
Relevant PG commit:
0294df2f1f842dfb0eed79007b21016f486a3c6c
0294df2f1f

- EXPLAIN (MEMORY) is added, make changes to ExplainOnePlan
Relevant PG commit:
5de890e3610d5a12cdaea36413d967cf5c544e20
5de890e361

- LIMIT_OPTION_DEFAULT has been removed as it's useless, use LIMIT_OPTION_COUNT
Relevant PG commit:
a6be0600ac3b71dda8277ab0fcbe59ee101ac1ce
a6be0600ac

- write compat for create_foreignscan_path bcs of more arguments in PG17
Relevant PG commit:
9e9931d2bf40e2fea447d779c2e133c2c1256ef3
9e9931d2bf

- pgprocno and lxid have been combined into a struct in PGPROC
Relevant PG commits:
28f3915b73f75bd1b50ba070f56b34241fe53fd1
28f3915b73

ab355e3a88de745607f6dd4c21f0119b5c68f2ad
ab355e3a88

024c521117579a6d356050ad3d78fdc95e44eefa
024c521117

- Simplify CitusNewNode (#7434)
postgres refactored newNode() in PG 17, the main point for doing this is
the original tricks is no longer neccessary for modern compilers[1].
This does the same for Citus.
This should have no backward compatibility issues since it just replaces
palloc0fast with palloc0.
This is good for forward compatibility since palloc0fast no longer
exists in PG 17.
[1]
https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
(cherry picked from commit 4b295cc)
2024-10-17 15:37:13 +03:00
Naisila Puka 9d364332ac
Rename foreach_ macros to foreach_declared_ macros (#7700)
This is prep work for successful compilation with PG17

PG17added foreach_ptr, foreach_int and foreach_oid macros
Relevant PG commit
14dd0f27d7cd56ffae9ecdbe324965073d01a9ff

14dd0f27d7

We already have these macros, but they are different with the
PG17 ones because our macros take a DECLARED variable, whereas
the PG16 macros declare a locally-scoped loop variable themselves.

Hence I am renaming our macros to foreach_declared_

I am separating this into its own PR since it touches many files. The
main compilation PR is https://github.com/citusdata/citus/pull/7699
2024-10-16 17:01:39 +03:00
Parag Jain 6349f2d52d
Support MERGE command for single_shard_distributed Target (#7643)
This PR has following changes :
1. Enable MERGE command for single_shard_distributed targets.

(cherry picked from commit 3c467e6e02)
2024-07-17 15:11:38 +03:00
Jelte Fennema-Nio 4f0053ed6d Redo #7620: Fix merge command when insert value does not have source distributed column (#7627)
Related to issue #7619, #7620
Merge command fails when source query is single sharded and source and
target are co-located and insert is not using distribution key of
source.

Example
```
CREATE TABLE source (id integer);
CREATE TABLE target (id integer );

-- let's distribute both table on id field
SELECT create_distributed_table('source', 'id');
SELECT create_distributed_table('target', 'id');

MERGE INTO target t
  USING ( SELECT 1 AS somekey
          FROM source
        WHERE source.id = 1) s
  ON t.id = s.somekey
  WHEN NOT MATCHED
  THEN INSERT (id)
    VALUES (s.somekey)

ERROR:  MERGE INSERT must use the source table distribution column value
HINT:  MERGE INSERT must use the source table distribution column value
```

Author's Opinion: If join is not between source and target distributed
column, we should not force user to use source distributed column while
inserting value of target distributed column.

Fix: If user is not using distributed key of source for insertion let's
not push down query to workers and don't force user to use source
distributed column if it is not part of join.

This reverts commit fa4fc0b372.

Co-authored-by: paragjain <paragjain@microsoft.com>
(cherry picked from commit aaaf637a6b)
2024-06-18 16:49:39 +02:00
Gürkan İndibay 4e838a471a
Adds null check for node in HasRangeTableRef (#7604)
DESCRIPTION: Adds null check for node in HasRangeTableRef to prevent
errors

When executing the query below, users encountered an error due to a null
Node object. This PR adds a null check to handle this error.

Query:
```sql
select
    ct.conname as constraint_name,
    a.attname as column_name,
    fc.relname as foreign_table_name,
    fns.nspname as foreign_table_schema,
    fa.attname as foreign_column_name
from
    (SELECT ct.conname, ct.conrelid, ct.confrelid, ct.conkey, ct.contype,
ct.confkey, generate_subscripts(ct.conkey, 1) AS s
       FROM pg_constraint ct
    ) AS ct
    inner join pg_class c on c.oid=ct.conrelid
    inner join pg_namespace ns on c.relnamespace=ns.oid
    inner join pg_attribute a on a.attrelid=ct.conrelid and a.attnum =
ct.conkey[ct.s]
    left join pg_class fc on fc.oid=ct.confrelid
    left join pg_namespace fns on fc.relnamespace=fns.oid
    left join pg_attribute fa on fa.attrelid=ct.confrelid and fa.attnum =
ct.confkey[ct.s]
where
    ct.contype='f'
    and c.relname='table1'
    and ns.nspname='schemauser'
order by
    fns.nspname, fc.relname, a.attnum
;
```

Error:
```
#0  HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507
507             if (IsA(node, RangeTblRef))
#0  HasRangeTableRef (node=0x0, varno=varno@entry=0x7ffe18cc3674) at worker/worker_shard_visibility.c:507
#1  0x0000561b0aae390e in expression_tree_walker_impl (node=0x561b0d19cc78, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2091
#2  0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#3  0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cd68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674)
    at nodeFuncs.c:2405
#4  0x0000561b0aae3945 in expression_tree_walker_impl (node=0x561b0d19d0f8, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2111
#5  0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#6  0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x561b0d19cb38, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674)
    at nodeFuncs.c:2405
#7  0x0000561b0aae396d in expression_tree_walker_impl (node=0x561b0d19d198, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2127
#8  0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#9  0x0000561b0aae3ef7 in expression_tree_walker_impl (node=0x561b0d183e88, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2464
#10 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#11 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184278, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2460
#12 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#13 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184668, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2460
#14 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#15 0x0000561b0aae3ed3 in expression_tree_walker_impl (node=0x561b0d184f68, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=0x7ffe18cc3674)
    at nodeFuncs.c:2460
#16 0x00007f2a73249f26 in HasRangeTableRef (node=<optimized out>, varno=<optimized out>) at worker/worker_shard_visibility.c:513
#17 0x0000561b0aae3e09 in expression_tree_walker_impl (node=0x7f2a68010148, walker=walker@entry=0x7f2a73249f0a <HasRangeTableRef>, context=context@entry=0x7ffe18cc3674)
    at nodeFuncs.c:2405
#18 0x00007f2a7324a0eb in FilterShardsFromPgclass (node=node@entry=0x561b0d185de8, context=context@entry=0x0) at worker/worker_shard_visibility.c:464
#19 0x00007f2a7324a5ff in HideShardsFromSomeApplications (query=query@entry=0x561b0d185de8) at worker/worker_shard_visibility.c:294
#20 0x00007f2a731ed7ac in distributed_planner (parse=0x561b0d185de8, 
    query_string=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=<optimized out>, boundParams=0x0) at planner/distributed_planner.c:237
#21 0x00007f2a7311a52a in pgss_planner (parse=0x561b0d185de8, 
    query_string=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=2048, boundParams=0x0) at pg_stat_statements.c:953
#22 0x0000561b0ab65465 in planner (parse=parse@entry=0x561b0d185de8, 
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at planner.c:279
#23 0x0000561b0ac53aa3 in pg_plan_query (querytree=querytree@entry=0x561b0d185de8, 
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at postgres.c:904
#24 0x0000561b0ac53b71 in pg_plan_queries (querytrees=0x7f2a68012878, 
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0)
    at postgres.c:996
#25 0x0000561b0ac5408e in exec_simple_query (
    query_string=query_string@entry=0x561b0d009478 "select\n    ct.conname as constraint_name,\n    a.attname as column_name,\n    fc.relname as foreign_table_name,\n    fns.nspname as foreign_table_schema,\n    fa.attname as foreign_column_name\nfrom\n    (S"...) at postgres.c:1193
#26 0x0000561b0ac56116 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4637
#27 0x0000561b0abab7a7 in BackendRun (port=port@entry=0x561b0d0caf50) at postmaster.c:4464
#28 0x0000561b0abae969 in BackendStartup (port=port@entry=0x561b0d0caf50) at postmaster.c:4192
#29 0x0000561b0abaeaa6 in ServerLoop () at postmaster.c:1782
```


Fixes #7603
2024-05-28 08:54:40 +03:00
Jelte Fennema-Nio bac95cc523 Greatly speed up "\d tablename" on servers with many tables (#7577)
DESCRIPTION: Fix performance issue when using "\d tablename" on a server
with many tables

We introduce a filter to every query on pg_class to automatically remove
shards. This is useful to make sure \d and PgAdmin are not cluttered
with shards. However, the way we were introducing this filter was using
`securityQuals` which can have negative impact on query performance.

On clusters with 100k+ tables this could cause a simple "\d tablename"
command to take multiple seconds, because a skipped optimization by
Postgres causes a full table scan. This changes the code to introduce
this filter in the regular `quals` list instead of in `securityQuals`.
Which causes Postgres to use the intended optimization again.

For reference, this was initially reported as a Postgres issue by me:

https://www.postgresql.org/message-id/flat/4189982.1712785863%40sss.pgh.pa.us#b87421293b362d581ea8677e3bfea920
(cherry picked from commit a0151aa31d)
2024-04-17 10:26:50 +02:00
Xing Guo 38967491ef Add missing volatile qualifier. (#7570)
Variables being modified in the PG_TRY block and read in the PG_CATCH
block should be qualified with volatile.

The variable waitEventSet is modified in the PG_TRY block (line 1085)
and read in the PG_CATCH block (line 1095).

The variable relation is modified in the PG_TRY block (line 500) and
read in the PG_CATCH block (line 515).

Besides, the variable objectAddress doesn't need the volatile qualifier.

Ref: C99 7.13.2.1[^1],

> All accessible objects have values, and all other components of the
abstract machine have state, as of the time the longjmp function was
called, except that the values of objects of automatic storage duration
that are local to the function containing the invocation of the
corresponding setjmp macro that do not have volatile-qualified type and
have been changed between the setjmp invocation and longjmp call are
indeterminate.

[^1]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

DESCRIPTION: Correctly mark some variables as volatile

---------

Co-authored-by: Hong Yi <zouzou0208@gmail.com>
(cherry picked from commit ada3ba2507)
2024-04-17 10:26:50 +02:00
Filip Sedlák fc09e1cfdc Log username in the failed connection message (#7432)
This patch includes the username in the reported error message.
This makes debugging easier when certain commands open connections
as other users than the user that is executing the command.

```
monitora_snapshot=# SELECT citus_move_shard_placement(102030, 'monitora.db-dev-worker-a', 6005, 'monitora.db-dev-worker-a', 6017);
ERROR:  connection to the remote node monitora_user@monitora.db-dev-worker-a:6017 failed with the following error: fe_sendauth: no password supplied
Time: 40,198 ms
```

(cherry picked from commit 8b48d6ab02)
2024-04-17 10:26:50 +02:00
sminux 5708fca1ea fix bad copy-paste rightComparisonLimit (#7547)
DESCRIPTION: change for #7543
(cherry picked from commit d59c93bc50)
2024-04-17 10:26:50 +02:00
LightDB Enterprise Postgres 2a6164d2d9 Fix timeout when underlying socket is changed in a MultiConnection (#7377)
When there are multiple localhost entries in /etc/hosts like following
/etc/hosts:
```
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1   localhost
```

multi_cluster_management check will failed:
```

@@ -857,20 +857,21 @@
 ERROR:  group 14 already has a primary node
 -- check that you can add secondaries and unavailable nodes to a group
 SELECT groupid AS worker_2_group FROM pg_dist_node WHERE nodeport = :worker_2_port \gset
 SELECT 1 FROM master_add_node('localhost', 9998, groupid => :worker_1_group, noderole => 'secondary');
  ?column?
 ----------
         1
 (1 row)

 SELECT 1 FROM master_add_node('localhost', 9997, groupid => :worker_1_group, noderole => 'unavailable');
+WARNING:  could not establish connection after 5000 ms
  ?column?
 ----------
         1
 (1 row)
```

This actually isn't just a problem in test environments, but could occur
as well during actual usage when a hostname in pg_dist_node
resolves to multiple IPs and one of those IPs is unreachable.
Postgres will then automatically continue with the next IP, but
Citus should listen for events on the new socket. Not on the
old one.

Co-authored-by: chuhx43211 <chuhx43211@hundsun.com>
(cherry picked from commit 9a91136a3d)
2024-04-17 10:26:50 +02:00
eaydingol db391c0bb7 Change the order in which the locks are acquired (#7542)
This PR changes the order in which the locks are acquired (for the
target and reference tables), when a modify request is initiated from a
worker node that is not the "FirstWorkerNode".

To prevent concurrent writes, locks are acquired on the first worker
node for the replicated tables. When the update statement originates
from the first worker node, it acquires the lock on the reference
table(s) first, followed by the target table(s). However, if the update
statement is initiated in another worker node, the lock requests are
sent to the first worker in a different order. This PR unifies the
modification order on the first worker node. With the third commit,
independent of the node that received the request, the locks are
acquired for the modified table and then the reference tables on the
first node.

The first commit shows a sample output for the test prior to the fix.

Fixes #7477

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
(cherry picked from commit 8afa2d0386)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 146725fc9b Replace more spurious strdups with pstrdups (#7441)
DESCRIPTION: Remove a few small memory leaks

In #7440 one instance of a strdup was removed. But there were a few
more. This removes the ones that are left over, or adds a comment why
strdup is on purpose.

(cherry picked from commit 9683bef2ec)
2024-04-17 10:26:50 +02:00
Marco Slot 94ab1dc240 Replace spurious strdup with pstrdup (#7440)
Not sure why we never found this using valgrind, but using strdup will
cause memory leaks because the pointer is not tracked in a memory
context.

(cherry picked from commit 72fbea20c4)
2024-04-17 10:26:50 +02:00
Onur Tirtir 812a2b759f Improve error message for recursive CTEs (#7407)
Fixes #2870

(cherry picked from commit 5aedec4242)
2024-04-17 10:26:50 +02:00
Onur Tirtir 452564c19b Allow providing "host" parameter via citus.node_conninfo (#7541)
And when that is the case, directly use it as "host" parameter for the
connections between nodes and use the "hostname" provided in
pg_dist_node / pg_dist_poolinfo as "hostaddr" to avoid host name lookup.

This is to avoid allowing dns resolution (and / or setting up DNS names
for each host in the cluster). This already works currently when using
IPs in the hostname. The only use of setting host is that you can then
use sslmode=verify-full and it will validate that the hostname matches
the certificate provided by the node you're connecting too.

It would be more flexible to make this a per-node setting, but that
requires SQL changes. And we'd like to backport this change, and
backporting such a sql change would be quite hard while backporting this
change would be very easy. And in many setups, a different hostname for
TLS validation is actually not needed. The reason for that is
query-from-any node: With query-from-any-node all nodes usually have a
certificate that is valid for the same "cluster hostname", either using
a wildcard cert or a Subject Alternative Name (SAN). Because if you load
balance across nodes you don't know which node you're connecting to, but
you still want TLS validation to do it's job. So with this change you
can use this same "cluster hostname" for TLS validation within the
cluster. Obviously this means you don't validate that you're connecting
to a particular node, just that you're connecting to one of the nodes in
the cluster, but that should be fine from a security perspective (in
most cases).

Note to self: This change requires updating

https://docs.citusdata.com/en/latest/develop/api_guc.html#citus-node-conninfo-text.

DESCRIPTION: Allows overwriting host name for all inter-node connections
by supporting "host" parameter in citus.node_conninfo

(cherry picked from commit 3586aab17a)
2024-04-17 10:26:50 +02:00
Karina 9b06d02c43 Fix error in master_disable_node/citus_disable_node (#7492)
This fixes #7454: master_disable_node() has only two arguments, but
calls citus_disable_node() that tries to read three arguments

Co-authored-by: Karina Litskevich <litskevichkarina@gmail.com>
(cherry picked from commit 683e10ab69)
2024-04-17 10:26:50 +02:00
copetol 2ee43fd00c Fix segfault when using certain DO block in function (#7554)
When using a CASE WHEN expression in the body
of the function that is used in the DO block, a segmentation
fault occured. This fixes that.

Fixes #7381

---------

Co-authored-by: Konstantin Morozov <vzbdryn@yahoo.com>
(cherry picked from commit 12f56438fc)
2024-04-17 10:26:50 +02:00
Emel Şimşek f2d102d54b Fix crash caused by some form of ALTER TABLE ADD COLUMN statements. (#7522)
DESCRIPTION: Fixes a crash caused by some form of ALTER TABLE ADD COLUMN
statements. When adding multiple columns, if one of the ADD COLUMN
statements contains a FOREIGN constraint ommitting the referenced
columns in the statement, a SEGFAULT occurs.

For instance, the following statement results in a crash:

```
  ALTER TABLE lt ADD COLUMN new_col1 bool,
                          ADD COLUMN new_col2 int references rt;

```

Fixes #7520.

(cherry picked from commit fdd658acec)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 364e8ece14 Speed up GetForeignKeyOids (#7578)
DESCRIPTION: Fix performance issue in GetForeignKeyOids on systems with
many constraints

GetForeignKeyOids was showing up in CPU profiles when distributing
schemas on systems with 100k+ constraints. The reason was that this
function was doing a sequence scan of pg_constraint to get the foreign
keys that referenced the requested table.

This fixes that by finding the constraints referencing the table through
pg_depend instead of pg_constraint. We're doing this indirection,
because pg_constraint doesn't have an index that we can use, but
pg_depend does.

(cherry picked from commit a263ac6f5f)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio 62c32067f1 Use an index to get FDWs that depend on extensions (#7574)
DESCRIPTION: Fix performance issue when distributing a table that
depends on an extension

When the database contains many objects this function would show up in
profiles because it was doing a sequence scan on pg_depend. And with
many objects pg_depend can get very large.

This starts using an index scan to only look for rows containing FDWs,
of which there are expected to be very few (often even zero).

(cherry picked from commit 16604a6601)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio d9069b1e01 Speed up SequenceUsedInDistributedTable (#7579)
DESCRIPTION: Fix performance issue when creating distributed tables if
many already exist

This builds on the work to speed up EnsureSequenceTypeSupported, and now
does something similar for SequenceUsedInDistributedTable.
SequenceUsedInDistributedTable had a similar O(number of citus tables)
operation. This fixes that and speeds up creation of distributed tables
significantly when many distributed tables already exist.

Fixes #7022

(cherry picked from commit cdf51da458)
2024-04-17 10:26:50 +02:00
Jelte Fennema-Nio fa6743d436 Speed up EnsureSequenceTypeSupported (#7575)
DESCRIPTION: Fix performance issue when creating distributed tables and many already exist

EnsureSequenceTypeSupported was doing an O(number of distributed tables)
operation. This can become very slow with lots of Citus tables, which
now happens much more frequently in practice due to schema based sharding.

Partially addresses #7022

(cherry picked from commit 381f31756e)
2024-04-17 10:26:50 +02:00