Commit Graph

4677 Commits (c98341e4ed3967219dd5ae0e264dc41a7f442f1c)

Author SHA1 Message Date
Gürkan İndibay c3579eef06
Adds REASSIGN OWNED BY propagation (#7319)
DESCRIPTION: Adds REASSIGN OWNED BY propagation

This pull request introduces the propagation of the "Reassign owned by"
statement. It accommodates both local and distributed roles for both the
old and new assignments. However, when the old role is a local role, it
undergoes filtering and is not propagated. On the other hand, if the new
role is a local role, the process involves first creating the role on
worker nodes before propagating the "Reassign owned" statement.
2023-12-28 15:15:58 +03:00
Gürkan İndibay 181b8ab6d5
Adds additional alter database propagation support (#7253)
DESCRIPTION: Adds database connection limit, rename and set tablespace
propagation
In this PR, below statement propagations are added

alter database <database_name> with allow_connections = <boolean_value>;
alter database <database_name> rename to <database_name2>;
alter database <database_name> set TABLESPACE <table_space_name>

---------

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-12-26 14:55:04 +03:00
Halil Ozan Akgül b877d606c7
Adds 2PC distributed commands from other databases (#7203)
DESCRIPTION: Adds support for 2PC from non-Citus main databases

This PR only adds support for `CREATE USER` queries, other queries need
to be added. But it should be simple because this PR creates the
underlying structure.

Citus main database is the database where the Citus extension is
created. A non-main database is all the other databases that are in the
same node with a Citus main database.

When a `CREATE USER` query is run on a non-main database we:

1. Run `start_management_transaction` on the main database. This
function saves the outer transaction's xid (the non-main database
query's transaction id) and marks the current query as main db command.
2. Run `execute_command_on_remote_nodes_as_user("CREATE USER
<username>", <username to run the command>)` on the main database. This
function creates the users in the rest of the cluster by running the
query on the other nodes. The user on the current node is created by the
query on the outer, non-main db, query to make sure consequent commands
in the same transaction can see this user.
3. Run `mark_object_distributed` on the main database. This function
adds the user to `pg_dist_object` in all of the nodes, including the
current one.

This PR also implements transaction recovery for the queries from
non-main databases.
2023-12-22 19:19:41 +03:00
Jodi-Ann Francis 6801a1ed1e
PG16 update GRANT... ADMIN | INHERIT | SET, and REVOKE
Allowing GRANT ADMIN to now also be INHERIT or SET in support of psql16

GRANT role_name [, ...] TO role_specification [, ...] [ WITH { ADMIN |
INHERIT | SET } { OPTION | TRUE | FALSE } ] [ GRANTED BY
role_specification ]

Fixes: #7148 
Related: #7138

See review changes from https://github.com/citusdata/citus/pull/7164
2023-12-13 15:57:02 -05:00
Naisila Puka dbdde111c1
Add missing order by clause in failure_split_cleanup test (#7363)
https://github.com/citusdata/citus/actions/runs/6903353045/attempts/1#summary-18781959638
```diff
         ARRAY['-100000'],
         ARRAY[:worker_1_node, :worker_2_node],
         'force_logical');
 ERROR:  server closed the connection unexpectedly
 CONTEXT:  while executing command on localhost:9060
     SELECT operation_id, object_type, object_name, node_group_id, policy_type
     FROM pg_dist_cleanup where operation_id = 777 ORDER BY object_name;
  operation_id | object_type |                        object_name                        | node_group_id | policy_type 
 --------------+-------------+-----------------------------------------------------------+---------------+-------------
           777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981000 |             1 |           0
-          777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981002 |             1 |           1
           777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981002 |             2 |           0
+          777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981002 |             1 |           1
           777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981003 |             2 |           1
           777 |           4 | citus_shard_split_publication_1_10_777                    |             2 |           0
 (5 rows)
```

Similar attempt to fix in

c9f2fc892d
There were some more missing ORDER BY stuff, so I added them
2023-11-24 18:26:06 +03:00
Naisila Puka c019acc01b
Run wal2json cdc test for pg16 as well (#7361)
pg16 wal2json package is now available, adding the tests back. Basically
reverting
f253bb3210

Sister PR https://github.com/citusdata/the-process/pull/153
2023-11-24 14:40:23 +03:00
Nils Dijk 0620c8f9a6
Sort includes (#7326)
This change adds a script to programatically group all includes in a
specific order. The script was used as a one time invocation to group
and sort all includes throught our formatted code. The grouping is as
follows:

 - System includes (eg. `#include<...>`)
 - Postgres.h (eg. `#include "postgres.h"`)
- Toplevel imports from postgres, not contained in a directory (eg.
`#include "miscadmin.h"`)
 - General postgres includes (eg . `#include "nodes/..."`)
- Toplevel citus includes, not contained in a directory (eg. `#include
"citus_verion.h"`)
 - Columnar includes (eg. `#include "columnar/..."`)
 - Distributed includes (eg. `#include "distributed/..."`)

Because it is quite hard to understand the difference between toplevel
citus includes and toplevel postgres includes it hardcodes the list of
toplevel citus includes. In the same manner it assumes anything not
prefixed with `columnar/` or `distributed/` as a postgres include.

The sorting/grouping is enforced by CI. Since we do so with our own
script there are not changes required in our uncrustify configuration.
2023-11-23 18:19:54 +01:00
Gürkan İndibay 3b556cb5ed
Adds create / drop database propagation support (#7240)
DESCRIPTION: Adds support for propagating `CREATE`/`DROP` database

In this PR, create and drop database support is added.

For CREATE DATABASE:
* "oid" option is not supported
* specifying "strategy" to be different than "wal_log" is not supported
* specifying "template" to be different than "template1" is not
supported

The last two are because those are not saved in `pg_database` and when
activating a node, we cannot assume what parameters were provided when
creating the database.

And "oid" is not supported because whether user specified an arbitrary
oid when creating the database is not saved in pg_database and we want
to avoid from oid collisions that might arise from attempting to use an
auto-assigned oid on workers.

Finally, in case of node activation, GRANTs for the database are also
propagated.

---------

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-11-21 16:43:51 +03:00
Naisila Puka cedcc220bf
Fixes flaky VACUUM (freeze, process toast true) result (#7348)
https://app.circleci.com/pipelines/github/citusdata/citus/34550/workflows/5b802f66-2666-4623-a209-6d7799f7ee5f/jobs/1229153
```diff
VACUUM (FREEZE, PROCESS_TOAST true) local_vacuum_table;
 SELECT relfrozenxid::text::integer > :frozenxid AS frozen_performed FROM pg_class
 WHERE oid=:reltoastrelid::regclass;
  frozen_performed 
 ------------------
- t
+ f
 (1 row)
```
Process toast option in vacuum was introduced in PG14. The failing test
was supposed to be a part of `multi_utilities.sql`, but it was included
in `pg14.sql` to avoid alternative output for PG13. See
ba62c0a148 (diff-ed03478f693155e2fe092e9ad356bf884dc097f554e8d75eff562d52bbcf7a75L255-L272)
for reference.
However, now that we don't support PG13 anymore, we can move this test
to `multi_utilities.sql`. Moving the test, plus inserting data before
running vacuum freeze such that the freeze is more meaningful and not
flaky, fixes the flakiness problem of the test.
2023-11-17 18:58:06 +03:00
Naisila Puka c88bf5ff1c
Cleanup leftover replication slots in publication test (#7354) 2023-11-17 15:11:38 +03:00
Japin Li e14e8667cc
Fix redundant variable declaration (#7353)
The `$workerCount` declare twice in
src/test/regress/pg_regress_multi.pl.
2023-11-17 13:01:23 +03:00
Naisila Puka 0d1f18862b
Propagates SECURITY LABEL ON ROLE stmt (#7304)
We propagate `SECURITY LABEL [for provider] ON ROLE rolename IS
labelname` to the worker nodes.
We also make sure to run the relevant `SecLabelStmt` commands on a
newly added node by looking at roles found in `pg_shseclabel`.

See official docs for explanation on how this command works:
https://www.postgresql.org/docs/current/sql-security-label.html
This command stores the role label in the `pg_shseclabel` catalog table.

This commit also fixes the regex string in
`check_gucs_are_alphabetically_sorted.sh` script such that it escapes
the dot. Previously it was looking for all strings starting with "citus"
instead of "citus." as it should.

To test this feature, I currently make use of a special GUC to control
label provider registration in PG_init when creating the Citus extension.
2023-11-16 13:12:30 +03:00
Naisila Puka c6fbb72c02
Fix flaky multi_prepare_plsql (#7346)
Simple need of an `ORDER BY` clause

Ran into this twice this week already!

https://github.com/citusdata/citus/actions/runs/6849701315/attempts/1#summary-18622563506

https://github.com/citusdata/citus/actions/runs/6875051160/attempts/1#summary-18698009952

```diff
 SELECT nspname, typname FROM pg_type JOIN pg_namespace ON pg_namespace.oid = pg_type.typnamespace WHERE typname = 'prepare_ddl_type_backup';
    nspname   |         typname         
 -------------+-------------------------
- public      | prepare_ddl_type_backup
  otherschema | prepare_ddl_type_backup
+ public      | prepare_ddl_type_backup
 (2 rows)
```
2023-11-15 13:28:43 +03:00
Naisila Puka a960799dfb
Clean up leftover replication slots in tests (#7338)
This commit fixes the flakiness in `logical_replication` and
`citus_non_blocking_split_shard_cleanup` tests. The flakiness
was related to leftover replication slots.
Below is a flaky example for each test:

logical_replication https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267030604
citus_non_blocking_split_shard_cleanup https://github.com/citusdata/citus/actions/runs/6721324131/attempts/1#summary-18267006967

```diff
 -- Replication slots should be cleaned up
 SELECT slot_name FROM pg_replication_slots;
             slot_name            
 ---------------------------------
-(0 rows)
+ citus_shard_split_slot_19_10_17
+(1 row)
```

The tests by themselves are not flaky: 32 flaky test
schedules each with 20 runs run successfully.
https://github.com/citusdata/citus/actions/runs/6822020127?pr=7338

The conclusion is that:
1. `multi_tenant_isolation_nonblocking` is the problematic test running
before `logical_replication` in the `enterprise_schedule`, so I added a
cleanup at the end of `multi_tenant_isolation_nonblocking`.
https://github.com/citusdata/citus/actions/runs/6824334614/attempts/1#summary-18560127461
2. `citus_split_shard_by_split_points_negative` is the problematic test
running before `citus_non_blocking_split_shards_cleanup` in the split
schedule. Also added cleanup line.

For details on the investigation of leftover replication slots,
please check the PR https://github.com/citusdata/citus/pull/7338
2023-11-14 18:50:54 +03:00
Naisila Puka cdef2d5224
Random tests refactoring (#7342)
While investigating replication slots leftovers
in PR https://github.com/citusdata/citus/pull/7338,
I ran into the following refactoring/cleanup
that can be done in our test suite:

- Add separate test to remove non default nodes
- Remove coordinator removal from `add_coordinator` test
  Use `remove_coordinator_from_metadata` test where needed
- Don't print nodeids in `multi_multiuser_auth` and
`multi_poolinfo_usage`
  tests
- Use `startswith` when checking for isolation or failure tests
- Add some dependencies accordingly in `run_test.py` for running flaky
test schedules
2023-11-14 12:49:15 +03:00
Onur Tirtir 240313e286
Support role commands from any node (#7278)
DESCRIPTION: Adds support from issuing role management commands from worker nodes

It's unlikely to get into a distributed deadlock with role commands, we
don't care much about them at the moment.
There were several attempts to reduce the chances of a deadlock but we
didn't any of them merged into main branch yet, see:
#7325
#7016
#7009
2023-11-10 09:58:51 +00:00
Naisila Puka 57ff762c82
Fix VACUUM flakiness in multi_utilities (#7334)
When I run this test in my local, the size of the table after the DELETE
command is around 58785792. Hence, I assume that the diffs suggest that
the Vacuum had no effect. The current solution is to run the VACUUM
command three times instead of once.

Example diff:
https://github.com/citusdata/citus/actions/runs/6722231142/attempts/1#summary-18269870674
```diff
insert into local_vacuum_table select i from generate_series(1,1000000) i;
 delete from local_vacuum_table;
 VACUUM local_vacuum_table;
 SELECT CASE WHEN s BETWEEN 20000000 AND 25000000 THEN 22500000 ELSE s END
 FROM pg_total_relation_size('local_vacuum_table') s ;
     s     
 ----------
- 22500000
+ 58785792
 (1 row)
```
See more diff examples in the PR description
https://github.com/citusdata/citus/pull/7334
2023-11-09 21:00:24 +03:00
dependabot[bot] d4663212f4 Bump werkzeug from 2.3.7 to 3.0.1 in /src/test/regress
Bumps [werkzeug](https://github.com/pallets/werkzeug) from 2.3.7 to 3.0.1.
- [Release notes](https://github.com/pallets/werkzeug/releases)
- [Changelog](https://github.com/pallets/werkzeug/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/werkzeug/compare/2.3.7...3.0.1)

---
updated-dependencies:
- dependency-name: werkzeug
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-09 17:14:14 +01:00
Nils Dijk 0dac63afc0
move pg_version_constants.h to toplevel include (#7335)
In preparation of sorting and grouping all includes we wanted to move
this file to the toplevel includes for good grouping/sorting.
2023-11-09 15:09:39 +00:00
Naisila Puka 0dc41ee5a0
Fix flaky multi_mx_insert_select_repartition test (#7331)
https://github.com/citusdata/citus/actions/runs/6745019678/attempts/1#summary-18336188930
```diff
     insert into target_table SELECT a*2 FROM source_table RETURNING a;
-NOTICE:  executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_xxxxx_from_4213582_to_0','repartitioned_results_xxxxx_from_4213584_to_0']::text[],'localhost',57638) bytes
+NOTICE:  executing the command locally: SELECT bytes FROM fetch_intermediate_results(ARRAY['repartitioned_results_3940758121873413_from_4213584_to_0','repartitioned_results_3940758121873413_from_4213582_to_0']::text[],'localhost',57638) bytes
```

The elements in the array passed to `fetch_intermediate_results` are the
same, but in the opposite order than expected.

To fix this flakiness, we can omit the `"SELECT bytes FROM
fetch_intermediate_results..."` line. From the following logs, it is
understandable that the intermediate results have been fetched.
2023-11-08 15:15:33 +03:00
Onur Tirtir 444e6cb7d6
Remove useless variables (#7327)
To fix warnings observed when using different compiler versions.
2023-11-07 16:39:08 +03:00
cvbhjkl e535f53ce5
Fix typo in local_executor.c (#7324)
Fix a typo 'remaning' -> 'remaining' in local_executor.c
2023-11-03 12:14:11 +00:00
Onur Tirtir 21646ca1e9
Fix flaky isolation_get_all_active_transactions.spec test (#7323)
Fix the flaky test that results in following diff by waiting until the
backend that we want to terminate really terminates, until 5secs.

```diff
--- /__w/citus/citus/src/test/regress/expected/isolation_get_all_active_transactions.out.modified	2023-11-01 16:30:57.648749795 +0000
+++ /__w/citus/citus/src/test/regress/results/isolation_get_all_active_transactions.out.modified	2023-11-01 16:30:57.656749877 +0000
@@ -114,13 +114,13 @@
 --------------------
 t                   
 (1 row)
 
 step s3-show-activity: 
  SET ROLE postgres;
  select count(*) from get_all_active_transactions() where process_id IN (SELECT * FROM selected_pid);
 
 count
 -----
-    0
+    1
 (1 row)
```
2023-11-03 09:00:32 +01:00
Onur Tirtir 5e2439a117
Make some more tests re-runable (#7322)
* multi_mx_create_table
* multi_mx_function_table_reference
* multi_mx_add_coordinator
* create_role_propagation
* metadata_sync_helpers
* text_search

https://github.com/citusdata/citus/pull/7278 requires this.
2023-11-02 18:32:56 +03:00
Jelte Fennema-Nio 85b997a0fb
Fix flaky multi_alter_table_statements (#7321)
Sometimes multi_alter_table_statements would fail in CI like this:

```diff
 -- Verify that DROP NOT NULL works
 ALTER TABLE lineitem_alter ALTER COLUMN int_column2 DROP NOT NULL;
 SELECT "Column", "Type", "Modifiers" FROM table_desc WHERE relid='lineitem_alter'::regclass;
-     Column      |         Type          | Modifiers
----------------------------------------------------------------------
- l_orderkey      | bigint                | not null
- l_partkey       | integer               | not null
- l_suppkey       | integer               | not null
- l_linenumber    | integer               | not null
- l_quantity      | numeric(15,2)         | not null
- l_extendedprice | numeric(15,2)         | not null
- l_discount      | numeric(15,2)         | not null
- l_tax           | numeric(15,2)         | not null
- l_returnflag    | character(1)          | not null
- l_linestatus    | character(1)          | not null
- l_shipdate      | date                  | not null
- l_commitdate    | date                  | not null
- l_receiptdate   | date                  | not null
- l_shipinstruct  | character(25)         | not null
- l_shipmode      | character(10)         | not null
- l_comment       | character varying(44) | not null
- float_column    | double precision      | default 1
- date_column     | date                  |
- int_column1     | integer               |
- int_column2     | integer               |
- null_column     | integer               |
-(21 rows)
-
+ERROR:  schema "alter_table_add_column" does not exist
 -- COPY should succeed now
 SELECT master_create_empty_shard('lineitem_alter') as shardid \gset
 ```

Reading from table_desc apparantly has an issue that if the schema gets
deleted from one of the items, while it is being read that we get such
an error.

This change fixes that by not running multi_alter_table_statements in parallel
with alter_table_add_column anymore.

This is another instance of the same issue as in #7294
2023-11-02 16:42:45 +03:00
Jelte Fennema-Nio f171ec98fc
Fix flaky failure_distributed_results (#7307)
Sometimes in CI we run into this failure:

```diff
   SELECT resultId, nodeport, rowcount, targetShardId, targetShardIndex
   FROM partition_task_list_results('test', $$ SELECT * FROM source_table $$, 'target_table')
           NATURAL JOIN pg_dist_node;
-WARNING:  connection to the remote node localhost:xxxxx failed with the following error: connection not open
+ERROR:  connection to the remote node localhost:9060 failed with the following error: connection not open
 SELECT * FROM distributed_result_info ORDER BY resultId;
-       resultid        | nodeport | rowcount | targetshardid | targetshardindex
----------------------------------------------------------------------
- test_from_100800_to_0 |     9060 |       22 |        100805 |                0
- test_from_100801_to_0 |    57637 |        2 |        100805 |                0
- test_from_100801_to_1 |    57637 |       15 |        100806 |                1
- test_from_100802_to_1 |    57637 |       10 |        100806 |                1
- test_from_100802_to_2 |    57637 |        5 |        100807 |                2
- test_from_100803_to_2 |    57637 |       18 |        100807 |                2
- test_from_100803_to_3 |    57637 |        4 |        100808 |                3
- test_from_100804_to_3 |     9060 |       24 |        100808 |                3
-(8 rows)
-
+ERROR:  current transaction is aborted, commands ignored until end of transaction block
 -- fetch from worker 2 should fail
 SAVEPOINT s1;
+ERROR:  current transaction is aborted, commands ignored until end of transaction block
 SELECT fetch_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[], 'localhost', :worker_2_port) > 0 AS fetched;
-ERROR:  could not open file "base/pgsql_job_cache/xx_x_xxx/test_from_100802_to_1.data": No such file or directory
-CONTEXT:  while executing command on localhost:xxxxx
+ERROR:  current transaction is aborted, commands ignored until end of transaction block
 ROLLBACK TO SAVEPOINT s1;
+ERROR:  savepoint "s1" does not exist
 -- fetch from worker 1 should succeed
 SELECT fetch_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[], 'localhost', :worker_1_port) > 0 AS fetched;
- fetched
----------------------------------------------------------------------
- t
-(1 row)
-
+ERROR:  current transaction is aborted, commands ignored until end of transaction block
 -- make sure the results read are same as the previous transaction block
 SELECT count(*), sum(x) FROM
   read_intermediate_results('{test_from_100802_to_1,test_from_100802_to_2}'::text[],'binary') AS res (x int);
- count | sum
----------------------------------------------------------------------
-    15 | 863
-(1 row)
-
+ERROR:  current transaction is aborted, commands ignored until end of transaction block
 ROLLBACk;
```

As outlined in the #7306 I created, the reason for this is related to
only having a single connection open to the node. Finding and fixing the
full cause is not trivial, so instead this PR starts working around
this bug by forcing maximum parallelism. Preferably we'd want
this workaround not to be necessary, but that requires
spending time to fix this. For now having a less flaky CI is
good enough.
2023-11-02 12:31:56 +00:00
Jelte Fennema-Nio b47c8b3fb0
Fix flaky insert_select_connection_leak (#7302)
Sometimes in CI insert_select_connection_leak would fail like this:

```diff
 END;
 SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections,
        worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections;
  leaked_worker_1_connections | leaked_worker_2_connections
 -----------------------------+-----------------------------
-                           0 |                           0
+                          -1 |                           0
 (1 row)

 -- ROLLBACK
 BEGIN;
 INSERT INTO target_table SELECT * FROM source_table;
 INSERT INTO target_table SELECT * FROM source_table;
 ROLLBACK;
 SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections,
        worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections;
  leaked_worker_1_connections | leaked_worker_2_connections
 -----------------------------+-----------------------------
-                           0 |                           0
+                          -1 |                           0
 (1 row)

 \set VERBOSITY TERSE
 -- Error on constraint failure
 BEGIN;
 INSERT INTO target_table SELECT * FROM source_table;
 SELECT worker_connection_count(:worker_1_port) AS worker_1_connections,
        worker_connection_count(:worker_2_port) AS worker_2_connections \gset
 SAVEPOINT s1;
 INSERT INTO target_table SELECT a, CASE WHEN a < 50 THEN b ELSE null END  FROM source_table;
@@ -89,15 +89,15 @@
  leaked_worker_1_connections | leaked_worker_2_connections
 -----------------------------+-----------------------------
                            0 |                           0
 (1 row)

 END;
 SELECT worker_connection_count(:worker_1_port) - :pre_xact_worker_1_connections AS leaked_worker_1_connections,
        worker_connection_count(:worker_2_port) - :pre_xact_worker_2_connections AS leaked_worker_2_connections;
  leaked_worker_1_connections | leaked_worker_2_connections
 -----------------------------+-----------------------------
-                           0 |                           0
+                          -1 |                           0
 (1 row)
```

Source:
https://github.com/citusdata/citus/actions/runs/6718401194/attempts/1#summary-18258258387

A negative amount of leaked connectios is obviously not possible. For
some reason there was a connection open when we checked the initial
amount of connections that was closed afterwards. This could be the
from the maintenance daemon or maybe from the previous test that had not
fully closed its connections just yet.

The change in this PR doesnt't actually fix the cause of the negative
connection, but it simply considers it good as well, by changing the
result to zero for negative values.

With this fix we might sometimes miss a leak, because the negative
number can cancel out the leak and still result in a 0. But since the
negative number only occurs sometimes, we'll still find the leak often
enough.
2023-11-02 13:15:43 +01:00
Cédric Villemain 0678a2fd89
Fix #7242, CALL(@0) crash backend (#7288)
When executing a prepared CALL, which is not pure SQL but available with
some drivers like npgsql and jpgdbc, Citus entered a code path where a
plan is not defined, while trying to increase its cost. Thus SIG11 when
plan is a NULL pointer.

Fix by only increasing plan cost when plan is not null.

However, it is a bit suspicious to get here with a NULL plan and maybe a
better change will be to not call
ShardPlacementForFunctionColocatedWithDistTable() with a NULL plan at
all (in call.c:134)

bug hit with for example:
```
CallableStatement proc = con.prepareCall("{CALL p(?)}");
proc.registerOutParameter(1, java.sql.Types.BIGINT);
proc.setInt(1, -100);
proc.execute();
```

where `p(bigint)` is a distributed "function" and the param the
distribution key (also in a distributed table), see #7242 for details

Fixes #7242
2023-11-02 13:15:24 +01:00
Jelte Fennema-Nio 5a48a1602e
Debug flaky logical_replication test (#7309)
Sometimes in CI our logical_replication test fails like this:

```diff
+++ /__w/citus/citus/src/test/regress/results/logical_replication.out.modified	2023-11-01 14:15:08.562758546 +0000
@@ -40,21 +40,21 @@

 SELECT count(*) from pg_publication;
  count
 -------
      0
 (1 row)

 SELECT count(*) from pg_replication_slots;
  count
 -------
-     0
+     1
 (1 row)

 SELECT count(*) FROM dist;
  count
 -------
```

It's hard to understand what is going on here, just based on the wrong
number. So this PR changes the test to show the name of the
subscription, publication and replication slot to make finding the cause
easier.

In passing this also fixes another flaky test in the same file that our
flaky test detection picked up. This is done by waiting for resource
cleanup after the shard move.
2023-11-02 13:15:02 +01:00
Onur Tirtir 9867c5b949
Fix flaky multi_mx_node_metadata.sql test (#7317)
Fixes the flaky test that results in following diff:
```diff
--- /__w/citus/citus/src/test/regress/expected/multi_mx_node_metadata.out.modified	2023-11-01 14:22:12.890476575 +0000
+++ /__w/citus/citus/src/test/regress/results/multi_mx_node_metadata.out.modified	2023-11-01 14:22:12.914476657 +0000
@@ -840,24 +840,26 @@
 (1 row)
 
 \c :datname - - :master_port
 SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
   datname   
 ------------
  db_to_drop
 (1 row)
 
 DROP DATABASE db_to_drop;
+ERROR:  database "db_to_drop" is being accessed by other users
 SELECT datname FROM pg_stat_activity WHERE application_name LIKE 'Citus Met%';
   datname   
 ------------
-(0 rows)
+ db_to_drop
+(1 row)
 
 -- cleanup
 DROP SEQUENCE sequence CASCADE;
 NOTICE:  drop cascades to default value for column a of table reference_table
```
2023-11-02 11:02:34 +00:00
Gürkan İndibay 184c8fc1ee
Enriches statement propagation document (#7267)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2023-11-02 09:59:34 +00:00
Jelte Fennema-Nio a6e86884f6
Fix flaky isolation_metadata_sync_deadlock (#7312)
Sometimes isolation_metadata_sync_deadlock fails in CI like this:

```diff
diff -dU10 -w /__w/citus/citus/src/test/regress/expected/isolation_metadata_sync_deadlock.out /__w/citus/citus/src/test/regress/results/isolation_metadata_sync_deadlock.out
--- /__w/citus/citus/src/test/regress/expected/isolation_metadata_sync_deadlock.out.modified	2023-11-01 16:03:15.090199229 +0000
+++ /__w/citus/citus/src/test/regress/results/isolation_metadata_sync_deadlock.out.modified	2023-11-01 16:03:15.098199312 +0000
@@ -110,10 +110,14 @@
 t
 (1 row)

 step s2-stop-connection:
  SELECT stop_session_level_connection_to_node();

 stop_session_level_connection_to_node
 -------------------------------------

 (1 row)
+
+teardown failed: ERROR:  localhost:57638 is a metadata node, but is out of sync
+HINT:  If the node is up, wait until metadata gets synced to it and try again.
+CONTEXT:  SQL statement "SELECT master_remove_distributed_table_metadata_from_workers(v_obj.objid, v_obj.schema_name, v_obj.object_name)"
```

Source:
https://github.com/citusdata/citus/actions/runs/6721938040/attempts/1#summary-18268946448

To fix this we now wait for the metadata to be fully synced to all
nodes at the start of the teardown steps.
2023-11-02 10:39:05 +01:00
Onur Tirtir 2cf4c04023
Fix flaky global_cancel.sql test (#7316) 2023-11-01 23:59:41 +01:00
Jelte Fennema-Nio e3c93c303d
Fix flaky citus_non_blocking_split_shard_cleanup (#7311)
Sometimes in CI citus_non_blocking_split_shard_cleanup failed like this:

```diff
--- /__w/citus/citus/src/test/regress/expected/citus_non_blocking_split_shard_cleanup.out.modified	2023-11-01 15:07:14.280551207 +0000
+++ /__w/citus/citus/src/test/regress/results/citus_non_blocking_split_shard_cleanup.out.modified	2023-11-01 15:07:14.292551358 +0000
@@ -106,21 +106,22 @@
 -----------------------------------

 (1 row)

 \c - - - :worker_2_port
 SET search_path TO "citus_split_test_schema";
 -- Replication slots should be cleaned up
 SELECT slot_name FROM pg_replication_slots;
             slot_name
 ---------------------------------
-(0 rows)
+ citus_shard_split_slot_19_10_17
+(1 row)

 -- Publications should be cleanedup
 SELECT count(*) FROM pg_publication;
  count
```

It's expected that the replication slot is sometimes not cleaned up if
we don't wait until resource cleanup completes. This PR starts doing
that here.
2023-11-01 16:21:12 +00:00
Jelte Fennema-Nio c9f2fc892d
Fix flaky failure_split_cleanup (#7299)
Sometimes failure_split_cleanup failed in CI like this:

```diff
 ERROR:  server closed the connection unexpectedly
 CONTEXT:  while executing command on localhost:9060
     SELECT operation_id, object_type, object_name, node_group_id, policy_type
     FROM pg_dist_cleanup where operation_id = 777 ORDER BY object_name;
  operation_id | object_type |                        object_name                        | node_group_id | policy_type
 --------------+-------------+-----------------------------------------------------------+---------------+-------------
           777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981000 |             1 |           0
-          777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981002 |             1 |           1
           777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981002 |             2 |           0
+          777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981002 |             1 |           1
           777 |           1 | citus_failure_split_cleanup_schema.table_to_split_8981003 |             2 |           1
           777 |           4 | citus_shard_split_publication_1_10_777                    |             2 |           0
 (5 rows)

     -- we need to allow connection so that we can connect to proxy
```

Source:
https://github.com/citusdata/citus/actions/runs/6717642291/attempts/1#summary-18256014949

It's the common problem where we're missing a column in the ORDER BY
clause. This fixes that by adding an node_group_id to the query in
question.
2023-11-01 14:08:51 +00:00
Jelte Fennema-Nio c83c556702
Fix flaky isolation_master_update_node (#7303)
Sometimes in CI isolation_master_update_node fails like this:

```diff
 ------------------

 (1 row)

 step s2-abort: ABORT;
 step s1-abort: ABORT;
 FATAL:  terminating connection due to administrator command
 FATAL:  terminating connection due to administrator command
 SSL connection has been closed unexpectedly
+server closed the connection unexpectedly

 master_remove_node
 ------------------

```

This just seesm like a random error line. The only way to reasonably fix
this is by adding an extra output file. So that's what this PR does.
2023-11-01 16:44:45 +03:00
Jelte Fennema-Nio 0d83ab57de
Fix flaky multi_cluster_management (#7295)
One of our most flaky and most anoying tests is
multi_cluster_management. It usually fails like this:
```diff
 SELECT citus_disable_node('localhost', :worker_2_port);
  citus_disable_node
 --------------------

 (1 row)

 SELECT public.wait_until_metadata_sync(60000);
+WARNING:  waiting for metadata sync timed out
  wait_until_metadata_sync
 --------------------------

 (1 row)

```

This tries to address that by hardening wait_until_metadata_sync. I
believe the reason for this warning is that there is a race condition in
wait_until_metadata_sync. It's possible for the pre-check to fail, then
have the maintenance daemon send a notification. And only then have the
backend start to listen. I tried to fix it in two ways:
1. First run LISTEN, and only then read do the pre-check.
2. If we time out, check again just to make sure that we did not miss
   the notification somehow. And don't show a warning if all metadata is
   synced after the timeout.

It's hard to know for sure that this fixes it because the test is not
repeatable and I could not reproduce it locally. Let's just hope for the
best.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-11-01 10:46:01 +00:00
Jelte Fennema-Nio 20ae42e7fa
Fix flaky multi_reference_table test (#7294)
Sometimes multi_reference_table failed in CI like this:

```diff
 \c - - - :master_port
 DROP INDEX reference_schema.reference_index_2;
 \c - - - :worker_1_port
 SELECT "Column", "Type", "Modifiers" FROM table_desc WHERE relid='reference_schema.reference_table_ddl_1250019'::regclass;
- Column  |            Type             |  Modifiers
----------------------------------------------------------------------
- value_2 | double precision            | default 25.0
- value_3 | text                        | not null
- value_4 | timestamp without time zone |
- value_5 | double precision            |
-(4 rows)
-
+ERROR:  schema "citus_local_table_queries" does not exist
 \di reference_schema.reference_index_2*
           List of relations
  Schema | Name | Type | Owner | Table
```

Source:
https://github.com/citusdata/citus/actions/runs/6707535961/attempts/2#summary-18226879513

Reading from table_desc apparantly has an issue that if the schema gets
deleted from one of the items, while it is being read that we get such
an error.

This change fixes that by not running multi_reference_table in parallel
with citus_local_tables_queries anymore.
2023-11-01 10:12:06 +00:00
Cédric Villemain 37415ef8f5
Allow citus_*_size on index related to a distributed table (#7271)
I just enhanced the existing code to check if the relation is an index
belonging to a distributed table.
If so the shardId is appended to relation (index) name and the *_size
function are executed as before.

There is a change in an extern function:
  `extern StringInfo GenerateSizeQueryOnMultiplePlacements(...)`
It's possible to create a new function and deprecate this one later if
compatibility is an issue.

Fixes https://github.com/citusdata/citus/issues/6496.

DESCRIPTION: Allows using Citus size functions on distributed tables
indexes.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-11-01 09:05:51 +00:00
Jelte Fennema-Nio a76a832553
Fix flaky validate_constraint test (#7293)
Sometimes validate constraint would fail like this:

```diff
  validatable_constraint_8000016 | t
 (10 rows)

 DROP TABLE constrained_table;
+ERROR:  deadlock detected
+DETAIL:  Process 16602 waits for ShareRowExclusiveLock on relation 56258 of database 16384; blocked by process 16601.
+Process 16601 waits for AccessShareLock on relation 56120 of database 16384; blocked by process 16602.
+HINT:  See server log for query details.
 DROP TABLE referenced_table CASCADE;
 DROP TABLE referencing_table;
 DROP SCHEMA validate_constraint CASCADE;
-NOTICE:  drop cascades to 3 other objects
+NOTICE:  drop cascades to 4 other objects
 DETAIL:  drop cascades to type constraint_validity
 drop cascades to view constraint_validations_in_workers
 drop cascades to view constraint_validations
+drop cascades to table constrained_table
 SET search_path TO DEFAULT;

```

Source:
https://github.com/citusdata/citus/actions/runs/6708383699?pr=7291

This change fixes that by not running together with the
foreign_key_to_reference_table test anymore. In passing it also
simplifies dropping of the test its resources.
2023-11-01 09:41:28 +01:00
Emel Şimşek ee8f4bb7e8
Start Maintenance Daemon for Main DB at the server start. (#7254)
DESCRIPTION: This change starts a maintenance deamon at the time of
server start if there is a designated main database.

This is the code flow:

1. User designates a main database:
   `ALTER SYSTEM SET citus.main_db =  "myadmindb";`

2. When postmaster starts, in _PG_Init, citus calls 
    `InitializeMaintenanceDaemonForMainDb`
  
This function registers a background worker to run
`CitusMaintenanceDaemonMain `with `databaseOid = 0 `

3. `CitusMaintenanceDaemonMain ` takes some special actions when
databaseOid is 0:
     - Gets the citus.main_db  value.
     - Connects to the  citus.main_db
     - Now the `MyDatabaseId `is available, creates a hash entry for it.
     - Then follows the same control flow as for a regular db,
2023-10-30 09:44:13 +03:00
Benjamin O f9218d9780
Support replacing IPv6 Loopback in `normalize.sed` (#7269)
I had a test failure issue due to my machine using the IPv6 loopback
address. This change to the `normalize.sed` solves that issue.
2023-10-27 16:42:55 +02:00
Naisila Puka 10198b18e8
Technical readme small fixes (#7261) 2023-10-23 13:43:43 +03:00
Naisila Puka 1fe16fa746
Remove unnecessary pre-fastpath code (#7262)
This code was here because we first implemented
`fast path planner` via
[#2606](https://github.com/citusdata/citus/pull/2606)
and then later `deferred pruning`
[#3369](https://github.com/citusdata/citus/pull/3369)
So, for some years, this code was useful.
2023-10-23 13:01:48 +03:00
zhjwpku 2d1444188c
Fix wrong comments around HasDistributionKey() (#7223)
HasDistributionKey & HasDistributionKeyCacheEntry returns true when the
corresponding table has a distribution key, the comments state the
opposite,
which should be fixed.

Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-10-18 10:53:00 +02:00
Onur Tirtir db13afaa7b
Fix flaky columnar_create.sql test (#7266) 2023-10-17 16:58:17 +03:00
Gürkan İndibay 71a4633dad
Fixes typo and renames multi_process_utility (#7259) 2023-10-17 16:39:37 +03:00
Jelte Fennema-Nio 788e09a39a
Add a test for citus_shards where table names have spaces (#7224)
There was a bug reported for previous versions of Citus where
shard\_size was returning NULL for tables with spaces in them. It works
fine on the main branch though, but I'm still adding a test for this to
the main branch because it seems a good test to have.
2023-10-16 11:38:24 +02:00
Emel Şimşek e9035f6d32
Send keepalive messages in split decoder periodically to avoid wal receiver timeouts during large shard splits. (#7229)
DESCRIPTION: Send keepalive messages during the logical replication
phase of large shard splits to avoid timeouts.

During the logical replication part of the shard split process, split
decoder filters out the wal records produced by the initial copy. If the
number of wal records is big, then split decoder ends up processing for
a long time before sending out any wal records through pgoutput. Hence
the wal receiver may time out and restarts repeatedly causing our split
driver code catch up logic to fail.

Notes: 

1. If the wal_receiver_timeout is set to a very small number e.g. 600ms,
it may time out before receiving the keepalives. My tests show that this
code works best when the` wal_receiver_timeout `is set to 1minute, which
is the default value.

2. Once a logical replication worker time outs, a new one gets launched.
The new logical replication worker sets the pg_stat_subscription columns
to initial values. E.g. the latest_end_lsn is set to 0. Our driver logic
in `WaitForGroupedLogicalRepTargetsToCatchUp` can not handle LSN value
to go back. This is the main reason for it to get stuck in the infinite
loop.
2023-10-09 22:33:08 +03:00
Nils Dijk 6d8725efb0
Fix leaking of memory and memory contexts in Foreign Constraint Graphs (#7236)
DESCRIPTION: Fix leaking of memory and memory contexts in Foreign
Constraint Graphs

Previously, every time we (re)created the Foreign Constraint
Relationship Graph, we created a new Memory Context while loosing a
reference to the previous context. This old context could still have
left over memory in there causing a memory leak.

With this patch we statically have one memory context that we lazily
initialize the first time we create our foreign constraint relationship
graph. On every subsequent creation, beside destroying our previous
hashmap we also reset our memory context to remove any left over
references.
2023-10-09 13:05:51 +02:00
Onur Tirtir 858d99be33
Take improvement_threshold into the account in citus_add_rebalance_strategy() (#7247)
DESCRIPTION: Makes sure to take improvement_threshold into the account
in `citus_add_rebalance_strategy()`.

Fixes https://github.com/citusdata/citus/issues/7188.
2023-10-09 13:13:08 +03:00
Önder Kalacı 7d6c401dd3
Update technical readme (#7248)
Fix a wrong query, reported by @naisila
2023-10-06 13:37:37 +03:00
Önder Kalacı 0dca65c84d
Addd missing image to Technical Readme (#7243)
DESCRIPTION: PR description that will go into the change log, up to 78
characters
2023-09-29 22:24:10 +02:00
Önder Kalacı 185ac5e01e
Citus Technical Readme (#7207)
This commit aims to add a comprehensive guide that covers all essential
aspects of Citus, including planning, execution, locking mechanisms,
shard moves, 2PC, and many other major components of Citus.

Co-authored-by: Marco Slot <marco.slot@gmail.com>
2023-09-29 16:50:52 +03:00
dependabot[bot] c323f49e83
Bump cryptography from 41.0.3 to 41.0.4 in /src/test/regress (#7231)
Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.3
to 41.0.4.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nils Dijk <nils@citusdata.com>
2023-09-27 15:36:58 +02:00
Onur Tirtir 27ac44eb2a
Fix mixed Citus upgrade tests (#7218)
When testing rolling Citus upgrades, coordinator should not be upgraded
until we upgrade all the workers.

---------

Co-authored-by: Jelte Fennema-Nio <github-tech@jeltef.nl>
2023-09-26 17:52:52 +03:00
Nils Dijk b87fbcbf79
Shard moves/isolate report LSN's in lsn format (#7227)
DESCRIPTION: Shard moves/isolate report LSN's in lsn format

While investigating an issue with our catchup mechanism on certain
postgres versions we noticed we print LSN's in the format of the native
long type. This is an uncommon representation for LSN's in postgres
logs.

This patch changes the output of our log message to go from the long
type representation to the native LSN type representation. Making it
easier for postgres users to recognize and compare LSN's with other
related reports.

example of new output:
```
2023-09-25 17:28:47.544 CEST [11345] LOG:  The LSN of the target subscriptions on node localhost:9701 have increased from 0/0 to 0/E1ED20F8 at 2023-09-25 17:28:47.544165+02 where the source LSN is 1/415DCAD0
```
2023-09-26 13:47:50 +02:00
Gürkan İndibay 7fa109c977
Adds alter user missing features (#7204)
DESCRIPTION: Adds alter user rename propagation and enriches alter user
tests

---------

Co-authored-by: Jelte Fennema <github-tech@jeltef.nl>
2023-09-26 12:28:07 +03:00
Onur Tirtir 111b4c19bc
Make sure to disallow creating a replicated distributed table concurrently (#7219)
See explanation in https://github.com/citusdata/citus/issues/7216.
Fixes https://github.com/citusdata/citus/issues/7216.

DESCRIPTION: Makes sure to disallow creating a replicated distributed
table concurrently
2023-09-25 11:14:35 +03:00
Nils Dijk 0f28a69f12
Use the $(DLSUFFIX) instead of hard coded extensions for cdc (#7221)
When cdc got added the makefiles hardcoded the `.so` extension instead
of using the platform specifc `$(DLSUFFIX)` variable used by `pgxs.mk`.
Also don't remove installed cdc artifacts on `make clean`.
2023-09-22 16:24:18 +02:00
Jelte Fennema-Nio 71e556e090
Remove useless test output (#7209)
This was sometimes failing when running locally due to some local shard
still existing due to. This fixes that. We normally silence all
`drop schema cascade` output like this anyway to avoid unnecessary
diffs when modifying a test later on.
2023-09-19 14:12:46 +02:00
Naisila Puka 4e46708789
Adds PostgreSQL 16.0 Support (#7201)
This commit concludes PG16.0 Support in Citus.

The main PG16 support work has been done for 16beta3
https://github.com/citusdata/citus/pull/6952
There was some extra work needed for 16rc1
https://github.com/citusdata/citus/pull/7173
And this PR yet introduces some extra work needed to 16.0 :)

`pgstat_fetch_stat_local_beentry` has been renamed to
`pgstat_get_local_beentry_by_index` in PG16.0

Relevant PG commit:
8dfa37b797
8dfa37b797843a83a5756ea3309055e8953e1a86

Sister PR
https://github.com/citusdata/the-process/pull/150
2023-09-15 12:23:04 +03:00
Gürkan İndibay 7c0b289761
Adds alter database set option (#7181)
DESCRIPTION: Adds support for ALTER DATABASE <db_name> SET .. statement
propagation
SET statements in Postgres has a common structure which is already being
used in Alter Function
statement. 
In this PR, I added a util file; citus_setutils and made it usable for
both for
alter database<db_name>set .. and alter function ... set ... statements.
With this PR, below statements will be propagated
```sql
ALTER DATABASE name SET configuration_parameter { TO | = } { value | DEFAULT }
ALTER DATABASE name SET configuration_parameter FROM CURRENT
ALTER DATABASE name RESET configuration_parameter
ALTER DATABASE name RESET ALL
```
Additionally, there was a bug in processing float values in the common
code block.
I fixed this one as well

Previous
```C
case T_Float:
			{
				appendStringInfo(buf, " %s", strVal(value));
				break;
			}
```
Now
```C
case T_Float:
			{
				appendStringInfo(buf, " %s", nodeToString(value));
				break;
			}
```
2023-09-14 16:29:16 +03:00
aykut-bozkurt 26dc407f4a
bump citus and columnar into 12.2devel (#7200) 2023-09-14 12:03:09 +03:00
Gürkan İndibay e5e64b7454
Adds alter database propagation - with and refresh collation (#7172)
DESCRIPTION: Adds ALTER DATABASE WITH ... and REFRESH COLLATION VERSION
support

This PR adds supports for basic ALTER DATABASE statements propagation 
support. Below statements are supported:
ALTER DATABASE <database_name> with IS_TEMPLATE <true/false>;
ALTER DATABASE <database_name> with CONNECTION LIMIT <integer_value>;
ALTER DATABASE <database_name> REFRESH COLLATION VERSION;

---------

Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
2023-09-12 14:09:15 +03:00
Naisila Puka 1da99f8423
PG16 - Don't propagate GRANT ROLE with INHERIT/SET option (#7190)
We currently don't support propagating these options in Citus
Relevant PG commits:
https://github.com/postgres/postgres/commit/e3ce2de
https://github.com/postgres/postgres/commit/3d14e17

Limitation:
We also need to take care of generated GRANT statements by dependencies
in attempt to distribute something else. Specifically, this part of the
code in `GenerateGrantRoleStmtsOfRole`:
```
grantRoleStmt->admin_opt = membership->admin_option;
```
In PG16, membership also has `inherit_option` and `set_option` which
need to properly be part of the `grantRoleStmt`. We can skip for now
since #7164 will take care of this soon, and also this is not an
expected use-case.
2023-09-12 12:47:37 +03:00
Naisila Puka c1dc378504
Fix WITH ADMIN FALSE propagation (#7191) 2023-09-11 15:58:24 +03:00
Onur Tirtir d628a4c21a
Add citus_schema_move() function (#7180)
Add citus_schema_move() that can be used to move tenant tables within a distributed
schema to another node. The function has two variations as simple wrappers around
citus_move_shard_placement() and citus_move_shard_placement_with_nodeid() respectively.
They pick a shard that belongs to the given tenant schema and resolve the source node
that contain the shards under given tenant schema. Hence their signatures are quite
similar to underlying functions:

```sql
-- citus_schema_move(), using target node name and node port
CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move(
	schema_id regnamespace,
	target_node_name text,
	target_node_port integer,
	shard_transfer_mode citus.shard_transfer_mode default 'auto')
RETURNS void
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_schema_move$$;

-- citus_schema_move(), using target node id
CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move(
	schema_id regnamespace,
	target_node_id integer,
	shard_transfer_mode citus.shard_transfer_mode default 'auto')
RETURNS void
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_schema_move_with_nodeid$$;
```
2023-09-08 12:03:53 +03:00
Naisila Puka 8894c76ec0
PG16 - Add rules option to CREATE COLLATION (#7185)
Relevant PG commit:
https://github.com/postgres/postgres/commit/30a53b7
30a53b7
2023-09-07 13:50:47 +03:00
Naisila Puka 2df88042b3
Add tests with JSON_ARRAYAGG and JSON_OBJECTAGG aggregates (#7186)
Relevant PG commit:
7081ac46ac
7081ac46ace8c459966174400b53418683c9fe5c
2023-09-07 13:29:39 +03:00
Naisila Puka 7e5136f2de
Add tests with publications with schema and table of the same schema (#7184)
Relevant PG commit:
https://github.com/postgres/postgres/commit/13a185f
13a185f

It was backpatched through PG15 so I added this test in publication.sql
instead of pg16.sql
2023-09-06 16:40:36 +03:00
Naisila Puka b2fc763bc3
PG16 - Add tests with random_normal (#7183)
Relevant PG commit:
https://github.com/postgres/postgres/commit/38d8176
2023-09-06 14:57:24 +03:00
Naisila Puka 5c658b4eb7
PG16 - Add citus_truncate_trigger for Citus foreign tables (#7170)
Since in PG16, truncate triggers are supported on foreign tables, we add
the citus_truncate_trigger to Citus foreign tables as well, such that the TRUNCATE
command is propagated to the table's single local shard as well.
Note that TRUNCATE command was working for foreign tables even before this
commit: see https://github.com/citusdata/citus/pull/7170#issuecomment-1706240593 for details

This commit also adds tests with user-enabled truncate triggers on Citus foreign tables:
both trigger on the shell table and on its single foreign local shard.

Relevant PG commit:
https://github.com/postgres/postgres/commit/3b00a94
2023-09-05 19:42:39 +03:00
zhjwpku 205b159606
get rid of {Push/Pop}OverrideSearchPath (#7145) 2023-09-05 17:40:22 +02:00
aykut-bozkurt 8eb3360017
Fixes visibility problems with dependency propagation (#7028)
**Problem:**
Previously we always used an outside superuser connection to overcome
permission issues for the current user while propagating dependencies.
That has mainly 2 problems:
1. Visibility issues during dependency propagation, (metadata connection
propagates some objects like a schema, and outside transaction does not
see it and tries to create it again)
2. Security issues (it is preferrable to use current user's connection
instead of extension superuser)

**Solution (high level):**
Now, we try to make a smarter decision on whether should we use an
outside superuser connection or current user's metadata connection. We
prefer using current user's connection if any of the objects, which is
already propagated in the current transaction, is a dependency for a
target object. We do that since we assume if current user has
permissions to create the dependency, then it can most probably
propagate the target as well.

Our assumption is expected to hold most of the times but it can still be
wrong. In those cases, transaction would fail and user should set the
GUC `citus.create_object_propagation` to `deferred` to work around it.

**Solution:**
1. We track all objects propagated in the current transaction (we can
handle subtransactions),
2. We propagate dependencies via the current user's metadata connection
if any dependency is created in the current transaction to address
issues listed above. Otherwise, we still use an outside superuser
connection.


DESCRIPTION: Fixes some object propagation errors seen with transaction
blocks.

Fixes https://github.com/citusdata/citus/issues/6614

---------

Co-authored-by: Nils Dijk <nils@citusdata.com>
2023-09-05 18:04:16 +03:00
Naisila Puka 9f067731c0
Adds PostgreSQL 16 RC1 support (#7173) 2023-09-05 14:32:41 +03:00
Emel Şimşek a849570f3f
Improve the performance of CitusHasBeenLoaded function for a database that does not do CREATE EXTENSION citus but load citus.so. (#7123)
For a database that does not create the citus extension by running

`  CREATE EXTENSION citus;`

`CitusHasBeenLoaded ` function ends up querying the `pg_extension` table
every time it is invoked. This is not an ideal situation for a such a
database.

The idea in this PR is as follows:

### A new field in MetadataCache.
 Add a new variable `extensionCreatedState `of the following type:

```
typedef enum ExtensionCreatedState
{
        UNKNOWN = 0,
        CREATED = 1,
        NOTCREATED = 2,
} ExtensionCreatedState;
```
When the MetadataCache is invalidated, `ExtensionCreatedState` will be
set to UNKNOWN.
     
### Invalidate MetadataCache when CREATE/DROP/ALTER EXTENSION citus
commands are run.

- Register a callback function, named
`InvalidateDistRelationCacheCallback`, for relcache invalidation during
the shared library initialization for `citus.so`. This callback function
is invoked in all the backends whenever the relcache is invalidated in
one of the backends. (This could be caused many DDLs operations).

- In the cache invalidation callback,`
InvalidateDistRelationCacheCallback`, invalidate `MetadataCache` zeroing
it out.
 
- In `CitusHasBeenLoaded`, perform the costly citus is loaded check only
if the `MetadataCache` is not valid.
 
### Downsides

Any relcache invalidation (caused by various DDL operations) will case
Citus MetadataCache to get invalidated. Most of the time it will be
unnecessary. But we rely on that DDL operations on relations will not be
too frequent.
2023-09-05 13:29:35 +03:00
Hanefi Onaldi c22547d221 Create a new colocation properly after braking one
When braking a colocation, we need to create a new colocation group
record in pg_dist_colocation for the relation. It is not sufficient to
have a new colocationid value in pg_dist_partition only.

This patch also fixes a bug when deleting a colocation group if no
tables are left in it. Previously we passed a relation id as a parameter
to DeleteColocationGroupIfNoTablesBelong function, where we should have
passed a colocation id.
2023-09-05 10:58:46 +03:00
Jelte Fennema bdf085eabb
Add some small improvements to python testing framework (#7159)
1. Adds an `sql_row` function, for when a query returns a single row
   with multiple columns.
2. Include a `notice_handler` for easier debugging
3. Retry dropping replication slots when they are "in use", this is
   often an ephemeral state and can cause flaky tests
2023-09-05 09:34:56 +02:00
Ivan Vyazmitinov e94bf93152
#6548 2PC recovery is extremely ineffective on a cluster with multiple DATABASEs fix (#7174) 2023-09-04 15:28:22 +02:00
Naisila Puka de9af078b0
PG16 - Add reindex database/system tests (#7167)
In PG16, REINDEX DATABASE/SYSTEM name is optional.
We already don't propagate these commands automatically.
Testing here with run_command_on_workers.

Relevant PG commit:
https://github.com/postgres/postgres/commit/2cbc3c1
2023-09-04 11:31:57 +03:00
Naisila Puka cf71e80bfd
PG16 - Add tests for createdb with ICU_RULES option (#7161)
When we create a database, it already needs to be manually created in
the workers as well.
This new icu_rules option should work as the other options as well.
Added a test for that.

Relevant PG commit:
https://github.com/postgres/postgres/commit/30a53b7
2023-09-04 11:13:46 +03:00
zhjwpku 9fd4ef042f
avoid rebuilding MetadataCache for each placement insertion (#7163) 2023-09-04 09:57:25 +02:00
zhjwpku 5034f8eba5
polish the codebase by fixing dozens of typos (#7166) 2023-09-01 12:21:53 +02:00
Naisila Puka 05443a77ad
Adds test for COPY FROM failure in Citus foreign tables (#7160) 2023-09-01 12:20:07 +03:00
Gürkan İndibay b8bded6454
Adds citus_pause_node udf (#7089)
DESCRIPTION: Presenting citus_pause_node UDF enabling pausing by
node_id.

citus_pause_node takes a node_id parameter and fetches all the shards in
that node and puts AccessExclusiveLock on all the shards inside that
node. With this lock, insert is disabled, until citus_pause_node
transaction is closed.

---------

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2023-09-01 11:39:30 +03:00
Gürkan İndibay 4a1a5491ce
Refactors grant statements (#7153)
DESCRIPTION: Refactors all grant statements to use common code blocks to
deparse
2023-09-01 09:49:46 +03:00
zhjwpku f03291a8c8
remove useless code block (#7158) 2023-08-29 17:15:22 +02:00
Naisila Puka a17fae36b9
Disable statistics collection (#7162)
Enabled by mistake in

ba40eb363c
2023-08-29 16:09:19 +03:00
Onur Tirtir a830862717 Not undistribute Citus local table when converting it to a reference table / single-shard table 2023-08-29 12:57:28 +03:00
Onur Tirtir 34e3119b48 Intersect shard placements in a table type agnostic way
If we're in the middle of a table type conversion (such as from Citus
local table to a reference table), the table might not have all the
placements that we expect from the table type. For this reason, we
should intersect the placements of tables at hand when creating
inter-shard ddl tasks.
2023-08-29 12:57:28 +03:00
Onur Tirtir 5bdf19f517 Use CopyShardForeignConstraintCommandList in WorkerCreateShardCommandList
What we do to collect foreign key constraint commands in
WorkerCreateShardCommandList is quite similar to what we do in
CopyShardForeignConstraintCommandList. Plus, the code that we used
in WorkerCreateShardCommandList before was not able to properly handle
foreign key constraints between Citus local tables --when creating a
reference table from the referencing one.

With a few slight modifications made to
CopyShardForeignConstraintCommandList, we can use the same logic in
WorkerCreateShardCommandList too.
2023-08-29 12:57:28 +03:00
zhjwpku d97f786296
PQputCopyData's return value 0 should be considered fail (#7152) 2023-08-29 11:19:18 +02:00
Onur Tirtir d5d1684c45
Use correct errorCode for the errors thrown during recovery (#7146) 2023-08-28 11:03:38 +03:00
Naisila Puka afab879de3
PG16 - Add COPY FROM default tests (#7143)
Already supported in Citus, adding the same tests as in PG
Relevant PG commit:
https://github.com/postgres/postgres/commit/9f8377f
2023-08-24 15:52:09 +03:00
Naisila Puka 70c8aba967
PG16 - Add tests for CREATE/ALTER TABLE .. STORAGE (#7140)
Relevant PG commits:
https://github.com/postgres/postgres/commit/784cedd
https://github.com/postgres/postgres/commit/b9424d0
2023-08-24 15:26:40 +03:00
Gürkan İndibay 8d3a06c1c7
Adds grant/revoke privileges on database propagation (#7109)
DESCRIPTION: Adds grant/revoke propagation support for database
privileges

Following the implementation of support for granting and revoking
database privileges, certain tests that issued grants for worker nodes
experienced failures. These ones are fixed in this PR as well.
2023-08-24 14:43:19 +03:00
Naisila Puka b8c493f2c4
PG16 - Add GENERIC_PLAN option to EXPLAIN (#7141) 2023-08-23 20:15:54 +03:00
Naisila Puka c73ef405f5
PG16 - IS JSON predicate and SYSTEM_USER tests (#7137)
Support the IS JSON predicate
Relevant PG commit:
https://github.com/postgres/postgres/commit/6ee30209

SYSTEM_USER
Relevant PG commit:
https://github.com/postgres/postgres/commit/0823d061
2023-08-23 14:13:56 +03:00
Marco Slot ba55fd67d7
Rename planner_readme.md to README.md (#7139) 2023-08-23 13:47:18 +03:00
Naisila Puka 36b51d617c
PG16 - Throw meaningful error for stats without a name on Citus tables (#7136)
Relevant PG commit:
624aa2a13b
624aa2a13bd02dd584bb0995c883b5b93b2152df
2023-08-23 10:25:01 +03:00
Gürkan İndibay 371f094b68
Removes pg_send_cancellation (#7135)
DESCRIPTION: Removes pg_send_cancellation and all references
2023-08-21 17:29:44 +03:00
zhjwpku ba2a0aec16
fix some obvious typo and reduce usage of magic number (#7130)
fix some obvious typo and reduce usage of magic number

Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
2023-08-18 14:50:20 +00:00
Naisila Puka 682dca1f12
Adds PG16Beta3 support (#6952)
DESCRIPTION: Adds PG16Beta3 support

This is the final commit that adds
PG16 compatibility with Citus's current features.

You can use Citus community with PG16Beta3. This commit:

- Enables PG16 in the configure script.
- Adds PG16 tests to CI using test images that have 16beta3
- Skips wal2json cdc test since wal2json package is not available for PG16 yet
- Fixes an isolation test

Several PG16 Compatibility commits have been merged before this final one.
All these subtasks are done https://github.com/citusdata/citus/issues/7017
See the list below:

1 - 42d956888d
Resolve compilation issues
2 - 0d503dd5ac
Ruleutils and successful CREATE EXTENSION
3 - 907d72e60d
Some test outputs
4 - 7c6b4ce103
Outer join checks, subscription password, crash fixes
5 - 6056cb2c29
get_relation_info hook to avoid crash from adjusted partitioning
6 - b36c431abb
Rework PlannedStmt and Query's Permission Info
7 - ee3153fe50
More test output fixes
8 - 2c50b5f7ff
varnullingrels additions
9 - b2291374b4
More test output fixes
10- a2315fdc67
New options to vacuum and analyze
11- 9fa72545e2
Fix AM dependency and grant's admin option
12- 2d6cf8e79a
One more outer join check

Stay tuned for PG16 new features in Citus :)
2023-08-17 21:02:59 +03:00
Naisila Puka 2d6cf8e79a
PG16 compatibility - one more outer join check (#7126)
PG16 compatibility - part 11

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff
part 9 b2291374b4
part 10 a2315fdc67
part 11 9fa72545e2

This commit is in the series of PG16 compatibility commits.
We already took care of the majority of necessary outer join checks
in part 4 7c6b4ce103
However, In RelationInfoContainsOnlyRecurringTuples,
we need to add one more check of whether we are dealing
with an outer join RTE using IsRelOptOuterJoin function.
This prevents an outer join crash in sqlancer_failures.sql test.

We expect one more commit of PG compatibility with Citus's current
features are regression tests sanity.
2023-08-17 19:07:18 +03:00
zhjwpku b10320be6f
fix wrong type convertion (#7116)
partitionMethod and replicationModel are both type char, there seems
meaningless to convert them to type Oid implicitly.
2023-08-17 13:53:43 +02:00
Naisila Puka a5ce601c07
Bump PG14 and PG15 versions for CI tests (#7111)
Postgres got minor updates on Aug10, this commit starts using the
images with the latest version for our tests, namely 14.9 and 15.4.

Depends on https://github.com/citusdata/the-process/pull/147

For CI images, we needed to regenerate Pipfile.lock, mainly because of an issue
with pyyaml version: https://github.com/yaml/pyyaml/issues/601

We also needed to remove a failing test in subquery_local_tables.sql.
Relevant PG commit:
b0e390e6d1
b0e390e6d1d68b92e9983840941f8f6d9e083fe0
Issue: https://github.com/citusdata/citus/issues/7119
For joins where consider_join_pushdown is false, we cannot get the
information that we used to get, which prevents doing the distributed planning.
Team already contacted PG committers for this.
Until then, we remove the test from the schedule.
2023-08-17 11:53:19 +03:00
Naisila Puka 9fa72545e2
PG16 compatibility - fix AM dependency and grant's admin option (#7113)
PG16 compatibility - part 11

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff
part 9 b2291374b4
part 10 a2315fdc67

This commit is in the series of PG16 compatibility commits. It fixes
AM dependency and grant's admin option:

- Fix with admin option in grants 
grantstmt->admin_opt no longer exists in PG16
instead, grantstmt has a list of options, one of them is admin option.
Relevant PG commit:
e3ce2de09d
e3ce2de09d814f8770b2e3b3c152b7671bcdb83f

- Fix pg_depend entry to AMs after ALTER TABLE .. SET ACCESS METHOD 
Relevant PG commit:
97d8910104
97d89101045fac8cb36f4ef6c08526ea0841a596


More PG16 compatibility commits are coming soon:
We are very close to merging "PG16Beta3 Support - Regression tests sanity"
2023-08-17 11:22:34 +03:00
Naisila Puka 71c475af52
Fix GetUndistributableDependency (#7124)
This is a leftover task from merging enterprise to community.
Roles are distributed in community now, the comment is stale and the
check is redundant.
2023-08-17 10:57:22 +03:00
Naisila Puka a2315fdc67
PG16 compatibility - new options to vacuum and analyze (#7114)
PG16 compatibility - part 10

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb 
part 7 ee3153fe50
part 8 2c50b5f7ff
part 9 b2291374b4

This commit is in the series of PG16 compatibility commits. It:

- Adds buffer_usage_limit to vacuum and analyze
- Adds process_main, skip_database_stats, only_database_stats to vacuum

Important Note: adding these options is actually required for check-vanilla tests to succeed.
However, in concept, this PR belongs to "PG16 new features",
rather than "PG16 regression tests sanity"

Relevant PG commits:
1cbbee0338
1cbbee03385763b066ae3961fc61f2cd01a0d0d7
4211fbd841
4211fbd8413b26e0abedbe4338aa7cda2cd469b4
a46a7011b2
a46a7011b27188af526047a111969f257aaf4db8

More PG16 compatibility commits are coming soon ...
2023-08-16 16:18:28 +03:00
Naisila Puka b982f2dee6
Changes PROCESS_TOAST default value to true (#7122)
Process toast should be true by default, like in PG.
2023-08-16 14:40:24 +03:00
Naisila Puka b2291374b4
PG16 compatibility - more test output fixes (#7112)
PG16 compatibility - part 9

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50
part 8 2c50b5f7ff

This commit is in the series of PG16 compatibility commits. It makes some changes
to our tests in order to be compatible with the following in PG16:

- Fix multi_subquery_in_where_reference_clause test 
somehow PG got rid of the outer join
(e.g., explain doesn't show outer joins),
hence we can pushdown the subquery.
Changing to users_reference_table

- Fix unqualified column names for views in PG16 
Relevant PG commit:
47bb9db759
47bb9db75996232ea71fc1e1888ffb0e70579b54

- Fix global_cancel test 
Error wording and detail changed
Relevant PG commit:
2631ebab7b
2631ebab7b18bdc079fd86107c47d6104a6b3c6e

- Fix local_table_join_test with lateral subquery 
Possible relevant PG commit:
ae89129aa3
ae89129aa3555c263b8c3ccc4c0f1ef7e46201aa
I removed the where clause and the limit count error was hit again.
With the where clause the query unexpectedly works.

- Fix test outputs 
Relevant PG commits:
-- 1349d2790b
-- f4c7c410ee
For multi_explain and multi_complex_count_distinct there were too many places
touched so I just added an alternative test output.
For the other tests I modified the problematic parts.

More PG16 compatibility commits are coming soon ...
2023-08-15 13:49:25 +03:00
Naisila Puka 2c50b5f7ff
PG16 compatibility - varnullingrels additions (#7107)
PG16 compatibility - part 7

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb
part 7 ee3153fe50

This commit is in the series of PG16 compatibility commits. PG16 introduced a new entry
varnnullingrels to Var, which represents our partkey in pg_dist_partition.
This commit does the necessary changes in Citus to support this.
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d

More PG16 compatibility commits are coming soon ...
2023-08-15 13:07:55 +03:00
Naisila Puka ee3153fe50
PG16 compatibility - more test output fixes (#7108)
PG16 compatibility - part 7

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29
part 6 b36c431abb

This commit is in the series of PG16 compatibility commits. It makes some changes
to our tests in order to be compatible with the following in PG16:

- PG16 removed logic for converting a table to a view 
Relevant PG commit:
b23cd185fd
b23cd185fd5410e5204683933f848d4583e34b35

- Fix changed error message in certificate verification 
Relevant PG commit:
8eda731465
8eda7314652703a2ae30d6c4a69c378f6813a7f2

- Fix backend type order in tests 
Relevant PG commit:
0c679464a8
0c679464a837079acc75ff1d45eaa83f79e05690

- Reduce log level to omit extra NOTICE in create collation in PG16 
Relevant PG commit:
a14e75eb0b
a14e75eb0b6a73821e0d66c0d407372ec8376105
That commit made LOCALE parameter apply regardless of the
provider used, and it printed the following notice:
NOTICE:  using standard form "und-u-ks-level2" for ICU locale "@colStrength=secondary"
We omit this notice to omit output change between pg versions.

- Fix columnar_memory test 
TopMemoryContext now has more children contexts
Possible relevant PG commit:
9d3ebba729
9d3ebba729ebaf5882a92f0f5f662a3312037605
memusage is now around 8.5 MB, whereas it was less than 8MB before.
To avoid differences between PG versions, I changed the test to compare
to less than 9 MB. It still reflects very well the improvement from
28MB.

- Alternative test output for GRANTOR values in pg_auth_members 
grantor changed in PG16
Relevant PG commit:
ce6b672e44
ce6b672e4455820a0348214be0da1a024c3f619f

- Remove redundant grouping columns from our tests 
Relevant PG commit:
8d83a5d0a2
8d83a5d0a2673174dc478e707de1f502935391a5

- Fix tests with different order in Filters 
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d

More PG16 compatibility commits are coming soon ...
2023-08-09 18:04:32 +03:00
Naisila Puka b36c431abb
PG16 compatibility - Rework PlannedStmt and Query's Permission Info (#7098)
PG16 compatibility - Part 6

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103
part 5 6056cb2c29

This commit is in the series of PG16 compatibility commits.
It handles the Permission Info changes in PG16. See below:

The main issue lies in the following entries of PlannedStmt: {
   rtable
   permInfos
}

Each rtable has an int perminfoindex, and its actual permission info is
obtained through the following:
permInfos[perminfoindex]
We had crashes because perminfoindexes were not updated in the finalized
planned statement after distributed planner hook.
So, basically, everywhere we set a query's or planned statement's rtable
entry, we need to set the rteperminfos/permInfos accordingly.

Relevant PG commits:
a61b1f7482
a61b1f74823c9c4f79c95226a461f1e7a367764b
b803b7d132
b803b7d132e3505ab77c29acf91f3d1caa298f95

More PG16 compatibility commits are coming soon ...
2023-08-09 15:23:00 +03:00
Naisila Puka 6056cb2c29
PG16 compatibility - get_relation_info hook to avoid crash from adjusted partitioning (#7099)
PG16 compatibility - Part 5

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d
part 4 7c6b4ce103

This commit is in the series of PG16 compatibility commits. Find the explanation below:

If we allow to adjust partitioning, we get a crash when accessing
amcostestimate of partitioned indexes, because amcostestimate is NULL
for them. The following PG commit is the culprit:
3c569049b7
3c569049b7b502bb4952483d19ce622ff0af5fd6
Previously, partitioned indexes would just be ignored.
Now, they are added in the list. However get_relation_info expects the
tables which have partitioned indexes to have the inh flag set properly.
AdjustPartitioningForDistributedPlanning plays with that flag, hence we
don't get the desired behaviour.
The hook is simply removing all partitioned indexes from the list.

More PG16 compatibility commits are coming soon ...
2023-08-08 15:51:21 +03:00
Naisila Puka 7c6b4ce103
PG16 compatibility - outer join checks, subscription password, crash fixes (#7097)
PG16 compatibility - Part 4

Check out part 1 42d956888d
part 2 0d503dd5ac
part 3 907d72e60d

This commit is in the series of PG16 compatibility commits.
It adds some outer join checks to the planner,
the new password_required option to the subscription,
and a crash fix related to PGIOAlignedBlock, see below for more details:

- Fix PGIOAlignedBlock Assert crash in PG16 
Relevant PG commit:
faeedbcefd
faeedbcefd40bfdf314e048c425b6d9208896d90

- Pass planner info as argument to make_simple_restrictinfo 
Pre PG16 passing plannerInfo to make_simple_restrictinfo
was only needed for placeholder Vars, which is not the case
in this part of the codebase because we are building the
expression from shard intervals which don't have placeholder
vars.
However, PG16 is counting baserels appearing in clause_relids
and is deleting the rels mentioned in plannerinfo->outer_join_rels
Hence directly accessing plannerinfo.
We will crash if we leave it as NULL.
For reference
2489d76c49 (diff-e045c41eda9686451a7993e91518e40056b3739365e39eb1b70ae438dc1f7c76R207)
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d

- Add outer join checks, root->simple_rel_array

- fix rebalancer to include passwork_required option 
Relevant PG commit:
c3afe8cf5a
c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6

More PG16 compatibility commits are coming soon ...
2023-08-04 14:51:28 +03:00
Naisila Puka 907d72e60d
PG16 compatibility - some test outputs (#7100)
PG16 compatibility - Part 3

Check out part 1 42d956888d
and part 2 0d503dd5ac

This commit is in the series of PG compatibility. It makes some changes
to our tests in order to be compatible with the following in PG16:

Use debug_parallel_query in PG16+, force_parallel_mode otherwise 
Relevant PG commit
5352ca22e0
5352ca22e0012d48055453ca9992a9515d811291

HINT changed to DETAIL in PG16 
Relevant PG commit:
56d0ed3b75
56d0ed3b756b2e3799a7bbc0ac89bc7657ca2c33

Fix removed read-only server setting lc_collate 
Relevant PG commit:
b0f6c43716
b0f6c437160db640d4ea3e49398ebc3ba39d1982

Fix unsupported join alias expression in sqlancer_failures 
Relevant PG commit:
2489d76c49
2489d76c4906f4461a364ca8ad7e0751ead8aa0d

More PG16 compatibility commits are coming soon ...
2023-08-04 13:03:15 +03:00
Önder Kalacı 4ae3982d14
Add single-shard router Merge command support (#7088)
Similar to https://github.com/citusdata/citus/pull/7077.

As PG 16+ has changed the join restriction information for certain outer
joins, MERGE is also impacted given that is is also underlying an outer
join.

See #7077 for the details.
2023-08-04 08:16:29 +03:00
Naisila Puka 0d503dd5ac
PG16 compatibility: ruleutils and successful CREATE EXTENSION (#7087)
PG16 compatibility - Part 2

Part 1 provided successful compilation against pg16beta2.
42d956888d

This PR provides ruleutils changes with pg16beta2 and successful CREATE EXTENSION command.
Note that more changes are needed in order to have successful regression tests.
More commits are coming soon ...

For any_value changes, I referred to this commit
8ef94dc1f5
where we did something similar for PG14 support.
2023-08-02 16:04:51 +03:00
Önder Kalacı 960a5f6104
Improve failure handling of distributed execution (#7090)
Prior to this commit, the code would skip processing the
errors happened for local commands.

Prior to https://github.com/citusdata/citus/pull/5379, it might
make sense to allow the execution continue. But, as of today,
if a modification fails on any placement, we can safely fail
the execution.

The first commit show the problem in action. The second commit
includes the fix and the test fixes.
2023-08-01 16:47:59 +03:00
Onur Tirtir dd6ea1ebd5
Makes sure to handle NULL constraints for ADD COLUMN commands (#7093)
DESCRIPTION: Fixes a bug that causes an unexpected error when adding a
column with a NULL constraint

Fixes https://github.com/citusdata/citus/issues/7092.
2023-08-01 11:07:47 +03:00
Önder Kalacı cb5eb73048
Add support for router INSERT .. SELECT commands (#7077)
Tradionally our planner works in the following order:
   router - > pushdown -> repartition -> pull to coordinator

However, for INSERT .. SELECT commands, we did not support "router".

In practice, that is not a big issue, because pushdown planning can
handle router case as well.

However, with PG 16, certain outer joins are converted to JOIN without
any conditions (e.g., JOIN .. ON (true)) and the filters are pushed down
to the tables.

When the filters are pushed down to the tables, router planner can
detect. However, pushdown planner relies on JOIN conditions.

An example query:
```
INSERT INTO agg_events (user_id)
        SELECT raw_events_first.user_id
        FROM raw_events_first LEFT JOIN raw_events_second
        	ON raw_events_first.user_id = raw_events_second.user_id
        WHERE raw_events_first.user_id = 10;
```

As a side effect of this change, now we can also relax certain
limitation that "pushdown" planner emposes, but not "router". So, with
this PR, we also allow those.

Closes https://github.com/citusdata/citus/pull/6772
DESCRIPTION: Prevents unnecessarily pulling the data into coordinator
for some INSERT .. SELECT queries that target a single-shard group
2023-07-28 15:07:20 +03:00
Teja Mupparti 846cbc3a39 In the MERGE join clause, there is a datatype mismatch between target's distribution column
and the expression originating from the source. If the types are different, Citus uses
different hash functions for the two column types, which might lead to incorrect repartitioning
of the result data
2023-07-27 16:06:00 -07:00
Nils Dijk 186804c119
fix flappyness of shard_rebalancer operations test (#7083)
Fixes flappyness where the order of shards was dependent on the physical
layout in the heap. Failed here
https://app.circleci.com/pipelines/github/citusdata/citus/33844/workflows/1651f8f5-6e6a-457e-9d35-34b8788ea6d1/jobs/1189836


```diff
--- /home/circleci/project/src/test/regress/expected/shard_rebalancer.out.modified	2023-07-24 12:51:27.126284675 +0000
+++ /home/circleci/project/src/test/regress/results/shard_rebalancer.out.modified	2023-07-24 12:51:27.170285079 +0000
@@ -2571,24 +2571,24 @@
 CREATE TABLE test_with_all_shards_excluded(a int PRIMARY KEY);
 SELECT create_distributed_table('test_with_all_shards_excluded', 'a', colocate_with:='none', shard_count:=4);
  create_distributed_table 
 --------------------------
  
 (1 row)
 
 SELECT shardid FROM pg_dist_shard;
  shardid 
 ---------
-  433504
   433505
   433506
   433507
+  433504
 (4 rows)
 
 SELECT rebalance_table_shards('test_with_all_shards_excluded', excluded_shard_list:='{102073, 102074, 102075, 102076}');
  rebalance_table_shards 
 ------------------------
  
 (1 row)
 
 DROP TABLE test_with_all_shards_excluded;
 SET citus.shard_count TO 2;
```
2023-07-27 16:24:35 +02:00
zhjwpku 6a00517312
[typo] fix typo in comments (#7073)
%s/pg_dist_local_node_group/pg_dist_local_group/g

Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
2023-07-25 16:43:55 +03:00
Önder Kalacı 862dae823e
Expand EnableNonColocatedRouterQueryPushdown to cover shard colocation (e.g., shard index) (#7076)
Previously, we only checked whether the relations are colocated, but we
ignore the shard indexes. That causes certain queries still to be
accidentally router. We should enforce colocation checks for both shard
index and table colocation id to make the check restrictive enough.

For example, the following query should not be router, and after this
patch, it won't:
```SQL
SELECT
   user_id
 FROM
   ((SELECT user_id FROM raw_events_first WHERE user_id = 15) EXCEPT
    (SELECT user_id FROM raw_events_second where user_id = 17)) as foo;
```

DESCRIPTION: Enforce shard level colocation with
citus.enable_non_colocated_router_query_pushdown
2023-07-25 16:20:13 +03:00
ahmet gedemenli 3f11139b5c Do not move a shard to a node that it already exists on 2023-07-25 13:38:33 +03:00
ahmet gedemenli c968dc9c27 Do not rebalance if replication factor is greater than the node count 2023-07-25 13:38:33 +03:00
Naisila Puka 42d956888d
PG16 compatibility: Resolve compilation issues (#7005)
This PR provides successful compilation against PG16Beta2. It does some
necessary refactoring to prepare for full support of version 16, in
https://github.com/citusdata/citus/pull/6952 .

Change RelFileNode to RelFileNumber or RelFileLocator 
Relevant PG commit
b0a55e43299c4ea2a9a8c757f9c26352407d0ccc

new header for varatt.h 
Relevant PG commit:
d952373a987bad331c0e499463159dd142ced1ef

drop support for Abs, use fabs 
Relevant PG commit
357cfefb09115292cfb98d504199e6df8201c957

tuplesort PGcommit: d37aa3d35832afde94e100c4d2a9618b3eb76472 
Relevant PG commit:
d37aa3d35832afde94e100c4d2a9618b3eb76472

Fix vacuum in columnar 
Relevant PG commit:
4ce3afb82ecfbf64d4f6247e725004e1da30f47c
older one:
b6074846cebc33d752f1d9a66e5a9932f21ad177

Add alloc_flags to pg_clean_ascii 
Relevant PG commit:
45b1a67a0fcb3f1588df596431871de4c93cb76f

Merge GetNumConfigOptions() into get_guc_variables() 
Relevant PG commit:
3057465acfbea2f3dd7a914a1478064022c6eecd

Minor PG refactor PG_FUNCNAME_MACRO __func__ 
Relevant PG commit
320f92b744b44f961e5d56f5f21de003e8027a7f

Pass NULL context to stringToQualifiedNameList, typeStringToTypeName 
The pre-PG16 error behaviour for the following
stringToQualifiedNameList & typeStringToTypeName
was ereport(ERROR, ...)
Now with PG16 we have this context input. We preserve the same behaviour
by passing a NULL context, because of the following:
(copy paste comment from PG16)
If "context" isn't an ErrorSaveContext node, this behaves as
errstart(ERROR, domain), and the errsave() macro ends up acting
exactly like ereport(ERROR, ...).
Relevant PG commit
858e776c84f48841e7e16fba7b690b76e54f3675

Use RangeVarCallbackMaintainsTable instead of RangeVarCallbackOwnsTable 
Relevant PG commit:
60684dd834a222fefedd49b19d1f0a6189c1632e

FIX THIS: Not implemented grant-level control of role inheritance 
see PG commit
e3ce2de09d814f8770b2e3b3c152b7671bcdb83f

Make Scan node abstract 
PG commit:
8c73c11a0d39049de2c1f400d8765a0eb21f5228

Change in Var representations, get_relids_in_jointree 
PG commit
2489d76c4906f4461a364ca8ad7e0751ead8aa0d

Deadlock detection changes because SHM_QUEUE is removed 
Relevant PG Commit:
d137cb52cb7fd44a3f24f3c750fbf7924a4e9532

TU_UpdateIndexes 
Relevant PG commit
19d8e2308bc51ec4ab993ce90077342c915dd116

Use object_ownercheck and object_aclcheck functions 
Relevant PG commits:
afbfc02983f86c4d71825efa6befd547fe81a926
c727f511bd7bf3c58063737bcf7a8f331346f253

Rework Permission Info for successful compilation 
Relevant PG commits:
postgres/postgres@a61b1f7
postgres/postgres@b803b7d
---------

Co-authored-by: onderkalaci <onderkalaci@gmail.com>
2023-07-21 14:32:37 +03:00
Naisila Puka a282953274
Fix ScanKeyInit RegProcedure and Datum arguments (#7072)
Index scans in PG16 return empty sets because of extra compatibility
enforcement for `ScanKeyInit` arguments.
Could be one of the relevant PG commits:
c8b2ef05f4
This PR fixes all incompatible `RegProcedure` and `Datum` arguments in
all `ScanKeyInit` functions used throughout the codebase.
Helpful for https://github.com/citusdata/citus/pull/6952
2023-07-21 14:11:10 +03:00
Teja Mupparti 87dc88f837 Isolate schema sharding/MERGE tests into a new file, and
use the new GUC parameter
2023-07-19 12:23:45 -07:00
Halil Ozan Akgül c99a93ffa7
Move SQL file changes for citus_shard_sizes fixes into the new 11.3-2 version (#7050)
This PR moves `citus_shard_sizes` changes from #7003, and #7018 to into
a new Citus version, 11.3-2
2023-07-14 17:19:54 +03:00
aykut-bozkurt 609a5465ea
Bump Citus version into 12.1devel (#7061) 2023-07-14 13:12:30 +03:00
Gürkan İndibay 0f0b60c29c
Fix format attribute and IsLocalReplicationOriginSessionActive errors (#7055)
This PR fixes the following:

- in oraclelinux-7 `Make` step
```
/usr/bin/ld: utils/replication_origin_session_utils.o: relocation R_X86_64_PC32 against undefined symbol 
`IsLocalReplicationOriginSessionActive' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
```
`IsLocalReplicationOriginSessionActive` function has improper inline
declaration, fixed that
- in centos-7 `Make` step
```
utils/background_jobs.c: In function 'StartCitusBackgroundTaskExecutor':
utils/background_jobs.c:1746:6: warning: function might be possible candidate for 'gnu_printf' format attribute
[-Wsuggest-attribute=format]
      database, user, jobId, taskId);
      ^
```
should use `pg_attribute_printf(3,4)` instead of
`pg_attribute_printf(3,0)` since the number of arguments varies for
`SafeSnprintf(char *str, rsize_t count, const char *fmt, ...)`

---------

Co-authored-by: naisila <nicypp@gmail.com>
2023-07-13 17:41:57 +03:00
Onur Tirtir f3cdb6d1bf Deparse ALTER TABLE commands if ADD COLUMN is the only subcommand
And stabilize multi_alter_table_statements.sql.
2023-07-12 18:17:47 +03:00
Onur Tirtir 6365f47b57 Properly handle index storage options for ADD CONSTRAINT / COLUMN 2023-07-11 17:42:43 +03:00
Onur Tirtir ae142e1764 Properly handle IF NOT EXISTS for ADD COLUMN 2023-07-11 17:42:43 +03:00
Onur Tirtir d4789a2c3a Stabilize test helper sql files
multi_test_helpers is run in parallel with others, so need to stabilize
other test helpers too to make multi_test_helpers runnable multiple
times.
2023-07-06 10:47:41 +03:00
Onur Tirtir 001437bdfe Refactor AppendAlterTableCmdAddConstraint to reuse it for ADD COLUMN too 2023-07-06 10:47:41 +03:00
Onur Tirtir 56f1daa800 Refactor the code that extends constraint/index names on shards into a func 2023-07-06 10:47:41 +03:00
Onur Tirtir ba1ea9b5bd Refactor the code that prepares constraint objects in an alter table stmt into a func 2023-07-06 10:47:41 +03:00
Halil Ozan Akgül 613cced1ae
Use citus_shard_sizes in citus_tables (#7018)
Fixes #7019 

This PR updates citus_tables view to use citus_shard_sizes function,
instead of citus_total_relation_size to improve performance.
2023-07-05 11:40:34 +03:00
aykut-bozkurt 719d92c8b9
mat view should not be converted to tenant table (#7043)
We allow materialized view to exist in distrbuted schema but they should
not be tried to be converted to a tenant table since they cannot be
distributed.

Fixes https://github.com/citusdata/citus/issues/7041
2023-07-04 17:28:03 +03:00
Ahmet Gedemenli 5051be86ff
Skip distributed schema insertion into pg_dist_schema, if already exists (#7044)
Inserting into `pg_dist_schema` causes unexpected duplicate key errors,
for distributed schemas that already exist. With this commit we skip the
insertion if the schema already exists in `pg_dist_schema`.

The error:
```sql
SET citus.enable_schema_based_sharding TO ON;
CREATE SCHEMA sc2;
CREATE SCHEMA IF NOT EXISTS sc2;
NOTICE:  schema "sc2" already exists, skipping
ERROR:  duplicate key value violates unique constraint "pg_dist_schema_pkey"
DETAIL:  Key (schemaid)=(17294) already exists.
```

fixes: #7042
2023-07-04 15:19:07 +03:00
Gokhan Gulbiz e0d3476526
Add locking mechanism for tenant monitoring probabilistic approach (#7026)
This PR 
* Addresses a concurrency issue in the probabilistic approach of tenant
monitoring by acquiring a shared lock for tenant existence checks.
* Changes `citus.stat_tenants_sample_rate_for_new_tenants` type to
double
* Renames `citus.stat_tenants_sample_rate_for_new_tenants` to
`citus.stat_tenants_untracked_sample_rate`
2023-07-03 13:08:03 +03:00
Jelte Fennema ac24e11986
Change default rebalance strategy to by_disk_size (#7033)
DESCRIPTION: Change default rebalance strategy to by_disk_size

When introducing rebalancing by disk size we didn't make it the default
initially. The main reason was, because we expected some problems with
it. We have indeed had some problems/bugs with it over the years, and
have fixed all of them. By now we're quite confident in its stability,
and that it pretty much always gives better results than by_shard_count.

So this PR makes by_disk_size the new default. We don't change the
default when some other strategy than by_shard_count is the current
default. This is in case someone defined their own rebalance strategy
and marked this as the default themselves.

Note: It explicitly does nothing during a downgrade, because there's no
way of knowing if the rebalance strategy before the upgrade was
by_disk_size or by_shard_count. And even in previous versions
by_disk_size is considered superior for quite some time.
2023-07-03 11:08:24 +02:00
Jelte Fennema fd1427de2c
Change by_disk_size rebalance strategy to have a base size (#7035)
One problem with rebalancing by disk size is that shards in newly
created collocation groups are considered extremely small. This can
easily result in bad balances if there are some other collocation groups
that do have some data. One extremely bad example of this is:
1. You have 2 workers
2. Both contain about 100GB of data, but there's a 70MB difference.
3. You create 100 new distributed schemas with a few empty tables in
   them
4. You run the rebalancer
5. Now all new distributed schemas are placed on the node with that had
   70MB less.
6. You start loading some data in these shards and quickly the balance
   is completely off

To address this edge case, this PR changes the by_disk_size rebalance
strategy to add a a base size of 100MB to the actual size of each
shard group. This can still result in a bad balance when shard groups
are empty, but it solves some of the worst cases.
2023-06-27 16:37:09 +02:00
Halil Ozan Akgül 03a4769c3a
Fix Reference Table Check for CDC (#7025)
Previously reference table check only looked at `partition method =
'n'`. This PR adds `replication model = 't'` to that.
2023-06-23 16:37:35 +03:00
Teja Mupparti 387b5f80f9 Fixes the bug#6785 2023-06-22 10:44:45 -07:00
Ahmet Gedemenli 99edb2675f
Improve error/hint messages related to schema-based sharding (#7027)
Improve error/hint messages related to schema-based sharding
2023-06-22 18:10:12 +03:00
Ahmet Gedemenli 44e3c3b9c6
Improve error message for CREATE SCHEMA .. CREATE TABLE (#7024)
Improve error message for CREATE SCHEMA .. CREATE TABLE when
enable_schema_based_sharding is enabled.
2023-06-21 15:24:09 +03:00
aykut-bozkurt 565c5260fd
Properly handle error at owner check (#6984)
We did not properly handle the error at ownership check method, which
causes `max stack depth for errors` as in
https://github.com/citusdata/citus/issues/6980.

**Fix:**
In case of an error, we should rollback subtransaction and throw the
message with log level to `LOG_SERVER_ONLY`.

Note: We prevent logs from the client to prevent pg vanilla test
failures due to Citus logs which differs from the actual Postgres logs.
(For context: https://github.com/citusdata/citus/pull/6130)

I also needed to fix a flaky test: `multi_schema_support`

DESCRIPTION: Fixes a bug related to non-existent objects in DDL
commands.

Fixes https://github.com/citusdata/citus/issues/6980
2023-06-21 14:50:01 +03:00
Naisila Puka 69af3e8509
Drop PG13 Support Phase 2 - Remove PG13 specific paths/tests (#7007)
This commit is the second and last phase of dropping PG13 support.

It consists of the following:

- Removes all PG_VERSION_13 & PG_VERSION_14 from codepaths
- Removes pg_version_compat entries and columnar_version_compat entries
specific for PG13
- Removes alternative pg13 test outputs 
- Removes PG13 normalize lines and fix the test outputs based on that

It is a continuation of 5bf163a27d
2023-06-21 14:18:23 +03:00
aykut-bozkurt 1bb667ce6e
Fix create schema authorization bug (#7015)
Fixes a bug related to `CREATE SCHEMA AUTHORIZATION <rolename>` for single shard
tables. We should properly fetch schema name from role specification if schema name is not given.
2023-06-20 22:05:17 +03:00
aykut-bozkurt f667f14029
Rewind tuple store to fix scrollable with hold cursor fetches (#7014)
We need to rewind the tuplestorestate's tuple index to get correct
results on fetching scrollable with hold cursors.


`PersistHoldablePortal` is responsible for persisting out
tuplestorestate inside a with hold cursor before commiting a
transaction.

It rewinds the cursor like below (`ExecutorRewindcalls` calls `rescan`):
```c
if (portal->cursorOptions & CURSOR_OPT_SCROLL)
{
  ExecutorRewind(queryDesc);
}
```

At the end, it adjusts tuple index for holdStore in the portal properly.
```c
if (portal->cursorOptions & CURSOR_OPT_SCROLL)
{
         if (!tuplestore_skiptuples(portal->holdStore,
	                                         portal->portalPos,
	                                         true))
	    elog(ERROR, "unexpected end of tuple stream");
}
```

DESCRIPTION: Fixes incorrect results on fetching scrollable with hold
cursors.

Fixes https://github.com/citusdata/citus/issues/7010
2023-06-19 23:00:18 +03:00
Teja Mupparti 58da8771aa This pull request introduces support for nonroutable merge commands in the following scenarios:
1) For distributed tables that are not colocated.
2) When joining on a non-distribution column for colocated tables.
3) When merging into a distributed table using reference or citus-local tables as the data source.

This is accomplished primarily through the implementation of the following two strategies.

Repartition: Plan the source query independently,
execute the results into intermediate files, and repartition the files to
co-locate them with the merge-target table. Subsequently, compile a final
merge query on the target table using the intermediate results as the data
source.

Pull-to-coordinator: Execute the plan that requires evaluation at the coordinator,
run the query on the coordinator, and redistribute the resulting rows to ensure
colocation with the target shards. Direct the MERGE SQL operation to the worker
nodes' target shards, using the intermediate files colocated with the data as the
data source.
2023-06-19 12:23:40 -07:00
Xin Li c10cb50aa9
Support custom cast from / to timestamptz in time partition management UDFs (#6923)
This is to implement custom cast of table partition column
type from / to `timestamptz` in time partition management UDFs, as
proposed in ticket #6454

The general idea is for a time partition column with type other than
`date`, `timestamp`, or `timestamptz`, users can provide custom
bidirectional cast between the column type and `timestamptz`, the UDFs
then will be able to create and drop time partitions for such tables.

Fixes #6454

---------

Signed-off-by: Xin Li <xin@swirldslabs.com>
Co-authored-by: Marco Slot <marco.slot@microsoft.com>
Co-authored-by: Ahmet Gedemenli <afgedemenli@gmail.com>
2023-06-19 17:49:05 +03:00
Halil Ozan Akgül d71ad4b65a
Add Publication Tests for Tenant Schema Tables (#7011)
This PR adds schema based sharding tests to publication.sql file
2023-06-19 12:39:41 +03:00
aykut-bozkurt fba5c8dd30
ALTER TABLE <tblname> SET SCHEMA <schemaname> for single shard tables (#7004)
Adds support for altering schema of single shard tables. We do that in 2
steps.
1. Undistribute the tenant table at `preprocess` step,
2. Distribute new schema if it is a distributed schema after DDLs are
propagated.

DESCRIPTION: Adds support for altering a table's schema to/from
distributed schemas.
2023-06-19 10:21:13 +03:00
Nils Dijk ce2ba1d07e
Optimize QueryPushdownSqlTaskList on memory and cpu (#6945)
While going over this piece of code (a long time ago) it was bothering
to me we keep a bool array with the size of shardcount to iterate only
over shards present in the list of non-pruned shards. Especially since
we keep min/max of the set shards to optimize iteration.

Postgres has the bitmapset datastructure which a) takes significantly
less space, b) has iterator functions to only iterate over set bits, c)
can efficiently skip long sequences of unset bits and d) stops quickly
once the last set bit has been reached.

I have been contemplating if it is worth to keep the minShardOffset
because of readability and the efficient skipping of unset bits,
however, I have decided to keep it -although less readable-, as there
are known usecases where 100k+ shards are pruned to single digit shards.
If these would end up at the end of `shardcount` a hotloop of zero
checks on the first iteration _could_ cause a theoretical performance
regression.

All in all, this code is using less memory in all cases where it
matters, and less cpu in most cases, while using more idiomatic
datastructures for the task at hand.
2023-06-16 16:06:22 +02:00
Marco Slot 3adc1575d9
Fix DROP CONSTRAINT in command string with other commands (#7012)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2023-06-16 15:54:37 +02:00
Onur Tirtir 12a093b456
Allow using generated identity column based on int/smallint when creating a distributed table (#7008)
Allow using generated identity column based on int/smallint when
creating a distributed table so that applications that rely on
those data types don't break.

Inserting into / modifying such columns from workers is not allowed
but it's better than not allowing such columns altogether.
2023-06-16 14:34:23 +03:00
Halil Ozan Akgül 04f6868ed2
Add citus_schemas view (#6979)
DESCRIPTION: Adds citus_schemas view

The citus_schemas view will be created in public schema if it exists, if
not the view will be created in pg_catalog.

Need to:
- [x] Add tests
- [x] Fix tests
2023-06-16 14:21:58 +03:00
Naisila Puka 5bf163a27d
Remove PG13 from CI and Configure (#7002)
DESCRIPTION: Drops PG13 Support

This commit is the first phase of dropping PG13 support.

It consists of the following:

- Removes pg13 from CI tests
Among other things, Citus upgrade tests should now use PG14.
Earliest Citus version supporting PG14 is 10.2.
We also pick 11.3 version for upgrade_pg_dist_cleanup tests.
Therefore, we run the citus upgrade tests with versions 10.2 and 11.3.

- Removes pg13 from configure script

- Remove upgrade_columnar_metapage upgrade tests 
We populate first_row_number column of columnar.stripe table
during citus 10.1-10.2 upgrade. Given that we start from citus 10.2.0,
which is the oldest version supporting PG14, we don't have that
upgrade path anymore. Hence we remove these tests.

- Removes upgrade_pg_dist_object_test and upgrade_partition_constraints tests
These upgrade tests require the citus old version to be less than 10.0.
Given that we drop support for PG13, we run upgrade tests with PG14,
which starts with 10.2.
So we remove these upgrade tests.

- Documents that upgrade_post_11 should upgrade from version less than 11 
In this way we make sure we run
citus_finalize_upgrade_to_citus11 script

- Adds needed alternative output for upgrade_citus_finish_citus_upgrade 
Given that we use 11.3 as the citus old version as well,
we add this alternative output because pg_catalog.citus_finish_citus_upgrade()
makes sense if last_upgrade_major_version < 11. See below for reference:
pg_catalog.citus_finish_citus_upgrade():
...
	IF last_upgrade_major_version < 11 THEN
		PERFORM citus_finalize_upgrade_to_citus11();
		performed_upgrade := true;
	END IF;

	IF NOT performed_upgrade THEN
		RAISE NOTICE 'already at the latest distributed
		schema version (%)', last_upgrade_version_string;
		RETURN;
	END IF;
...

And that's it :)

The second phase of dropping PG13 support will consist in removing
all the PG13 specific compilation paths/tests in the Citus repo.
Will be done soon.
2023-06-15 14:54:06 +03:00
Ahmet Gedemenli 002a88ae7f
Error for single shard table creation if replication factor > 1 (#7006)
Error for single shard table creation if replication factor > 1
2023-06-15 13:13:45 +03:00
Emel Şimşek 4f793abc4a
Turn on GUC_REPORT flag for search_path to enable reporting back the parameter value upon change. (#6983)
DESCRIPTION: Turns on the GUC_REPORT flag for search_path. This results
in postgres to report the parameter status back in addition to Command
Complete packet.

In response to the following command,

> SET search_path TO client1;

postgres sends back the following packets (shown in pseudo form):

C (Command Complete) SET + **S (Parameter Status) search_path =
client1**
2023-06-14 17:35:52 +03:00
Naisila Puka 3cc7a4aa42
Fix pg14-pg15 upgrade_distributed_triggers test (#6981)
This test is only relevant for pg14-15 upgrade.
However, the check on `upgrade_distributed_triggers_after` didn't take
into consideration the case when we are doing pg15-16 upgrade. Hence, I
added one more condition to the test: existence of
`upgrade_distributed_triggers` schema which can only be created in pg14.
2023-06-14 15:32:38 +03:00
Onur Tirtir dbdf04e8ba
Rename pg_dist tenant_schema to pg_dist_schema (#7001) 2023-06-14 12:12:15 +03:00
Naisila Puka ba40eb363c
Fix some gucs' initial and boot values, and flag combinations (#6957)
PG16beta1 added some sanity checks for GUCS, find the Relevant PG
commits below:

1- Add check on initial and boot values when loading GUCs

a73952b795
2- Extend check_GUC_init() with checks on flag combinations when loading
GUCs

009f8d1714

I fixed our currently problematic GUCS, we can merge this directly into
main as these make sense for any PG version.

There was a particular NodeConninfo issue:
Previously we would rely on the fact that NodeConninfo initial value
is an empty string. However, with PG16 enforcing same initial and boot
values, we can't use an empty initial value for NodeConninfo anymore.
Therefore we add a new flag to indicate whether we are at boot check.
2023-06-14 11:55:52 +03:00
Ahmet Gedemenli 7b0bc62173
Support CREATE TABLE .. AS SELECT .. commands for tenant tables (#6998)
Support CREATE TABLE .. AS SELECT .. commands for tenant tables
2023-06-13 17:54:09 +03:00
Halil Ozan Akgül 772d194357
Changes citus_shard_sizes view's Shard Name Column to Shard Id (#7003)
citus_shard_sizes view had a shard name column we use to extract shard
id. This PR changes the column to shard id so we don't do unnecessary
string operation.
2023-06-13 16:36:35 +03:00
Gokhan Gulbiz e0ccd155ab
Make citus_stat_tenants work with schema-based tenants. (#6936)
DESCRIPTION: Enabling citus_stat_tenants to support schema-based
tenants.

This pull request modifies the existing logic to enable tenant
monitoring with schema-based tenants. The changes made are as follows:

- If a query has a partitionKeyValue (which serves as a tenant
key/identifier for distributed tables), Citus annotates the query with
both the partitionKeyValue and colocationId. This allows for accurate
tracking of the query.
- If a query does not have a partitionKeyValue, but its colocationId
belongs to a distributed schema, Citus annotates the query with only the
colocationId. The tenant monitor can then easily look up the schema to
determine if it's a distributed schema and make a decision on whether to
track the query.

---------

Co-authored-by: Jelte Fennema <jelte.fennema@microsoft.com>
2023-06-13 14:11:45 +03:00
aykut-bozkurt 5acbd735ca
Move 2 functions to correct files (#7000)
Followup item from
https://github.com/citusdata/citus/pull/6933#discussion_r1217896933
2023-06-13 11:43:48 +03:00
aykut-bozkurt 213d363bc3
Add citus_schema_distribute/undistribute udfs to convert a schema into a tenant schema / back to a regular schema (#6933)
* Currently we do not allow any Citus tables other than Citus local
tables inside a regular schema before executing
`citus_schema_distribute`.
* `citus_schema_undistribute` expects only single shard distributed
tables inside a tenant schema.

DESCRIPTION: Adds the udf `citus_schema_distribute` to convert a regular
schema into a tenant schema.
DESCRIPTION: Adds the udf `citus_schema_undistribute` to convert a
tenant schema back to a regular schema.

---------

Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
2023-06-12 18:41:31 +03:00
Gokhan Gulbiz 2c509b712a
Tenant monitoring performance improvements (#6868)
- [x] Use spinlock instead of lwlock per tenant
[b437aa9](b437aa9e52)
- [x] Use hashtable to store tenant stats
[ccd464b](ccd464ba04)
- [x] Introduce a new GUC for specifying the sampling rate of new tenant
entries in the tenant monitor.
[a8d3805](a8d3805bd6)

Below are the pgbench metrics with select-only workloads from my local
machine. Here is the
[script](https://gist.github.com/gokhangulbiz/7a2308470597dc06734ff7c08f87c656)
I used for benchmarking.

| | Connection Count | Initial Implementation (TPS) | On/Off Diff |
Final Implementation -Run#1 (TPS) | On/Off Diff | Final Implementation
-Run#2 (TPS) | On/Off Diff | Final Implementation -Run#3 (TPS) | On/Off
Diff | Avg On/Off Diff |
| --- | ---------------- | ---------------------------- | ----------- |
---------------------------------- | ----------- |
---------------------------------- | ----------- |
---------------------------------- | ----------- | --------------- |
| On | 32 | 37488.69839 | \-17% | 42859.94402 | \-5% | 43379.63121 |
\-2% | 42636.2264 | \-7% | \-5% |
| Off | 32 | 43909.83121 | | 45139.63151 | | 44188.77425 | | 45451.9548
| | |
| On | 300 | 30463.03538 | \-15% | 33265.19957 | \-7% | 34685.87233 |
\-2% | 34682.5214 | \-1% | \-3% |
| Off | 300 | 35105.73594 | | 35637.45423 | | 35331.33447 | | 35113.3214
| | |
2023-06-11 12:17:31 +03:00
Ahmet Gedemenli 2f13b37ce4
Fix flaky multi_schema_support (#6991)
Dropping a leftover table, delete some unnecessary command, add some
ORDER BY to avoid flakiness in `multi_schema_support`
2023-06-09 17:03:58 +03:00
Naisila Puka 50e6c50534
Remove flaky rebalance plan from test (#6990)
Looks like sometimes shards are a slightly different size than we
expect, 16k vs 8k, resulting in a different rebalance plan.
2023-06-09 15:59:30 +03:00
Ahmet Gedemenli e6ac9f2a68
Propagate ALTER SCHEMA .. OWNER TO .. (#6987)
Propagate `ALTER SCHEMA .. OWNER TO ..` commands to workers
2023-06-09 15:32:18 +03:00
Halil Ozan Akgül 3acadd7321
Citus Clock tests with Single Shard Tables (#6938)
This PR tests Citus clock with single shard tables.
2023-06-09 15:06:46 +03:00
Naisila Puka 2ba3bffe1e
Random warning fixes (#6974)
Citus build with PG16 fails because of the following warnings:
 - using char* instead of Datum
 - using pointer instead of oid
 - candidate function for format attribute
 - remove old definition from PG11 compatibility 62bf571ced

This commit fixes the above.
2023-06-09 14:36:43 +03:00
Emel Şimşek 8b2024b730
When Creating a FOREIGN KEY without a name, schema qualify referenced table name in deparser. (#6986)
DESCRIPTION: Fixes a bug which causes an error when creating a FOREIGN
KEY constraint without a name if the referenced table is schema
qualified.

In deparsing the `ALTER TABLE s1.t1 ADD FOREIGN KEY (key) REFERENCES
s2.t2; `, command back from its cooked form, we should schema qualify
the REFERENCED table.

Fixes #6982.
2023-06-09 14:13:13 +03:00
Onur Tirtir fa8870217d
Enable logical planner for single-shard tables (#6950)
* Enable using logical planner for single-shard tables

* Improve non-colocated table error in physical planner

* Favor distributed tables over reference tables when chosing anchor shard
2023-06-08 10:57:23 +03:00
Halil Ozan Akgül b569d53a0c
Single shard misc udfs (#6956)
This PR tests:
- shards_colocated
- citus_shard_cost_by_disk_size
- citus_update_shard_statistics
- citus_update_table_statistics
2023-06-07 13:30:50 +03:00
Emel Şimşek 6369645db4
Restore Test Coverage for Pushing Down Subqueries. (#6976)
When we add the coordinator in metadata, reference tables gets
replicated to coordinator. As a result we lose some test coverage since
some queries start to run locally instead of getting pushed down.

This PR adds new test cases involving distributed tables instead of
reference tables for covering distributed execution in related cases.
2023-06-07 12:14:34 +03:00
Ahmet Gedemenli 8d8968ae63
Disable ALTER TABLE .. SET SCHEMA for tenant tables (#6973)
Disables `ALTER TABLE .. SET SCHEMA` for tenant tables.
Disables `ALTER TABLE .. SET SCHEMA` for tenant schemas.
2023-06-07 11:02:53 +03:00
Halil Ozan Akgül 3f7bc0cbf5
Single Shard Partition Column UDFs (#6964)
This PR fixes and tests:
- debug_equality_expression
- partition_column_id
2023-06-06 17:55:40 +03:00
Halil Ozan Akgül 7e486345f1
Fix citus_table_type column in citus_tables and citus_shards views for single shard tables (#6971)
`citus_table_type` column of `citus_tables` and `citus_shards` will show
"schema" for tenants schema tables and "distributed" for single shard
tables that are not in a tenant schema.
2023-06-06 16:20:11 +03:00
Naisila Puka c2f117c559
Citus Revise tree-walk APIs to include context (#6975)
Without revising there are Warnings in PG16 build

Relevant PG commit

1c27d16e6e
1c27d16e6e5c1f463bbe1e9ece88dda811235165
2023-06-06 14:17:51 +03:00
Teja Mupparti f6a516dab5 Refactor repartitioning code into generic format 2023-06-05 09:06:05 -07:00
Naisila Puka 48f068d08e
Remove AssertArg and AssertState (#6970)
PG16 removed them. They were already identical to Assert. We can merge
this directly to main branch

Relevant PG commit:

b1099eca8f
b1099eca8f38ff5cfaf0901bb91cb6a22f909bc6

Co-authored-by: onderkalaci <onderkalaci@gmail.com>
2023-06-05 13:25:21 +03:00
Emel Şimşek 3fda2c3254
Change test files in multi and multi-1 schedules to accommodate coordinator in the metadata. (#6939)
Changes test files in multi and multi-1 schedules such that they
accomodate coordinator in metadata.

Changes fall into the following buckets:

1. When coordinator is in metadata, reference table shards are present
in coordinator too.
This changes test outputs checking the table size, shard numbers etc.
for reference tables.

2. When coordinator is in metadata, postgres tables are converted to
citus local tables whenever a foreign key relationship to them is
created. This changes some test cases which tests it should not be
possible to create foreign keys to postgres tables.

3. Remove lines that add/remove coordinator for testing purposes.
2023-06-05 10:37:48 +03:00
ahmet gedemenli 2bd6ff0e93 Use schema name in the error msg 2023-06-02 15:25:14 +03:00
ahmet gedemenli fccfee08b6 Style 2023-06-02 14:48:07 +03:00
ahmet gedemenli f68ea20009 Disable alter_distributed_table for tenant tables 2023-06-02 14:48:07 +03:00
ahmet gedemenli 4b67e398b1 Disable undistribute_table for tenant tables 2023-06-02 14:48:07 +03:00
ahmet gedemenli f4b2494d0c Disable update_distributed_table_colocation for tenant tables 2023-06-02 14:48:07 +03:00
Halil Ozan Akgül 3e183746b7
Single Shard Misc UDFs 2 (#6963)
Creating a second PR to make reviewing easier.
This PR tests:
- replicate_reference_tables
- fix_partition_shard_index_names
- isolate_tenant_to_new_shard
- replicate_table_shards
2023-06-02 13:46:14 +03:00
Halil Ozan Akgül ac7f732be2
Add Single Shard Table Tests for Dependency UDFs (#6960)
This PR tests:
- citus_get_all_dependencies_for_object
- citus_get_dependencies_for_object
- is_citus_depended_object
2023-06-02 11:57:53 +03:00
Teja Mupparti ff2062e8c3 Rename insert-select redistribute code base to generic purpose 2023-06-01 09:43:43 -07:00
Halil Ozan Akgül 9961d39d97
Adds Single Shard Table Tests for Foreign Key UDFs (#6959)
This PR adds tests for:
- get_referencing_relation_id_list
- get_referenced_relation_id_list
- get_foreign_key_connected_relations
2023-06-01 12:56:06 +03:00
ahmet gedemenli 8ace5a7af5 Use citus_drain_node with single shard tables 2023-05-31 14:01:52 +03:00
ahmet gedemenli ee42af7ad2 Add test for rebalancer with single shard tables 2023-05-31 11:48:49 +03:00
Teja Mupparti f9dbe7784b This commit adds a safety-net to the issue seen in #6785. The fix for the underlying issue will be in the PR#6943 2023-05-30 10:53:05 -07:00
Halil Ozan Akgül d99a5e2f62
Single Shard Table Tests for Shard Lock UDFs (#6944)
This PR adds single shard table tests for shard lock UDFs,
`shard_lock_metadata`, `shard_lock_resources`
2023-05-30 12:23:41 +03:00
Halil Ozan Akgül 5b54700b93
Single Shard Table Tests for Time Partitions (#6941)
This PR adds tests for time partitions UDFs and view with single shard
tables.
2023-05-29 14:18:56 +03:00
Halil Ozan Akgül 9d9b3817c1
Single Shard Table Columnar UDFs Tests (#6937)
Adds columnar UDF tests for single shard tables.
2023-05-29 13:53:00 +03:00
Halil Ozan Akgül 321fcfcdb5
Add Support for Single Shard Tables in update_distributed_table_colocation (#6924)
Adds Support for Single Shard Tables in
`update_distributed_table_colocation`.

This PR changes checks that make sure tables should be hash distributed
table to hash or single shard distributed tables.
2023-05-29 11:47:50 +03:00
Ahmet Gedemenli 1ca80813f6
Citus UDFs support for single shard tables (#6916)
Verify Citus UDFs work well with single shard tables

SUPPORTED
* citus_table_size
* citus_total_relation_size
* citus_relation_size
* citus_shard_sizes
* truncate_local_data_after_distributing_table
* create_distributed_function // test function colocated with a single
shard table
* undistribute_table
* alter_table_set_access_method

UNSUPPORTED - error out for single shard tables
* master_create_empty_shard
* create_distributed_table_concurrently
* create_distributed_table
* create_reference_table
* citus_add_local_table_to_metadata
* citus_split_shard_by_split_points
* alter_distributed_table
2023-05-26 17:30:05 +03:00
Onur Tirtir 246b054a7d
Add support for schema-based-sharding via a GUC (#6866)
DESCRIPTION: Adds citus.enable_schema_based_sharding GUC that allows
sharding the database based on schemas when enabled.

* Refactor the logic that automatically creates Citus managed tables 

* Refactor CreateSingleShardTable() to allow specifying colocation id
instead

* Add support for schema-based-sharding via a GUC

### What this PR is about:
Add **citus.enable_schema_based_sharding GUC** to enable schema-based
sharding. Each schema created while this GUC is ON will be considered
as a tenant schema. Later on, regardless of whether the GUC is ON or
OFF, any table created in a tenant schema will be converted to a
single shard distributed table (without a shard key). All the tenant
tables that belong to a particular schema will be co-located with each
other and will have a shard count of 1.

We introduce a new metadata table --pg_dist_tenant_schema-- to do the
bookkeeping for tenant schemas:
```sql
psql> \d pg_dist_tenant_schema
          Table "pg_catalog.pg_dist_tenant_schema"
┌───────────────┬─────────┬───────────┬──────────┬─────────┐
│    Column     │  Type   │ Collation │ Nullable │ Default │
├───────────────┼─────────┼───────────┼──────────┼─────────┤
│ schemaid      │ oid     │           │ not null │         │
│ colocationid  │ integer │           │ not null │         │
└───────────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "pg_dist_tenant_schema_pkey" PRIMARY KEY, btree (schemaid)
    "pg_dist_tenant_schema_unique_colocationid_index" UNIQUE, btree (colocationid)

psql> table pg_dist_tenant_schema;
┌───────────┬───────────────┐
│ schemaid  │ colocationid  │
├───────────┼───────────────┤
│     41963 │            91 │
│     41962 │            90 │
└───────────┴───────────────┘
(2 rows)
```

Colocation id column of pg_dist_tenant_schema can never be NULL even
for the tenant schemas that don't have a tenant table yet. This is
because, we assign colocation ids to tenant schemas as soon as they
are created. That way, we can keep associating tenant schemas with
particular colocation groups even if all the tenant tables of a tenant
schema are dropped and recreated later on.

When a tenant schema is dropped, we delete the corresponding row from
pg_dist_tenant_schema. In that case, we delete the corresponding
colocation group from pg_dist_colocation as well.

### Future work for 12.0 release:
We're building schema-based sharding on top of the infrastructure that
adds support for creating distributed tables without a shard key
(https://github.com/citusdata/citus/pull/6867).
However, not all the operations that can be done on distributed tables
without a shard key necessarily make sense (in the same way) in the
context of schema-based sharding. For example, we need to think about
what happens if user attempts altering schema of a tenant table. We
will tackle such scenarios in a future PR.

We will also add a new UDF --citus.schema_tenant_set() or such-- to
allow users to use an existing schema as a tenant schema, and another
one --citus.schema_tenant_unset() or such-- to stop using a schema as
a tenant schema in future PRs.
2023-05-26 10:49:58 +03:00
Halil Ozan Akgül 2c7beee562
Fix citus.tenant_stats_limit test by setting it to 2 (#6899)
citus.tenant_stats_limit was set to 2 when we were adding tests for it.
Then we changed it to 10, making the tests incorrect.
This PR fixes that without breaking other tests.
2023-05-23 17:44:07 +03:00
Jelte Fennema 350a0f6417
Support running Citus upgrade tests with run_test.py (#6832)
Citus upgrade tests require some additional logic to run, because we
have a before and after schedule and we need to swap the Citus
version in-between. This adds that logic to `run_test.py`.

In passing this makes running upgrade tests locally multiple times
faster by caching tarballs.
2023-05-23 14:38:54 +02:00
Emel Şimşek 02f815ce1f
Disable local execution when Explain Analyze is requested for a query. (#6892)
DESCRIPTION: Fixes a crash when explain analyze is requested for a query
that is normally locally executed.

When explain analyze is requested for a query, a task with two queries
is created. Those two queries are
    
1. Wrapped Query --> `SELECT ... FROM
worker_save_query_explain_analyze(<query>, <explain analyze options>)`
2. Fetch Query -->` SELECT explain_analyze_output, execution_duration
FROM worker_last_saved_explain_analyze();`

When the query is locally executed a task with multiple queries causes a
crash in production. See the Assert at
57455dc64d/src/backend/distributed/executor/tuple_destination.c#:~:text=Assert(task%2D%3EqueryCount%20%3D%3D%201)%3B

This becomes a critical issue when auto_explain extension is used. When
auto_explain extension is enabled, explain analyze is automatically
requested for every query.

One possible solution could be not to create two queries for a locally
executed query. The fetch part may not have to be a query since the
values are available in local variables.

Until we enable local execution for explain analyze, it is best to
disable local execution.

Fixes #6777.
2023-05-23 14:33:22 +03:00
Emel Şimşek f9a5be59b9
Run replicate_reference_tables background task as superuser. (#6930)
DESCRIPTION: Fixes a bug in background shard rebalancer where the
replicate reference tables task fails if the current user is not a
superuser.

This change is to be backported to earlier releases. We should fix the
permissions for replicate_reference_tables on main branch such that it
can be run by non-superuser roles.

Fixes #6925.
Fixes #6926.
2023-05-18 23:46:32 +03:00
Hanefi Onaldi 6a83290d91
Add ORDER BY clauses to some flaky tests (#6931)
I observed a flaky test output
[here](https://app.circleci.com/pipelines/github/citusdata/citus/32692/workflows/32464a22-7fd6-440a-9ff7-cfa62f9ff58a/jobs/1126144)
and added `ORDER BY` clauses to similar queries in the failing test
file.

```diff
 SELECT pg_identify_object_as_address(classid, objid, objsubid) from pg_catalog.pg_dist_object where objid IN('viewsc.prop_view3'::regclass::oid, 'viewsc.prop_view4'::regclass::oid);
   pg_identify_object_as_address  
 ---------------------------------
- (view,"{viewsc,prop_view3}",{})
  (view,"{viewsc,prop_view4}",{})
+ (view,"{viewsc,prop_view3}",{})
 (2 rows)
```
2023-05-18 12:45:39 +03:00
Onur Tirtir 8ff9dde4b3
Prevent pushing down INSERT .. SELECT queries that we shouldn't (and allow some more) (#6752)
Previously INSERT .. SELECT planner were pushing down some queries that should not be pushed down due to wrong colocation checks. It was checking whether one of the table in SELECT part and target table are colocated. But now, we check colocation for all tables in SELECT part and the target table.

Another problem with INSERT .. SELECT planner was that some queries, which is valid to be pushed down, were not pushed down due to unnecessary checks which are currently supported. e.g. UNION check. As solution, we reused the pushdown planner checks for INSERT .. SELECT planner.


DESCRIPTION: Fixes a bug that causes incorrectly pushing down some
INSERT .. SELECT queries that we shouldn't
DESCRIPTION: Prevents unnecessarily pulling the data into coordinator
for some INSERT .. SELECT queries
DESCRIPTION: Drops support for pushing down INSERT .. SELECT with append
table as target

Fixes #6749.
Fixes #1428.
Fixes #6920.

---------

Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
2023-05-17 15:05:08 +03:00
Onur Tirtir 56d217b108
Mark objects as distributed even when pg_dist_node is empty (#6900)
We mark objects as distributed objects in Citus metadata only if we need
to propagate given the command that creates it to worker nodes. For this
reason, we were not doing this for the objects that are created while
pg_dist_node is empty.

One implication of doing so is that we defer the schema propagation to
the time when user creates the first distributed table in the schema.
However, this doesn't help for schema-based sharding (#6866) because we
want to sync pg_dist_tenant_schema to the worker nodes even for empty
schemas too.

* Support test dependencies for isolation tests without a schedule

* Comment out a test due to a known issue (#6901)

* Also, reduce the verbosity for some log messages and make some
   tests compatible with run_test.py.
2023-05-16 11:45:42 +03:00
Onur Tirtir e7abde7e81
Prevent downgrades when there is a single-shard table in the cluster (#6908)
Also add a few tests for Citus/PG upgrade/downgrade scenarios.
2023-05-16 09:44:28 +02:00
Onur Tirtir 893ed416f1
Disable citus.enable_non_colocated_router_query_pushdown by default (#6909)
Fixes #6779.

DESCRIPTION: Disables citus.enable_non_colocated_router_query_pushdown
GUC by default to ensure generating a consistent distributed plan for
the queries that reference non-colocated distributed tables

We already have tests for the cases where this GUC is disabled,
so I'm not adding any more tests in this PR.

Also make multi_insert_select_window idempotent.

Related to: #6793
2023-05-15 12:07:50 +03:00
Jelte Fennema 07b8cd2634
Forward to existing emit_log_hook in our log hook (#6877)
DESCRIPTION: Forward to existing emit_log_hook in our log hook

This makes us work better with other extensions installed in Postgres.
Without this change we would overwrite their emit_log_hook, causing it
to never be called.

Fixes #6874
2023-05-09 16:55:56 +02:00
Ivan Kush e3c6b8a10e
Fix flaky clolumnar_permissions test (#6913)
As attr_num isn't ordered, order may be random. And regression test may
be failed.
This MR adds attr_num to ORDER BY


```
  3 --- /build/contrib/citus/src/test/regress/expected/columnar_permissions.out.modified    2023-05-05 11:13:44.926085432 +0000
  4 +++ /build/contrib/citus/src/test/regress/results/columnar_permissions.out.modified 2023-05-05 11:13:44.934085414 +0000
  5 @@ -124,24 +124,24 @@
  6    from columnar.chunk
  7    where relation in ('no_access'::regclass, 'columnar_permissions'::regclass)
  8    order by relation, stripe_num;
  9         relation       | stripe_num | attr_num | chunk_group_num | value_count
 10  ----------------------+------------+----------+-----------------+-------------
 11   no_access            |          1 |        1 |               0 |           1
 12   no_access            |          2 |        1 |               0 |           1
 13   no_access            |          3 |        1 |               0 |           1
 14   columnar_permissions |          1 |        1 |               0 |           1
 15   columnar_permissions |          1 |        2 |               0 |           1
 16 - columnar_permissions |          2 |        1 |               0 |           1
 17   columnar_permissions |          2 |        2 |               0 |           1
 18 - columnar_permissions |          3 |        1 |               0 |           1
 19 + columnar_permissions |          2 |        1 |               0 |           1
 20   columnar_permissions |          3 |        2 |               0 |           1
 21 + columnar_permissions |          3 |        1 |               0 |           1
 22   columnar_permissions |          4 |        1 |               0 |           1
 23   columnar_permissions |          4 |        2 |               0 |           1
 24  (11 rows)
```

Co-authored-by: Ivan Kush <ivan.kush@tantorlabs.ru>
2023-05-09 12:42:37 +02:00
Hanefi Onaldi 06e6f8e428
Normalize columnar version in tests (#6917)
When we bump columnar version, some tests fail because of the output
change. Instead of changing those lines every time, I think it is better
to normalize it in tests.
2023-05-08 16:10:55 +03:00
Naisila Puka 905fd46410
Fixes flakiness in background_rebalance_parallel test (#6910)
Fixes the following flaky outputs by decreasing citus_task_wait loop
interval, and changing the order of wait commands.

https://app.circleci.com/pipelines/github/citusdata/citus/32102/workflows/19958297-6c7e-49ef-9bc2-8efe8aacb96f/jobs/1089589

``` diff
SELECT job_id, task_id, status, nodes_involved
 FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id;
  job_id | task_id |  status  | nodes_involved 
 --------+---------+----------+----------------
   17779 |    1013 | done     | {50,56}
   17779 |    1014 | running  | {50,57}
-  17779 |    1015 | running  | {50,56}
-  17779 |    1016 | blocked  | {50,57}
+  17779 |    1015 | done     | {50,56}
+  17779 |    1016 | running  | {50,57}
   17779 |    1017 | runnable | {50,56}
   17779 |    1018 | blocked  | {50,57}
   17779 |    1019 | runnable | {50,56}
   17779 |    1020 | blocked  | {50,57}
 (8 rows)
```

https://github.com/citusdata/citus/pull/6893#issuecomment-1525661408
```diff
SELECT job_id, task_id, status, nodes_involved
 FROM pg_dist_background_task WHERE job_id in (:job_id) ORDER BY task_id;
  job_id | task_id |  status  | nodes_involved 
 --------+---------+----------+----------------
   17779 |    1013 | done     | {50,56}
-  17779 |    1014 | running  | {50,57}
+  17779 |    1014 | runnable | {50,57}
   17779 |    1015 | running  | {50,56}
   17779 |    1016 | blocked  | {50,57}
   17779 |    1017 | runnable | {50,56}
   17779 |    1018 | blocked  | {50,57}
   17779 |    1019 | runnable | {50,56}
   17779 |    1020 | blocked  | {50,57}
 (8 rows)
```
2023-05-05 16:47:01 +03:00
Hanefi Onaldi 3217e3f181
Fix flaky background rebalance parallel test (#6893)
A test in background_rebalance_parallel.sql was failing intermittently
where the order of tasks in the output was not deterministic. This
commit fixes the test by removing id columns for the background tasks in
the output.

A sample failing diff before this patch is below:

```diff
 SELECT D.task_id,
        (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.task_id),
        D.depends_on,
        (SELECT T.command FROM pg_dist_background_task T WHERE T.task_id = D.depends_on)
 FROM pg_dist_background_task_depend D  WHERE job_id in (:job_id) ORDER BY D.task_id, D.depends_on ASC;
  task_id |                               command                               | depends_on |                               command
 ---------+---------------------------------------------------------------------+------------+---------------------------------------------------------------------
-    1014 | SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') |       1013 | SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto')
-    1016 | SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') |       1015 | SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto')
-    1018 | SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') |       1017 | SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto')
-    1020 | SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') |       1019 | SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto')
+    1014 | SELECT pg_catalog.citus_move_shard_placement(85674038,50,57,'auto') |       1013 | SELECT pg_catalog.citus_move_shard_placement(85674037,50,56,'auto')
+    1016 | SELECT pg_catalog.citus_move_shard_placement(85674044,50,57,'auto') |       1015 | SELECT pg_catalog.citus_move_shard_placement(85674043,50,56,'auto')
+    1018 | SELECT pg_catalog.citus_move_shard_placement(85674026,50,57,'auto') |       1017 | SELECT pg_catalog.citus_move_shard_placement(85674025,50,56,'auto')
+    1020 | SELECT pg_catalog.citus_move_shard_placement(85674032,50,57,'auto') |       1019 | SELECT pg_catalog.citus_move_shard_placement(85674031,50,56,'auto')
 (4 rows)
```

Notice that the dependent and dependee tasks have some commands, but
they have different task ids.
2023-05-05 12:07:46 +03:00
Teja Mupparti b58665773b Move all pre-15-defined routines to the bottom of the file 2023-05-04 10:07:08 -07:00
Naisila Puka 072ae44742
Adjusts query's CoerceViaIO & RelabelType nodes that are improper for deparsing (#6391)
Adjusts query's CoerceViaIO & RelabelType nodes that are
improper for deparsing

The standard planner converts some `::text` casts to `::cstring` and
here we convert back because `cstring` is a pseudotype and it cannot be
casted to most types. This problem occurs in CoerceViaIO nodes.
There was another problem with RelabelType nodes fixed in the following
PR:
https://github.com/citusdata/citus/pull/4580
We undo the changes in that PR, and fix both CoerceViaIO and RelabelType
nodes in the planning phase (not in the deparsing phase in ruleutils)

Fixes https://github.com/citusdata/citus/issues/5646
Fixes https://github.com/citusdata/citus/issues/5033
Fixes https://github.com/citusdata/citus/issues/6061
2023-05-04 16:46:02 +03:00
Ahmet Gedemenli 4321286005 Disable master_create_empty_shard udf for single shard tables (#6902) 2023-05-03 17:02:43 +03:00
Onur Tirtir db2514ef78 Call null-shard-key tables as single-shard distributed tables in code 2023-05-03 17:02:43 +03:00
Onur Tirtir 39b7711527 Add support for more pushable / non-pushable insert .. select queries with null-shard-key tables (#6823)
* Add support for dist insert select by selecting from a reference
table.
  
  This was the only pushable insert .. select case that
  #6773 didn't cover.

* For the cases where we insert into a Citus table but the INSERT ..
SELECT
  query cannot be pushed down, allow pull-to-coordinator when possible.

  Remove the checks that we had at the very beginning of
  CreateInsertSelectPlanInternal so that we can try insert .. select via
  pull-to-coordinator for the cases where we cannot push-down the insert
  .. select query. What we support via pull-to-coordinator is still
  limited due to lacking of logical planner support for SELECT queries,
but this commit at least allows using pull-to-coordinator for the cases
  where the select query can be planned via router planner, without
  limiting ourselves to restrictive top-level checks.

  Also introduce some additional restrictions into
CreateDistributedInsertSelectPlan for the cases it was missing to check
  for null-shard-key tables. Indeed, it would make more sense to have
those checks for distributed tables in general, via separate PRs against
  main branch. See https://github.com/citusdata/citus/pull/6817.

* Add support for inserting into a Postgres table.
2023-05-03 16:24:20 +03:00
Onur Tirtir 85745b46d5 Add initial sql support for distributed tables that don't have a shard key (#6773/#6822)
Enable router planner and a limited version of INSERT .. SELECT planner
for the queries that reference colocated null shard key tables.

* SELECT / UPDATE / DELETE / MERGE is supported as long as it's a router
query.
* INSERT .. SELECT is supported as long as it only references colocated
  null shard key tables.

Note that this is not only limited to distributed INSERT .. SELECT but
also
covers a limited set of query types that require pull-to-coordinator,
e.g.,
  due to LIMIT clause, generate_series() etc. ...
(Ideally distributed INSERT .. SELECT could handle such queries too,
e.g.,
when we're only referencing tables that don't have a shard key, but
today
this is not the case. See
https://github.com/citusdata/citus/pull/6773#discussion_r1140130562.
2023-05-03 16:24:20 +03:00
Onur Tirtir ac0ffc9839 Add a config for arbitrary config tests where all the tables are null-shard-key tables (#6783/#6788) 2023-05-03 16:18:27 +03:00
Ahmet Gedemenli cdf54ff4b1 Add DDL support null-shard-key tables(#6778/#6784/#6787/#6859)
Add tests for ddl coverage:
* indexes
* partitioned tables + indexes with long names
* triggers
* foreign keys
* statistics
* grant & revoke statements
* truncate & vacuum
* create/test/drop view that depends on a dist table with no shard key
* policy & rls test

* alter table add/drop/alter_type column (using sequences/different data
  types/identity columns)
* alter table add constraint (not null, check, exclusion constraint)
* alter table add column with a default value / set default / drop
  default
* alter table set option (autovacuum)

* indexes / constraints without names
* multiple subcommands

Adds support for
* Creating new partitions after distributing (with null key) the parent
table
* Attaching partitions to a distributed table with null distribution key
(and automatically distribute the new partition with null key as well)
* Detaching partitions from it
2023-05-03 16:18:27 +03:00
Onur Tirtir fa467e05e7 Add support for creating distributed tables with a null shard key (#6745)
With this PR, we allow creating distributed tables with without
specifying a shard key via create_distributed_table(). Here are the
the important details about those tables:
* Specifying `shard_count` is not allowed because it is assumed to be 1.
* We mostly call such tables as "null shard-key" table in code /
comments.
* To avoid doing a breaking layout change in create_distributed_table();
instead of throwing an error, it will inform the user that
`distribution_type`
  param is ignored unless it's explicitly set to NULL or  'h'.
* `colocate_with` param allows colocating such null shard-key tables to
  each other.
* We define this table type, i.e., NULL_SHARD_KEY_TABLE, as a subclass
of
  DISTRIBUTED_TABLE because we mostly want to treat them as distributed
  tables in terms of SQL / DDL / operation support.
* Metadata for such tables look like:
  - distribution method => DISTRIBUTE_BY_NONE
  - replication model => REPLICATION_MODEL_STREAMING
- colocation id => **!=** INVALID_COLOCATION_ID (distinguishes from
Citus local tables)
* We assign colocation groups for such tables to different nodes in a
  round-robin fashion based on the modulo of "colocation id".

Note that this PR doesn't care about DDL (except CREATE TABLE) / SQL /
operation (i.e., Citus UDFs) support for such tables but adds a
preliminary
API.
2023-05-03 16:18:27 +03:00
aykut-bozkurt 2d005ac777
Query Generator Seed (#6883)
- Give seed number as argument to query generator to reproduce a
previous run.
- Expose the difference between results, if any, as artifact on CI.
2023-05-03 15:54:11 +03:00
Teja Mupparti e444dd4f3f MERGE: Support reference table as source with local table as target 2023-05-02 11:37:29 -07:00
Hanefi Onaldi efd41e8ea5
Bump columnar to 11.3 (#6898)
When working on changelog, Marco suggested in
https://github.com/citusdata/citus/pull/6856#pullrequestreview-1386601215
that we should bump columnar version to 11.3 as well.

This PR aims to contain all the necessary changes to allow upgrades to
and downgrades from 11.3.0 for columnar. Note that updating citus
extension version does not affect columnar as the two extension versions
are not really coupled.

The same changes will also be applied to the release branch in
https://github.com/citusdata/citus/pull/6897
2023-05-02 11:58:32 +03:00
Ahmet Gedemenli 59ccf364df
Ignore nodes not allowed for shards, when planning rebalance steps (#6887)
We are handling colocation groups with shard group count less than the
worker node count, using a method different than the usual rebalancer.
See #6739
While making the decision of using this method or not, we should've
ignored the nodes that are marked `shouldhaveshards = false`. This PR
excludes those nodes when making the decision.

Adds a test such that:
 coordinator: []
 worker 1: [1_1, 1_2]
 worker 2: [2_1, 2_2]
(rebalance)
 coordinator: []
 worker 1: [1_1, 2_1]
 worker 2: [1_2, 2_2]

If we take the coordinator into account, the rebalancer considers the
first state as balanced and does nothing (because shard_count <
worker_count)
But with this pr, we ignore the coordinator because it's
shouldhaveshards = false
So the rebalancer distributes each colocation group to both workers

Also, fixes an unrelated flaky test in the same file
2023-05-01 12:21:08 +02:00
aykut-bozkurt 8cb69cfd13
break sequence dependency during table creation (#6889)
We need to break sequence dependency for a table while creating the
table during non-transactional metadata sync to ensure idempotency of
the creation of the table.

**Problem:**
When we send `SELECT
pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)
FROM pg_dist_partition` to workers during the non-transactional sync,
table might not be in `pg_dist_partition` at worker, and sequence
dependency is not broken at the worker.

**Solution:** 
We break sequence dependency via `SELECT
pg_catalog.worker_drop_sequence_dependency(logicalrelid::regclass::text)`
for each table while creating it at the workers. It is safe to send
since the udf is a no-op when there is no sequence dependency.

DESCRIPTION: Fixes a bug related to sequence idempotency at
non-transactional sync.

Fixes https://github.com/citusdata/citus/issues/6888.
2023-04-28 15:09:09 +03:00
aykut-bozkurt a7fa1db696
fix flaky test regex (#6890)
There was a bug related to regex. We sometimes caught the wrong line
when the test name is also included in comments.
Example: We caught the wrong line as multi_metadata_sync is included in
the comment before the test line.

```
# ----------
# multi_metadata_sync tests the propagation of mx-related metadata changes to metadata workers
# multi_unsupported_worker_operations tests that unsupported operations error out on metadata workers
# ----------
test: multi_metadata_sync
```

Solution: Restrict regex rule better.
2023-04-27 13:14:40 +03:00
Jelte Fennema a5f4fece13
Fix running PG upgrade tests with run_test.py (#6829)
In #6814 we started using the Python test runner for upgrade tests in
run_test.py, instead of the Perl based one. This had a problem though,
not all tests in minimal_schedule can be run with the Python runner.
This adds a separate minimal schedule for the pg_upgrade tests which
doesn't include the tests that break with the Python runner.

This PR also fixes various other issues that came up while testing
the upgrade tests.
2023-04-24 15:54:32 +02:00
aykut-bozkurt a6a7271e63
Query generator test tool (#6686)
- Query generator is used to create queries, allowed by the grammar which is documented at `query_generator/query_gen.py` (currently contains only joins). 
- This PR adds a CI test which utilizes the query generator to compare the results of generated queries that are executed on Citus tables and local (undistributed) tables. It fails if there is an unexpected error at results. The error can be related to Citus, the query generator, or even Postgres.
- The tool is configured by the file `query_generator/config/config.yaml`, which limits table counts at generated queries and sets many table related parameters (e.g. row count).
- Run time of the CI task can be configured from the config file. By default, we run 250 queries with maximum table count of 40 inside each query.
2023-04-23 20:28:26 +03:00
aykut-bozkurt 08e2820c67
skip restriction clause if it contains placeholdervar (#6857)
`PlaceHolderVar` is not relevant to be processed inside a restriction
clause. Otherwise, `pull_var_clause_default` would throw error. PG would
create the restriction to physical `Var` that `PlaceHolderVar` points to
anyway, so it is safe to skip this restriction.

DESCRIPTION: Fixes a bug related to WHERE clause list which contains
placeholder.

Fixes https://github.com/citusdata/citus/issues/6758
2023-04-17 18:14:01 +03:00
Emel Şimşek 2675a68218
Make coordinator always in metadata by default in regression tests. (#6847)
DESCRIPTION: Changes the regression test setups adding the coordinator
to metadata by default.

When creating a Citus cluster, coordinator can be added in metadata
explicitly by running `citus_set_coordinator_host ` function. Adding the
coordinator to metadata allows to create citus managed local tables.
Other Citus functionality is expected to be unaffected.

This change adds the coordinator to metadata by default when creating
test clusters in regression tests.

There are 3 ways to run commands in a sql file (or a schedule which is a
sequence of sql files) with Citus regression tests. Below is how this PR
adds the coordinator to metadata for each.

1. `make <schedule_name>`
Changed the sql files (sql/multi_cluster_management.sql and
sql/minimal_cluster_management.sql) which sets up the test clusters such
that they call `citus_set_coordinator_host`. This ensures any following
tests will have the coordinator in metadata by default.
 
2. `citus_tests/run_test.py <sql_file_name>`
Changed the python code that sets up the cluster to always call `
citus_set_coordinator_host`.
For the upgrade tests, a version check is included to make sure
`citus_set_coordinator_host` function is available for a given version.

3. ` make check-arbitrary-configs  `     
Changed the python code that sets up the cluster to always call
`citus_set_coordinator_host `.

#6864 will be used to track the remaining work which is to change the
tests where coordinator is added/removed as a node.
2023-04-17 14:14:37 +03:00
Gokhan Gulbiz 8782ea1582
Ensure partitionKeyValue and colocationId are set for proper tenant stats gathering (#6834)
This PR updates the tenant stats implementation to set partitionKeyValue
and colocationId in ExecuteLocalTaskListExtended, in addition to
LocallyExecuteTaskPlan. This ensures that tenant stats can be properly
gathered regardless of the code path taken. The changes were initially
made while testing stored procedure calls for tenant stats.
2023-04-17 09:35:26 +03:00
Onur Tirtir f87a2d02b0
Move the common logic related to creating a Citus table down to CreateCitusTable (#6836)
.. rather than having it in user facing functions. That way, we
can use the same logic for creating Citus tables from other places
too.

This would be useful for creating tenant tables via a simple function
call in the utility hook, for schema-based sharding purposes.
2023-04-14 16:13:39 +03:00
aykut-bozkurt 3286ec59e9
fix 3 flaky tests in failure schedule (#6846)
Fixed 3 flaky tests in failure tests which caused flakiness in other
tests due to changed node and group sequence ids during node
addition-removal.
2023-04-13 13:13:28 +03:00
Halil Ozan Akgül 9ba70696f7
Add CPU usage to citus_stat_tenants (#6844)
This PR adds CPU usage to `citus_stat_tenants` monitor.
CPU usage is tracked in periods, similar to query counts.
2023-04-12 16:23:00 +03:00
Emel Şimşek e7a25d82c9
When creating a HTAB we need to use HASH_COMPARE flag in order to set a user defined comparison function. (#6845)
DESCRIPTION: Fixes memory errors, caught by valgrind, of type
"conditional jump or move depends on uninitialized value"

When running Citus tests under Postgres with valgrind, the test cases
calling into `NonBlockingShardSplit` function produce valgrind errors of
type "conditional jump or move depends on uninitialized value".

The issue is caused by creating a HTAB in a wrong way. HASH_COMPARE flag
should have been used when creating a HTAB with user defined comparison
function. In the absence of HASH_COMPARE flag, HTAB falls back into
built-in string comparison function. However, valgrind somehow discovers
that the match function is not assigned to the user defined function as
intended.

Fixes #6835
2023-04-11 21:24:33 +03:00
Halil Ozan Akgül 8b50e95dc8
Fix citus_stat_tenants period updating bug (#6833)
Fixes the bug that causes updating the citus_stat_tenants periods
incorrectly.

`TimestampDifferenceExceeds` expects the difference in milliseconds but
it was microseconds, this is fixed.
`tenantStats->lastQueryTime` was updated during monitoring too, now it's
updated only when there are tenant queries.
2023-04-11 17:40:07 +03:00
aykut-bozkurt a20f7e1a55
fixes update propagation bug when `citus_set_coordinator_host` is called more than once (#6837)
DESCRIPTION: Fixes update propagation bug when
`citus_set_coordinator_host` is called more than once.

Fixes https://github.com/citusdata/citus/issues/6731.
2023-04-11 11:27:16 +03:00