Commit Graph

2003 Commits (10603ed5d45ad435b3d3122146f67b7de95b4d06)

Author SHA1 Message Date
Hanefi Onaldi 5cfcc63308
Add warning messages for cluster commands on partitioned tables (#6306)
PG15 introduces `CLUSTER` commands for partitioned tables. Similar to a
`CLUSTER` command with no supplied table names, these commands also can
not be run inside transaction blocks and therefore can not be propagated
in a distributed transaction block with ease. Therefore we raise warnings.

Relevant PG commit: cfdd03f45e6afc632fbe70519250ec19167d6765
2022-09-13 00:05:58 +03:00
Hanefi Onaldi 164f2fa0a6
PG15: Add support for NULLS NOT DISTINCT (#6308)
Relevant PG commit: 94aa7cc5f707712f592885995a28e018c7c80488
2022-09-12 23:47:37 +03:00
Marco Slot b79111527e
Avoid blocking writes in create_distributed_table_concurrently (#6324)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-12 12:09:37 -07:00
Nils Dijk cda3686d86
Feature: run rebalancer in the background (#6215)
DESCRIPTION: Add a rebalancer that uses background tasks for its
execution

Based on the baclground jobs and tasks introduced in #6296 we implement
a new rebalancer on top of the primitives of background execution. This
allows the user to initiate a rebalance and let Citus execute the long
running steps in the background until completion.

Users can invoke the new background rebalancer with `SELECT
citus_rebalance_start();`. It will output information on its job id and
how to track progress. Also it returns its job id for automation
purposes. If you simply want to wait till the rebalance is done you can
use `SELECT citus_rebalance_wait();`

A running rebalance can be canelled/stopped with `SELECT
citus_rebalance_stop();`.
2022-09-12 20:46:53 +03:00
Marco Slot 48f7d6c279
Show local managed tables in citus_tables and hide tables owned by extensions (#6321)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-12 17:49:17 +03:00
Marco Slot b036e44aa4
Fix bug preventing isolate_tenant_to_new_shard with text column (#6320)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-12 16:29:57 +02:00
naisila 47bea76c6c Revert "Support JSON_TABLE on PG 15 (#6241)"
This reverts commit 1f4fe35512.
2022-09-12 15:20:17 +03:00
Onder Kalaci 36f8c48560 Add tests for allowing SET NULL/DEFAULT for subseet of columns
PG 15 added support for that (d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a).

We also add support, but we already do not support ON DELETE SET NULL/DEFAULT
for distribution column. So, in essence, we add support for reference tables
and Citus local tables.
2022-09-12 13:56:09 +03:00
Marco Slot 2e943a64a0
Make shard moves more idempotent (#6313)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-09 18:21:36 +02:00
Jelte Fennema a2d86214b2
Share more replication code between moves and splits (#6310)
The logical replication catchup part for shard splits and shard moves is
very similar. This abstracts most of that similarity away into a single
function. This also improves the logic for non blocking shard splits a
bit, by using faster foreign key creation. It also parallelizes index creation
which shard moves were already doing, but shard splits did not.
2022-09-09 16:45:38 +02:00
Marco Slot ba2fe3e3c4
Remove do_repair option from citus_copy_shard_placement (#6299)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-09 15:44:30 +02:00
Nils Dijk 00a94c7f13
Implement infrastructure to run sql jobs in the background (#6296)
DESCRIPTION: Add infrastructure to run long running management operations in background

This infrastructure introduces the primitives of jobs and tasks.
A task consists of a sql statement and an owner. Tasks belong to a
Job and can depend on other tasks from the same job.

When there are either runnable or running tasks we would like to
make sure a bacgrkound task queue monitor process is running. A Task
could be in running state while there is actually no monitor present
due to a database restart or failover. Once the monitor starts it
will reset any running task to its runnable state.

To make sure only one background task queue monitor is ever running
at once it will acquire an advisory lock that self conflicts.

Once a task is done it will find all tasks depending on this task.
After checking that the task doesn't have unmet dependencies it will
transition the task from blocked to runnable state for the task to
be picked up on a subsequent task start.

Currently only one task can be running at a time. This can be
improved upon in later releases without changes to the higher level
API.

The initial goal for this background tasks is to allow a rebalance
to run in the background. This will be implemented in a subsequent PR.
2022-09-09 16:11:19 +03:00
Jelte Fennema 76137e967f
Create all foreign keys quickly at the end of a shard move (#6148)
Previously we would create foreign keys to reference table in an extra
fast way at the end of a shard move. This uses that same logic to also
do it for foreign keys between distributed tables.

Fixes #6141
2022-09-09 09:58:33 +02:00
Ahmet Gedemenli eadc88a800
Introduce GUC citus.skip_constraint_validation (#6281)
Introduces a new GUC named citus.skip_constraint_validation, which basically skips constraint validation when set to on.
For some several places that we hack to skip the foreign key validation phase, now we use this GUC.
2022-09-08 18:13:18 +03:00
Hanefi Onaldi a557a196aa
Add tests for numeric with scale greater than precision 2022-09-07 13:12:04 +03:00
Hanefi Onaldi 4db113496f
Add tests for new COPY features in PG15 2022-09-07 13:12:04 +03:00
Hanefi Onaldi 3e4e42253f
Add tests for new regexp sql functions 2022-09-07 13:12:04 +03:00
Nitish Upreti d7404a9446
'Deferred Drop' and robust 'Shard Cleanup' for Splits. (#6258)
DESCRIPTION:
This PR adds support for 'Deferred Drop' and robust 'Shard Cleanup' for Splits.

Common Infrastructure
This PR introduces new common infrastructure so as any operation that wants robust cleanup of resources can register with the cleaner and have the resources cleaned appropriately based on a specified policy. 'Shard Split' is the first consumer using this new infrastructure.
Note : We only support adding 'shards' as resources to be cleaned-up right now but the framework will be extended to support other resources in future.

Deferred Drop for Split
Deferred Drop Support ensures that shards undergoing split are not dropped inline as part of operation but dropped later when no active read queries are running on shard. This helps with :

Avoids any potential deadlock scenarios that can cause long running Split operation to rollback.
Avoids Split operation blocking writes and then getting blocked (due to running queries on the shard) when trying to drop shards.
Deferred drop is the new default behavior going forward.
Shard Cleaner Extension
Shard Cleaner is a background task responsible for deferred drops in case of 'Move' operations.
The cleaner has been extended to ensure robust cleanup of shards (dummy shards and split children) in case of a failure based on the new infrastructure mentioned above. The cleaner also handles deferred drop for 'Splits'.

TESTING:
New test ''citus_split_shard_by_split_points_deferred_drop' to test deferred drop support.
New test 'failure_split_cleanup' to test shard cleanup with failures in different stages.
Update 'isolation_blocking_shard_split and isolation_non_blocking_shard_split' for deferred drop.
Added non-deferred drop version of existing tests : 'citus_split_shard_no_deferred_drop' and 'citus_non_blocking_splits_no_deferred_drop'
2022-09-06 12:11:20 -07:00
Gokhan Gulbiz ac96370ddf
Use IsMultiStatementTransaction for SELECT .. FOR UPDATE queries (#6288)
* Use IsMultiStatementTransaction instead of IsTransaction for row-locking operations.

* Add regression test for SELECT..FOR UPDATE statement
2022-09-06 16:38:41 +02:00
Emel Şimşek 6f06ff78cc
Throw an error if there is a RangeTblEntry that is not assigned an RTE identity. (#6295)
* Fix issue : 6109 Segfault or (assertion failure) is possible when using a SQL function

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases.
This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases. This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

Co-authored-by: Emel Simsek <emel.simsek@microsoft.com>
2022-09-06 15:46:41 +02:00
Hanefi Onaldi 85b19c851a
Disallow distributing by numeric with negative scale
PG15 allows numeric scale to be negative or greater than precision. This
causes issues and we may end up routing queries to a wrong shard due to
differing hash results after rounding.

Formerly, when specifying NUMERIC(precision, scale), the scale had to be
in the range [0, precision], which was per SQL spec. PG15 extends the
range of allowed scales to [-1000, 1000].

A negative scale implies rounding before the decimal point. For
example, a column might be declared with a scale of -3 to round values
to the nearest thousand. Note that the display scale remains
non-negative, so in this case the display scale will be zero, and all
digits before the decimal point will be displayed.

Relevant PG commit: 085f931f52494e1f304e35571924efa6fcdc2b44
2022-09-06 12:40:56 +03:00
Naisila Puka d7f41cacbe
Prohibit renaming child trigger on distributed partition pre PG15 (#6290)
Pre PG15, renaming child triggers on partitions is allowed. When
creating a trigger in a distributed parent partitioned table, the
triggers on the shards of the partitions have the same name with
the triggers on the corresponding parent shards of the parent
table. Therefore, they don't have the same appended shard id as
the shard id of the partition. Hence, when trying to rename a
child trigger on a partition of a distributed table, we can't
correctly find the triggers on the shards of the partition in
order to rename them since we append a different shard id to the
name of the trigger. Since we can't find the trigger we get a
misleading error of inexistent trigger.

In this commit we prohibit renaming child triggers on distributed
partitions altogether.
2022-09-06 12:19:25 +03:00
Naisila Puka fd9b3f4ae9
Add tests to make sure distributed clone trigger rename fails in PG15 (#6291)
Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866
2022-09-06 11:04:14 +03:00
Marco Slot e6b1845931
Change split logic to avoid EnsureReferenceTablesExistOnAllNodesExtended (#6208)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-05 22:02:18 +02:00
Önder Kalacı bd13836648
Add citus.skip_advisory_lock_permission_checks (#6293) 2022-09-05 17:47:41 +02:00
Ahmet Gedemenli 7c8cc7fc61
Fix flakiness for view tests (#6284) 2022-09-02 10:12:07 +03:00
Marco Slot 432f399a5d
Allow citus_internal application_name with additional suffix (#6282)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-01 14:26:43 +02:00
Naisila Puka 9e2b96caa5
Add pg14->pg15 upgrade test for dist. triggers on part. tables (#6265)
PRE PG15, Renaming the parent triggers on partitioned tables doesn't
recurse to renaming the child triggers on the partitions as well.
In PG15, Renaming triggers on partitioned tables
recurses to renaming the triggers on the partitions as well.

Add an upgrade test to make sure we are not breaking anything
with distributed triggers on distributed partitioned tables.

Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866
2022-09-01 12:32:44 +03:00
Naisila Puka 98dcbeb304
Specifies that our CustomScan providers support projections (#6244)
Before, this was the default mode for CustomScan providers.
Now, the default is to assume that they can't project.
This causes performance penalties due to adding unnecessary
Result nodes.

Hence we use the newly added flag, CUSTOMPATH_SUPPORT_PROJECTION
to get it back to how it was.

In PG15 support branch we created explain functions to ignore
the new Result nodes, so we undo that in this commit.

Relevant PG commit:
955b3e0f9269639fb916cee3dea37aee50b82df0
2022-08-31 10:48:01 +03:00
Jelte Fennema 24e695ca27
Fix flakyness in multi_utilities (#6272)
Sometimes in CI our multi_utilities test fails like this:
```diff
 VACUUM (INDEX_CLEANUP ON, PARALLEL 1) local_vacuum_table;
 SELECT CASE WHEN s BETWEEN 20000000 AND 25000000 THEN 22500000 ELSE s END size
 FROM pg_total_relation_size('local_vacuum_table') s ;
    size
 ----------
- 22500000
+ 39518208
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26641/workflows/5caea99c-9f58-4baa-839a-805aea714628/jobs/762870

Apparently VACUUM is not as reliable in cleaning up as we thought. This
increases the range of allowed values. Important to note is that the
range is still completely outside of the allowed range of the initial
size. So we know for sure that some data was cleaned up.
2022-08-30 14:32:34 -07:00
Jelte Fennema f22a47981a
Fix flakyness in adaptive_executor (#6275)
Sometimes in CI our adaptive_executor test would fail randomly with the
following error:

```diff
 SELECT sum(result::bigint) FROM run_command_on_workers($$
   SELECT count(*) FROM pg_stat_activity
   WHERE pid <> pg_backend_pid() AND query LIKE '%8010090%'
 $$);
  sum
 -----
-   4
+   2
 (1 row)

 END;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26665/workflows/40665680-0044-4852-8fe4-5fd628f9fb47/jobs/764371

This means that the low slow start interval did not have any effect on
the number of connections being opened. I could see two possibilities
for this to happen:
1. CI was slow and actually doing the start of the second connection. I
   tried to solve this by doubling the time a query to the worker takes.
2. The second option is that the shards were queried in the oposite
   order than we expect. This would mean that the first query to the
   worker completes quickly because there's no, sleep because it doesn't
   contain any rows. I tried to solve this option by adding a row to
   each shard.

After trying to reproduce the random failure in CI it turned out that I
needed both of these fixes to resolve the random failure.
2022-08-30 23:23:30 +02:00
Jelte Fennema 8354853dec
Fix flakyness in citus_split_shard_columnar_partitioned (#6273)
On CI our citus_split_shard_columnar_partitioned test would sometimes
randomly fail like this:
```diff
  8970008 | colocated_dist_table                   | -2147483648   | 2147483647    | localhost |    57637
  8970009 | colocated_partitioned_table            | -2147483648   | 2147483647    | localhost |    57637
  8970010 | colocated_partitioned_table_2020_01_01 | -2147483648   | 2147483647    | localhost |    57637
- 8970011 | reference_table                        |               |               | localhost |    57637
  8970011 | reference_table                        |               |               | localhost |    57638
+ 8970011 | reference_table                        |               |               | localhost |    57637
 (13 rows)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26651/workflows/f695b4fb-ad81-46ff-b97e-0100e5d167ea/jobs/763517

This is a harmless diff due to a missing column in the order by list.
This fixes that by adding the nodeport as a tiebreaker.
2022-08-30 19:54:50 +03:00
Marco Slot 6bb31c5d75
Add non-blocking variant of create_distributed_table (#6087)
Added create_distributed_table_concurrently which is nonblocking variant of create_distributed_table.

It bases on the split API which takes advantage of logical replication to support nonblocking split operations.

Co-authored-by: Marco Slot <marco.slot@gmail.com>
Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
2022-08-30 15:35:40 +03:00
Önder Kalacı 33af407ac8
Add missing orderbys (#6271) 2022-08-30 12:49:15 +03:00
Jelte Fennema 895a484b39
Hopefully fix flakyeness in drop_partitioned_table (#6270)
Sometimes in CI our drop_partitioned_talbe test would fail with the
following error:

```diff
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing DROP TABLE IF EXISTS drop_partitioned_table.child1_727001 CASCADE
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
 ROLLBACK;
 NOTICE:  issuing ROLLBACK
 NOTICE:  issuing ROLLBACK
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26631/workflows/31536032-e1ba-493b-b12a-f40757f3a7d6/jobs/762170

For some reason the colocationid of the distributed partitioned table
would be one less than we expected. Why this happens I'm not sure, but
it seems fairly harmless that it does.

In an attempt to work around this flakyness I now reset the colocation
id sequence right before creating the table in question. This is good
practice in general, because it allows us to run the test successfully
using `check-minimal` and it also allows us to rerun it multiple times.
2022-08-30 12:21:16 +03:00
Naisila Puka 13fe89f018
Fixes flakyness in columnar_permissions test (#6266)
`columnar_permissions.sql` test is flaky due to a missing `ORDER BY` clauses.
Added the other `ORDER BY` clauses for consistency in the test.

```diff
   where relation in ('no_access'::regclass, 'columnar_permissions'::regclass);
        relation       | chunk_group_row_limit | stripe_row_limit | compression | compression_level 
 ----------------------+-----------------------+------------------+-------------+-------------------
- no_access            |                 10000 |           150000 | zstd        |                 3
  columnar_permissions |                 10000 |             2222 | none        |                 3
+ no_access            |                 10000 |           150000 | zstd        |                 3
 (2 rows)
```

Source: https://app.circleci.com/pipelines/github/citusdata/citus/26610/workflows/79f03ef9-7674-4567-a087-02536c9ddf04/jobs/760942
2022-08-29 14:33:26 +02:00
Ahmet Gedemenli 0855a9d1d4
Use SUM for calculating non partitioned table sizes (#6222)
We currently do a `pg_relation_total_size('t1') + pg_relation_total_size('t2') + ..` on shard lists, especially when rebalancing the shards. This in some cases goes huge. With this PR, we basically use a SUM for all table sizes, instead of using thousands of pluses.
2022-08-26 18:02:14 +03:00
Sameer Awasekar 4df8eca77f
Add worker_split_shard_release_dsm udf to release dynamic shared memory (#6248)
The code introduces worker_split_shard_release_dsm udf to release the dynamic shared memory segment allocated during non-blocking split workflow.
2022-08-26 18:27:32 +05:30
Jelte Fennema 77dd49fcf8
Fix flakyness in failure_online_move_shard_placement (#6250)
Sometimes in CI failure_online_move_shard_placement fails with the
following error:
```diff
 SELECT citus.mitmproxy('conn.onQuery(query="^ALTER SUBSCRIPTION .* ENABLE").cancel(' || :pid || ')');
  mitmproxy
 -----------

 (1 row)

 SELECT master_move_shard_placement(101, 'localhost', :worker_1_port, 'localhost', :worker_2_proxy_port);
-ERROR:  canceling statement due to user request
+ERROR:  tuple concurrently updated
+CONTEXT:  while executing command on localhost:9060
 -- failure on polling subscription state
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26441/workflows/dd6e3475-6121-47b3-aea3-4ac92be114f4/jobs/751476/steps

This error is not completely harmless, because based on the logs it mean
that our cleanup logic failed, which in turn means that replication
slots are left around:
```
2022-08-24 16:01:29.247 UTC [1219] ERROR:  XX000: tuple concurrently updated
2022-08-24 16:01:29.247 UTC [1219] LOCATION:  simple_heap_update, heapam.c:4179
2022-08-24 16:01:29.247 UTC [1219] STATEMENT:  ALTER SUBSCRIPTION citus_shard_move_subscription_10 DISABLE
```

However, we have other mechanisms to clean up any leftovers in case of a
failed cleanup. So it's not that big of a problem.

The reason we run into this error is arguably because of a Postgres bug,
so I created a patch for Postgres that fixes this.

While we wait for this (or a similar) patch to be merged, this PR
disables the flaky test. There's still a test that tests in case of a
connection "kill" instead of a "cancel", so I don't think we lose very
important coverage by disabling this test. When trying to reproduce this
I only hit this issue in the cancel case, so I don't think there's a
need to disable the kill case for now.
2022-08-26 12:49:45 +02:00
Jelte Fennema 2a0c0b3ba6
Fix flakyness in failure_connection_establishment (#6251)
In CI sometimes failure_connection_establishment would fail with the
following error:
```diff
 -- cancel all connections to this node
 SELECT citus.mitmproxy('conn.onAuthenticationOk().cancel(' || pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 SELECT * FROM citus_check_cluster_node_health();
```

The reason for this is that the mitm command that was used is very
broad and doesn't actually do what the comment says. What happens is
that if any connection is made, the current backend is cancelled, which
is not the always the same as the backend that made the connection. My
assessment is that likely the maintenance daemon makes a connection to
the node while we are executing the mitmproxy command. The mitmproxy
command goes through, and then triggers a cancel of itself due to the
connection made by the maintenance daemon.

This PR simply removes this test, since it doesn't seem to test what it
intended to test anyway. There's also still the "kill" version of this
test, which does do the intended thing. So I don't think we lose
important coverage by removing this test.
2022-08-26 10:01:36 +00:00
Jelte Fennema 18015ca501
Fix flakyness in multi_transaction_recovery (#6249)
Sometimes in CI multi_transaction_recovery would fail with the following
error:
```diff
 SET LOCAL citus.defer_drop_after_shard_move TO OFF;
 SELECT citus_move_shard_placement((SELECT * FROM selected_shard), 'localhost', :worker_1_port, 'localhost', :worker_2_port, shard_transfer_mode := 'block_writes');
- citus_move_shard_placement
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  could not find placement matching "localhost:57637"
+HINT:  Confirm the placement still exists and try again.
 COMMIT;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26510/workflows/8269ea93-d9b4-4376-ae0e-8332a5c15fc6/jobs/755548

The reason for this was that when choosing `selected_shard` we didn't
ensure that it was actually located on the node that we were moving it
from. Instead we simply picked the first shard for the table that was
returned by the query.

To fix this issue this PR adds a filter to only choose shards that are
located on the intended node.
2022-08-26 11:48:55 +02:00
Jelte Fennema b5cd1676f9
Fix flakyness in multi_utilities (#6245)
In CI multi_utilities would sometimes fail randomly with this error:

```diff
 VACUUM (INDEX_CLEANUP ON, PARALLEL 1) local_vacuum_table;
 SELECT pg_size_pretty( pg_total_relation_size('local_vacuum_table') );
  pg_size_pretty
 ----------------
- 21 MB
+ 22 MB
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26459/workflows/da47d9b6-f70b-49fe-806f-5ebf75bf0b11/jobs/752482

This is a harmless change in output where the relation size after
vacuuming was slightly more than we expected. This changes the size
checks for the local_vacuum_table to allow a wider range of values.

It uses the same trick as #6216 to show the actual value when it's
outside this valid range, which is useful if this test ever starts
failing again.
2022-08-25 22:50:47 +02:00
Jelte Fennema 00485d45a6
Make multi_utilities not leak tables (#6246)
When trying to fix #6245 I realized that multi_utilities was leaking
some tables that it created during the test. This fixes that by
creating all these tables in a schema that's dedicated for this test.
2022-08-25 19:33:03 +03:00
Jelte Fennema 1688bcda33
Fix errors in base_schedule (#6247)
When running `make check-base` locally it would fail with two different
errors.

The first one was this:
```diff
 SELECT create_distributed_table('pg_class', 'relname');
-ERROR:  cannot create a citus table from a catalog table
+ERROR:  deadlock detected
+DETAIL:  Process 28950 waits for ExclusiveLock on relation 16551 of database 16384; blocked by process 28951.
+Process 28951 waits for RowExclusiveLock on relation 1259 of database 16384; blocked by process 28950.
+HINT:  See server log for query details.
 SELECT create_reference_table('pg_class');
```

This happened because multi_behavioral_analytics_create_table and
multi_create_table were being run in parallel. Running them separately
resolved this issue.

The second one was this:
```diff
 CREATE OR REPLACE FUNCTION wait_until_metadata_sync(timeout INTEGER DEFAULT 15000)
     RETURNS void
     LANGUAGE C STRICT
     AS 'citus';
+ERROR:  duplicate key value violates unique constraint "pg_proc_proname_args_nsp_index"
+DETAIL:  Key (proname, proargtypes, pronamespace)=(wait_until_metadata_sync, 23, 2200) already exists.
 -- Add some helper functions for sending commands to mitmproxy
```
Which was because failure_test_helpers and multi_test_helpers were
trying to create the same function at the exact same time. The easy fix
here is to simply not create this function in the failure_test_helpers
file. This is fine, because any test schedule that runs
failure_test_helpers also runs multi_test_helpers.
2022-08-25 18:06:41 +02:00
Önder Kalacı 3ed6fea1cf
Prevent Merge command on distributed tables [PG 15] (#6238) 2022-08-25 13:27:08 +03:00
Marco Slot 9bf3c3dd5c
Add an allow_unsafe_constraints flag for constraints without distribution column (#6237)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-25 11:37:50 +03:00
Gokhan Gulbiz 69d2fcf5c0
Use the same colocation group for child and parent rels when altering a distributed table (#6225)
* Alter_distributed_table colocateWith:none bug fix for partitioned tables.

* Regression tests added for alter_distributed_table colocateWith:none for partitioned tables

* Update query comparision to be more accurate
2022-08-25 11:23:59 +03:00
Naisila Puka 1f4fe35512
Support JSON_TABLE on PG 15 (#6241)
Postgres supports JSON_TABLE feature on PG 15.

We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples).
In the end, for multi-shard JSON_TABLE commands, we apply the same
restrictions as reference tables (e.g., cannot be in the outer part of
an outer join etc.)

Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-08-24 19:11:18 +03:00
Naisila Puka 35b4ddc355
Pg15 support (#6085)
* Adjust configure script to allow PG15

* Adds copy of ruleutils_14.c as ruleutils_15.c

* Uses get_namespace_name_or_temp in ruleutils_15.c

Relevant PG commit:
48c5c9068211e0a04fd9553c8714b2821ed3ad17

* Clean up code using "(expr) ? true : false" in ruleutils_15.c

Relevant PG commit:
fd0625c7a9c679c0c1e896014b8f49a489c3a245

* Change varno from Index (unsigned int) to int in ruleutils_15.c

Relevant PG commit:
e3ec3c00d85bd2844ffddee83df2bd67c4f8297f

* Adds find_recursive_union to ruleutils_15.c

Relevant PG commit:
3f50b82639637c9908afa2087de7588450aa866b

* Fix display of SQL-std func's args in INSERT/SELECT in ruleutils_15.c

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fix ruleutils_15.c's dumping of whole-row Vars in more contexts

Relevant PG commit:
43c2175121c829c8591fc5117b725f1f22bfb670

* Fix assorted missing logic for GroupingFunc nodes in ruleutils_15.c

Relevant PG commit:
2591ee8ec44d8cbc8e1226550337a64c684746e4

* Adds grammar support for SQL/JSON clauses in ruleutils_15.c

Relevant PG commit:
f79b803dcc98d707450e158db3638dc67ff8380b

* Adds SQL/JSON constructors to ruleutils_15.c

Relevant PG commits:
f4fb45d15c59d7add2e1b81a9d477d0119a9691a
cc7401d5ca498a84d9b47fd2e01cebd8e830e558

* Adds support for MERGE in ruleutils_15.c

Relevant PG commit:
7103ebb7aae8ab8076b7e85f335ceb8fe799097c

* Add IS JSON predicate to ruleutils_15.c

Relevant PG commit:
33a377608fc29cdd1f6b63be561eab0aee5c81f0

* Add SQL/JSON query functions to ruleutils_15.c

Relevant PG commit:
1a36bc9dba8eae90963a586d37b6457b32b2fed4

* Adds three different SQL/JSON values to ruleutils_15.c

Relevant PG commits:
606948b058dc16bce494270eea577011a602810e
49082c2cc3d8167cca70cfe697afb064710828ca

* Adds JSON table functions in ruleutils_15.c

Relevant PG commit:
4e34747c88a03ede6e9d731727815e37273d4bc9

* Add PLAN function for JSON table in ruleutils_15.c

Relevant PG commit:
fadb48b00e02ccfd152baa80942de30205ab3c4f

* Remove extra blank lines before block-closing braces ruleutils_15.c

Relevant PG commit:
24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031

* set_deparse_plan: Reuse variable to appease Coverity ruleutils_15.c

Relevant PG commit:
e70813fbc4aaca35ec012d5a426706bd54e4acab

* Mechanical code beautification ruleutils_15.c

Relevant PG commit:
23e7b38bfe396f919fdb66057174d29e17086418

* Rename value_type to item_type in ruleutils_15.c

Relevant PG commit:
3ab9a63cb638a1fd99475668e2da9c237495aeda

* Show 'AS "?column?"' explicitly when it's important in ruleutils_15.c

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Fix ruleutils_15.c issues with dropped cols in funcs-returning-composite

Relevant PG commit:
c1d1e8469c77ce6b8e5310955580b4a3eee7fe96

* Change comment regarding functions returning composite in ruleutils_15.c

Relevant PG commit:
c2fa113ddb1117b1f03e91960f65d5d7d8a90270

* Replace int nodes with bool nodes where needed

In PG15, Boolean nodes are added. Pre PG15, internal Boolean values
in Create Role commands were represented by Integer nodes. This
commit replaces int nodes logic with bool nodes logic where needed.
Mostly there are CREATE ROLE logic changes.

Relevant PG commit:
941460fcf731a32e6a90691508d5cfa3d1f8eeaf

* Handle new option colliculocale in CREATE COLLATION logic

In PG15, there is an added option to use ICU as global locale provider.
pg_collation has three locale-related fields: collcollate and collctype,
which are libc-related fields, and a new one colliculocale, which is the
ICU-related field. Only the libc-related fields or the ICU-related field
is set, never both.

Relevant PG commits:
f2553d43060edb210b36c63187d52a632448e1d2
54637508f87bd5f07fb9406bac6b08240283be3b

* Add PG15 tests to CI using test images that have 15beta2 (#6093)

* Change warning message in pg_signal_backend()

Relevant PG commit:
7fa945b857cc1b2964799411f1633468826861ff

* Revert "Add missing ifdef for PG 15"

This reverts commit c7b51025ab.

* Fixes tests for ALTER TRIGGER RENAME consistency for part. tables

Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866

* Prevent creating child triggers on partitions when adding new node

Pre PG15, tgisinternal is true for a "child" trigger on a partition
cloned from the trigger on the parent.
In PG15, tgisinternal is false in that case. However, we don't want to
create this trigger on the partition since it will create a conflict
when we try to attach the partition to the parent table:
ERROR: trigger "..." for relation "{partition_name}" already exists

Relevant PG commit:
f4566345cf40b068368cb5617e61318da60676ec

* Fix tests for generated columns dependency changes

In PG15, For GENERATED columns, all dependencies of the generation
expression are recorded as NORMAL dependencies of the column itself.
This requires CASCADE to drop generated cols with the original col.
PRE PG15, dependencies were recorded as AUTO, with which
generated columns are silently dropped with the original column.

Relevant PG commit:
cb02fcb4c95bae08adaca1202c2081cfc81a28b5

* Explicitly cast catalog "char" column to text before concatenation

Relevant PG commit:
07eee5a0dc642d26f44d65c4e6263304208e8583

* Remove 'AS "?column?"' from test outputs

There were some instances in the following tst outputs
in planning debug outputs where AS "?column?" is added.
We add a normalization rule to remove it as it is not
important.

cte_inline.out
recursive_relation_planning_restriction_pushdown.out

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Use pg_backup_stop(PG15) instead of pg_stop_backup(PG<15)

Add an alternative test output because of the change in the
backup modes of Postgres. Specifically here, there is a renaming
issue: pg_stop_backup PRE PG15 vs pg_backup_stop PG15+
The alternative output can be deleted when we drop support for PG14

Relevant PG commit:
39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a

* Adds citus.mitmfifo GUC

Previously we setting this configuration parameter
in the fly for failure tests schedule.
However, PG15 doesn't allow that anymore: reserved prefixes
like "citus" cannot be used to set non-existing GUCs.

Relevant PG commit:
88103567cb8fa5be46dc9fac3e3b8774951a2be7

* Handles EXPLAIN output diffs in PG15 - Extra result lines

To handle extra "Result" lines in explain outputs, we add explain
method to multi_test_helpers.sql file
- plan_without_result_lines() is added for cases where we want the
whole explain output with only "Result" lines removed

* Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage

To handle differences in usage of GroupAggregate vs HashAggregate
or Merge Join vs Hash join in cases where this detail doesn't
seem to matter, we use coordinator_plan().
- coordinator_plan() is updated to remove "Result" lines

There are some cases where we have subplans so we add a new
function that prints all Task Count lines as well
- coordinator_plan_with_subplans()

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.

* Handles EXPLAIN output diffs in PG15: enable_group_by_reordering

Relevant PG commit
db0d67db2401eb6238ccc04c6407a4fd4f985832

* Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs

We create a new function in multi_test_helpers, which is similar
to explain_merge function in PG15. This explain helper function
normalies Memory Usage, Buckets and Batches, and we use it in the
tests which give a different output for PG15.

* Bump test images to 15beta3 (#6172)

* Omit namespace in post-copy errmsg

Relevant PG commit:
069d33d0c5a021601245e44df77a0423ddd69359

* Handles EXPLAIN output diffs in PG15: extra arrows&result lines

To handle extra "->" arrows resulting from extra Result lines
in explain outputs, we add the following explain method to
multi_test_helpers.sql file

- plan_without_arrows() is added for cases where we want the
whole explain output without arrows and without Result lines

* Alters public schema's owner to pg_database_owner in PG15

In PG15, public schema is owned by pg_database_owner role.
In multi_extension, we drop and recreate the ppublic schema,
hence its owner become the default user in our tests, postgres.
Change that to pg_database_owner for PG15 consistency.

This results in alternative test output for public schema grants
in the following test:

grant_on_schema_propagation.sql

Relevant PG commit: b073c3ccd06e4cb845e121387a43faa8c68a7b62

* Add alternative test outputs for change in Insert Select display

citus_local_tables_queries.sql
coordinator_shouldhaveshards.sql
cte_inline.sql
insert_select_repartition.sql
intermediate_result_pruning.sql
local_shard_execution.sql
local_shard_execution_replicated.sql
multi_deparse_shard_query.sql
multi_insert_select.sql
multi_insert_select_conflict.sql
multi_mx_insert_select_repartition.sql
mx_coordinator_shouldhaveshards.sql
single_node.sql

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fixes columnar tap tests for PG15

In PG15, Perl test modules have been moved to a new namespace.
Also, postgres node new() and get_new_node() methods have been
unified to one method: new()

We create separate tap tests for PG13/14 and PG15+
and update the Makefiles accordingly.

Relevant PG commits:
201a76183e2056c2217129e12d68c25ec9c559c8
b3b4d8e68ae83f432f43f035c7eb481ef93e1583

* Handles EXPLAIN output diffs in PG15: HashAgg Leverage,alt. output

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.
2022-08-24 17:59:17 +02:00
Naisila Puka ddbd10d2e7
Rename server version checks in tests (#6239) 2022-08-24 16:31:52 +03:00
Jelte Fennema 5c0205ce10
Fix flakyness in multi_replicate_reference_table (#6235)
In CI multi_replicate_reference_table would sometimes fail like this:

```diff
 -- detects correctly that referecence table doesn't have replica identity
 SELECT replicate_reference_tables();
-ERROR:  cannot use logical replication to transfer shards of the relation initially_not_replicated_reference_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
+ERROR:  cannot use logical replication to transfer shards of the relation ref_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
 DETAIL:  UPDATE and DELETE commands on the shard will error out during logical replication unless there is a REPLICA IDENTITY or PRIMARY KEY.
 HINT:  If you wish to continue without a replica identity set the shard_transfer_mode to 'force_logical' or 'block_writes'.
```

Because `CitusTableTypeIdList` returns tables in heap order so it's
a bit random which one is first in the list. And the test contained
multiple tables that didn't have a primary key or replica identity. So
it made sense that the error could be for either one of these tables.
This PR makes the test output consistent by changing one of the tables
to have a primary key.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26387/workflows/fc3196e7-ddf2-4000-a70b-5ac71c836321/jobs/748940
2022-08-24 13:34:10 +03:00
aykut-bozkurt 041f88d7bf
Revert "Revert "Creates new colocation for colocate_with:='none' too"" (#6227)
This reverts commit d171a736ab.
2022-08-24 10:54:04 +03:00
Marco Slot bad8196da3
Verify that we can replicate reference tables using rebalancer (#6232)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-24 00:34:21 +02:00
Jelte Fennema e0ada050aa
Enable binary logical replication for shard moves (#6017)
Using binary encoding can save a lot of CPU cycles, both on the sender
and on the receiver. Since the walsender and walreceiver processes are
single threaded, this can matter a lot for the throughput if they are
bottlenecked on CPU.

This feature is only available in PG14, not PG13. It should be safe to 
always enable because it's only used for types that support binary 
encoding according to the PG docs:
> Even when this option is enabled, only data types that have binary 
> send and receive functions will be transferred in binary.

But in case it causes problems, it can still be disabled by setting
`citus.enable_binary_protocol` to `false`.
2022-08-23 16:38:00 +02:00
Jelte Fennema cc7e93a56a
Fix flakyness in failure_connection_establishment (#6226)
In CI our failure_connection_establishment sometimes failed randomly
with the following error:
```diff
 -- verify a connection attempt was made to the intercepted node, this would have cause the
 -- connection to have been delayed and thus caused a timeout
 SELECT * FROM citus.dump_network_traffic() WHERE conn=0;
  conn | source | message
 ------+--------+---------
-    0 | coordinator | [initial message]
-(1 row)
+(0 rows)

 SELECT citus.mitmproxy('conn.allow()');
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26318/workflows/d3354024-9a67-4b01-9416-5cf79aec6bd8/jobs/745558

The way I fixed this was by removing the dump_network_traffic call. This
might sound simple, but doing this while continuing to let the test
serve its intended purpose required quite some more changes.

This dump_network_traffic call was there because we didn't want to show
warnings in the queries above, because the exact warnings were not
reliable. The main reason this error was not reliable was because we
were using round-robin task assignment. We did the same query twice, so
that it would hit the node with the intercepted connection in one of
those connections. Instead of doing that I'm now using the
"first-replica" policy and do the queries only once. This works, because
the first placements by placementid for each of the used tables are on
the second node, so first-replica will cause the first connection to go
there.

This solved most of the flakyness, but when confirming that the
flakyness was fixed I found some additional errors:

```diff
 -- show that INSERT failed
 SELECT citus.mitmproxy('conn.allow()');
  mitmproxy
 -----------

 (1 row)

 SELECT count(*) FROM single_replicatated WHERE key = 100;
- count
----------------------------------------------------------------------
-     0
-(1 row)
-
+ERROR:  could not establish any connections to the node localhost:9060 after 400 ms
 RESET client_min_messages;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26321/workflows/fd5f4622-400c-465e-8d82-83f5f55a87ec/jobs/745666


I addressed this with a combination of two things:
1. Only change citus.node_connection_timeout for the queries that we
   want to test timeout behaviour for. When those queries are done I
   reset the value to the default again.
2. Change our mitm framework to only delay the initial connection packet
   instead of all packets. I think sometimes a follow on packet of a previous 
   connection attempt was causing the next connection attempt to be delayed
   even if `conn.allow()` was already called. For our tests we only care about
   connection timeouts, so there's no reason to delay any other packets than
   the initial connection packet.

Then there was some last flakyness in the exact error that was given:

```diff
 -- tests for connectivity checks
 SELECT name FROM r1 WHERE id = 2;
 WARNING:  could not establish any connections to the node localhost:9060 after 900 ms
+WARNING:  connection to the remote node localhost:9060 failed with the following error:
  name
 ------
  bar
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26338/workflows/9610941c-4d01-4f62-84dc-b91abc56c252/jobs/746467

I don't have a good explaination for this slight change in error message, but
given that it is missing the actual error message I expected this to be related
to some small difference in timing: e.g. the server responding to the connection
attempt right after the coordinator determined that the connection timed out.
To solve this last  flakyness I increased the connection timeouts and made the 
difference between the timeout and the delay a bit bigger. With these tweaks 
I wasn't able to reproduce this error on CI anymore.

Finally, I made most of the same changes to failure_failover_to_local_execution,
since it was using the `conn.delay()` mitm method too. The only change that
I left out was the timing increase, since it might not be strictly necessary and
increases time it takes to run the test. If this test ever becomes flaky the first
thing we should try is increase its timeout.
2022-08-23 15:04:20 +03:00
Jelte Fennema 506c16efdf
Fix flakyness in failure_single_select (#6223)
The failure_single_select test would sometimes fail with an error that's
similar to this:
```diff
 -- cancel after first SELECT; txn should fail and nothing should be marked as invalid
 SELECT citus.mitmproxy('conn.onQuery(query="^SELECT").cancel(' ||  pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 BEGIN;
```

This error looked very to the one from #6217 and indeed the cause turned
out to be similar. Because we were canceling all SELECT queries, we
would actually sometimes cancel our mitmproxy SELECT queries itself.

This puts some additional restrictions on the queries that we cancel,
most importantly it should contain the name of the table that we're
selecting from.

I was able to reproduce the original issue locally pretty reliably. With
the changes in this PR it didn't happen again.

In passing this also changes one other failure test that was cancelling
all selects and puts similar additional restrictions on those
cancellations. 

Example of failed test in CI: https://app.circleci.com/pipelines/github/citusdata/citus/26305/workflows/4d942b91-f83c-453c-8d9a-ae22d608e756/jobs/745071
2022-08-22 20:06:33 +02:00
Hanefi Onaldi e33ba7da9e
Decrease min messages for normalization 2022-08-22 17:16:52 +03:00
Jelte Fennema e2a24b921e
Fix flakyness in failure_create_distributed_table_non_empty (#6217)
The failure_create_distributed_table_non_empty test would sometimes fail
like this:
```diff
 -- in the first test, cancel the first connection we sent from the coordinator
 SELECT citus.mitmproxy('conn.cancel(' ||  pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 SELECT create_distributed_table('test_table', 'id');
```

Because the cancel command had no filter it would actually sometimes
cancel the mitmproxy cancel command itself. This PR addresses that by
filtering on CREATE TABLE, which is one of the command that
create_distributed_table will send to the workers.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26252/workflows/1b7e5464-cca4-4ec1-99b3-48ddf25c29fa/jobs/742829
2022-08-20 01:23:25 +03:00
Jelte Fennema 4ce17f015b
Fix flakyness in columnar_memory test (#6216)
Sometimes in CI the columnar_memory test was using slightly more memory
than expected.
```diff
 SELECT CASE WHEN 1.0 * TopMemoryContext / :top_post BETWEEN 0.98 AND 1.02 THEN 1 ELSE 1.0 * TopMemoryContext / :top_post END AS top_growth
 FROM columnar_test_helpers.columnar_store_memory_stats();
--[ RECORD 1 ]-
-top_growth | 1
+-[ RECORD 1 ]------------------
+top_growth | 1.0206132116232119

 -- before this change, max mem usage while executing inserts was 28MB and
```

This PR changes the expectation to be slightly higher, such that this
random increase in memory usage doesn't cause a flaky test.

Failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26256/workflows/c0870f66-3346-4f8d-a1d3-36dfd7c98289/jobs/743028
2022-08-19 23:46:28 +02:00
Jelte Fennema de475feb69
Actually connect to the right database in logical_replication test (#6211)
In the logical_replication test we test that the cleanup logic at the
start of a shard move works as expected. To do so we create a
subscription and publication slot manually. This changes the test to
make that subscription actually connect to the database that the
publication is in.

Useful for #5987 #6085
2022-08-20 00:09:50 +03:00
Naisila Puka 9cfadd7965
Deletes unnecessary test outputs pt2 (#6214) 2022-08-19 18:21:13 +03:00
Jelte Fennema e6a1a86db0
Improve debugability for columnar_memory flakyness (#6203)
Sometimes the columnar_memory test fails in CI with the following error:
```diff
 SELECT 1.0 * TopMemoryContext / :top_post BETWEEN 0.98 AND 1.02 AS top_growth_ok
 FROM columnar_test_helpers.columnar_store_memory_stats();
 -[ RECORD 1 ]-+--
-top_growth_ok | t
+top_growth_ok | f

 -- before this change, max mem usage while executing inserts was 28MB and
```

This is almost certainly a harmless failure that simply requires bumping
the margin a little bit. However, it's impossible to say with the
current output. I was unable to reproduce this on-demand on my local
machine or even in CI. So this changes the test to include the actual
value difference in the size of TopMemoryContext when it's outside the
expected range. Then next time it fails we at least have some
information about why.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/25966/workflows/d472a57b-419a-4f33-b8bc-2e174a98d4d6/jobs/730576
2022-08-19 15:41:16 +02:00
Jelte Fennema fe1668e43f
Fix flakyness in multi_utilities (#6204)
Sometimes this multi_utilities would fail with the following error:

```diff
SET citus.log_remote_commands TO ON;
 -- should propagate to all workers because no table is specified
 ANALYZE;
 NOTICE:  issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(0, 3461, '2022-08-19 01:56:06.35816-07');
 DETAIL:  on server postgres@localhost:57637 connectionId: 1
 NOTICE:  issuing BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;SELECT assign_distributed_transaction_id(0, 3461, '2022-08-19 01:56:06.35816-07');
 DETAIL:  on server postgres@localhost:57638 connectionId: 2
 NOTICE:  issuing SET citus.enable_ddl_propagation TO 'off'
 DETAIL:  on server postgres@localhost:57637 connectionId: 1
-NOTICE:  issuing SET citus.enable_ddl_propagation TO 'off'
-DETAIL:  on server postgres@localhost:xxxxx connectionId: xxxxxxx
 NOTICE:  issuing ANALYZE
 DETAIL:  on server postgres@localhost:57637 connectionId: 1
+NOTICE:  issuing SET citus.enable_ddl_propagation TO 'off'
+DETAIL:  on server postgres@localhost:57638 connectionId: 2
 NOTICE:  issuing ANALYZE
 DETAIL:  on server postgres@localhost:57638 connectionId: 2
```

This is simply a harmless change in output due to some timing
differences. This PR makes the test output consistent by only logging
the remote ANALYZE commands, not the SET commands.
2022-08-19 12:38:55 +02:00
Jelte Fennema 8ce12eb51f
Fix flakyness in failure_insert_select_repartition (#6202)
This fixes our most commonly randomly failing failure test. The failing
diff is as follows:

```diff
SELECT citus.mitmproxy('conn.onQuery(query="fetch_intermediate_results").kill()');
  mitmproxy
 -----------

 (1 row)

 INSERT INTO target_table SELECT * FROM source_table;
-ERROR:  connection to the remote node localhost:xxxxx failed with the following error: connection not open
+ERROR:  could not open file "base/pgsql_job_cache/10_0_40/repartitioned_results_20770193413_from_4213590_to_1.data": No such file or directory
+CONTEXT:  while executing command on localhost:9060
+while executing command on localhost:57637
 SELECT * FROM target_table ORDER BY a;
```

As far as I can tell this is the cause of a race condition: After killing
fetch_intermediate_results on worker 9060, the previously created data
file gets cleaned up. The fetch_intermediate_results call that's sent
to worker 57637 will be cancelled and rolled back soon because of the
failure on the other connection. But if that fetch_intermediate_results
call is able to connect to 9060 before it is cancelled, it won't find
the file it's looking for there anymore. So while it's not the error we
expect, it does indicate that we succeeded.

To avoid this issue instead of killing the fetch_intermediate_results
call directly, we kill the COPY command that it uses to do the fetch.
This results in stable output as can be seen here, where 227 runs of
failure_insert_select_repartition succeeded:
https://app.circleci.com/pipelines/github/citusdata/citus/26168/workflows/9c64a3b6-f46c-4725-9fb4-8f6a2d00a023/jobs/739389

To be clear this changes the test to affects the opposite
fetch_intermediate_results call. This kills the fetch_intermediate_results
call of worker 57637, instead of killing the fetch_intermediate_results call
on worker 9060.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26147/workflows/780e95ea-264a-4c9f-ad2e-cf11449a795e/jobs/738467
2022-08-19 09:11:07 +00:00
Naisila Puka 5a9fdc221b
Add explicit alias to avoid debug output diff in pg15 (#6183) 2022-08-19 11:39:18 +03:00
Jelte Fennema d16b458e2a
Remove the flaky rollback_to_savepoint test (#6190)
This removes a flaky test that I introduced in #3868 after I fixed the
issue described in #3622. This test is sometimes fails randomly in CI.
The way it fails indicates that there might be some bug: A connection
breaks after rolling back to a savepoint.

I tried reproducing this issue locally, but I wasn't able to. I don't
understand what causes the failure.

Things that I tried were:

1. Running the test with:
   ```sql
   SET citus.force_max_query_parallelization = true;
   ```
2. Running the test with:
   ```sql
   SET citus.max_adaptive_executor_pool_size = 1;
   ```
3. Running the test in parallel with the same tests that it is run in
   parallel with in multi_schedule.

None of these allowed me to reproduce the issue locally.

So I think it's time to give on fixing this test and simply remove the
test. The regression that this test protects against seems very unlikely
to reappear, since in #3868 I also added a big comment about the need
for the newly added `UnclaimConnection` call. So, I think the need for
the test is quite small, and removing it will make our CI less flaky.

In case the cause of the bug ever gets found, I tracked the bug in #6189

Example of a failing CI run:
https://app.circleci.com/pipelines/github/citusdata/citus/26098/workflows/f84741d9-13b1-4ae7-9155-c21ed3466951/jobs/736424

For reference the unexpected diff is this (so both warnings and an error):
```diff
 INSERT INTO t SELECT i FROM generate_series(1, 100) i;
+WARNING:  connection to the remote node localhost:57638 failed with the following error: 
+WARNING:  
+CONTEXT:  while executing command on localhost:57638
+ERROR:  connection to the remote node localhost:57638 failed with the following error: 
 ROLLBACK;
```

This test is also mentioned as the most failing regression test in #5975
2022-08-18 15:14:16 +03:00
Onder Kalaci 9ec8e627c1 Support Sequences owned by columns before distributing tables
There are 3 different ways that a sequence can be interacting
with tables. (1) and (2) are already supported. This commit adds
support for (3).

     (1) column DEFAULT nextval('seq'):

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
         ^                                     |
         |------------------ sequence <--------|

    (2) serial columns: Bigserial/small serial etc:

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
                                 ^             |
				 |             |
          		     sequence <--------|

   (3) Sequence OWNED BY table.column: Added support for
       this type of resolution in this commit.

       The dependency is almost like the following, and
       ExpandCitusSupportedTypes() is NOT responsible for finding
       the dependency.

        schema <--- table <--- column
                                 ^
				 |
          		     sequence
2022-08-18 10:29:40 +02:00
Ying Xu 91473635db
[Columnar] Check for existence of Citus before creating Citus_Columnar (#6178)
* Added a check to see if Citus has already been loaded before creating citus_columnar

* added tests
2022-08-17 15:12:42 -07:00
Ahmet Gedemenli 0631e1998b
Fix upgrade paths for #6100 (#6176)
* Fix upgrade paths for #6100

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2022-08-17 18:56:53 +03:00
Naisila Puka 20a0e0ed39
Grant create on public to some users where necessary (for PG15) (#6180) 2022-08-17 17:35:10 +03:00
aykut-bozkurt 52efe08642
default mode for shard splitting is set to auto. (#6179) 2022-08-17 12:18:47 +03:00
aykut-bozkurt be06d65721
Nonblocking tenant isolation is supported by using split api. (#6167) 2022-08-17 11:13:07 +03:00
Jelte Fennema 78a5013e24
Support changing CPU priorities for backends and shard moves (#6126)
**Intro**
This adds support to Citus to change the CPU priority values of
backends. This is created with two main usecases in mind:

1. Users might want to run the logical replication part of the shard moves
   or shard splits at a higher speed than they would do by themselves. 
   This might cause some small loss of DB performance for their regular 
   queries, but this is often worth it. During high load it's very possible
   that the logical replication WAL sender is not able to keep up with the
   WAL that is generated. This is especially a big problem when the
   machine is close to running out of disk when doing a rebalance.
2. Users might have certain long running queries that they don't impact
   their regular workload too much.

**Be very careful!!!**
Using CPU priorities to control scheduling can be helpful in some cases
to control which processes are getting more CPU time than others. 
However, due to an issue called "[priority inversion][1]" it's possible that
using CPU priorities together with the many locks that are used within
Postgres cause the exact opposite behavior of what you intended. This
is why this PR only allows the PG superuser to change the CPU priority 
of its own processes. Currently it's not recommended to set `citus.cpu_priority`
directly. Currently the only recommended interface for users is the setting 
called `citus.cpu_priority_for_logical_replication_senders`. This setting
controls CPU priority for a very limited set of processes (the logical 
replication senders). So, the dangers of priority inversion are also limited
with when using it for this usecase.

**Background**
Before reading the rest it's important to understand some basic
background regarding process CPU priorities, because they are a bit
counter intuitive. A lower priority value, means that the process will
be scheduled more and whatever it's doing will thus complete faster. The
default priority for processes is 0. Valid values are from -20 to 19
inclusive. On Linux a larger difference between values of two processes
will result in a bigger difference in percentage of scheduling.

**Handling the usecases**
Usecase 1 can be achieved by setting `citus.cpu_priority_for_logical_replication_senders`
to the priority value that you want it to have. It's necessary to set
this both on the workers and the coordinator. Example:
```
citus.cpu_priority_for_logical_replication_senders = -10
```

Usecase 2 can with this PR be achieved by running the following as
superuser. Note that this is only possible as superuser currently 
due to the dangers mentioned in the "Be very carefull!!!" section. 
And although this is possible it's **NOT** recommended:
```sql
ALTER USER background_job_user SET citus.cpu_priority = 5;
```

**OS configuration**
To actually make these settings work well it's important to run Postgres
with more a more permissive value for the 'nice' resource limit than
Linux will do by default. By default Linux will not allow a process to
set its priority lower than it currently is, even if it was lower when
the process originally started. This capability is necessary to reset
the CPU priority to its original value after a transaction finishes.
Depending on how you run Postgres this needs to be done in one of two
ways:

If you use systemd to start Postgres all you have to do is add  a line
like this to the systemd service file:
```conf
LimitNice=+0 # the + is important, otherwise its interpreted incorrectly as 20
```

If that's not the case you'll have to configure `/etc/security/limits.conf` 
like so, assuming that you are running Postgres as the `postgres` OS user:
```
postgres            soft    nice            0
postgres            hard    nice            0
```
Finally you'd have add the following line to `/etc/pam.d/common-session`
```
session required pam_limits.so
```

These settings would allow to change the priority back after setting it
to a higher value.

However, to actually allow you to set priorities even lower than the
default priority value you would need to change the values in the 
config to something lower than 0. So for example:
```conf
LimitNice=-10
```

or

```
postgres            soft    nice            -10
postgres            hard    nice            -10
```

If you use WSL2 you'll likely have to do another thing. You have to 
open a new shell, because when PAM is only used during login, and 
WSL2 doesn't actually log you in. You can force a login like this:
```
sudo su $USER --shell /bin/bash
```
Source: https://stackoverflow.com/a/68322992/2570866

[1]: https://en.wikipedia.org/wiki/Priority_inversion
2022-08-16 13:07:17 +03:00
Jelte Fennema 43c2a1e88b
Share more code between splits and moves (#6152)
When introducing non-blocking shard split functionality it was based
heavily on the non-blocking shard moves. However, differences between
usage was slightly to big to be able to reuse the existing functions
easily. So, most logical replication code was simply copied to dedicated
shard split functions and modified for that purpose.

This PR tries to create a more generic logical replication
infrastructure that can be used by both shard splits and shard moves.
There's probably more code sharing possible in the future, but I believe
this is at least a good start and addresses the lowest hanging fruit.

This also adds a CreateSimpleHash function that makes creating the
most common type of hashmap common.
2022-08-15 20:21:51 +03:00
yxu2162 e1322ec905 Change for PG15 test because hash_mem_multiplier was changed to 2 as a default instead of 1 which was what PG13/14 have 2022-08-11 09:49:56 -07:00
Önder Kalacı 73fcbdf12c
Merge branch 'main' into add_missing_schema 2022-08-11 11:28:41 +02:00
aykut-bozkurt 898801504e
sysid should be parsed as int. (#6150) 2022-08-11 10:44:46 +03:00
Hanefi Onaldi 294400b2eb
Fix typos in tests that fail on PG15 2022-08-10 22:45:28 +03:00
Onder Kalaci 00ce7235cb Set missing search_path in the tests
On PG 15, public schema requires explicit GRANT, so lets avoid the conflict

helpful for #6085
2022-08-10 18:04:10 +02:00
Onder Kalaci 44947d5634 This is not supported in PG15
so fix earlier
2022-08-10 17:44:03 +02:00
naisila ea209bd11d Rename remaining regclass to relation in columnar.options 2022-08-10 15:38:53 +02:00
aykut-bozkurt cc694b6bcf
we consider stat object as invalid if it is not owned by current user (#6130) 2022-08-09 20:59:30 +03:00
Hanefi Onaldi 6ef96ac560
Use client side \copy when accessing test files 2022-08-09 15:00:42 +03:00
Hanefi Onaldi 9f52fa7610
Remove dynamic translation of regression test scripts, step 2.
This commit is inspired by a commit
dc9c3b0ff21465fa89d71eecf5e6cc956d647eca from PostgreSQL 15 that shares
the same header.

I also removed some gitignore rules so that I can add some files to git
worktree. We used to ignore the generated files, that are no longer
generated after this commit.

--------------------

Below is the commit message from PostgreSQL 15 commit
dc9c3b0ff21465fa89d71eecf5e6cc956d647eca :

"git mv" all the input/*.source and output/*.source files into
the corresponding sql/ and expected/ directories.  Then remove
the pg_regress and Makefile infrastructure associated with
dynamic translation.

Discussion: https://postgr.es/m/1655733.1639871614@sss.pgh.pa.us
2022-08-09 14:15:52 +03:00
Jelte Fennema 8017693b2f
Allow specifying the shard_transfer_mode when replicating reference tables (#6070)
When using `citus.replicate_reference_tables_on_activate = off`,
reference tables need to be replicated later. This can be done using the
`replicate_reference_tables()` UDF. However, this function only allowed
blocking replication. This changes the function to default to logical
replication instead, and allows choosing any of our existing shard
transfer modes.
2022-08-09 13:21:31 +03:00
Marco Slot 3b57ff2867 Fix crash in citus_copy_shard_placement 2022-08-09 09:31:05 +02:00
naisila 796d90d293 Explain w/out costs in ch_bench to avoid PG15 output diff 2022-08-09 07:53:27 +03:00
Naisila Puka bcbba99c96
Clean up large_table_shard_count guc leftovers (#6144) 2022-08-09 06:31:57 +03:00
Naisila Puka 3806f6f6a9
Add ORDER BY in pg_locks to avoid output order diffs (#6145) 2022-08-09 06:02:07 +03:00
Naisila Puka ce944c3c0f
Remove bogus guc citus.compression (#6142) 2022-08-09 05:21:32 +03:00
Jelte Fennema dd548ee3c7
Use faster custom copy logic for non-blocking shard moves (#6119)
DESCRIPTION: Use faster custom copy logic for non-blocking shard moves

Non-blocking shard moves consist of two main phases:
1. Initial data copy
2. Catchup phase

This changes the first of these phases significantly. Previously we used the
copy logic provided by postgres subscriptions. This meant we didn't have
to implement it ourselves, but it came with the downside of little control.
When implementing shard splits we needed more control to even make it
work, so we implemented our own logic for copying data between nodes.

This PR starts using that logic for non-blocking shard moves. Doing so
has four main advantages:
1. It uses COPY in binary format when possible, which is cheaper to encode 
    and decode. Furthermore it very often results in less data that needs to 
    be sent over the network.
2. It allows us to create the primary key (or other replica identity) after doing
    the initial data copy. This should give some speed up over the total run,
    because creating an index is bulk is much faster than incrementally building it.
3. It doesn't require a replication slot per parallel copy. Increasing the maximum
    number of replication slots uses resources in postgres, even if they are not used.
    So reducing the number of replication slots that shard moves need is nice.
4. Logical replication table_sync workers are slow to start up, so if lots of shards
    need to be copied that can make it quite slow. This can happen easily when
    combining Postgres partitioning with Citus.
2022-08-08 17:09:43 +02:00
Marco Slot 6aee8f35a6 Fix tenant isolation failure tests 2022-08-08 13:33:23 +02:00
Marco Slot 044dd26e40 Reimplement tenant isolation on top of block shard split 2022-08-08 13:33:23 +02:00
Naisila Puka 3401b31c13
Deletes unnecessary test outputs (#6140) 2022-08-08 11:19:14 +03:00
Naisila Puka 9eedf6dcf8
Reduce log level to avoid alternative output for PG15 (#6139) 2022-08-07 16:07:58 +03:00
aykut-bozkurt 4992533e33
support grant statement propagation for aggregates (#6132) 2022-08-05 14:47:33 +03:00
Ahmet Gedemenli 8b68b0b5bb
Fix pg upgrade script for foreign tables (#6100)
Fixes unexpected error for foreign tables when upgrading pg
2022-08-05 13:35:17 +03:00
Sameer Awasekar e236711eea Introduce Non-Blocking Shard Split Workflow 2022-08-04 16:32:38 +02:00
Naisila Puka a1c630a16e
Reduce shard_count to reduce drain_node execution time (#6128)
master_drain_node in distributed_triggers.sql test file takes too
long to execute. It is directly dependent on the shard count.
Hence I reduced shard count from 32 to 4 (default in tests),
since this doesn't affect the validity of the tests.
2022-08-04 15:34:13 +03:00
aykut-bozkurt 3ddc089651
stop distributing views with no distributed dependency if GUC DistributeLocalViews is set false. (#6083) 2022-08-04 12:34:40 +03:00
Jelte Fennema 8866d9ac32
Reduce setup time of check-minimal and check-minimal-mx (#6117)
This change reduces the setup time of our minimal schedules in two ways:
1. Don't run `multi_cluster_managament`, but instead run a much smaller
   sql file with almost the same results. `multi_cluster_management`
   adds and removes lots of nodes and tests all kinds of failure
   scenarios. This is not needed for the minimal schedules. The only
   reason we were using it there was to get a working cluster of the
   layout that the tests expected. The new `minimal_cluster_management`
   test achieves this with much less work, going from ~2s to ~0.5s.
2. Parallelize a bit more of the helper tests.
2022-08-02 17:58:59 +03:00
Naisila Puka 28e22c4abf
Reduce log level to avoid alternative output for PG15 (#6118)
We are reducing the log level here to avoid alternative test output
in PG15 because of the change in the display of SQL-standard
function's arguments in INSERT/SELECT in PG15.
The log level changes can be reverted when we drop support for PG14
Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759
2022-08-02 11:56:28 +03:00
Jelte Fennema abffa6c3b9
Use shard split copy code for blocking shard moves (#6098)
The new shard copy code that was created for shard splits has some
advantages over the old shard copy code. The old code was using 
worker_append_table_to_shard, which wrote to disk twice. And it also 
didn't use binary copy when that was possible. Both of these issues
were fixed in the new copy code. This PR starts using this new copy
logic also for shard moves, not just for shard splits.

On my local machine I created a single shard table like this.
```sql
set citus.shard_count = 1;
create table t(id bigint, a bigint);
select create_distributed_table('t', 'id');

INSERT into t(id, a) SELECT i, i from generate_series(1, 100000000) i;
```

I then turned `fsync` off to make sure I wasn't bottlenecked by disk. 
Finally I moved this shard between nodes with `citus_move_shard_placement`
with `block_writes`.

Before this PR a move took ~127s, after this PR it took only ~38s. So for this 
small test this resulted in spending ~70% less time.

And I also tried the same test for a table that contained large strings:
```sql
set citus.shard_count = 1;
create table t(id bigint, a bigint, content text);
select create_distributed_table('t', 'id');

INSERT into t(id, a, content) SELECT i, i, 'aunethautnehoautnheaotnuhetnohueoutnehotnuhetncouhaeohuaeochgrhgd.athbetndairgexdbuhaobulrhdbaetoausnetohuracehousncaoehuesousnaceohuenacouhancoexdaseohusnaetobuetnoduhasneouhaceohusnaoetcuhmsnaetohuacoeuhebtokteaoshetouhsanetouhaoug.lcuahesonuthaseauhcoerhuaoecuh.lg;rcydabsnetabuesabhenth' from generate_series(1, 20000000) i;
```
2022-08-01 20:10:36 +03:00
Naisila Puka 5060d0ab17
Remove leftover PG version_above_11 checks from tests (#6112) 2022-08-01 15:38:19 +03:00
Onder Kalaci bdaeb40b51 Add missing relation access record for local utility command
While testing 5670dffd33, I realized
that we have a missing RecordNonDistTableAccessesForTask() for
local utility commands.

Although we don't have to record the relation access for local
only cases, we really want to keep the behaviour for scale-out
be the same with single node on all aspects. We wouldn't want
any single node complex transaction to work on single machine,
but not on multi node cluster. Hence, we apply the same restrictions.

For example, on a distributed cluster, the following errors, and
after this commit this errors locally as well

```SQL
CREATE TABLE ref(a int primary key);
INSERT INTO ref VALUES (1);

CREATE TABLE dist(a int REFERENCES ref(a));
SELECT create_reference_table('ref');
SELECT create_distributed_table('dist', 'a');

BEGIN;
		SELECT * FROM dist;
		TRUNCATE ref CASCADE;

ERROR:  cannot execute DDL on table "ref" because there was a parallel SELECT access to distributed table "dist" in the same transaction
HINT:  Try re-running the transaction with "SET LOCAL citus.multi_shard_modify_mode TO 'sequential';"

COMMIT;
```

We also add the comprehensive test suite and run the same locally.
2022-07-29 11:36:33 +02:00
Marco Slot cff013a057 Fix issues with insert..select casts and column ordering 2022-07-28 13:23:57 +02:00
Onder Kalaci b41c3fd30d Add tests 2022-07-28 11:27:59 +02:00
Ying Xu fdf090758b
Bugfix for IN clause to be considered during planner phase in Columnar (#6030)
Reported bug #5803 shows that we are currently not sending the IN clause to our planner for columnar. This PR fixes it by checking for ScalarArrayOpExpr in ExtractPushdownClause so that we do not skip it. Also added a test case for this new addition.
2022-07-27 11:06:49 -07:00
Ahmet Gedemenli 2b2a529653
Error out for views with circular dependencies (#6051)
Adds error check for views with circular dependencies
2022-07-27 17:57:45 +03:00
Naisila Puka 1259d83511
Smallfix in CreateCollationDDL logic (#6089) 2022-07-27 14:33:31 +03:00
aykut-bozkurt 67ac3da2b0
added citus_depended_objects udf and HideCitusDependentObjects GUC to hide citus depended objects from pg meta queries (#6055)
use RecurseObjectDependencies api to find if an object is citus depended

make vanilla tests runnable to see if citus_depended function is working correctly
2022-07-25 16:43:34 +03:00
Marco Slot 5fabf94e39 Allow WITH HOLD cursors with parameters 2022-07-21 12:00:59 +02:00
Hanefi Onaldi eb3e5ee227 Introduce citus_locks view
citus_locks combines the pg_locks views from all nodes and adds
global_pid, nodeid, and relation_name. The columns of citus_locks don't
change based on the Postgres version, however the pg_locks's columns do.
Postgres 14 added one more column to pg_locks (waitstart timestamptz).
citus_locks has the most expansive column set, including the newly added
column. If citus_locks is queried in a Postgres version where pg_locks
doesn't have some columns, the values for those columns in citus_locks
will be NULL
2022-07-21 03:06:57 +03:00
Nitish Upreti 3d569cc49a
Shard Split support for Columnar and Partitioned Table (#6067)
DESCRIPTION:
This PR extends support for Partitioned and Columnar tables in blocking 'citus_split_shard_by_split_points' workflow.
Columnar Support : No special handling required. Just removing checks that fails split for columnar table and adding test coverage.
Partitioned Table Support :

Skip copying of parent table as they are empty, The partitions contain data and are treated as co-located shards that will be copied separately.
Attach partitions to parent on destination after inserting new shard metadata and before creating foreign key constraints.
MISC:
Fix Bug #4949 where Blocking shard moves fails if there is a foreign key between partitioned distributed tables (from child to parent).

TEST:
Added new test 'citus_split_shards_columnar_partitioned' for splitting 'partitioned' and 'columnar + partitioned' table.
Added new test 'shard_move_constraints_blocking' to add coverage for shard move bug fix.
Updated test 'citus_split_shard_by_split_points_negative' to allow columnar and partitioned table.
2022-07-20 12:24:50 -07:00
Naisila Puka 7d6410c838
Drop postgres 12 support (#6040)
* Remove if conditions with PG_VERSION_NUM < 13

* Remove server_above_twelve(&eleven) checks from tests

* Fix tests

* Remove pg12 and pg11 alternative test output files

* Remove pg12 specific normalization rules

* Some more if conditions in the code

* Change RemoteCollationIdExpression and some pg12/pg13 comments

* Remove some more normalization rules
2022-07-20 17:49:36 +03:00
Nitish Upreti 5b3537cdff
Shard Split for Citus (#6029)
* Blocking split setup

* Add missing type

* Missing API from Metadata Sync

* Shard Split e2e code

* Worker Split Copy DestReceiver skeleton

* Basic destreceiver code

* worker_split_copy UDF

* UDF calling

* Split points are text

* Isolate Tenant and Split Shard Unification

* Fixing executor and misc

* Reindent code

* Fixing UDF definitions

* Hello World Local Copy works

* Remote copy hello world works

* Local and Remote binary test

* Fixing text local copy and adding tests

* Hello World shard split works

* Negative tests

* Blocking Split workflow works

* Refactor

* Bug fix

* Reindent

* Cleaning up and adding comments

* Basic test for shard split workflow

* ReIndent

* Circle CI integration

* Removing include causing circle-ci build failure

* Remove SplitCopyDestReceiver and use PartitionedResultDestReceiver

* Add support for citus.enable_binary_protocol

* Reindent

* Fix build break

* Update Test

* Cleanup on catch

* Addressing open comments

* Update downgrade script and quote schema/table in COPY statement

* Fix metadata sync issue. Update regression test

* Isolation test and bug fix

* Add Isolation test, fix foreign constraint deadlock issue

* Misc code review comments

* Test name needing to be quoted

* Refactor code from review comments

* Explaining shardGroupSplitIntervalListList

* Fix upgrade & downgrade

* Fix broken test

* Test fix Round 2

* Fixing bug and modifying test appropriately

* Fully qualify copy udf name. Run Reindent

* Address PR comments

* Fix null handling when creating AuxiliaryStructures

* Ensure local copy is triggered in tests

* Limit max shards that can be created with split

* Test failure fix

* Remove split_mode and use shard_transfer_mode instead'

* Fix test failure

* Fix test failure

* Fixing permission issue when splitting non-superuser owned tables

* Fix test expected output

* Remove extra space

* Fix test

* attempt to fix test

* Addressing Marco's PR comment

* Only clean shards created by workflow

* Remove from merge

* Update test
2022-07-18 02:54:15 -07:00
ywj 1675519f93
Support citus_columnar as separate extension (#5911)
* Support upgrade and downgrade and separate columnar as citus_columnar extension

Co-authored-by: Yanwen Jin <yanwjin@microsoft.com>
Co-authored-by: Jeff Davis <jeff@j-davis.com>
2022-07-13 21:08:29 -07:00
Onder Kalaci 6cd7319f12 Add more generic read-replica tests 2022-07-13 14:58:30 +02:00
Onder Kalaci 3c343d4563 Add regression tests for LOCK command citus.use_secondary_nodes=always mode 2022-07-13 14:27:11 +02:00
aykutbozkurt d53a7760b0 * alter index/table rename weird syntax supported,
* correct the wrong level of lock if the weird syntax is used
2022-07-04 21:27:47 +03:00
aykutbozkurt ba62c0a148 auto is a valid option for vacuum index_cleanup. 2022-07-04 19:27:55 +03:00
Ahmet Gedemenli c8e1e243b8
Fix matviews for citus_add_local_table_to_metadata (#6023) 2022-07-04 17:00:07 +03:00
Hanefi Onaldi f60809a6c1
Fix downgrade scripts from 11.0-2 to 11.0-1 (#6039) 2022-06-29 22:43:50 +03:00
Onder Kalaci bab4c0a8c3 Fixes a bug that prevents upgrades when there are no worker nodes 2022-06-28 15:54:49 +02:00
Onder Kalaci bd3a070369 Fixes a bug that prevents upgrades when there COMPRESSION and DEFAULT columns 2022-06-28 13:36:00 +02:00
aykutbozkurt 8194dc4c62 * Added isolation tests for vacuum,
* Added more regression tests for more vacuum options,
* Fixed deadlock for unqualified vacuum when there is only 1 worker,
* Supported lock_skipped for vacuum.
2022-06-23 15:33:14 +03:00
aykutbozkurt 1d6c81245c fix bug, which is column mismatch of shard tasks when specifying column names for citus tables in vacuum and analyze commands 2022-06-23 15:33:14 +03:00
Aykut Bozkurt 6986f53835 propagate unqualified vacuum and analyze to all worker nodes 2022-06-23 15:33:14 +03:00
Ahmet Gedemenli 1ee3e8b7f4
Fix creating stats bug when CREATE TABLE LIKE (#6006) 2022-06-16 12:43:47 +03:00
Jelte Fennema 184c7c0bce
Make enterprise features open source (#6008)
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 00:23:46 -07:00
Burak Velioglu e244e9ffb6
Fix dropping temporary view without specifying the explicit schema name (#6003) 2022-06-15 16:41:12 +02:00
Marco Slot ee34e1ed9d Fix bug in unqualified, non-existing DROP DOMAIN IF EXISTS 2022-06-15 13:59:08 +02:00
Ahmet Gedemenli 268d3fa3a6
Fix materialized view intermediate result filename (#5982) 2022-06-14 15:07:08 +03:00
Onder Kalaci af22a30b48 Use citus_finish_citus_upgrade() in the tests
We already have tests relying on citus_finalize_upgrade_to_citus11().
Now, adjust those to rely on citus_finish_citus_upgrade() and
always call citus_finish_citus_upgrade().
2022-06-13 13:15:15 +02:00
Halil Ozan Akgul b255706189 Fixes the bug where undistribute can drop Citus extension 2022-05-31 16:23:28 +03:00
Onder Kalaci 89c1ccb7a5 Show that no metadata is sent when disabled 2022-05-30 13:41:06 +02:00
Ahmet Gedemenli 26d927178c
Propagate dependent views upon distribution (#5950) 2022-05-26 14:23:45 +03:00
Burak Velioglu 1d7dda991f Create view and materialized views with right schema and owner while
altering the distributed table.

To be able to alter view's owner without enforcing sequential mode.
Alter view process functions have been udpated to use metadata
connection.
2022-05-24 15:27:30 +03:00
Onder Kalaci dd02e1755f Parallelize metadata syncing on node activate
It is often useful to be able to sync the metadata in parallel
across nodes.

Also citus_finalize_upgrade_to_citus11() uses
start_metadata_sync_to_primary_nodes() after this commit.

Note that this commit does not parallelize all pieces of node
activation or metadata syncing. Instead, it tries to parallelize
potenially large parts of metadata, which is the objects and
distributed tables (in general Citus tables).

In the future, it would be nice to sync the reference tables
in parallel across nodes.

Create ~720 distributed tables / ~23450 shards
```SQL
-- declaratively partitioned table
CREATE TABLE github_events_looooooooooooooong_name (
  event_id bigint,
  event_type text,
  event_public boolean,
  repo_id bigint,
  payload jsonb,
  repo jsonb,
  actor jsonb,
  org jsonb,
  created_at timestamp
) PARTITION BY RANGE (created_at);

SELECT create_time_partitions(
  table_name         := 'github_events_looooooooooooooong_name',
  partition_interval := '1 day',
  end_at             := now() + '24 months'
);

CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id);
SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id');

SET client_min_messages TO ERROR;

```

across 1 node: almost same as expected
```SQL

SELECT start_metadata_sync_to_primary_nodes();
Time: 15664.418 ms (00:15.664)

select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 14284.069 ms (00:14.284)
```

across 7 nodes: ~3.5x improvement
```SQL

SELECT start_metadata_sync_to_primary_nodes();
┌──────────────────────────────────────┐
│ start_metadata_sync_to_primary_nodes │
├──────────────────────────────────────┤
│ t                                    │
└──────────────────────────────────────┘
(1 row)

Time: 25711.192 ms (00:25.711)

-- across 7 nodes
select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 82126.075 ms (01:22.126)
```
2022-05-23 09:15:48 +02:00
jeff-davis a2f5b068e6
Columnar: tighten security and improve visibility. (#5922)
Move internal storage details to a separate schema with no public
access to limit the possibility for information leakage.

Create views with public access that show storage details for those
columnar tables where the user has ownership privileges. Include
mapping between relation ID and storage ID for easier interpretation.
2022-05-20 15:30:31 -07:00
Ying Xu a1151c2395
Clear metadatacache during abort for create extension (#5907)
* Bug fix for bug #5876. Memset MetadataCacheSystem every time there is an abort

* Created an ObjectAccessHook that saves the transactionlevel of when citus was created and will clear metadatacache if that transaction level is rolled back. Added additional tests to make sure metadatacache is cleared
2022-05-20 13:47:58 -07:00
Marco Slot 09ec366ff5 Improve nested execution checks and add GUC to disable 2022-05-20 18:55:43 +02:00
Marco Slot e683993449 Fix prepared statement bug when switching from local to remote execution 2022-05-20 18:55:43 +02:00
jeff-davis a9f8a60007
Columnar: support relation options with ALTER TABLE. (#5935)
Columnar: support relation options with ALTER TABLE.

Use ALTER TABLE ... SET/RESET to specify relation options rather than
alter_columnar_table_set() and alter_columnar_table_reset().

Not only is this more ergonomic, but it also allows better integration
because it can be treated like DDL on a regular table. For instance,
citus can use its own ProcessUtility_hook to distribute the new
settings to the shards.

DESCRIPTION: Columnar: support relation options with ALTER TABLE.
2022-05-20 08:35:00 -07:00
Marco Slot ad5214b50c Allow distributed execution from run_command_on_* functions 2022-05-20 15:26:47 +02:00
gledis69 4731630741 Add distributing lock command support 2022-05-20 12:28:07 +03:00
Marco Slot 79d7e860e6 Add a run_command_on_coordinator function 2022-05-19 10:26:09 +02:00
Marco Slot fa9cee409c Fix downgrade scripts and add new downgrade tests 2022-05-19 10:26:09 +02:00
Onder Kalaci 127450466e Do not warn unncessarily when a node is removed
In the past (pre-11), we allowed removing worker nodes
that had active placements for replicated distributed
table, without even checking if there are any other
replicas of the same placement.

However, with #5469, we prevent disabling nodes via a hard
error when there is the last active placement of shard, as we
do for reference tables. Note that otherwise, we'd allow
users to lose data.

As of today, the NOTICE is completely irrelevant.
2022-05-18 17:23:38 +02:00
Onder Kalaci db998b3d66 Adds "sync" option to citus_disable_node() UDF
Before this commit, we had:
```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false)
```

Where, we allow forcing to disable first worker node with
`force:=true`. However, it entails the risk for losing
data / diverging placement data etc.

With `force` flag, we control disabling the first worker node,
and with `async` flag we control whether the changes are done
via bg worker or immediately.

```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false, sync boolean DEFAULT false)
```

Where we can achieve all the following:

| Mode  | Data loss possibility | Can run in 2PC | Handle multiple node failures | Immediately effective |
| --- |--- |--- |--- |--- |
| force:false, sync: false  | false   | true  | true  | false |
| force:false, sync: true   | false  | false | false | true |
| force:true, sync: false   | true   | true  | true   | false |
| force:true, sync: true    | false  | false | false  | true |
2022-05-18 17:21:12 +02:00