Commit Graph

1896 Commits (36f8c48560a5ebef5a21d0907e8707f63130caf7)

Author SHA1 Message Date
Onder Kalaci 36f8c48560 Add tests for allowing SET NULL/DEFAULT for subseet of columns
PG 15 added support for that (d6f96ed94e73052f99a2e545ed17a8b2fdc1fb8a).

We also add support, but we already do not support ON DELETE SET NULL/DEFAULT
for distribution column. So, in essence, we add support for reference tables
and Citus local tables.
2022-09-12 13:56:09 +03:00
Marco Slot 2e943a64a0
Make shard moves more idempotent (#6313)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-09 18:21:36 +02:00
Jelte Fennema a2d86214b2
Share more replication code between moves and splits (#6310)
The logical replication catchup part for shard splits and shard moves is
very similar. This abstracts most of that similarity away into a single
function. This also improves the logic for non blocking shard splits a
bit, by using faster foreign key creation. It also parallelizes index creation
which shard moves were already doing, but shard splits did not.
2022-09-09 16:45:38 +02:00
Marco Slot ba2fe3e3c4
Remove do_repair option from citus_copy_shard_placement (#6299)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-09 15:44:30 +02:00
Nils Dijk 00a94c7f13
Implement infrastructure to run sql jobs in the background (#6296)
DESCRIPTION: Add infrastructure to run long running management operations in background

This infrastructure introduces the primitives of jobs and tasks.
A task consists of a sql statement and an owner. Tasks belong to a
Job and can depend on other tasks from the same job.

When there are either runnable or running tasks we would like to
make sure a bacgrkound task queue monitor process is running. A Task
could be in running state while there is actually no monitor present
due to a database restart or failover. Once the monitor starts it
will reset any running task to its runnable state.

To make sure only one background task queue monitor is ever running
at once it will acquire an advisory lock that self conflicts.

Once a task is done it will find all tasks depending on this task.
After checking that the task doesn't have unmet dependencies it will
transition the task from blocked to runnable state for the task to
be picked up on a subsequent task start.

Currently only one task can be running at a time. This can be
improved upon in later releases without changes to the higher level
API.

The initial goal for this background tasks is to allow a rebalance
to run in the background. This will be implemented in a subsequent PR.
2022-09-09 16:11:19 +03:00
Jelte Fennema 76137e967f
Create all foreign keys quickly at the end of a shard move (#6148)
Previously we would create foreign keys to reference table in an extra
fast way at the end of a shard move. This uses that same logic to also
do it for foreign keys between distributed tables.

Fixes #6141
2022-09-09 09:58:33 +02:00
Ahmet Gedemenli eadc88a800
Introduce GUC citus.skip_constraint_validation (#6281)
Introduces a new GUC named citus.skip_constraint_validation, which basically skips constraint validation when set to on.
For some several places that we hack to skip the foreign key validation phase, now we use this GUC.
2022-09-08 18:13:18 +03:00
Hanefi Onaldi a557a196aa
Add tests for numeric with scale greater than precision 2022-09-07 13:12:04 +03:00
Hanefi Onaldi 4db113496f
Add tests for new COPY features in PG15 2022-09-07 13:12:04 +03:00
Hanefi Onaldi 3e4e42253f
Add tests for new regexp sql functions 2022-09-07 13:12:04 +03:00
Nitish Upreti d7404a9446
'Deferred Drop' and robust 'Shard Cleanup' for Splits. (#6258)
DESCRIPTION:
This PR adds support for 'Deferred Drop' and robust 'Shard Cleanup' for Splits.

Common Infrastructure
This PR introduces new common infrastructure so as any operation that wants robust cleanup of resources can register with the cleaner and have the resources cleaned appropriately based on a specified policy. 'Shard Split' is the first consumer using this new infrastructure.
Note : We only support adding 'shards' as resources to be cleaned-up right now but the framework will be extended to support other resources in future.

Deferred Drop for Split
Deferred Drop Support ensures that shards undergoing split are not dropped inline as part of operation but dropped later when no active read queries are running on shard. This helps with :

Avoids any potential deadlock scenarios that can cause long running Split operation to rollback.
Avoids Split operation blocking writes and then getting blocked (due to running queries on the shard) when trying to drop shards.
Deferred drop is the new default behavior going forward.
Shard Cleaner Extension
Shard Cleaner is a background task responsible for deferred drops in case of 'Move' operations.
The cleaner has been extended to ensure robust cleanup of shards (dummy shards and split children) in case of a failure based on the new infrastructure mentioned above. The cleaner also handles deferred drop for 'Splits'.

TESTING:
New test ''citus_split_shard_by_split_points_deferred_drop' to test deferred drop support.
New test 'failure_split_cleanup' to test shard cleanup with failures in different stages.
Update 'isolation_blocking_shard_split and isolation_non_blocking_shard_split' for deferred drop.
Added non-deferred drop version of existing tests : 'citus_split_shard_no_deferred_drop' and 'citus_non_blocking_splits_no_deferred_drop'
2022-09-06 12:11:20 -07:00
Gokhan Gulbiz ac96370ddf
Use IsMultiStatementTransaction for SELECT .. FOR UPDATE queries (#6288)
* Use IsMultiStatementTransaction instead of IsTransaction for row-locking operations.

* Add regression test for SELECT..FOR UPDATE statement
2022-09-06 16:38:41 +02:00
Emel Şimşek 6f06ff78cc
Throw an error if there is a RangeTblEntry that is not assigned an RTE identity. (#6295)
* Fix issue : 6109 Segfault or (assertion failure) is possible when using a SQL function

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases.
This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases. This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

Co-authored-by: Emel Simsek <emel.simsek@microsoft.com>
2022-09-06 15:46:41 +02:00
Hanefi Onaldi 85b19c851a
Disallow distributing by numeric with negative scale
PG15 allows numeric scale to be negative or greater than precision. This
causes issues and we may end up routing queries to a wrong shard due to
differing hash results after rounding.

Formerly, when specifying NUMERIC(precision, scale), the scale had to be
in the range [0, precision], which was per SQL spec. PG15 extends the
range of allowed scales to [-1000, 1000].

A negative scale implies rounding before the decimal point. For
example, a column might be declared with a scale of -3 to round values
to the nearest thousand. Note that the display scale remains
non-negative, so in this case the display scale will be zero, and all
digits before the decimal point will be displayed.

Relevant PG commit: 085f931f52494e1f304e35571924efa6fcdc2b44
2022-09-06 12:40:56 +03:00
Naisila Puka d7f41cacbe
Prohibit renaming child trigger on distributed partition pre PG15 (#6290)
Pre PG15, renaming child triggers on partitions is allowed. When
creating a trigger in a distributed parent partitioned table, the
triggers on the shards of the partitions have the same name with
the triggers on the corresponding parent shards of the parent
table. Therefore, they don't have the same appended shard id as
the shard id of the partition. Hence, when trying to rename a
child trigger on a partition of a distributed table, we can't
correctly find the triggers on the shards of the partition in
order to rename them since we append a different shard id to the
name of the trigger. Since we can't find the trigger we get a
misleading error of inexistent trigger.

In this commit we prohibit renaming child triggers on distributed
partitions altogether.
2022-09-06 12:19:25 +03:00
Naisila Puka fd9b3f4ae9
Add tests to make sure distributed clone trigger rename fails in PG15 (#6291)
Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866
2022-09-06 11:04:14 +03:00
Marco Slot e6b1845931
Change split logic to avoid EnsureReferenceTablesExistOnAllNodesExtended (#6208)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-05 22:02:18 +02:00
Önder Kalacı bd13836648
Add citus.skip_advisory_lock_permission_checks (#6293) 2022-09-05 17:47:41 +02:00
Ahmet Gedemenli 7c8cc7fc61
Fix flakiness for view tests (#6284) 2022-09-02 10:12:07 +03:00
Marco Slot 432f399a5d
Allow citus_internal application_name with additional suffix (#6282)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-01 14:26:43 +02:00
Naisila Puka 9e2b96caa5
Add pg14->pg15 upgrade test for dist. triggers on part. tables (#6265)
PRE PG15, Renaming the parent triggers on partitioned tables doesn't
recurse to renaming the child triggers on the partitions as well.
In PG15, Renaming triggers on partitioned tables
recurses to renaming the triggers on the partitions as well.

Add an upgrade test to make sure we are not breaking anything
with distributed triggers on distributed partitioned tables.

Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866
2022-09-01 12:32:44 +03:00
Naisila Puka 98dcbeb304
Specifies that our CustomScan providers support projections (#6244)
Before, this was the default mode for CustomScan providers.
Now, the default is to assume that they can't project.
This causes performance penalties due to adding unnecessary
Result nodes.

Hence we use the newly added flag, CUSTOMPATH_SUPPORT_PROJECTION
to get it back to how it was.

In PG15 support branch we created explain functions to ignore
the new Result nodes, so we undo that in this commit.

Relevant PG commit:
955b3e0f9269639fb916cee3dea37aee50b82df0
2022-08-31 10:48:01 +03:00
Jelte Fennema 24e695ca27
Fix flakyness in multi_utilities (#6272)
Sometimes in CI our multi_utilities test fails like this:
```diff
 VACUUM (INDEX_CLEANUP ON, PARALLEL 1) local_vacuum_table;
 SELECT CASE WHEN s BETWEEN 20000000 AND 25000000 THEN 22500000 ELSE s END size
 FROM pg_total_relation_size('local_vacuum_table') s ;
    size
 ----------
- 22500000
+ 39518208
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26641/workflows/5caea99c-9f58-4baa-839a-805aea714628/jobs/762870

Apparently VACUUM is not as reliable in cleaning up as we thought. This
increases the range of allowed values. Important to note is that the
range is still completely outside of the allowed range of the initial
size. So we know for sure that some data was cleaned up.
2022-08-30 14:32:34 -07:00
Jelte Fennema f22a47981a
Fix flakyness in adaptive_executor (#6275)
Sometimes in CI our adaptive_executor test would fail randomly with the
following error:

```diff
 SELECT sum(result::bigint) FROM run_command_on_workers($$
   SELECT count(*) FROM pg_stat_activity
   WHERE pid <> pg_backend_pid() AND query LIKE '%8010090%'
 $$);
  sum
 -----
-   4
+   2
 (1 row)

 END;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26665/workflows/40665680-0044-4852-8fe4-5fd628f9fb47/jobs/764371

This means that the low slow start interval did not have any effect on
the number of connections being opened. I could see two possibilities
for this to happen:
1. CI was slow and actually doing the start of the second connection. I
   tried to solve this by doubling the time a query to the worker takes.
2. The second option is that the shards were queried in the oposite
   order than we expect. This would mean that the first query to the
   worker completes quickly because there's no, sleep because it doesn't
   contain any rows. I tried to solve this option by adding a row to
   each shard.

After trying to reproduce the random failure in CI it turned out that I
needed both of these fixes to resolve the random failure.
2022-08-30 23:23:30 +02:00
Jelte Fennema 8354853dec
Fix flakyness in citus_split_shard_columnar_partitioned (#6273)
On CI our citus_split_shard_columnar_partitioned test would sometimes
randomly fail like this:
```diff
  8970008 | colocated_dist_table                   | -2147483648   | 2147483647    | localhost |    57637
  8970009 | colocated_partitioned_table            | -2147483648   | 2147483647    | localhost |    57637
  8970010 | colocated_partitioned_table_2020_01_01 | -2147483648   | 2147483647    | localhost |    57637
- 8970011 | reference_table                        |               |               | localhost |    57637
  8970011 | reference_table                        |               |               | localhost |    57638
+ 8970011 | reference_table                        |               |               | localhost |    57637
 (13 rows)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26651/workflows/f695b4fb-ad81-46ff-b97e-0100e5d167ea/jobs/763517

This is a harmless diff due to a missing column in the order by list.
This fixes that by adding the nodeport as a tiebreaker.
2022-08-30 19:54:50 +03:00
Marco Slot 6bb31c5d75
Add non-blocking variant of create_distributed_table (#6087)
Added create_distributed_table_concurrently which is nonblocking variant of create_distributed_table.

It bases on the split API which takes advantage of logical replication to support nonblocking split operations.

Co-authored-by: Marco Slot <marco.slot@gmail.com>
Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
2022-08-30 15:35:40 +03:00
Önder Kalacı 33af407ac8
Add missing orderbys (#6271) 2022-08-30 12:49:15 +03:00
Jelte Fennema 895a484b39
Hopefully fix flakyeness in drop_partitioned_table (#6270)
Sometimes in CI our drop_partitioned_talbe test would fail with the
following error:

```diff
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
 NOTICE:  issuing DROP TABLE IF EXISTS drop_partitioned_table.child1_727001 CASCADE
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
-NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
+NOTICE:  issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
 ROLLBACK;
 NOTICE:  issuing ROLLBACK
 NOTICE:  issuing ROLLBACK
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26631/workflows/31536032-e1ba-493b-b12a-f40757f3a7d6/jobs/762170

For some reason the colocationid of the distributed partitioned table
would be one less than we expected. Why this happens I'm not sure, but
it seems fairly harmless that it does.

In an attempt to work around this flakyness I now reset the colocation
id sequence right before creating the table in question. This is good
practice in general, because it allows us to run the test successfully
using `check-minimal` and it also allows us to rerun it multiple times.
2022-08-30 12:21:16 +03:00
Naisila Puka 13fe89f018
Fixes flakyness in columnar_permissions test (#6266)
`columnar_permissions.sql` test is flaky due to a missing `ORDER BY` clauses.
Added the other `ORDER BY` clauses for consistency in the test.

```diff
   where relation in ('no_access'::regclass, 'columnar_permissions'::regclass);
        relation       | chunk_group_row_limit | stripe_row_limit | compression | compression_level 
 ----------------------+-----------------------+------------------+-------------+-------------------
- no_access            |                 10000 |           150000 | zstd        |                 3
  columnar_permissions |                 10000 |             2222 | none        |                 3
+ no_access            |                 10000 |           150000 | zstd        |                 3
 (2 rows)
```

Source: https://app.circleci.com/pipelines/github/citusdata/citus/26610/workflows/79f03ef9-7674-4567-a087-02536c9ddf04/jobs/760942
2022-08-29 14:33:26 +02:00
Ahmet Gedemenli 0855a9d1d4
Use SUM for calculating non partitioned table sizes (#6222)
We currently do a `pg_relation_total_size('t1') + pg_relation_total_size('t2') + ..` on shard lists, especially when rebalancing the shards. This in some cases goes huge. With this PR, we basically use a SUM for all table sizes, instead of using thousands of pluses.
2022-08-26 18:02:14 +03:00
Sameer Awasekar 4df8eca77f
Add worker_split_shard_release_dsm udf to release dynamic shared memory (#6248)
The code introduces worker_split_shard_release_dsm udf to release the dynamic shared memory segment allocated during non-blocking split workflow.
2022-08-26 18:27:32 +05:30
Jelte Fennema 77dd49fcf8
Fix flakyness in failure_online_move_shard_placement (#6250)
Sometimes in CI failure_online_move_shard_placement fails with the
following error:
```diff
 SELECT citus.mitmproxy('conn.onQuery(query="^ALTER SUBSCRIPTION .* ENABLE").cancel(' || :pid || ')');
  mitmproxy
 -----------

 (1 row)

 SELECT master_move_shard_placement(101, 'localhost', :worker_1_port, 'localhost', :worker_2_proxy_port);
-ERROR:  canceling statement due to user request
+ERROR:  tuple concurrently updated
+CONTEXT:  while executing command on localhost:9060
 -- failure on polling subscription state
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26441/workflows/dd6e3475-6121-47b3-aea3-4ac92be114f4/jobs/751476/steps

This error is not completely harmless, because based on the logs it mean
that our cleanup logic failed, which in turn means that replication
slots are left around:
```
2022-08-24 16:01:29.247 UTC [1219] ERROR:  XX000: tuple concurrently updated
2022-08-24 16:01:29.247 UTC [1219] LOCATION:  simple_heap_update, heapam.c:4179
2022-08-24 16:01:29.247 UTC [1219] STATEMENT:  ALTER SUBSCRIPTION citus_shard_move_subscription_10 DISABLE
```

However, we have other mechanisms to clean up any leftovers in case of a
failed cleanup. So it's not that big of a problem.

The reason we run into this error is arguably because of a Postgres bug,
so I created a patch for Postgres that fixes this.

While we wait for this (or a similar) patch to be merged, this PR
disables the flaky test. There's still a test that tests in case of a
connection "kill" instead of a "cancel", so I don't think we lose very
important coverage by disabling this test. When trying to reproduce this
I only hit this issue in the cancel case, so I don't think there's a
need to disable the kill case for now.
2022-08-26 12:49:45 +02:00
Jelte Fennema 2a0c0b3ba6
Fix flakyness in failure_connection_establishment (#6251)
In CI sometimes failure_connection_establishment would fail with the
following error:
```diff
 -- cancel all connections to this node
 SELECT citus.mitmproxy('conn.onAuthenticationOk().cancel(' || pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 SELECT * FROM citus_check_cluster_node_health();
```

The reason for this is that the mitm command that was used is very
broad and doesn't actually do what the comment says. What happens is
that if any connection is made, the current backend is cancelled, which
is not the always the same as the backend that made the connection. My
assessment is that likely the maintenance daemon makes a connection to
the node while we are executing the mitmproxy command. The mitmproxy
command goes through, and then triggers a cancel of itself due to the
connection made by the maintenance daemon.

This PR simply removes this test, since it doesn't seem to test what it
intended to test anyway. There's also still the "kill" version of this
test, which does do the intended thing. So I don't think we lose
important coverage by removing this test.
2022-08-26 10:01:36 +00:00
Jelte Fennema 18015ca501
Fix flakyness in multi_transaction_recovery (#6249)
Sometimes in CI multi_transaction_recovery would fail with the following
error:
```diff
 SET LOCAL citus.defer_drop_after_shard_move TO OFF;
 SELECT citus_move_shard_placement((SELECT * FROM selected_shard), 'localhost', :worker_1_port, 'localhost', :worker_2_port, shard_transfer_mode := 'block_writes');
- citus_move_shard_placement
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  could not find placement matching "localhost:57637"
+HINT:  Confirm the placement still exists and try again.
 COMMIT;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26510/workflows/8269ea93-d9b4-4376-ae0e-8332a5c15fc6/jobs/755548

The reason for this was that when choosing `selected_shard` we didn't
ensure that it was actually located on the node that we were moving it
from. Instead we simply picked the first shard for the table that was
returned by the query.

To fix this issue this PR adds a filter to only choose shards that are
located on the intended node.
2022-08-26 11:48:55 +02:00
Jelte Fennema b5cd1676f9
Fix flakyness in multi_utilities (#6245)
In CI multi_utilities would sometimes fail randomly with this error:

```diff
 VACUUM (INDEX_CLEANUP ON, PARALLEL 1) local_vacuum_table;
 SELECT pg_size_pretty( pg_total_relation_size('local_vacuum_table') );
  pg_size_pretty
 ----------------
- 21 MB
+ 22 MB
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26459/workflows/da47d9b6-f70b-49fe-806f-5ebf75bf0b11/jobs/752482

This is a harmless change in output where the relation size after
vacuuming was slightly more than we expected. This changes the size
checks for the local_vacuum_table to allow a wider range of values.

It uses the same trick as #6216 to show the actual value when it's
outside this valid range, which is useful if this test ever starts
failing again.
2022-08-25 22:50:47 +02:00
Jelte Fennema 00485d45a6
Make multi_utilities not leak tables (#6246)
When trying to fix #6245 I realized that multi_utilities was leaking
some tables that it created during the test. This fixes that by
creating all these tables in a schema that's dedicated for this test.
2022-08-25 19:33:03 +03:00
Jelte Fennema 1688bcda33
Fix errors in base_schedule (#6247)
When running `make check-base` locally it would fail with two different
errors.

The first one was this:
```diff
 SELECT create_distributed_table('pg_class', 'relname');
-ERROR:  cannot create a citus table from a catalog table
+ERROR:  deadlock detected
+DETAIL:  Process 28950 waits for ExclusiveLock on relation 16551 of database 16384; blocked by process 28951.
+Process 28951 waits for RowExclusiveLock on relation 1259 of database 16384; blocked by process 28950.
+HINT:  See server log for query details.
 SELECT create_reference_table('pg_class');
```

This happened because multi_behavioral_analytics_create_table and
multi_create_table were being run in parallel. Running them separately
resolved this issue.

The second one was this:
```diff
 CREATE OR REPLACE FUNCTION wait_until_metadata_sync(timeout INTEGER DEFAULT 15000)
     RETURNS void
     LANGUAGE C STRICT
     AS 'citus';
+ERROR:  duplicate key value violates unique constraint "pg_proc_proname_args_nsp_index"
+DETAIL:  Key (proname, proargtypes, pronamespace)=(wait_until_metadata_sync, 23, 2200) already exists.
 -- Add some helper functions for sending commands to mitmproxy
```
Which was because failure_test_helpers and multi_test_helpers were
trying to create the same function at the exact same time. The easy fix
here is to simply not create this function in the failure_test_helpers
file. This is fine, because any test schedule that runs
failure_test_helpers also runs multi_test_helpers.
2022-08-25 18:06:41 +02:00
Önder Kalacı 3ed6fea1cf
Prevent Merge command on distributed tables [PG 15] (#6238) 2022-08-25 13:27:08 +03:00
Marco Slot 9bf3c3dd5c
Add an allow_unsafe_constraints flag for constraints without distribution column (#6237)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-25 11:37:50 +03:00
Gokhan Gulbiz 69d2fcf5c0
Use the same colocation group for child and parent rels when altering a distributed table (#6225)
* Alter_distributed_table colocateWith:none bug fix for partitioned tables.

* Regression tests added for alter_distributed_table colocateWith:none for partitioned tables

* Update query comparision to be more accurate
2022-08-25 11:23:59 +03:00
Naisila Puka 1f4fe35512
Support JSON_TABLE on PG 15 (#6241)
Postgres supports JSON_TABLE feature on PG 15.

We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples).
In the end, for multi-shard JSON_TABLE commands, we apply the same
restrictions as reference tables (e.g., cannot be in the outer part of
an outer join etc.)

Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-08-24 19:11:18 +03:00
Naisila Puka 35b4ddc355
Pg15 support (#6085)
* Adjust configure script to allow PG15

* Adds copy of ruleutils_14.c as ruleutils_15.c

* Uses get_namespace_name_or_temp in ruleutils_15.c

Relevant PG commit:
48c5c9068211e0a04fd9553c8714b2821ed3ad17

* Clean up code using "(expr) ? true : false" in ruleutils_15.c

Relevant PG commit:
fd0625c7a9c679c0c1e896014b8f49a489c3a245

* Change varno from Index (unsigned int) to int in ruleutils_15.c

Relevant PG commit:
e3ec3c00d85bd2844ffddee83df2bd67c4f8297f

* Adds find_recursive_union to ruleutils_15.c

Relevant PG commit:
3f50b82639637c9908afa2087de7588450aa866b

* Fix display of SQL-std func's args in INSERT/SELECT in ruleutils_15.c

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fix ruleutils_15.c's dumping of whole-row Vars in more contexts

Relevant PG commit:
43c2175121c829c8591fc5117b725f1f22bfb670

* Fix assorted missing logic for GroupingFunc nodes in ruleutils_15.c

Relevant PG commit:
2591ee8ec44d8cbc8e1226550337a64c684746e4

* Adds grammar support for SQL/JSON clauses in ruleutils_15.c

Relevant PG commit:
f79b803dcc98d707450e158db3638dc67ff8380b

* Adds SQL/JSON constructors to ruleutils_15.c

Relevant PG commits:
f4fb45d15c59d7add2e1b81a9d477d0119a9691a
cc7401d5ca498a84d9b47fd2e01cebd8e830e558

* Adds support for MERGE in ruleutils_15.c

Relevant PG commit:
7103ebb7aae8ab8076b7e85f335ceb8fe799097c

* Add IS JSON predicate to ruleutils_15.c

Relevant PG commit:
33a377608fc29cdd1f6b63be561eab0aee5c81f0

* Add SQL/JSON query functions to ruleutils_15.c

Relevant PG commit:
1a36bc9dba8eae90963a586d37b6457b32b2fed4

* Adds three different SQL/JSON values to ruleutils_15.c

Relevant PG commits:
606948b058dc16bce494270eea577011a602810e
49082c2cc3d8167cca70cfe697afb064710828ca

* Adds JSON table functions in ruleutils_15.c

Relevant PG commit:
4e34747c88a03ede6e9d731727815e37273d4bc9

* Add PLAN function for JSON table in ruleutils_15.c

Relevant PG commit:
fadb48b00e02ccfd152baa80942de30205ab3c4f

* Remove extra blank lines before block-closing braces ruleutils_15.c

Relevant PG commit:
24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031

* set_deparse_plan: Reuse variable to appease Coverity ruleutils_15.c

Relevant PG commit:
e70813fbc4aaca35ec012d5a426706bd54e4acab

* Mechanical code beautification ruleutils_15.c

Relevant PG commit:
23e7b38bfe396f919fdb66057174d29e17086418

* Rename value_type to item_type in ruleutils_15.c

Relevant PG commit:
3ab9a63cb638a1fd99475668e2da9c237495aeda

* Show 'AS "?column?"' explicitly when it's important in ruleutils_15.c

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Fix ruleutils_15.c issues with dropped cols in funcs-returning-composite

Relevant PG commit:
c1d1e8469c77ce6b8e5310955580b4a3eee7fe96

* Change comment regarding functions returning composite in ruleutils_15.c

Relevant PG commit:
c2fa113ddb1117b1f03e91960f65d5d7d8a90270

* Replace int nodes with bool nodes where needed

In PG15, Boolean nodes are added. Pre PG15, internal Boolean values
in Create Role commands were represented by Integer nodes. This
commit replaces int nodes logic with bool nodes logic where needed.
Mostly there are CREATE ROLE logic changes.

Relevant PG commit:
941460fcf731a32e6a90691508d5cfa3d1f8eeaf

* Handle new option colliculocale in CREATE COLLATION logic

In PG15, there is an added option to use ICU as global locale provider.
pg_collation has three locale-related fields: collcollate and collctype,
which are libc-related fields, and a new one colliculocale, which is the
ICU-related field. Only the libc-related fields or the ICU-related field
is set, never both.

Relevant PG commits:
f2553d43060edb210b36c63187d52a632448e1d2
54637508f87bd5f07fb9406bac6b08240283be3b

* Add PG15 tests to CI using test images that have 15beta2 (#6093)

* Change warning message in pg_signal_backend()

Relevant PG commit:
7fa945b857cc1b2964799411f1633468826861ff

* Revert "Add missing ifdef for PG 15"

This reverts commit c7b51025ab.

* Fixes tests for ALTER TRIGGER RENAME consistency for part. tables

Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866

* Prevent creating child triggers on partitions when adding new node

Pre PG15, tgisinternal is true for a "child" trigger on a partition
cloned from the trigger on the parent.
In PG15, tgisinternal is false in that case. However, we don't want to
create this trigger on the partition since it will create a conflict
when we try to attach the partition to the parent table:
ERROR: trigger "..." for relation "{partition_name}" already exists

Relevant PG commit:
f4566345cf40b068368cb5617e61318da60676ec

* Fix tests for generated columns dependency changes

In PG15, For GENERATED columns, all dependencies of the generation
expression are recorded as NORMAL dependencies of the column itself.
This requires CASCADE to drop generated cols with the original col.
PRE PG15, dependencies were recorded as AUTO, with which
generated columns are silently dropped with the original column.

Relevant PG commit:
cb02fcb4c95bae08adaca1202c2081cfc81a28b5

* Explicitly cast catalog "char" column to text before concatenation

Relevant PG commit:
07eee5a0dc642d26f44d65c4e6263304208e8583

* Remove 'AS "?column?"' from test outputs

There were some instances in the following tst outputs
in planning debug outputs where AS "?column?" is added.
We add a normalization rule to remove it as it is not
important.

cte_inline.out
recursive_relation_planning_restriction_pushdown.out

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Use pg_backup_stop(PG15) instead of pg_stop_backup(PG<15)

Add an alternative test output because of the change in the
backup modes of Postgres. Specifically here, there is a renaming
issue: pg_stop_backup PRE PG15 vs pg_backup_stop PG15+
The alternative output can be deleted when we drop support for PG14

Relevant PG commit:
39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a

* Adds citus.mitmfifo GUC

Previously we setting this configuration parameter
in the fly for failure tests schedule.
However, PG15 doesn't allow that anymore: reserved prefixes
like "citus" cannot be used to set non-existing GUCs.

Relevant PG commit:
88103567cb8fa5be46dc9fac3e3b8774951a2be7

* Handles EXPLAIN output diffs in PG15 - Extra result lines

To handle extra "Result" lines in explain outputs, we add explain
method to multi_test_helpers.sql file
- plan_without_result_lines() is added for cases where we want the
whole explain output with only "Result" lines removed

* Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage

To handle differences in usage of GroupAggregate vs HashAggregate
or Merge Join vs Hash join in cases where this detail doesn't
seem to matter, we use coordinator_plan().
- coordinator_plan() is updated to remove "Result" lines

There are some cases where we have subplans so we add a new
function that prints all Task Count lines as well
- coordinator_plan_with_subplans()

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.

* Handles EXPLAIN output diffs in PG15: enable_group_by_reordering

Relevant PG commit
db0d67db2401eb6238ccc04c6407a4fd4f985832

* Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs

We create a new function in multi_test_helpers, which is similar
to explain_merge function in PG15. This explain helper function
normalies Memory Usage, Buckets and Batches, and we use it in the
tests which give a different output for PG15.

* Bump test images to 15beta3 (#6172)

* Omit namespace in post-copy errmsg

Relevant PG commit:
069d33d0c5a021601245e44df77a0423ddd69359

* Handles EXPLAIN output diffs in PG15: extra arrows&result lines

To handle extra "->" arrows resulting from extra Result lines
in explain outputs, we add the following explain method to
multi_test_helpers.sql file

- plan_without_arrows() is added for cases where we want the
whole explain output without arrows and without Result lines

* Alters public schema's owner to pg_database_owner in PG15

In PG15, public schema is owned by pg_database_owner role.
In multi_extension, we drop and recreate the ppublic schema,
hence its owner become the default user in our tests, postgres.
Change that to pg_database_owner for PG15 consistency.

This results in alternative test output for public schema grants
in the following test:

grant_on_schema_propagation.sql

Relevant PG commit: b073c3ccd06e4cb845e121387a43faa8c68a7b62

* Add alternative test outputs for change in Insert Select display

citus_local_tables_queries.sql
coordinator_shouldhaveshards.sql
cte_inline.sql
insert_select_repartition.sql
intermediate_result_pruning.sql
local_shard_execution.sql
local_shard_execution_replicated.sql
multi_deparse_shard_query.sql
multi_insert_select.sql
multi_insert_select_conflict.sql
multi_mx_insert_select_repartition.sql
mx_coordinator_shouldhaveshards.sql
single_node.sql

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fixes columnar tap tests for PG15

In PG15, Perl test modules have been moved to a new namespace.
Also, postgres node new() and get_new_node() methods have been
unified to one method: new()

We create separate tap tests for PG13/14 and PG15+
and update the Makefiles accordingly.

Relevant PG commits:
201a76183e2056c2217129e12d68c25ec9c559c8
b3b4d8e68ae83f432f43f035c7eb481ef93e1583

* Handles EXPLAIN output diffs in PG15: HashAgg Leverage,alt. output

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.
2022-08-24 17:59:17 +02:00
Naisila Puka ddbd10d2e7
Rename server version checks in tests (#6239) 2022-08-24 16:31:52 +03:00
Jelte Fennema 5c0205ce10
Fix flakyness in multi_replicate_reference_table (#6235)
In CI multi_replicate_reference_table would sometimes fail like this:

```diff
 -- detects correctly that referecence table doesn't have replica identity
 SELECT replicate_reference_tables();
-ERROR:  cannot use logical replication to transfer shards of the relation initially_not_replicated_reference_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
+ERROR:  cannot use logical replication to transfer shards of the relation ref_table since it doesn't have a REPLICA IDENTITY or PRIMARY KEY
 DETAIL:  UPDATE and DELETE commands on the shard will error out during logical replication unless there is a REPLICA IDENTITY or PRIMARY KEY.
 HINT:  If you wish to continue without a replica identity set the shard_transfer_mode to 'force_logical' or 'block_writes'.
```

Because `CitusTableTypeIdList` returns tables in heap order so it's
a bit random which one is first in the list. And the test contained
multiple tables that didn't have a primary key or replica identity. So
it made sense that the error could be for either one of these tables.
This PR makes the test output consistent by changing one of the tables
to have a primary key.

Example of failing test: https://app.circleci.com/pipelines/github/citusdata/citus/26387/workflows/fc3196e7-ddf2-4000-a70b-5ac71c836321/jobs/748940
2022-08-24 13:34:10 +03:00
aykut-bozkurt 041f88d7bf
Revert "Revert "Creates new colocation for colocate_with:='none' too"" (#6227)
This reverts commit d171a736ab.
2022-08-24 10:54:04 +03:00
Marco Slot bad8196da3
Verify that we can replicate reference tables using rebalancer (#6232)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-24 00:34:21 +02:00
Jelte Fennema e0ada050aa
Enable binary logical replication for shard moves (#6017)
Using binary encoding can save a lot of CPU cycles, both on the sender
and on the receiver. Since the walsender and walreceiver processes are
single threaded, this can matter a lot for the throughput if they are
bottlenecked on CPU.

This feature is only available in PG14, not PG13. It should be safe to 
always enable because it's only used for types that support binary 
encoding according to the PG docs:
> Even when this option is enabled, only data types that have binary 
> send and receive functions will be transferred in binary.

But in case it causes problems, it can still be disabled by setting
`citus.enable_binary_protocol` to `false`.
2022-08-23 16:38:00 +02:00
Jelte Fennema cc7e93a56a
Fix flakyness in failure_connection_establishment (#6226)
In CI our failure_connection_establishment sometimes failed randomly
with the following error:
```diff
 -- verify a connection attempt was made to the intercepted node, this would have cause the
 -- connection to have been delayed and thus caused a timeout
 SELECT * FROM citus.dump_network_traffic() WHERE conn=0;
  conn | source | message
 ------+--------+---------
-    0 | coordinator | [initial message]
-(1 row)
+(0 rows)

 SELECT citus.mitmproxy('conn.allow()');
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26318/workflows/d3354024-9a67-4b01-9416-5cf79aec6bd8/jobs/745558

The way I fixed this was by removing the dump_network_traffic call. This
might sound simple, but doing this while continuing to let the test
serve its intended purpose required quite some more changes.

This dump_network_traffic call was there because we didn't want to show
warnings in the queries above, because the exact warnings were not
reliable. The main reason this error was not reliable was because we
were using round-robin task assignment. We did the same query twice, so
that it would hit the node with the intercepted connection in one of
those connections. Instead of doing that I'm now using the
"first-replica" policy and do the queries only once. This works, because
the first placements by placementid for each of the used tables are on
the second node, so first-replica will cause the first connection to go
there.

This solved most of the flakyness, but when confirming that the
flakyness was fixed I found some additional errors:

```diff
 -- show that INSERT failed
 SELECT citus.mitmproxy('conn.allow()');
  mitmproxy
 -----------

 (1 row)

 SELECT count(*) FROM single_replicatated WHERE key = 100;
- count
----------------------------------------------------------------------
-     0
-(1 row)
-
+ERROR:  could not establish any connections to the node localhost:9060 after 400 ms
 RESET client_min_messages;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26321/workflows/fd5f4622-400c-465e-8d82-83f5f55a87ec/jobs/745666


I addressed this with a combination of two things:
1. Only change citus.node_connection_timeout for the queries that we
   want to test timeout behaviour for. When those queries are done I
   reset the value to the default again.
2. Change our mitm framework to only delay the initial connection packet
   instead of all packets. I think sometimes a follow on packet of a previous 
   connection attempt was causing the next connection attempt to be delayed
   even if `conn.allow()` was already called. For our tests we only care about
   connection timeouts, so there's no reason to delay any other packets than
   the initial connection packet.

Then there was some last flakyness in the exact error that was given:

```diff
 -- tests for connectivity checks
 SELECT name FROM r1 WHERE id = 2;
 WARNING:  could not establish any connections to the node localhost:9060 after 900 ms
+WARNING:  connection to the remote node localhost:9060 failed with the following error:
  name
 ------
  bar
 (1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26338/workflows/9610941c-4d01-4f62-84dc-b91abc56c252/jobs/746467

I don't have a good explaination for this slight change in error message, but
given that it is missing the actual error message I expected this to be related
to some small difference in timing: e.g. the server responding to the connection
attempt right after the coordinator determined that the connection timed out.
To solve this last  flakyness I increased the connection timeouts and made the 
difference between the timeout and the delay a bit bigger. With these tweaks 
I wasn't able to reproduce this error on CI anymore.

Finally, I made most of the same changes to failure_failover_to_local_execution,
since it was using the `conn.delay()` mitm method too. The only change that
I left out was the timing increase, since it might not be strictly necessary and
increases time it takes to run the test. If this test ever becomes flaky the first
thing we should try is increase its timeout.
2022-08-23 15:04:20 +03:00
Jelte Fennema 506c16efdf
Fix flakyness in failure_single_select (#6223)
The failure_single_select test would sometimes fail with an error that's
similar to this:
```diff
 -- cancel after first SELECT; txn should fail and nothing should be marked as invalid
 SELECT citus.mitmproxy('conn.onQuery(query="^SELECT").cancel(' ||  pg_backend_pid() || ')');
- mitmproxy
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  canceling statement due to user request
+CONTEXT:  COPY mitmproxy_result, line 1: ""
+SQL statement "COPY mitmproxy_result FROM '/home/circleci/project/src/test/regress/tmp_check/mitmproxy.fifo'"
+PL/pgSQL function citus.mitmproxy(text) line 11 at EXECUTE
 BEGIN;
```

This error looked very to the one from #6217 and indeed the cause turned
out to be similar. Because we were canceling all SELECT queries, we
would actually sometimes cancel our mitmproxy SELECT queries itself.

This puts some additional restrictions on the queries that we cancel,
most importantly it should contain the name of the table that we're
selecting from.

I was able to reproduce the original issue locally pretty reliably. With
the changes in this PR it didn't happen again.

In passing this also changes one other failure test that was cancelling
all selects and puts similar additional restrictions on those
cancellations. 

Example of failed test in CI: https://app.circleci.com/pipelines/github/citusdata/citus/26305/workflows/4d942b91-f83c-453c-8d9a-ae22d608e756/jobs/745071
2022-08-22 20:06:33 +02:00
Hanefi Onaldi e33ba7da9e
Decrease min messages for normalization 2022-08-22 17:16:52 +03:00