* Fix issue : 6109 Segfault or (assertion failure) is possible when using a SQL function
* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases.
This change fixes the issue by throwing an error message when a SQL function cannot be handled.
Fixes#6109.
* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases. This change fixes the issue by throwing an error message when a SQL function cannot be handled.
Fixes#6109.
Co-authored-by: Emel Simsek <emel.simsek@microsoft.com>
PG15 allows numeric scale to be negative or greater than precision. This
causes issues and we may end up routing queries to a wrong shard due to
differing hash results after rounding.
Formerly, when specifying NUMERIC(precision, scale), the scale had to be
in the range [0, precision], which was per SQL spec. PG15 extends the
range of allowed scales to [-1000, 1000].
A negative scale implies rounding before the decimal point. For
example, a column might be declared with a scale of -3 to round values
to the nearest thousand. Note that the display scale remains
non-negative, so in this case the display scale will be zero, and all
digits before the decimal point will be displayed.
Relevant PG commit: 085f931f52494e1f304e35571924efa6fcdc2b44
Pre PG15, renaming child triggers on partitions is allowed. When
creating a trigger in a distributed parent partitioned table, the
triggers on the shards of the partitions have the same name with
the triggers on the corresponding parent shards of the parent
table. Therefore, they don't have the same appended shard id as
the shard id of the partition. Hence, when trying to rename a
child trigger on a partition of a distributed table, we can't
correctly find the triggers on the shards of the partition in
order to rename them since we append a different shard id to the
name of the trigger. Since we can't find the trigger we get a
misleading error of inexistent trigger.
In this commit we prohibit renaming child triggers on distributed
partitions altogether.
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
step s2-view-dist:
SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;
query |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state |wait_event_type|wait_event|usename |datname
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------
INSERT INTO test_table VALUES (100, 100);
|localhost | 57636|idle in transaction|Client |ClientRead|postgres|regression
-(1 row)
+
+ SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+ FROM (
+ SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+ pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+ ) AS csa_from_one_node;
+ |localhost | 57636|active | | |postgres|regression
+(2 rows)
step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26692/workflows/3406e4b4-b686-4667-bec6-8253ee0809b1/jobs/765119
I intended to fix this with #6263, but the fix turned out to be
insufficient. This PR tries to address the issue by setting
distributedCommandOriginator correctly in more situations. However, even
with this change it's still possible to reproduce the flaky test in CI.
In any case this should fix at least some instances of this issue.
In passing this changes the isolation_citus_dist_activity test to allow
running it multiple times in a row.
PRE PG15, Renaming the parent triggers on partitioned tables doesn't
recurse to renaming the child triggers on the partitions as well.
In PG15, Renaming triggers on partitioned tables
recurses to renaming the triggers on the partitions as well.
Add an upgrade test to make sure we are not breaking anything
with distributed triggers on distributed partitioned tables.
Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866
pg_dist_node and pg_dist_colocation have a primary key index, not a replica identity index.
Citus catalog tables are created in public schema, which has replica identity index by default
as primary key index. Later the citus catalog tables are moved to pg_catalog schema.
During pg_upgrade, all tables are recreated, and given that pg_dist_colocation is found in
pg_catalog schema, it is recreated in that schema, and when it is recreated it doesn't
have a replica identity index, because catalog tables have no replica identity.
Further action:
Do we even need to acquire this lock on the primary key index?
Postgres doesn't acquire such locks on indexes before deleting catalog tuples.
Also, catalog tuples don't have replica identities by definition.
We have lots of flaky tests in CI and most of these random failures are
very hard/impossible to reproduce locally. This adds a job definition to
CI that allows adding a temporary job to rerun the same test in CI a lot
of times. This will very often reproduce the random failures. If you
then try to change the test or code to fix the random failure, you can
confirm that it's indeed fixed by using this job.
A future improvement to this job would be to run it (or a variant of it)
automatically for every newly added test, and maybe even changed tests.
This is not implemented in this PR.
An example of this job running can be found here:
https://app.circleci.com/pipelines/github/citusdata/citus/26682/workflows/a2638385-35bc-443c-badc-7713a8101313
In commit 31faa88a4e I removed some features of the rebalance progress
monitor. I did this because the plan was to remove the foreground shard
rebalancer later in the PR that would add the background shard
rebalancer. So, I didn't want to spend time fixing something that we
would throw away anyway.
As it turns out we're not removing the foreground shard rebalancer after
all, so it made sens to fix the stuff that I broke. This PR does that.
For the most part this commit reverts the changes in commit 31faa88a4e.
It's not a full revert though, because it keeps the improved tests and
the changes to `citus_move_shard_placement`.
Before, this was the default mode for CustomScan providers.
Now, the default is to assume that they can't project.
This causes performance penalties due to adding unnecessary
Result nodes.
Hence we use the newly added flag, CUSTOMPATH_SUPPORT_PROJECTION
to get it back to how it was.
In PG15 support branch we created explain functions to ignore
the new Result nodes, so we undo that in this commit.
Relevant PG commit:
955b3e0f9269639fb916cee3dea37aee50b82df0
Sometimes in CI our multi_utilities test fails like this:
```diff
VACUUM (INDEX_CLEANUP ON, PARALLEL 1) local_vacuum_table;
SELECT CASE WHEN s BETWEEN 20000000 AND 25000000 THEN 22500000 ELSE s END size
FROM pg_total_relation_size('local_vacuum_table') s ;
size
----------
- 22500000
+ 39518208
(1 row)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26641/workflows/5caea99c-9f58-4baa-839a-805aea714628/jobs/762870
Apparently VACUUM is not as reliable in cleaning up as we thought. This
increases the range of allowed values. Important to note is that the
range is still completely outside of the allowed range of the initial
size. So we know for sure that some data was cleaned up.
Sometimes in CI our adaptive_executor test would fail randomly with the
following error:
```diff
SELECT sum(result::bigint) FROM run_command_on_workers($$
SELECT count(*) FROM pg_stat_activity
WHERE pid <> pg_backend_pid() AND query LIKE '%8010090%'
$$);
sum
-----
- 4
+ 2
(1 row)
END;
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26665/workflows/40665680-0044-4852-8fe4-5fd628f9fb47/jobs/764371
This means that the low slow start interval did not have any effect on
the number of connections being opened. I could see two possibilities
for this to happen:
1. CI was slow and actually doing the start of the second connection. I
tried to solve this by doubling the time a query to the worker takes.
2. The second option is that the shards were queried in the oposite
order than we expect. This would mean that the first query to the
worker completes quickly because there's no, sleep because it doesn't
contain any rows. I tried to solve this option by adding a row to
each shard.
After trying to reproduce the random failure in CI it turned out that I
needed both of these fixes to resolve the random failure.
On CI our citus_split_shard_columnar_partitioned test would sometimes
randomly fail like this:
```diff
8970008 | colocated_dist_table | -2147483648 | 2147483647 | localhost | 57637
8970009 | colocated_partitioned_table | -2147483648 | 2147483647 | localhost | 57637
8970010 | colocated_partitioned_table_2020_01_01 | -2147483648 | 2147483647 | localhost | 57637
- 8970011 | reference_table | | | localhost | 57637
8970011 | reference_table | | | localhost | 57638
+ 8970011 | reference_table | | | localhost | 57637
(13 rows)
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26651/workflows/f695b4fb-ad81-46ff-b97e-0100e5d167ea/jobs/763517
This is a harmless diff due to a missing column in the order by list.
This fixes that by adding the nodeport as a tiebreaker.
Added create_distributed_table_concurrently which is nonblocking variant of create_distributed_table.
It bases on the split API which takes advantage of logical replication to support nonblocking split operations.
Co-authored-by: Marco Slot <marco.slot@gmail.com>
Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
step s2-view-dist:
SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;
query |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state |wait_event_type|wait_event|usename |datname
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------
INSERT INTO test_table VALUES (100, 100);
|localhost | 57636|idle in transaction|Client |ClientRead|postgres|regression
-(1 row)
+
+ SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+ FROM (
+ SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+ pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+ ) AS csa_from_one_node;
+ |localhost | 57636|active | | |postgres|regression
+(2 rows)
step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26605/workflows/56d284d2-5bb3-4e64-a0ea-7b9b1626e7cd/jobs/760633
The reason for this is that citus_dist_stat_activity sometimes shows the
query that it uses itself to get the data from pg_stat_activity. This is
actually a bug, because it's a worker query and thus shouldn't show up
there. To try and solve this bug, we remove two small opportunities for a
race condition. These race conditions could happen when the backenddata
was marked as active, but the distributedCommandOriginator was not set
correctly yet/anymore. There was an opportunity for this to happen both
during connection start and shutdown.
Sometimes in CI our drop_partitioned_talbe test would fail with the
following error:
```diff
NOTICE: issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
NOTICE: issuing SELECT worker_drop_distributed_table('drop_partitioned_table.child1')
NOTICE: issuing DROP TABLE IF EXISTS drop_partitioned_table.child1_727001 CASCADE
-NOTICE: issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
-NOTICE: issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100047)
+NOTICE: issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
+NOTICE: issuing SELECT pg_catalog.citus_internal_delete_colocation_metadata(100046)
ROLLBACK;
NOTICE: issuing ROLLBACK
NOTICE: issuing ROLLBACK
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26631/workflows/31536032-e1ba-493b-b12a-f40757f3a7d6/jobs/762170
For some reason the colocationid of the distributed partitioned table
would be one less than we expected. Why this happens I'm not sure, but
it seems fairly harmless that it does.
In an attempt to work around this flakyness I now reset the colocation
id sequence right before creating the table in question. This is good
practice in general, because it allows us to run the test successfully
using `check-minimal` and it also allows us to rerun it multiple times.
Our python based tests didn't always copy the normalized files after the
regress run. I had the problem where running the following command would
result in non-normalized files in the expected directory after running
our PG upgrade tests locally:
```
cp src/test/regress/{results,expected}/upgrade_list_citus_objects.out
```
This PR fixes that by always running `copy_modified` even if the tests
fail. The same was already being done for our perl based tests at the
end of the `pg_regress_multi.pl` file.