In #6405 I added better improved blocked process detection for isolation
tests. But when cleaning up unnecessary code I cleaned up a bit too
much. This actually includes the new function definition in our
migrations.
In CI multi_partitioning sometimes fails with this error:
```diff
SELECT citus_remove_node('localhost', :master_port);
- citus_remove_node
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR: tuple concurrently deleted
-- d) invalid tables for helper UDFs
```
Source:
https://app.circleci.com/pipelines/github/citusdata/citus/27993/workflows/685e5b20-c923-43e5-8a0d-b932ef4c4914/jobs/839466
This PR avoids this concurrency issue by not running the
multi_partitioning test in parallel with other tests.
If an operation requires having coordinator in pg_dist_node and if that
is not the case, then we automatically add the coordinator into
pg_dist_node if user didn't add any worker nodes yet.
However, if user have already added some worker nodes before, we throw
an error. With this commit, we improve the error thrown in that case.
Closes#6423 based on the discussion made there.
Sometimes our CI randomly fails on a test in a way similar to this:
```diff
step s2-drop:
DROP TABLE cancel_table;
-
+ <waiting ...>
+step s2-drop: <... completed>
starting permutation: s1-timeout s1-begin s1-sleep10000 s1-rollback s1-reset s1-drop
```
Source:
https://app.circleci.com/pipelines/github/citusdata/citus/26524/workflows/5415b84f-13a3-482f-bef9-648314c79a67/jobs/756377
I tried to fix that already in #6252 by disabling the maintenance daemon
during isolation tests. But it seems that hasn't fixed all cases of
these errors. This is another attempt at fixing these issues that seems
to have better results.
What it does is that it starts using the pInterestingPids parameter that
citus_isolation_test_session_is_blocked receives. With this change we
start filter out block-edges that are not caused by any of these pids.
In passing this change also makes it possible to run
`isolation_create_distributed_table_concurrently` with
`check-isolation-base`
PG15 introduced a function called ReplicationSlotName that causes
conflicts with our function with the same name. I solved this issue by
renaming our function to ReplicationSlotNameForNodeAndOwner
Relevant PG commit:
c3b5992b91
DESCRIPTION: Fix bug in global PID assignment for rebalancer
sub-connections
In CI our isolation_shard_rebalancer_progress test would sometimes fail
like this:
```diff
+isolationtester: canceling step s1-rebalance-c1-block-writes after 60 seconds
step s1-rebalance-c1-block-writes:
SELECT rebalance_table_shards('colocated1', shard_transfer_mode:='block_writes');
- <waiting ...>
+
+ERROR: canceling statement due to user request
step s7-get-progress:
```
Source:
https://app.circleci.com/pipelines/github/citusdata/citus/27855/workflows/2a7e335a-f3e8-46ed-b6bd-6920d42f7214/jobs/831710
It turned out this was an actual bug in the way our assigning of global
PIDs interacts with the way we connect to ourselves as the shard
rebalancer. The first command the shard rebalancer sends is a SET
ommand to change the application_name to `citus_rebalancer`. If
`StartupCitusBackend` is called after this command is processed, then it
overwrites the global PID that was extracted from the previous
application_name. This makes sure that we don't do that, and continue to
use the original global PID. While it might seem that we only call
`StartupCitusBackend` once for each query backend, this isn't actually
the case. Whenever pg_dist_partition gets ANALYZEd by autovacuum
we indirectly call `StartupCitusBackend` again, because we invalidate
the cache then.
In passing this fixes two other things as well:
1. It sets `distributedCommandOriginator` correctly in
`AssignGlobalPID`, by using IsExternalClientBackend(). This doesn't
matter much anymore, since AssignGlobalPID effectively becomes a
no-op in this PR for any non-external client backends.
2. It passes the application_name to InitializeBackendData in
StartupCitusBackend, instead of INVALID_CITUS_INTERNAL_BACKEND_GPID
(which effectively got casted to NULL). In practice this doesn't
change the behaviour of the call, since the call is a no-op for every
backend except the maintenance daemon. And the behaviour of the call
is the same for NULL as for the application_name of the maintenance
daemon.
We decrease verbosity level here to avoid the flaky output
https://app.circleci.com/pipelines/github/citusdata/citus/27936/workflows/dc63128a-1570-41a0-8722-08f3e3cfe301/jobs/836153
```diff
select alter_table_set_access_method('ref','heap');
NOTICE: creating a new table for alter_table_set_access_method.ref
NOTICE: moving the data of alter_table_set_access_method.ref
NOTICE: dropping the old alter_table_set_access_method.ref
NOTICE: drop cascades to 2 other objects
-DETAIL: drop cascades to materialized view m_ref
-drop cascades to view v_ref
+DETAIL: drop cascades to view v_ref
+drop cascades to materialized view m_ref
CONTEXT: SQL statement "DROP TABLE alter_table_set_access_method.ref CASCADE"
NOTICE: renaming the new table to alter_table_set_access_method.ref
alter_table_set_access_method
-------------------------------
(1 row)
```
DESCRIPTION: Raises memory limits in columnar from 256MB to 1GB for
reads and writes
This doesn't completely fix#5918 but at least increases the
buffer limits that might cause throwing an error when reading
from or writing into into columnar storage. A way better approach
to fix this is documented in #6420.
Replacing memcpy_s with memcpy is quite safe in those places
since we anyway make sure to allocate enough amount of memory
before writing into related buffers.
When you run vanilla tests in your local environment, some of the tests
tries to find path for regress.so which is not in default lib path. That
is why we need to specify regress.so path as dlpath option.
Example failure:
```
LOAD :'regresslib';
+ERROR: could not access file "/home/aykutbozkurt/.pgenv/pgsql-15beta4/lib/regress.so": No such file or directory
```
It is actually in
`~/.pgenv/src/postgresql-15beta4/src/test/regress/regress.so` which is
found by `$regresslibdir`.
When bumping to RC2, we needed to update one test. The following is the
commit message for the change:
Remove references to optimization PG15 reverted
PG15 introduced an optimization on GROUP BY keys that is now reverted on
RC2.
Relevant PG Commit:
Revert "Optimize order of GROUP BY keys".
443df6e2db932a7cd6d85ddfb67e11a43345130d
Depends on: https://github.com/citusdata/the-process/pull/94
PG15 introduced an optimization on GROUP BY keys that is now reverted on
RC2.
Relevant PG commit:
Revert "Optimize order of GROUP BY keys".
443df6e2db932a7cd6d85ddfb67e11a43345130d
Fixes https://github.com/citusdata/citus/issues/6394.
DESCRIPTION: Fixes a bug that causes creating disabled-triggers on
shards as enabled
Since CREATE TRIGGER doesn't have syntax support to specify
whether the trigger should be enabled/disabled, the underlying
PG function (`pg_get_triggerdef()`) that we use to generate the
command to create the trigger is not enough. For this reason, we
append a second command to enable/disable trigger, right after
creating it.
We don't retain explicit extension dependencies set by using
`ALTER trigger DEPENDS ON EXTENSION` commands too, but apparently
right fix for that is to throw an error as in
`PreprocessAlterTriggerDependsStmt()`; so, opened a separate PR
to fix that #6399.
During alter_distributed_table, we create a new table like the
original table but with the altered options.
To retrieve the name of the distribution column, we were using
the attribute syscache of the new table, since we already created
the new table as identical to the original table.
However, the attribute syscaches of these two tables are not
the same if the original table has dropped columns. The reason
is that dropped columns are all still present in the cache.
Hence, for example, the attnos would be different in the syscaches.
So, let's use the attribute syscache of the original table.
DESCRIPTION: Fixes a bug that prevents retaining columnar table options after a table-rewrite
A fix for this issue: Columnar: options ignored during ALTER TABLE
rewrite #5927
The OID for the temporary table created during ALTER TABLE was not the
same as the original table's OID so the columnar options were not being
applied during rewrite.
The change is that I applied the original table's columnar options to
the new table so that it has the correct options during write. I also
added a test.
DESCRIPTION: Adds source_lsn and target_lsn fields into
get_rebalance_progress
Adding two fields named `source_lsn` and `target_lsn` to the function
`get_rebalance_progress`.
Target lsn data is fetched in `GetShardStatistics`, by expanding the
query sent to workers (joining with pg_subscription_rel and
pg_stat_subscription). Then put into the hashmap, for each shard.
Source lsn data is fetched in `BuildWorkerShardStatististicsHash`, in
the loop that iterate each node, by sending a pg_current_wal_lsn query
to each node. Then put into the hashmap, for each node.
DESCRIPTION: Fixes a bug in `ALTER EXTENSION citus UPDATE`
We had a series of changes on columnar that made it impossible for a
Citus user to downgrade from 10.2-4 to 10.2-2. Since we test downgrades
to immediate previous versions, we did not capture this in our tests.
Here are the series of changes.
- `10.2-1` introduced a btree index named
`columnar.stripe_first_row_number_idx`
- `10.2-3` had a unique index with the same name. To accomplish that, we
dropped the btree index, and create a unique index with the same name.
- `10.2-4` introduced `columnar_ensure_am_depends_catalog()` that adds
pg_depend entries so that Columnar access method depended on objects
such as `stripe_first_row_number_idx`
If a user upgrades to `>=10.2-4` we create a dependency record, and this
prevents users from downgrading to an earlier version than `10.2-3`
since the downgrade file `columnar--10.2-3--10.2-2.sql` wanted to drop
the unique index and create a btree index instead. However this created
an error because columnar am depended on that index.
We do not usually like to update earlier migration versions, but there
is no other solution that I could think of.
## Notes to reviewer:
Consider reviewing the commits one by one.
- Commit#1 aims to improve downgrade scripts overall.
- Commit#2 documents the failure
- Commit#3 fixes the problem by updating all the files that attempted to
drop `stripe_first_row_number_idx` index.
Related: #6041
On our CI our isolation_shard_rebalancer_progress would sometimes
randomly fail like this:
```diff
table_name|shardid|shard_size|sourcename|sourceport|source_shard_size|targetname|targetport|target_shard_size|progress|operation_type
----------+-------+----------+----------+----------+-----------------+----------+----------+-----------------+--------+--------------
-colocated1|1500001| 49152|localhost | 57637| 49152|localhost | 57638| 73728| 1|move
-colocated2|1500005| 376832|localhost | 57637| 376832|localhost | 57638| 401408| 1|move
+colocated1|1500001| 49152|localhost | 57637| 49152|localhost | 57638| 81920| 1|move
+colocated2|1500005| 376832|localhost | 57637| 376832|localhost | 57638| 409600| 1|move
(2 rows)
```
Source:
https://app.circleci.com/pipelines/github/citusdata/citus/27688/workflows/8c5ca443-5f21-4f21-b74f-0ca7bde69648/jobs/823648/parallel-runs/1
The shard sizes would be slightly larger or smaller than expected. This
fixes this by fixing the output to the nearest expected shard size. To
do so I used a trick described in this stack overflow answer:
https://stackoverflow.com/a/33147437/2570866
When investigating I ran into one more random failure:
```diff
-step s1-shard-move-c1-block-writes: <... completed>
+step s4-shard-move-sep-block-writes: <... completed>
citus_move_shard_placement
--------------------------
(1 row)
-step s4-shard-move-sep-block-writes: <... completed>
+step s1-shard-move-c1-block-writes: <... completed>
citus_move_shard_placement
--------------------------
```
Source:
https://app.circleci.com/pipelines/github/citusdata/citus/27707/workflows/c3ff4fc7-5068-4096-ab9f-803c941ddac0/jobs/824622/parallel-runs/29?filterBy=FAILED
This random failure happens, because the two parallel moves can complete
at the same time. So, it's non-deterministic which one finishes first. To
make this deterministic I used the "marker" feature from the isolation
tester.
And finally I ran into a third random failure:
```diff
table_name|shardid|shard_size|sourcename|sourceport|source_shard_size|targetname|targetport|target_shard_size|progress|operation_type
----------+-------+----------+----------+----------+-----------------+----------+----------+-----------------+--------+--------------
-colocated1|1500001| 50000|localhost | 57637| 50000|localhost | 57638| 50000| 1|move
-colocated2|1500005| 400000|localhost | 57637| 400000|localhost | 57638| 400000| 1|move
+colocated1|1500001| 50000|localhost | 57637| 50000|localhost | 57638| 8000| 1|move
+colocated2|1500005| 400000|localhost | 57637| 400000|localhost | 57638| 8000| 1|move
colocated1|1500002| 200000|localhost | 57637| 200000|localhost | 57638| 0| 0|move
colocated2|1500006| 8000|localhost | 57637| 8000|localhost | 57638| 0| 0|move
```
Source:
https://app.circleci.com/pipelines/github/citusdata/citus/27707/workflows/c3ff4fc7-5068-4096-ab9f-803c941ddac0/jobs/824622/parallel-runs/30?filterBy=FAILED
This happened in two of the tests only. For now I commented these tests
out. I have some ideas on how to fix these, but these ideas require more
impactful changes than I would like in this PR. One of these tests had a
copy paste error too, in passing I fixed that in the commented out line.
This test used to contain some utility commands that Citus did not
support. However we added support for most of the commands, and this
test got outdated.
We used to error out on community when user attempted to use pooler
options. Now that we open sourced all enterprise features, the test can
now be removed.
Sometimes our CI randomly fails on a test in a way similar to this:
```diff
step s2-drop:
DROP TABLE cancel_table;
-
+ <waiting ...>
+step s2-drop: <... completed>
starting permutation: s1-timeout s1-begin s1-sleep10000 s1-rollback s1-reset s1-drop
```
Source:
https://app.circleci.com/pipelines/github/citusdata/citus/26524/workflows/5415b84f-13a3-482f-bef9-648314c79a67/jobs/756377
Another example of a failure like this:
```diff
stop_session_level_connection_to_node
-------------------------------------
(1 row)
step s3-display:
SELECT * FROM ref_table ORDER BY id, value;
SELECT * FROM dist_table ORDER BY id, value;
-
+ <waiting ...>
+step s3-display: <... completed>
id|value
--+-----
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26551/workflows/91dca4b2-bb1c-4cae-b2ef-ce3f9c689ce5/jobs/757781
A step that shouldn't be blocked is detected as "waiting..." temporarily
and then gets unblocked automatically immediately after. I'm not
certain of the reason for this, but one explanation is that the
maintenance daemon is doing something that blocks the query. In the
shown case my hunch is that it could be the deferred shard deletion.
This PR disables all the features of the maintenance daemon during
isolation testing to try and prevent process from randomly being
detected as blocking.
NOTE: I'm not certain that this will actually fix this issue. If the
issue persists even after this change, at least we know that it's not
the maintenance daemon that's blocking it.
For the sake of documentation, here is a failing diff:
```diff
step s2-view-dist:
SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;
query |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state |wait_event_type|wait_event|usename |datname
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------
ALTER TABLE test_table ADD COLUMN x INT;
|localhost | 57636|idle in transaction|Client |ClientRead|postgres|regression
-(1 row)
+
+ SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+ FROM (
+ SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+ pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+ ) AS csa_from_one_node;
+ |localhost | 57638|active | | |postgres|regression
+(2 rows)
```
This failure can be seen at [this CI
run](https://app.circleci.com/pipelines/github/citusdata/citus/27653/workflows/d769701c-8f6e-4f97-a412-16f7b9b288a6/jobs/821416)
Update the test images from PG15beta4 to PG15rc1.
There is a new commit in 15rc1 that improves message styles. We also
update the messages accordingly.
Relevant PG commit:
[517484b5820e9e20057ff066b5df7d09cbb5f464](517484b582)
Depends on: https://github.com/citusdata/the-process/pull/93
PG15 now allows users to specify oids when creating databases. This
feature is a side effect of a bigger feature in pg_upgrade.
Relevant PG Commit:
pg_upgrade: Preserve database OIDs.
aa01051418f10afbdfa781b8dc109615ca785ff9
Depends on https://github.com/citusdata/the-process/pull/92Closes: #6371
Updates test dependencies to not rely on a known vulnerable dependency
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
PG15 has suppressed some casts on constants when querying foreign
tables.
For example, we can use text to represent a type that's an enum on the
remote side.
A comparison on such a column will get shipped as "var = 'foo'::text".
But there's no enum = text operator on the remote side.
If we leave off the explicit cast, the comparison will work.
Test we behave in the same way with a Citus foreign table
Reminder: foreign tables cannot be distributed/reference, can only be
Citus local
Relevant PG commit:
f8abb0f5e1
PostgreSQL 15 had some changes to jsonpath to conform with ECMA-262
referenced by SQL standard. This commit adds tests to make sure Citus
also supports the same standards.
Relevant pg commit:
e26114c817b610424010cfbe91a743f591246ff1
In Split, Logical replication logic and ShardCleaner we call
`SendCommandListToWorkerOutsideTransaction` and
`SendOptionalCommandListToWorkerOutsideTransaction` frequently. This
opens new connection for each of those calls, even though we already
have a perfectly good connection lying around.
This PR adds two new APIs
`SendCommandListToWorkerOutsideTransactionWithConnection` and
`SendOptionalCommandListToWorkerOutsideTransactionWithConnection` that
allow sending a list of queries in a transaction over an existing
connection. We also update the callers (Split, ShardCleaner, Logical
Replication) to use these new APIs instead.
Co-authored-by: Nitish Upreti <niupre@microsoft.com>
Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
In Citus 11.1.0 we changed the order of doing the initial data copy and
the replica identity creation when doing a non blocking shard move. This
was done to try and increase the speed with which shard moves could be
done. But after doing more extensive performance testing this change
turned out to have a negative impact on the speed of moves on the setups
that I tested.
Looking at the resource usage metrics of the VMs the reason for this
seems to be that these shard moves were bottlenecked by disk bandwidth.
While creating replica identities in bulk after the initial copy will
reduce CPU usage a bit, it does require an additional sequence scan of
the just written data. So when a VM is bottlenecked on disk, it makes
sense to spend a little bit more CPU to avoid an additional scan. Since
PKs are usually simple indexes that don't require lots of CPU to update,
as opposed to e.g. GiST indexes.
This reverts the order change to avoid a regression on shard move speed
in these cases.
For future releases we might consider re-evaluating our index creation
order for other indexes too, and create "simple" indexes before the
copy.
Given that we drop DEFAULT nextval('sequence') expressions from
shard relation columns, allowing `ON DELETE/UPDATE SET DEFAULT`
on such columns might cause inserting NULL values as a result
of a delete/update operation.
For this reason, we disallow ON DELETE/UPDATE SET DEFAULT actions
on columns that default to sequences.
DESCRIPTION: Disallows having ON DELETE/UPDATE SET DEFAULT actions on
columns that default to sequences
Fixes#6339.
As we did for GENERATED STORED columns in #4613, we should not drop
column
default expressions that are not based on sequences from shard relation
since
such expressions need to exist e.g. for foreign key actions.
For the column default expressions that are based on sequences we cannot
do much, so we need to disallow having ON DELETE SET DEFAULT actions on
such columns in a separate PR, see #6339.
Fixes#6318.
DESCRIPTION: Fixes a bug that might cause inserting incorrect DEFAULT
values when applying foreign key actions
PG15 added support for security invoker views. Relevant PG conmit:
7faa5fc84b
These views check the permissions for the underlying tables of the view
invoker user, not the view definer user.
When the view has underlying distributed tables, the queries to the
shards are sent by opening connections with the current user, which is
the view invoker, no matter what the type of the view is. This means
that, for distributed views, they were always behaving like security
invoker views. Check the following issue for more details:
https://github.com/citusdata/citus/issues/6161
So, Citus doesn't fully support security definer views.
However Citus does fully support security invoker views. We add tests to
make sure we cover different cases.
DESCRIPTION: Fixes dropping replication slots
As detected by a flaky test, Citus sometimes fails to drop replication
slots, possibly due to a race condition, at the end of a shard split.
With this PR, we retry to drop them in case of an `OBJECT_IN_USE` error,
consistently for 20 seconds.
fixes: #6326
Both tests include pushdown and pull to coordinator type of aggregate
execution.
Relevant PG commits:
Add min() and max() aggregates for xid8
400fc6b6487ddf16aa82c9d76e5cfbe64d94f660
Add range_agg with multirange inputs
7ae1619bc5b1794938c7387a766b8cae34e38d8a
Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
DESCRIPTION: Improve logging during shard split and resource cleanup
### DESCRIPTION
This PR makes logging improvements to Shard Split :
1. Update confusing logging to fix#6312
2. Added new `ereport(LOG` to make debugging easier as part of telemetry review.