DESCRIPTION: Fix leaking of memory and memory contexts in Foreign
Constraint Graphs
Previously, every time we (re)created the Foreign Constraint
Relationship Graph, we created a new Memory Context while loosing a
reference to the previous context. This old context could still have
left over memory in there causing a memory leak.
With this patch we statically have one memory context that we lazily
initialize the first time we create our foreign constraint relationship
graph. On every subsequent creation, beside destroying our previous
hashmap we also reset our memory context to remove any left over
references.
DESCRIPTION: Adds ALTER DATABASE WITH ... and REFRESH COLLATION VERSION
support
This PR adds supports for basic ALTER DATABASE statements propagation
support. Below statements are supported:
ALTER DATABASE <database_name> with IS_TEMPLATE <true/false>;
ALTER DATABASE <database_name> with CONNECTION LIMIT <integer_value>;
ALTER DATABASE <database_name> REFRESH COLLATION VERSION;
---------
Co-authored-by: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
We currently don't support propagating these options in Citus
Relevant PG commits:
https://github.com/postgres/postgres/commit/e3ce2dehttps://github.com/postgres/postgres/commit/3d14e17
Limitation:
We also need to take care of generated GRANT statements by dependencies
in attempt to distribute something else. Specifically, this part of the
code in `GenerateGrantRoleStmtsOfRole`:
```
grantRoleStmt->admin_opt = membership->admin_option;
```
In PG16, membership also has `inherit_option` and `set_option` which
need to properly be part of the `grantRoleStmt`. We can skip for now
since #7164 will take care of this soon, and also this is not an
expected use-case.
Add citus_schema_move() that can be used to move tenant tables within a distributed
schema to another node. The function has two variations as simple wrappers around
citus_move_shard_placement() and citus_move_shard_placement_with_nodeid() respectively.
They pick a shard that belongs to the given tenant schema and resolve the source node
that contain the shards under given tenant schema. Hence their signatures are quite
similar to underlying functions:
```sql
-- citus_schema_move(), using target node name and node port
CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move(
schema_id regnamespace,
target_node_name text,
target_node_port integer,
shard_transfer_mode citus.shard_transfer_mode default 'auto')
RETURNS void
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_schema_move$$;
-- citus_schema_move(), using target node id
CREATE OR REPLACE FUNCTION pg_catalog.citus_schema_move(
schema_id regnamespace,
target_node_id integer,
shard_transfer_mode citus.shard_transfer_mode default 'auto')
RETURNS void
LANGUAGE C STRICT
AS 'MODULE_PATHNAME', $$citus_schema_move_with_nodeid$$;
```
Since in PG16, truncate triggers are supported on foreign tables, we add
the citus_truncate_trigger to Citus foreign tables as well, such that the TRUNCATE
command is propagated to the table's single local shard as well.
Note that TRUNCATE command was working for foreign tables even before this
commit: see https://github.com/citusdata/citus/pull/7170#issuecomment-1706240593 for details
This commit also adds tests with user-enabled truncate triggers on Citus foreign tables:
both trigger on the shell table and on its single foreign local shard.
Relevant PG commit:
https://github.com/postgres/postgres/commit/3b00a94
**Problem:**
Previously we always used an outside superuser connection to overcome
permission issues for the current user while propagating dependencies.
That has mainly 2 problems:
1. Visibility issues during dependency propagation, (metadata connection
propagates some objects like a schema, and outside transaction does not
see it and tries to create it again)
2. Security issues (it is preferrable to use current user's connection
instead of extension superuser)
**Solution (high level):**
Now, we try to make a smarter decision on whether should we use an
outside superuser connection or current user's metadata connection. We
prefer using current user's connection if any of the objects, which is
already propagated in the current transaction, is a dependency for a
target object. We do that since we assume if current user has
permissions to create the dependency, then it can most probably
propagate the target as well.
Our assumption is expected to hold most of the times but it can still be
wrong. In those cases, transaction would fail and user should set the
GUC `citus.create_object_propagation` to `deferred` to work around it.
**Solution:**
1. We track all objects propagated in the current transaction (we can
handle subtransactions),
2. We propagate dependencies via the current user's metadata connection
if any dependency is created in the current transaction to address
issues listed above. Otherwise, we still use an outside superuser
connection.
DESCRIPTION: Fixes some object propagation errors seen with transaction
blocks.
Fixes https://github.com/citusdata/citus/issues/6614
---------
Co-authored-by: Nils Dijk <nils@citusdata.com>
For a database that does not create the citus extension by running
` CREATE EXTENSION citus;`
`CitusHasBeenLoaded ` function ends up querying the `pg_extension` table
every time it is invoked. This is not an ideal situation for a such a
database.
The idea in this PR is as follows:
### A new field in MetadataCache.
Add a new variable `extensionCreatedState `of the following type:
```
typedef enum ExtensionCreatedState
{
UNKNOWN = 0,
CREATED = 1,
NOTCREATED = 2,
} ExtensionCreatedState;
```
When the MetadataCache is invalidated, `ExtensionCreatedState` will be
set to UNKNOWN.
### Invalidate MetadataCache when CREATE/DROP/ALTER EXTENSION citus
commands are run.
- Register a callback function, named
`InvalidateDistRelationCacheCallback`, for relcache invalidation during
the shared library initialization for `citus.so`. This callback function
is invoked in all the backends whenever the relcache is invalidated in
one of the backends. (This could be caused many DDLs operations).
- In the cache invalidation callback,`
InvalidateDistRelationCacheCallback`, invalidate `MetadataCache` zeroing
it out.
- In `CitusHasBeenLoaded`, perform the costly citus is loaded check only
if the `MetadataCache` is not valid.
### Downsides
Any relcache invalidation (caused by various DDL operations) will case
Citus MetadataCache to get invalidated. Most of the time it will be
unnecessary. But we rely on that DDL operations on relations will not be
too frequent.
When breaking a colocation, we need to create a new colocation group
record in pg_dist_colocation for the relation. It is not sufficient to
have a new colocationid value in pg_dist_partition only.
This patch also fixes a bug when deleting a colocation group if no
tables are left in it. Previously we passed a relation id as a parameter
to DeleteColocationGroupIfNoTablesBelong function, where we should have
passed a colocation id.
Fixes: #6928
When braking a colocation, we need to create a new colocation group
record in pg_dist_colocation for the relation. It is not sufficient to
have a new colocationid value in pg_dist_partition only.
This patch also fixes a bug when deleting a colocation group if no
tables are left in it. Previously we passed a relation id as a parameter
to DeleteColocationGroupIfNoTablesBelong function, where we should have
passed a colocation id.
1. Adds an `sql_row` function, for when a query returns a single row
with multiple columns.
2. Include a `notice_handler` for easier debugging
3. Retry dropping replication slots when they are "in use", this is
often an ephemeral state and can cause flaky tests
In PG16, REINDEX DATABASE/SYSTEM name is optional.
We already don't propagate these commands automatically.
Testing here with run_command_on_workers.
Relevant PG commit:
https://github.com/postgres/postgres/commit/2cbc3c1
When we create a database, it already needs to be manually created in
the workers as well.
This new icu_rules option should work as the other options as well.
Added a test for that.
Relevant PG commit:
https://github.com/postgres/postgres/commit/30a53b7
DESCRIPTION: Presenting citus_pause_node UDF enabling pausing by
node_id.
citus_pause_node takes a node_id parameter and fetches all the shards in
that node and puts AccessExclusiveLock on all the shards inside that
node. With this lock, insert is disabled, until citus_pause_node
transaction is closed.
---------
Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
Replaces https://github.com/citusdata/citus/pull/7120.
Closes https://github.com/citusdata/citus/issues/4692.
#7120 added the same functionality by implementing a transactional
--but scoped to Citus local tables-- version of TransferShards().
It was passing all the regression tests but didn't feel like an
intuitive approach.
This PR instead adds that functionality via the functions that we
use when creating a distributed table, namely, CreateShardsOnWorkers()
and CopyLocalDataIntoShards().
We insert entries into pg_dist_placement for the new shard placement(s)
and then call CreateShardsOnWorkers() to create those placement(s) on
workers.
Then we use CopyFromLocalTableIntoDistTable() to copy the data from
the local shard placement to the new shard placement(s).
CopyFromLocalTableIntoDistTable() is a new function that re-uses the
underlying logic of CopyLocalDataIntoShards() that allows copying
data from a local table into a distributed table. We tell
CopyLocalDataIntoShards() to read from local shard placement table
and to write the tuples into shard placement/s of the reference /
single-shard table. Before doing this, we temporarily delete metadata
record for the local placement to avoid from duplicating the data in
the local shard placement.
Finally, we drop the local shard placement if we were creating a
single-shard placement table and that effectively means moving the
local shard placement to the appropriate worker as we've already
created the new shard placement on the worker.
While the main motivation behind adding this functionality is to
avoid from the limitations when UndistributeTable() is called for
a Citus local table (during table conversion), this indeed optimizes
how we convert a Citus local table to a reference table /
single-shard table. This is because, the prior logic was causing
to use more disk space due to the duplication of the data during
UndistributeTable().
DESCRIPTION: Allow creating reference / distributed-schema tables from
local tables added to metadata and that use identity columns
- [x] Add tests.
- [x] Test django-tenants.
If we're in the middle of a table type conversion (such as from Citus
local table to a reference table), the table might not have all the
placements that we expect from the table type. For this reason, we
should intersect the placements of tables at hand when creating
inter-shard ddl tasks.
What we do to collect foreign key constraint commands in
WorkerCreateShardCommandList is quite similar to what we do in
CopyShardForeignConstraintCommandList. Plus, the code that we used
in WorkerCreateShardCommandList before was not able to properly handle
foreign key constraints between Citus local tables --when creating a
reference table from the referencing one.
With a few slight modifications made to
CopyShardForeignConstraintCommandList, we can use the same logic in
WorkerCreateShardCommandList too.
DESCRIPTION: Adds grant/revoke propagation support for database
privileges
Following the implementation of support for granting and revoking
database privileges, certain tests that issued grants for worker nodes
experienced failures. These ones are fixed in this PR as well.
DESCRIPTION: Removes ubuntu/bionic from packaging pipelines
Since pg16 beta is not available for ubuntu/bionic and ubuntu/bionic
support is EOL, I need to remove this os from pipeline
https://ubuntu.com/blog/ubuntu-18-04-eol-for-devices
Additionally, added concurrency support for GH Actions Packaging
pipeline