Commit Graph

5906 Commits (6a32061c08ae202a0d4d5fafbdaf9604697dd51b)

Author SHA1 Message Date
Onder Kalaci af22a30b48 Use citus_finish_citus_upgrade() in the tests
We already have tests relying on citus_finalize_upgrade_to_citus11().
Now, adjust those to rely on citus_finish_citus_upgrade() and
always call citus_finish_citus_upgrade().
2022-06-13 13:15:15 +02:00
Marco Slot 36c4ec6d53 Introduce a citus_finish_citus_upgrade() function 2022-06-13 13:15:15 +02:00
Marco Slot 3a32c28714
Merge pull request #5994 from citusdata/clairegiordano-slackbadge 2022-06-03 17:50:46 +02:00
Claire Giordano f0a19d48ad
Fix the broken link to the Slack badge 2022-06-02 17:30:56 -07:00
Halil Ozan Akgül e8a09b135d
Merge pull request #5974 from citusdata/fix_undistribute_table_drops_citus_bug
Fixes the bug where undistribute can drop Citus extension
2022-05-31 17:29:17 +03:00
Halil Ozan Akgul b255706189 Fixes the bug where undistribute can drop Citus extension 2022-05-31 16:23:28 +03:00
Önder Kalacı fc151a9b80
Merge pull request #5979 from citusdata/improve_metadata_sync_disabled
fix assertion failure & honor enable_metadata_sync in node operations
2022-05-30 17:01:07 +02:00
Onder Kalaci 89c1ccb7a5 Show that no metadata is sent when disabled 2022-05-30 13:41:06 +02:00
Onder Kalaci 7157152f6c Do not send metadata changes during add node if citus.enable_metadata_sync is set to false 2022-05-30 13:24:31 +02:00
Onder Kalaci 010a2a408e Avoid assertion failure on citus_add_node 2022-05-30 12:22:09 +02:00
Gledis Zeneli beef392f5a
Fix memory error with citus_add_node reported by valgrind test (#5967)
The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed.
Also fix another place in the code that might have been susceptible to this issue.
I tested on both multi-vg and multi-1-vg and the test were successful.
2022-05-28 00:22:00 +03:00
Ahmet Gedemenli 26d927178c
Propagate dependent views upon distribution (#5950) 2022-05-26 14:23:45 +03:00
jeff-davis 74ce210f8b
Columnar: fix wraparound bug. (#5962)
columnar_vacuum_rel() now advances relfrozenxid.

Fixes #5958.
2022-05-25 07:50:48 -07:00
Burak Velioglu 86781ec85a
Merge pull request #5966 from citusdata/velioglu/fix_depending_view_creation
Fix schema and ownership of view while altering depending distributed table
2022-05-24 17:09:20 +03:00
Burak Velioglu 1d7dda991f Create view and materialized views with right schema and owner while
altering the distributed table.

To be able to alter view's owner without enforcing sequential mode.
Alter view process functions have been udpated to use metadata
connection.
2022-05-24 15:27:30 +03:00
Gledis Zeneli 27ddb4fc8e
Do not obtain AccessShareLock before actual lock (#5965)
Do not obtain AccessShareLock before acquiring the distributed locks.

Acquiring an AccessShareLock ensures that the relations which we are trying to get a distributed lock on will not be dropped in the time between when the LOCK command is issued and the LOCK commands are send to the worker. However, this also leads to distributed deadlocks in such scenarios:

```sql
-- for dist lock acquiring order coor, w1, w2

-- on w2
LOCK t1 IN ACCESS EXLUSIVE MODE;
-- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock

      -- concurrently on w1
      LOCK t1 IN ACCESS EXLUSIVE MODE;
      -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock
      -- acquire dist lock on coor, w1, gets blocked on local AccessShareLock on w2

-- on w2 continuation of the execution above
-- starts to acquire dist locks and gets blocked on the coor by the lock acquired by w1

-- distributed deadlock

``` 

We opt for avoiding such deadlocks with the cost of the possibility of running into errors when the relations on which we are trying to acquire locks on get dropped.
2022-05-23 13:06:38 +03:00
Önder Kalacı 2a768176c4
Merge pull request #5960 from citusdata/parallelize_node_activate
Parallelize metadata syncing on node activate
2022-05-23 09:25:11 +02:00
Onder Kalaci dd02e1755f Parallelize metadata syncing on node activate
It is often useful to be able to sync the metadata in parallel
across nodes.

Also citus_finalize_upgrade_to_citus11() uses
start_metadata_sync_to_primary_nodes() after this commit.

Note that this commit does not parallelize all pieces of node
activation or metadata syncing. Instead, it tries to parallelize
potenially large parts of metadata, which is the objects and
distributed tables (in general Citus tables).

In the future, it would be nice to sync the reference tables
in parallel across nodes.

Create ~720 distributed tables / ~23450 shards
```SQL
-- declaratively partitioned table
CREATE TABLE github_events_looooooooooooooong_name (
  event_id bigint,
  event_type text,
  event_public boolean,
  repo_id bigint,
  payload jsonb,
  repo jsonb,
  actor jsonb,
  org jsonb,
  created_at timestamp
) PARTITION BY RANGE (created_at);

SELECT create_time_partitions(
  table_name         := 'github_events_looooooooooooooong_name',
  partition_interval := '1 day',
  end_at             := now() + '24 months'
);

CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id);
SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id');

SET client_min_messages TO ERROR;

```

across 1 node: almost same as expected
```SQL

SELECT start_metadata_sync_to_primary_nodes();
Time: 15664.418 ms (00:15.664)

select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 14284.069 ms (00:14.284)
```

across 7 nodes: ~3.5x improvement
```SQL

SELECT start_metadata_sync_to_primary_nodes();
┌──────────────────────────────────────┐
│ start_metadata_sync_to_primary_nodes │
├──────────────────────────────────────┤
│ t                                    │
└──────────────────────────────────────┘
(1 row)

Time: 25711.192 ms (00:25.711)

-- across 7 nodes
select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 82126.075 ms (01:22.126)
```
2022-05-23 09:15:48 +02:00
jeff-davis a2f5b068e6
Columnar: tighten security and improve visibility. (#5922)
Move internal storage details to a separate schema with no public
access to limit the possibility for information leakage.

Create views with public access that show storage details for those
columnar tables where the user has ownership privileges. Include
mapping between relation ID and storage ID for easier interpretation.
2022-05-20 15:30:31 -07:00
Hanefi Onaldi 52541c5802 Add normalization rules for flaky isolation tests
We remove `<waiting ...>` and `<... completed>` outputs for some CREATE
INDEX CONCURRENTLY commands since they can cause flakiness in some scenarios.

Postgres calls WaitForOlderSnapshots() and this can cause CREATE INDEX
CONCURRENTLY commands for shards to get blocked by each other for brief
periods of time. The extra waits can pop-up, or they can get completed
at different lines in the output files. To remedy that, we rename those
indexes to be captured by the new normalization rule.
2022-05-21 00:55:47 +03:00
Ying Xu a1151c2395
Clear metadatacache during abort for create extension (#5907)
* Bug fix for bug #5876. Memset MetadataCacheSystem every time there is an abort

* Created an ObjectAccessHook that saves the transactionlevel of when citus was created and will clear metadatacache if that transaction level is rolled back. Added additional tests to make sure metadatacache is cleared
2022-05-20 13:47:58 -07:00
Marco Slot c692bb10f7
Merge pull request #5961 from citusdata/marcocitus/cache-backend-type
Add caching for functions that check the backend type
2022-05-20 19:33:43 +02:00
Marco Slot 7abcfac61f Add caching for functions that check the backend type 2022-05-20 19:02:37 +02:00
Marco Slot 37dda19f31
Merge pull request #5924 from citusdata/marcocitus/nested-execution
Improve nested execution checks and add GUC to disable
2022-05-20 19:02:18 +02:00
Marco Slot 09ec366ff5 Improve nested execution checks and add GUC to disable 2022-05-20 18:55:43 +02:00
Marco Slot e683993449 Fix prepared statement bug when switching from local to remote execution 2022-05-20 18:55:43 +02:00
jeff-davis a9f8a60007
Columnar: support relation options with ALTER TABLE. (#5935)
Columnar: support relation options with ALTER TABLE.

Use ALTER TABLE ... SET/RESET to specify relation options rather than
alter_columnar_table_set() and alter_columnar_table_reset().

Not only is this more ergonomic, but it also allows better integration
because it can be treated like DDL on a regular table. For instance,
citus can use its own ProcessUtility_hook to distribute the new
settings to the shards.

DESCRIPTION: Columnar: support relation options with ALTER TABLE.
2022-05-20 08:35:00 -07:00
Marco Slot 16fa0dad85
Merge pull request #5959 from citusdata/marcocitus/run-command-on
Allow distributed execution from run_command_on_* functions
2022-05-20 15:35:44 +02:00
Marco Slot ad5214b50c Allow distributed execution from run_command_on_* functions 2022-05-20 15:26:47 +02:00
Gledis Zeneli b1d1df8214
Merge pull request #5938 from citusdata/propagate-lock
DESCRIPTION: 
* Lock statements will propagate locks to all the nodes in the metadata when a citus tables or a view is locked. Locks will also be propagated for all citus tables that appear in the definition of views that are locked (recursively applies to the views inside the locked view).
* TRUNCATE-ing a citus table from a worker node is no longer allowed if the coordinator is not added to the metadata. When TRUNCATE is called on a citus table that table needs to be locked in all the nodes before the TRUCATE is executed. If the coordinator is not in the metadata, the coordinator will not be aware of the lock on the table and can access the table's shard while a TRUNCATE is executing. Despite not being recommended, this behavior can be allowed by setting `SET citus.allow_unsafe_locks_from_workers TO 'on';`.
* `pg_dump` which calls LOCK TABLE is affected in the same way as TRUNCATE. An error is thrown when locking from worker when coordinator is not in the metadata.
2022-05-20 13:08:36 +03:00
gledis69 e5f645a204 Merge branch 'master' into propagate-lock 2022-05-20 12:29:51 +03:00
gledis69 4731630741 Add distributing lock command support 2022-05-20 12:28:07 +03:00
Önder Kalacı 431311732a
Adjust arbitrary configs metadata sync (#5956)
As of Citus 11, we already sync metadata by default.
It is useful to keep one schedule without metadata
sync.
2022-05-20 11:02:53 +03:00
Marco Slot 757ecba968
Merge pull request #5941 from citusdata/marcocitus/run_command_on_coordinator
Add a run_command_on_coordinator function
2022-05-19 10:32:35 +02:00
Marco Slot 79d7e860e6 Add a run_command_on_coordinator function 2022-05-19 10:26:09 +02:00
Marco Slot fa9cee409c Fix downgrade scripts and add new downgrade tests 2022-05-19 10:26:09 +02:00
Ahmet Gedemenli 4fe43c68bc
Merge pull request #5955 from citusdata/fix-rename-sequence-schema-qualifying
Fix schemaname qualify for rename seq stmts
2022-05-18 19:43:25 +03:00
Ahmet Gedemenli 48d5c9a1b5 Fix schemaname qualify for rename seq stmts 2022-05-18 19:04:22 +03:00
Önder Kalacı 66378d00da
Merge pull request #5912 from citusdata/relax_disable_node
Adds "synchronous" option to citus_disable_node() UDF
2022-05-18 17:34:41 +02:00
Onder Kalaci 0596062f96 Serialize reference table modifications with node changes & restore point
With Citus MX enabled, when a reference table is modified, it does
some operations on the first worker node(e.g., acquire locks).

If node metadata is locked (via add node or create restore point),
the changes to the reference tables should be blocked.
2022-05-18 17:23:38 +02:00
Onder Kalaci 127450466e Do not warn unncessarily when a node is removed
In the past (pre-11), we allowed removing worker nodes
that had active placements for replicated distributed
table, without even checking if there are any other
replicas of the same placement.

However, with #5469, we prevent disabling nodes via a hard
error when there is the last active placement of shard, as we
do for reference tables. Note that otherwise, we'd allow
users to lose data.

As of today, the NOTICE is completely irrelevant.
2022-05-18 17:23:38 +02:00
Onder Kalaci b4dbd84743 Prevent distributed queries while disabling first worker node
First worker node has a special meaning for modifications on the replicated tables

It is used to acquire a remote lock, such that the modifications are serialized.

With this commit, we make sure that we do not let any distributed query to see a
different 'first worker node' while first worker node is disabled.

Note that, maybe implicitly mentioned above, when first worker node is disabled,
the first worker node changes, that's why we have to handle the situation.
2022-05-18 17:21:12 +02:00
Onder Kalaci db998b3d66 Adds "sync" option to citus_disable_node() UDF
Before this commit, we had:
```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false)
```

Where, we allow forcing to disable first worker node with
`force:=true`. However, it entails the risk for losing
data / diverging placement data etc.

With `force` flag, we control disabling the first worker node,
and with `async` flag we control whether the changes are done
via bg worker or immediately.

```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false, sync boolean DEFAULT false)
```

Where we can achieve all the following:

| Mode  | Data loss possibility | Can run in 2PC | Handle multiple node failures | Immediately effective |
| --- |--- |--- |--- |--- |
| force:false, sync: false  | false   | true  | true  | false |
| force:false, sync: true   | false  | false | false | true |
| force:true, sync: false   | true   | true  | true   | false |
| force:true, sync: true    | false  | false | false  | true |
2022-05-18 17:21:12 +02:00
Önder Kalacı 69d007deec
Merge pull request #5940 from citusdata/index_name
Fixes a bug that prevents dropping/altering indexes
2022-05-18 17:20:00 +02:00
Onder Kalaci 2cc4053fc1 Fixes a bug that prevents dropping/altering indexes
There are two problems in this area. First, when there are expressions
on the index name, we should call `transformIndexExpression()` before
generating the index name. That is what Postgres does.

Second, because of 40c24bfef9
PG 13 and PG 14 generates different names for indexes with function calls even for local PG tables.
Assume we have:
```SQL
create table t(id int);
select create_distributed_table('t', 'id');
create index ON t (my_very_boring_function(id));
```

On PG 13, the name of the index is `t_expr_idx`
```SQL
\d t
Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_expr_idx" btree (my_very_boring_function(id::bigint))
```

On PG 14, the name of the index is `t_my_very_boring_function_idx`
```SQL
\d t
 Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_my_very_boring_function_idx" btree (my_very_boring_function(id::bigint))

```

The second issue is not very critical. The important part is that
we adjust regression tests to drop all the indexes, which ensures
the index names are sane on any version.
2022-05-18 16:35:17 +02:00
Nils Dijk e25a5d7837
Merge pull request #5931 from citusdata/refactor/dedupe-object-propagation
Refactor: reduce complexity and code duplication for Object Propagation
2022-05-18 16:30:31 +02:00
Nils Dijk b71a08955a
Refactor: reduce complexity and code duplication for Object Propagation
Over time we have added significantly improved the support for objects to be propagated by Citus as to make scaling out the database more seamless. It became evident that there was a lot of code duplication that got into the codebase to implement the propagation.

This PR tries to reduce the amount of repeated code that is at most only slightly different. To make things worse, most of the differences were actually oversights instead of correct.

This Patch introduces 3 reusable sets of pre/post processing steps for respectively
 - create
 - alter
 - drop

With the use of the common functionality we should have more coherent behaviour between different supported object by Citus.

Some steps either omit the Pre or Post processing step if they would not make sense to include.

All tests pass, only 1 test needed changing, foreign servers, as the dropping of foreign servers didn't implement support for dropping multiple foreign servers at once. Given the common approach correctly supports dropping of multiple objects, either distributed or not, the test that assumed it wouldn't work was now obsolete.
2022-05-18 15:58:28 +02:00
Önder Kalacı b04222155d
Merge pull request #5923 from citusdata/update_view
Mark existing views as distributed when upgrade to 11.0+
2022-05-18 15:50:29 +02:00
Onder Kalaci ee45e7bfbf Mark existing views as distributed when upgrade to 11.0+
We have a mechanism which ensures that newly distributed
objects are recorded during `alter extension citus update`.

However, the logic was lacking "view"s. With this commit, we make
sure that existing views are also marked as distributed during
upgrade.
2022-05-18 15:43:17 +02:00
Nils Dijk 14c6c799f2
suppress notices when more dependencies are found (#5954)
We are nearing the 100 objects being propagated in `master_copy_shard_placement` and with the extra supported objects this gets pushed over a 100 objects.

When a 100 objects are reached for propagation a notice will be shown to the user, informing them it might take a while to finish the operation.

During testing this is not important to see. Since the message contains the exact number of objects to be propagated the tests becomes very unstable when merging community into enterprsie.

This change makes that the test output stays stable.
2022-05-18 14:31:10 +03:00