Commit Graph

3380 Commits (3e924db90a255f0e2680775dd4c7e8b05de58fc2)

Author SHA1 Message Date
Onur Tirtir 0b81f68def
Use memcpy instead of memcpy_s to avoid pointless limits in columnar (#6419)
DESCRIPTION: Raises memory limits in columnar from 256MB to 1GB for
reads and writes

This doesn't completely fix #5918 but at least increases the
buffer limits that might cause throwing an error when reading
from or writing into into columnar storage. A way better approach
to fix this is documented in #6420.

Replacing memcpy_s with memcpy is quite safe in those places
since we anyway make sure to allocate enough amount of memory
before writing into related buffers.
2022-10-11 14:57:31 +03:00
Onur Tirtir 517b72a9d5
Fix use-after-free in GetAlterTriggerStateCommand() (#6413)
Fix use-after-free in GetAlterTriggerStateCommand() introduced in #6398.
2022-10-10 16:38:21 +03:00
Gokhan Gulbiz 1776bdf654
Limit citus_drain_node to drain the specified node only (#6361)
DESCRIPTION: Fixes citus_drain_node to drain the specified worker only.

Fixes #6267
2022-10-09 13:33:08 +03:00
Onur Tirtir 86e186f671
Retain trigger settings when re-creating the triggers (on shards) (#6398)
Fixes https://github.com/citusdata/citus/issues/6394.

DESCRIPTION: Fixes a bug that causes creating disabled-triggers on
shards as enabled

Since CREATE TRIGGER doesn't have syntax support to specify
whether the trigger should be enabled/disabled, the underlying
PG function (`pg_get_triggerdef()`) that we use to generate the
command to create the trigger is not enough. For this reason, we
append a second command to enable/disable trigger, right after
creating it.

We don't retain explicit extension dependencies set by using
`ALTER trigger DEPENDS ON EXTENSION` commands too, but apparently
right fix for that is to throw an error as in
`PreprocessAlterTriggerDependsStmt()`; so, opened a separate PR
to fix that #6399.
2022-10-06 14:51:07 +03:00
Naisila Puka 27e867afbc
Propagates column aliases (#6400)
Propagates column aliases in the shard-level commands
2022-10-06 12:27:31 +03:00
Naisila Puka b5cba3a3fe
Use original relation to retrieve column name because of syscache (#6387)
During alter_distributed_table, we create a new table like the
original table but with the altered options.

To retrieve the name of the distribution column, we were using
the attribute syscache of the new table, since we already created
the new table as identical to the original table.

However, the attribute syscaches of these two tables are not
the same if the original table has dropped columns. The reason
is that dropped columns are all still present in the cache.
Hence, for example, the attnos would be different in the syscaches.

So, let's use the attribute syscache of the original table.
2022-10-06 12:08:00 +03:00
Ying Xu f21cbe68f8
[Columnar] Bugfix for Columnar: options ignored during ALTER TABLE rewrite (#6337)
DESCRIPTION: Fixes a bug that prevents retaining columnar table options after a table-rewrite
A fix for this issue: Columnar: options ignored during ALTER TABLE
rewrite #5927
The OID for the temporary table created during ALTER TABLE was not the
same as the original table's OID so the columnar options were not being
applied during rewrite.

The change is that I applied the original table's columnar options to
the new table so that it has the correct options during write. I also
added a test.
2022-10-05 11:42:09 -07:00
Ahmet Gedemenli e36890ce55
Add source_lsn and target_lsn fields into get_rebalance_progress (#6364)
DESCRIPTION: Adds source_lsn and target_lsn fields into
get_rebalance_progress

Adding two fields named `source_lsn` and `target_lsn` to the function
`get_rebalance_progress`.
Target lsn data is fetched in `GetShardStatistics`, by expanding the
query sent to workers (joining with pg_subscription_rel and
pg_stat_subscription). Then put into the hashmap, for each shard.
Source lsn data is fetched in `BuildWorkerShardStatististicsHash`, in
the loop that iterate each node, by sending a pg_current_wal_lsn query
to each node. Then put into the hashmap, for each node.
2022-10-05 11:12:24 +03:00
Hanefi Onaldi 11a9a3771f
Ensure no dependencies to index before drop 2022-10-04 18:56:20 +03:00
Ahmet Gedemenli d0fa10a98c
Bump Citus to 11.2devel (#6385) 2022-09-30 14:47:42 +03:00
Jelte Fennema 24e06af6d2
Reuse connections for Splits and Logical Replication (#6314)
In Split, Logical replication logic and ShardCleaner we call
`SendCommandListToWorkerOutsideTransaction` and
`SendOptionalCommandListToWorkerOutsideTransaction` frequently. This
opens new connection for each of those calls, even though we already
have a perfectly good connection lying around.

This PR adds two new APIs
`SendCommandListToWorkerOutsideTransactionWithConnection` and
`SendOptionalCommandListToWorkerOutsideTransactionWithConnection` that
allow sending a list of queries in a transaction over an existing
connection. We also update the callers (Split, ShardCleaner, Logical
Replication) to use these new APIs instead.

Co-authored-by: Nitish Upreti <niupre@microsoft.com>
Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-09-26 13:37:40 +02:00
Jelte Fennema d9a9a3263b
Revert replica identity creation order for shard moves (#6367)
In Citus 11.1.0 we changed the order of doing the initial data copy and
the replica identity creation when doing a non blocking shard move. This
was done to try and increase the speed with which shard moves could be
done. But after doing more extensive performance testing this change
turned out to have a negative impact on the speed of moves on the setups
that I tested.

Looking at the resource usage metrics of the VMs the reason for this
seems to be that these shard moves were bottlenecked by disk bandwidth.
While creating replica identities in bulk after the initial copy will
reduce CPU usage a bit, it does require an additional sequence scan of
the just written data. So when a VM is bottlenecked on disk, it makes
sense to spend a little bit more CPU to avoid an additional scan. Since
PKs are usually simple indexes that don't require lots of CPU to update,
as opposed to e.g. GiST indexes.

This reverts the order change to avoid a regression on shard move speed
in these cases.

For future releases we might consider re-evaluating our index creation
order for other indexes too, and create "simple" indexes before the
copy.
2022-09-23 14:55:25 +02:00
Onur Tirtir a868cc049a
Not allow ON DELETE/UPDATE SET DEFAULT actions on columns that default to sequences (#6340)
Given that we drop DEFAULT nextval('sequence') expressions from
shard relation columns, allowing `ON DELETE/UPDATE SET DEFAULT`
on such columns might cause inserting NULL values as a result
of a delete/update operation.

For this reason, we disallow ON DELETE/UPDATE SET DEFAULT actions
on columns that default to sequences.

DESCRIPTION: Disallows having ON DELETE/UPDATE SET DEFAULT actions on
columns that default to sequences

Fixes #6339.
2022-09-23 03:34:02 -07:00
Onur Tirtir de24a3eda5
Not drop default col exprs from shard when adding local table to metadata (#6323)
As we did for GENERATED STORED columns in #4613, we should not drop
column
default expressions that are not based on sequences from shard relation
since
such expressions need to exist e.g. for foreign key actions.

For the column default expressions that are based on sequences we cannot
do much, so we need to disallow having ON DELETE SET DEFAULT actions on
such columns in a separate PR, see #6339.

Fixes #6318.

DESCRIPTION: Fixes a bug that might cause inserting incorrect DEFAULT
values when applying foreign key actions
2022-09-23 03:05:08 -07:00
Ahmet Gedemenli bae4b47c2f
Fix dropping replication slot (#6359)
DESCRIPTION: Fixes dropping replication slots

As detected by a flaky test, Citus sometimes fails to drop replication
slots, possibly due to a race condition, at the end of a shard split.
With this PR, we retry to drop them in case of an `OBJECT_IN_USE` error,
consistently for 20 seconds.

fixes: #6326
2022-09-21 16:29:56 +03:00
Nitish Upreti e9508b2603
Shard Split : Add / Update logging (#6336)
DESCRIPTION: Improve logging during shard split and resource cleanup

### DESCRIPTION

This PR makes logging improvements to Shard Split : 

1. Update confusing logging to fix #6312
2. Added new `ereport(LOG` to make debugging easier as part of telemetry review.
2022-09-16 09:39:08 -07:00
Marco Slot 8544346a78
Allow create_distributed_table_concurrently on an empty node (#6353)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-16 10:55:02 +02:00
Onder Kalaci 766f340ce0 Prevent failures on partitioned distributed tables with statistics objects on PG 15
Comment from the code is clear on this:
/*
 * The statistics objects of the distributed table are not relevant
 * for the distributed planning, so we can override it.
 *
 * Normally, we should not need this. However, the combination of
 * Postgres commit 269b532aef55a579ae02a3e8e8df14101570dfd9 and
 * Citus function AdjustPartitioningForDistributedPlanning()
 * forces us to do this. The commit expects statistics objects
 * of partitions to have "inh" flag set properly. Whereas, the
 * function overrides "inh" flag. To avoid Postgres to throw error,
 * we override statlist such that Postgres does not try to process
 * any statistics objects during the standard_planner() on the
 * coordinator. In the end, we do not need the standard_planner()
 * on the coordinator to generate an optimized plan. We call
 * into standard_planner() for other purposes, such as generating the
 * relationRestrictionContext here.
 *
 * AdjustPartitioningForDistributedPlanning() is a hack that we use
 * to prevent Postgres' standard_planner() to expand all the partitions
 * for the distributed planning when a distributed partitioned table
 * is queried. It is required for both correctness and performance
 * reasons. Although we can eliminate the use of the function for
 * the correctness (e.g., make sure that rest of the planner can handle
 * partitions), it's performance implication is hard to avoid. Certain
 * planning logic of Citus (such as router or query pushdown) relies
 * heavily on the relationRestrictionList. If
 * AdjustPartitioningForDistributedPlanning() is removed, all the
 * partitions show up in the, causing high planning times for
 * such queries.
 */
2022-09-15 14:36:05 +03:00
aykut-bozkurt 739b91afa6
ensure we have more active nodes than replication factor. (#6341)
DESCRIPTION: Fixes floating exception during
create_distributed_table_concurrently.

Fixes #6332.
During create_distributed_table_concurrently, when there is no active
primary node, it fails with floating exception. We added similar check
with create_distributed_table. It will fail with proper message if
current active node is less than replication factor.
2022-09-14 18:20:50 +03:00
Marco Slot 4ab415c43a
Fix escaping in sequence dependency queries (#6345)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-14 17:43:24 +03:00
Sameer Awasekar 4851b4e8f2
Introduce code changes to fix Issue:6303 (#6328)
The PR introduces code changes to fix Issue
[6303](https://github.com/citusdata/citus/issues/6303)

`create_distributed_table_concurrently` following drop column, creates a
buggy situation in split decoder.
 * Consider the below scenario:
* Session1 : Drop column followed by
create_distributed_table_concurrently
 * Session2 : Concurrent insert workload

The child shards created by `create_distributed_table_concurrently` will
have less columns than the source shard because some column were
dropped. The incoming tuple from session2 will have more columns as the
writes happened on source shard. But now the tuple needs to be applied
on child shard. So we need to format existing tuple according to child
schema and skip dropped column values.
The PR fixes this by reformatting the tuple according the target child
schema.

Test:
1) isolation_create_distributed_concurrently_after_drop_column - Repros
the issue and tests on the same.
2022-09-14 19:56:32 +05:30
Marco Slot 7a92d873b6
Fix bugs in CheckIfRelationWithSameNameExists (#6343)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-14 15:42:46 +02:00
Nils Dijk da527951ca
Fix: rebalance stop non super user (#6334)
No need for description, fixing issue introduced with new feature for
11.1

Fixes #6333 

Due to Postgres' C api being o-indexed and postgres' attributes being
1-indexed, we were reading the wrong Datum as the Task owner when
cancelling. Here we add a test to show the error and fix the off-by-one
error.
2022-09-13 23:19:31 +02:00
Hanefi Onaldi f34467dcb3
Remove missing declaration warning (#6330)
When I built Citus on PG15beta4 locally, I get a warning message.

```
utils/background_jobs.c:902:5: warning: declaration does not declare anything
      [-Wmissing-declarations]
                                __attribute__((fallthrough));
                                ^
1 warning generated.
```

This is a hint to the compiler that we are deliberately falling through
in a switch-case block.
2022-09-13 13:48:51 +03:00
Jelte Fennema f13b140621
Show citus_copy_shard_placement progress in get_rebalance_progress (#6322)
DESCRIPTION: Show citus_copy_shard_placement progress in
get_rebalance_progress

When rebalancing to a new node that does not have reference tables yet
the rebalancer will first copy the reference tables to the nodes.
Depending on the size of the reference tables, this might take a long
time. However, there's no indication of what's happening at this stage
of the rebalance.

This PR improves this situation by also showing the progress of any
citus_copy_shard_placement calls when calling get_rebalance_progress.
2022-09-13 08:59:52 +00:00
Naisila Puka 76ff4ab188
Adds support for unlogged distributed sequences (#6292)
We can now do the following:
- Distribute sequence with logged/unlogged option
- ALTER TABLE my_sequence SET LOGGED/UNLOGGED
- ALTER SEQUENCE my_sequence SET LOGGED/UNLOGGED

Relevant PG commit
344d62fb9a
2022-09-13 10:53:39 +03:00
Hanefi Onaldi 5cfcc63308
Add warning messages for cluster commands on partitioned tables (#6306)
PG15 introduces `CLUSTER` commands for partitioned tables. Similar to a
`CLUSTER` command with no supplied table names, these commands also can
not be run inside transaction blocks and therefore can not be propagated
in a distributed transaction block with ease. Therefore we raise warnings.

Relevant PG commit: cfdd03f45e6afc632fbe70519250ec19167d6765
2022-09-13 00:05:58 +03:00
Hanefi Onaldi 164f2fa0a6
PG15: Add support for NULLS NOT DISTINCT (#6308)
Relevant PG commit: 94aa7cc5f707712f592885995a28e018c7c80488
2022-09-12 23:47:37 +03:00
Marco Slot b79111527e
Avoid blocking writes in create_distributed_table_concurrently (#6324)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-12 12:09:37 -07:00
Nils Dijk cda3686d86
Feature: run rebalancer in the background (#6215)
DESCRIPTION: Add a rebalancer that uses background tasks for its
execution

Based on the baclground jobs and tasks introduced in #6296 we implement
a new rebalancer on top of the primitives of background execution. This
allows the user to initiate a rebalance and let Citus execute the long
running steps in the background until completion.

Users can invoke the new background rebalancer with `SELECT
citus_rebalance_start();`. It will output information on its job id and
how to track progress. Also it returns its job id for automation
purposes. If you simply want to wait till the rebalance is done you can
use `SELECT citus_rebalance_wait();`

A running rebalance can be canelled/stopped with `SELECT
citus_rebalance_stop();`.
2022-09-12 20:46:53 +03:00
Marco Slot 48f7d6c279
Show local managed tables in citus_tables and hide tables owned by extensions (#6321)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-12 17:49:17 +03:00
Marco Slot b036e44aa4
Fix bug preventing isolate_tenant_to_new_shard with text column (#6320)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-12 16:29:57 +02:00
naisila 47bea76c6c Revert "Support JSON_TABLE on PG 15 (#6241)"
This reverts commit 1f4fe35512.
2022-09-12 15:20:17 +03:00
naisila 53ffbe440a Revert SQL/JSON features in ruleutils_15.c
Reverting the following commits:
977ddaae56
4a5cf06def
9ae19c181f
30447117e5
f9c43f4332
21dba4ed08
262932da3e

We have to manually make changes to this file.
Follow the relevant PG commit in ruleutils.c & make the exact same changes in ruleutils_15.c

Relevant PG commit:
96ef3237bf741c12390003e90a4d7115c0c854b7
2022-09-12 15:20:17 +03:00
Marco Slot 2e943a64a0
Make shard moves more idempotent (#6313)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-09 18:21:36 +02:00
Jelte Fennema a2d86214b2
Share more replication code between moves and splits (#6310)
The logical replication catchup part for shard splits and shard moves is
very similar. This abstracts most of that similarity away into a single
function. This also improves the logic for non blocking shard splits a
bit, by using faster foreign key creation. It also parallelizes index creation
which shard moves were already doing, but shard splits did not.
2022-09-09 16:45:38 +02:00
Marco Slot ba2fe3e3c4
Remove do_repair option from citus_copy_shard_placement (#6299)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-09 15:44:30 +02:00
Nils Dijk 00a94c7f13
Implement infrastructure to run sql jobs in the background (#6296)
DESCRIPTION: Add infrastructure to run long running management operations in background

This infrastructure introduces the primitives of jobs and tasks.
A task consists of a sql statement and an owner. Tasks belong to a
Job and can depend on other tasks from the same job.

When there are either runnable or running tasks we would like to
make sure a bacgrkound task queue monitor process is running. A Task
could be in running state while there is actually no monitor present
due to a database restart or failover. Once the monitor starts it
will reset any running task to its runnable state.

To make sure only one background task queue monitor is ever running
at once it will acquire an advisory lock that self conflicts.

Once a task is done it will find all tasks depending on this task.
After checking that the task doesn't have unmet dependencies it will
transition the task from blocked to runnable state for the task to
be picked up on a subsequent task start.

Currently only one task can be running at a time. This can be
improved upon in later releases without changes to the higher level
API.

The initial goal for this background tasks is to allow a rebalance
to run in the background. This will be implemented in a subsequent PR.
2022-09-09 16:11:19 +03:00
Jelte Fennema 76137e967f
Create all foreign keys quickly at the end of a shard move (#6148)
Previously we would create foreign keys to reference table in an extra
fast way at the end of a shard move. This uses that same logic to also
do it for foreign keys between distributed tables.

Fixes #6141
2022-09-09 09:58:33 +02:00
Nils Dijk cc0eeea4c5
remove redundant call to TerminateBackgroundWorker (#6307)
Remove redundant call to TerminateBackgroundWorker
Discussion: https://github.com/citusdata/citus/pull/6296#discussion_r965926695
2022-09-09 07:37:02 +02:00
Ahmet Gedemenli eadc88a800
Introduce GUC citus.skip_constraint_validation (#6281)
Introduces a new GUC named citus.skip_constraint_validation, which basically skips constraint validation when set to on.
For some several places that we hack to skip the foreign key validation phase, now we use this GUC.
2022-09-08 18:13:18 +03:00
Jelte Fennema e29db74a19
Don't override postgres C symbols with our own (#6300)
When introducing our overrides of pg_cancel_backend and
pg_terminate_backend we accidentally did that in such a way that we
cannot call the original pg_cancel_backend and pg_terminate_backend from
C anymore. This happened because we defined the exact same symbols in
our shared library as postgres does in its own binary.

This fixes that by using a different names for the C function than for
the SQL function.

Making this work in all upgrade and downgrade scenarios is not trivial
though, because we actually need to remove the C function definition.
Postgres errors in two different times when the symbol that a C function
wants to call is not defined in the library it expects it in:
1. When creating the SQL function definition
2. When calling the SQL function

Item 1 causes an issue when creating our extension for the first time.
We then go execute all the migrations that we have. So if the 11.0
migration contains a SQL function definition that still references the
pg_cancel_backend symbol, that migration will fail. This issue is solved
by actually changing the SQL definition in the old migration.

This is not enough to fix all issues though. Item 2 causes an issue
after an upgrade to 11.1, because it won't have the new definition of
the SQL function. This is solved by recreating the SQL functions in the
migration to 11.1. That way it gets the new definition.

Then finally there's the case of downgrades. To continue to make our
pg_cancel_backend SQL function work after downgrading, we will need to
make a patch release for 11.0 that includes the new citus_cancel_backend
symbol. This is done in a separate commit.
2022-09-07 11:27:05 +02:00
Nitish Upreti d7404a9446
'Deferred Drop' and robust 'Shard Cleanup' for Splits. (#6258)
DESCRIPTION:
This PR adds support for 'Deferred Drop' and robust 'Shard Cleanup' for Splits.

Common Infrastructure
This PR introduces new common infrastructure so as any operation that wants robust cleanup of resources can register with the cleaner and have the resources cleaned appropriately based on a specified policy. 'Shard Split' is the first consumer using this new infrastructure.
Note : We only support adding 'shards' as resources to be cleaned-up right now but the framework will be extended to support other resources in future.

Deferred Drop for Split
Deferred Drop Support ensures that shards undergoing split are not dropped inline as part of operation but dropped later when no active read queries are running on shard. This helps with :

Avoids any potential deadlock scenarios that can cause long running Split operation to rollback.
Avoids Split operation blocking writes and then getting blocked (due to running queries on the shard) when trying to drop shards.
Deferred drop is the new default behavior going forward.
Shard Cleaner Extension
Shard Cleaner is a background task responsible for deferred drops in case of 'Move' operations.
The cleaner has been extended to ensure robust cleanup of shards (dummy shards and split children) in case of a failure based on the new infrastructure mentioned above. The cleaner also handles deferred drop for 'Splits'.

TESTING:
New test ''citus_split_shard_by_split_points_deferred_drop' to test deferred drop support.
New test 'failure_split_cleanup' to test shard cleanup with failures in different stages.
Update 'isolation_blocking_shard_split and isolation_non_blocking_shard_split' for deferred drop.
Added non-deferred drop version of existing tests : 'citus_split_shard_no_deferred_drop' and 'citus_non_blocking_splits_no_deferred_drop'
2022-09-06 12:11:20 -07:00
Gokhan Gulbiz ac96370ddf
Use IsMultiStatementTransaction for SELECT .. FOR UPDATE queries (#6288)
* Use IsMultiStatementTransaction instead of IsTransaction for row-locking operations.

* Add regression test for SELECT..FOR UPDATE statement
2022-09-06 16:38:41 +02:00
Emel Şimşek 6f06ff78cc
Throw an error if there is a RangeTblEntry that is not assigned an RTE identity. (#6295)
* Fix issue : 6109 Segfault or (assertion failure) is possible when using a SQL function

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases.
This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

* DESCRIPTION: Ensures disallowing the usage of SQL functions referencing to a distributed table and prevents a segfault.
Using a SQL function may result in segmentation fault in some cases. This change fixes the issue by throwing an error message when a SQL function cannot be handled.

Fixes #6109.

Co-authored-by: Emel Simsek <emel.simsek@microsoft.com>
2022-09-06 15:46:41 +02:00
aykut-bozkurt 69726648ab
verify shards if exists for insert, delete, update (#6280)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-06 15:29:14 +02:00
Hanefi Onaldi 85b19c851a
Disallow distributing by numeric with negative scale
PG15 allows numeric scale to be negative or greater than precision. This
causes issues and we may end up routing queries to a wrong shard due to
differing hash results after rounding.

Formerly, when specifying NUMERIC(precision, scale), the scale had to be
in the range [0, precision], which was per SQL spec. PG15 extends the
range of allowed scales to [-1000, 1000].

A negative scale implies rounding before the decimal point. For
example, a column might be declared with a scale of -3 to round values
to the nearest thousand. Note that the display scale remains
non-negative, so in this case the display scale will be zero, and all
digits before the decimal point will be displayed.

Relevant PG commit: 085f931f52494e1f304e35571924efa6fcdc2b44
2022-09-06 12:40:56 +03:00
Naisila Puka d7f41cacbe
Prohibit renaming child trigger on distributed partition pre PG15 (#6290)
Pre PG15, renaming child triggers on partitions is allowed. When
creating a trigger in a distributed parent partitioned table, the
triggers on the shards of the partitions have the same name with
the triggers on the corresponding parent shards of the parent
table. Therefore, they don't have the same appended shard id as
the shard id of the partition. Hence, when trying to rename a
child trigger on a partition of a distributed table, we can't
correctly find the triggers on the shards of the partition in
order to rename them since we append a different shard id to the
name of the trigger. Since we can't find the trigger we get a
misleading error of inexistent trigger.

In this commit we prohibit renaming child triggers on distributed
partitions altogether.
2022-09-06 12:19:25 +03:00
Marco Slot e6b1845931
Change split logic to avoid EnsureReferenceTablesExistOnAllNodesExtended (#6208)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-05 22:02:18 +02:00
Önder Kalacı bd13836648
Add citus.skip_advisory_lock_permission_checks (#6293) 2022-09-05 17:47:41 +02:00
Jelte Fennema 1c5b8588fe
Address race condition in InitializeBackendData (#6285)
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
 step s2-view-dist:
  SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;

 query                                                                                                                                                                                                                                                                                                                                                                 |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state              |wait_event_type|wait_event|usename |datname
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------

   INSERT INTO test_table VALUES (100, 100);
                                                                                                                                                                                                                                                                                                                          |localhost                |                    57636|idle in transaction|Client         |ClientRead|postgres|regression
-(1 row)
+
+                SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+                FROM (
+                    SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+                    pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+                ) AS csa_from_one_node;
+            |localhost                |                    57636|active             |               |          |postgres|regression
+(2 rows)

 step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26692/workflows/3406e4b4-b686-4667-bec6-8253ee0809b1/jobs/765119

I intended to fix this with #6263, but the fix turned out to be
insufficient. This PR tries to address the issue by setting
distributedCommandOriginator correctly in more situations. However, even
with this change it's still possible to reproduce the flaky test in CI.
In any case this should fix at least some instances of this issue.

In passing this changes the isolation_citus_dist_activity test to allow
running it multiple times in a row.
2022-09-02 14:23:47 +02:00
Marco Slot 432f399a5d
Allow citus_internal application_name with additional suffix (#6282)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-09-01 14:26:43 +02:00
Naisila Puka 317dda6af1
Use RelationGetPrimaryKeyIndex for citus catalog tables (#6262)
pg_dist_node and pg_dist_colocation have a primary key index, not a replica identity index.

Citus catalog tables are created in public schema, which has replica identity index by default 
as primary key index. Later the citus catalog tables are moved to pg_catalog schema.

During pg_upgrade, all tables are recreated, and given that pg_dist_colocation is found in
pg_catalog schema, it is recreated in that schema, and when it is recreated it doesn't
have a replica identity index, because catalog tables have no replica identity.

Further action:
Do we even need to acquire this lock on the primary key index?
Postgres doesn't acquire such locks on indexes before deleting catalog tuples.
Also, catalog tuples don't have replica identities by definition.
2022-09-01 11:56:31 +03:00
Jelte Fennema 8bb082e77d
Fix reporting of progress on waiting and moved shards (#6274)
In commit 31faa88a4e I removed some features of the rebalance progress
monitor. I did this because the plan was to remove the foreground shard
rebalancer later in the PR that would add the background shard
rebalancer. So, I didn't want to spend time fixing something that we
would throw away anyway.

As it turns out we're not removing the foreground shard rebalancer after
all, so it made sens to fix the stuff that I broke. This PR does that.
For the most part this commit reverts the changes in commit 31faa88a4e.
It's not a full revert though, because it keeps the improved tests and
the changes to `citus_move_shard_placement`.
2022-08-31 14:55:47 +03:00
Naisila Puka 98dcbeb304
Specifies that our CustomScan providers support projections (#6244)
Before, this was the default mode for CustomScan providers.
Now, the default is to assume that they can't project.
This causes performance penalties due to adding unnecessary
Result nodes.

Hence we use the newly added flag, CUSTOMPATH_SUPPORT_PROJECTION
to get it back to how it was.

In PG15 support branch we created explain functions to ignore
the new Result nodes, so we undo that in this commit.

Relevant PG commit:
955b3e0f9269639fb916cee3dea37aee50b82df0
2022-08-31 10:48:01 +03:00
Marco Slot 6bb31c5d75
Add non-blocking variant of create_distributed_table (#6087)
Added create_distributed_table_concurrently which is nonblocking variant of create_distributed_table.

It bases on the split API which takes advantage of logical replication to support nonblocking split operations.

Co-authored-by: Marco Slot <marco.slot@gmail.com>
Co-authored-by: aykutbozkurt <aykut.bozkurt1995@gmail.com>
2022-08-30 15:35:40 +03:00
Jelte Fennema d68654680b
Fix flakyness in isolation_citus_dist_activity (#6263)
Sometimes in CI our isolation_citus_dist_activity test fails randomly
like this:
```diff
 step s2-view-dist:
  SELECT query, citus_nodename_for_nodeid(citus_nodeid_for_gpid(global_pid)), citus_nodeport_for_nodeid(citus_nodeid_for_gpid(global_pid)), state, wait_event_type, wait_event, usename, datname FROM citus_dist_stat_activity WHERE query NOT ILIKE ALL(VALUES('%pg_prepared_xacts%'), ('%COMMIT%'), ('%BEGIN%'), ('%pg_catalog.pg_isolation_test_session_is_blocked%'), ('%citus_add_node%')) AND backend_type = 'client backend' ORDER BY query DESC;

 query                                                                                                                                                                                                                                                                                                                                                                 |citus_nodename_for_nodeid|citus_nodeport_for_nodeid|state              |wait_event_type|wait_event|usename |datname
 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------------------+-------------------+---------------+----------+--------+----------

   INSERT INTO test_table VALUES (100, 100);
                                                                                                                                                                                                                                                                                                                          |localhost                |                    57636|idle in transaction|Client         |ClientRead|postgres|regression
-(1 row)
+
+                SELECT coalesce(to_jsonb(array_agg(csa_from_one_node.*)), '[{}]'::JSONB)
+                FROM (
+                    SELECT global_pid, worker_query AS is_worker_query, pg_stat_activity.* FROM
+                    pg_stat_activity LEFT JOIN get_all_active_transactions() ON process_id = pid
+                ) AS csa_from_one_node;
+            |localhost                |                    57636|active             |               |          |postgres|regression
+(2 rows)

 step s3-view-worker:
```
Source: https://app.circleci.com/pipelines/github/citusdata/citus/26605/workflows/56d284d2-5bb3-4e64-a0ea-7b9b1626e7cd/jobs/760633

The reason for this is that citus_dist_stat_activity sometimes shows the
query that it uses itself to get the data from pg_stat_activity. This is
actually a bug, because it's a worker query and thus shouldn't show up
there. To try and solve this bug, we remove two small opportunities for a
race condition. These race conditions could happen when the backenddata
was marked as active, but the distributedCommandOriginator was not set
correctly yet/anymore. There was an opportunity for this to happen both 
during connection start and shutdown.
2022-08-30 12:57:37 +03:00
Ahmet Gedemenli 0855a9d1d4
Use SUM for calculating non partitioned table sizes (#6222)
We currently do a `pg_relation_total_size('t1') + pg_relation_total_size('t2') + ..` on shard lists, especially when rebalancing the shards. This in some cases goes huge. With this PR, we basically use a SUM for all table sizes, instead of using thousands of pluses.
2022-08-26 18:02:14 +03:00
Sameer Awasekar 4df8eca77f
Add worker_split_shard_release_dsm udf to release dynamic shared memory (#6248)
The code introduces worker_split_shard_release_dsm udf to release the dynamic shared memory segment allocated during non-blocking split workflow.
2022-08-26 18:27:32 +05:30
Önder Kalacı 3ed6fea1cf
Prevent Merge command on distributed tables [PG 15] (#6238) 2022-08-25 13:27:08 +03:00
Marco Slot 9bf3c3dd5c
Add an allow_unsafe_constraints flag for constraints without distribution column (#6237)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-25 11:37:50 +03:00
Gokhan Gulbiz 69d2fcf5c0
Use the same colocation group for child and parent rels when altering a distributed table (#6225)
* Alter_distributed_table colocateWith:none bug fix for partitioned tables.

* Regression tests added for alter_distributed_table colocateWith:none for partitioned tables

* Update query comparision to be more accurate
2022-08-25 11:23:59 +03:00
Marco Slot ac07d33a29
Remove unused reduceQuery from physical planning (#6221)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-24 17:24:27 +00:00
Naisila Puka 1f4fe35512
Support JSON_TABLE on PG 15 (#6241)
Postgres supports JSON_TABLE feature on PG 15.

We treat JSON_TABLE the same as correlated functions (e.g., recurring tuples).
In the end, for multi-shard JSON_TABLE commands, we apply the same
restrictions as reference tables (e.g., cannot be in the outer part of
an outer join etc.)

Co-authored-by: Onder Kalaci <onderkalaci@gmail.com>
2022-08-24 19:11:18 +03:00
Naisila Puka 35b4ddc355
Pg15 support (#6085)
* Adjust configure script to allow PG15

* Adds copy of ruleutils_14.c as ruleutils_15.c

* Uses get_namespace_name_or_temp in ruleutils_15.c

Relevant PG commit:
48c5c9068211e0a04fd9553c8714b2821ed3ad17

* Clean up code using "(expr) ? true : false" in ruleutils_15.c

Relevant PG commit:
fd0625c7a9c679c0c1e896014b8f49a489c3a245

* Change varno from Index (unsigned int) to int in ruleutils_15.c

Relevant PG commit:
e3ec3c00d85bd2844ffddee83df2bd67c4f8297f

* Adds find_recursive_union to ruleutils_15.c

Relevant PG commit:
3f50b82639637c9908afa2087de7588450aa866b

* Fix display of SQL-std func's args in INSERT/SELECT in ruleutils_15.c

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fix ruleutils_15.c's dumping of whole-row Vars in more contexts

Relevant PG commit:
43c2175121c829c8591fc5117b725f1f22bfb670

* Fix assorted missing logic for GroupingFunc nodes in ruleutils_15.c

Relevant PG commit:
2591ee8ec44d8cbc8e1226550337a64c684746e4

* Adds grammar support for SQL/JSON clauses in ruleutils_15.c

Relevant PG commit:
f79b803dcc98d707450e158db3638dc67ff8380b

* Adds SQL/JSON constructors to ruleutils_15.c

Relevant PG commits:
f4fb45d15c59d7add2e1b81a9d477d0119a9691a
cc7401d5ca498a84d9b47fd2e01cebd8e830e558

* Adds support for MERGE in ruleutils_15.c

Relevant PG commit:
7103ebb7aae8ab8076b7e85f335ceb8fe799097c

* Add IS JSON predicate to ruleutils_15.c

Relevant PG commit:
33a377608fc29cdd1f6b63be561eab0aee5c81f0

* Add SQL/JSON query functions to ruleutils_15.c

Relevant PG commit:
1a36bc9dba8eae90963a586d37b6457b32b2fed4

* Adds three different SQL/JSON values to ruleutils_15.c

Relevant PG commits:
606948b058dc16bce494270eea577011a602810e
49082c2cc3d8167cca70cfe697afb064710828ca

* Adds JSON table functions in ruleutils_15.c

Relevant PG commit:
4e34747c88a03ede6e9d731727815e37273d4bc9

* Add PLAN function for JSON table in ruleutils_15.c

Relevant PG commit:
fadb48b00e02ccfd152baa80942de30205ab3c4f

* Remove extra blank lines before block-closing braces ruleutils_15.c

Relevant PG commit:
24d2b2680a8d0e01b30ce8a41c4eb3b47aca5031

* set_deparse_plan: Reuse variable to appease Coverity ruleutils_15.c

Relevant PG commit:
e70813fbc4aaca35ec012d5a426706bd54e4acab

* Mechanical code beautification ruleutils_15.c

Relevant PG commit:
23e7b38bfe396f919fdb66057174d29e17086418

* Rename value_type to item_type in ruleutils_15.c

Relevant PG commit:
3ab9a63cb638a1fd99475668e2da9c237495aeda

* Show 'AS "?column?"' explicitly when it's important in ruleutils_15.c

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Fix ruleutils_15.c issues with dropped cols in funcs-returning-composite

Relevant PG commit:
c1d1e8469c77ce6b8e5310955580b4a3eee7fe96

* Change comment regarding functions returning composite in ruleutils_15.c

Relevant PG commit:
c2fa113ddb1117b1f03e91960f65d5d7d8a90270

* Replace int nodes with bool nodes where needed

In PG15, Boolean nodes are added. Pre PG15, internal Boolean values
in Create Role commands were represented by Integer nodes. This
commit replaces int nodes logic with bool nodes logic where needed.
Mostly there are CREATE ROLE logic changes.

Relevant PG commit:
941460fcf731a32e6a90691508d5cfa3d1f8eeaf

* Handle new option colliculocale in CREATE COLLATION logic

In PG15, there is an added option to use ICU as global locale provider.
pg_collation has three locale-related fields: collcollate and collctype,
which are libc-related fields, and a new one colliculocale, which is the
ICU-related field. Only the libc-related fields or the ICU-related field
is set, never both.

Relevant PG commits:
f2553d43060edb210b36c63187d52a632448e1d2
54637508f87bd5f07fb9406bac6b08240283be3b

* Add PG15 tests to CI using test images that have 15beta2 (#6093)

* Change warning message in pg_signal_backend()

Relevant PG commit:
7fa945b857cc1b2964799411f1633468826861ff

* Revert "Add missing ifdef for PG 15"

This reverts commit c7b51025ab.

* Fixes tests for ALTER TRIGGER RENAME consistency for part. tables

Relevant PG commit:
80ba4bb383538a2ee846fece6a7b8da9518b6866

* Prevent creating child triggers on partitions when adding new node

Pre PG15, tgisinternal is true for a "child" trigger on a partition
cloned from the trigger on the parent.
In PG15, tgisinternal is false in that case. However, we don't want to
create this trigger on the partition since it will create a conflict
when we try to attach the partition to the parent table:
ERROR: trigger "..." for relation "{partition_name}" already exists

Relevant PG commit:
f4566345cf40b068368cb5617e61318da60676ec

* Fix tests for generated columns dependency changes

In PG15, For GENERATED columns, all dependencies of the generation
expression are recorded as NORMAL dependencies of the column itself.
This requires CASCADE to drop generated cols with the original col.
PRE PG15, dependencies were recorded as AUTO, with which
generated columns are silently dropped with the original column.

Relevant PG commit:
cb02fcb4c95bae08adaca1202c2081cfc81a28b5

* Explicitly cast catalog "char" column to text before concatenation

Relevant PG commit:
07eee5a0dc642d26f44d65c4e6263304208e8583

* Remove 'AS "?column?"' from test outputs

There were some instances in the following tst outputs
in planning debug outputs where AS "?column?" is added.
We add a normalization rule to remove it as it is not
important.

cte_inline.out
recursive_relation_planning_restriction_pushdown.out

Relevant PG commit:
c7461fc25558832dd347a9c8150b0f1ed85e36e8

* Use pg_backup_stop(PG15) instead of pg_stop_backup(PG<15)

Add an alternative test output because of the change in the
backup modes of Postgres. Specifically here, there is a renaming
issue: pg_stop_backup PRE PG15 vs pg_backup_stop PG15+
The alternative output can be deleted when we drop support for PG14

Relevant PG commit:
39969e2a1e4d7f5a37f3ef37d53bbfe171e7d77a

* Adds citus.mitmfifo GUC

Previously we setting this configuration parameter
in the fly for failure tests schedule.
However, PG15 doesn't allow that anymore: reserved prefixes
like "citus" cannot be used to set non-existing GUCs.

Relevant PG commit:
88103567cb8fa5be46dc9fac3e3b8774951a2be7

* Handles EXPLAIN output diffs in PG15 - Extra result lines

To handle extra "Result" lines in explain outputs, we add explain
method to multi_test_helpers.sql file
- plan_without_result_lines() is added for cases where we want the
whole explain output with only "Result" lines removed

* Handles EXPLAIN output diffs in PG15, Hash Agg/Join leverage

To handle differences in usage of GroupAggregate vs HashAggregate
or Merge Join vs Hash join in cases where this detail doesn't
seem to matter, we use coordinator_plan().
- coordinator_plan() is updated to remove "Result" lines

There are some cases where we have subplans so we add a new
function that prints all Task Count lines as well
- coordinator_plan_with_subplans()

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.

* Handles EXPLAIN output diffs in PG15: enable_group_by_reordering

Relevant PG commit
db0d67db2401eb6238ccc04c6407a4fd4f985832

* Normalizes Memory Usage, Buckets, Batches for PG15 explain diffs

We create a new function in multi_test_helpers, which is similar
to explain_merge function in PG15. This explain helper function
normalies Memory Usage, Buckets and Batches, and we use it in the
tests which give a different output for PG15.

* Bump test images to 15beta3 (#6172)

* Omit namespace in post-copy errmsg

Relevant PG commit:
069d33d0c5a021601245e44df77a0423ddd69359

* Handles EXPLAIN output diffs in PG15: extra arrows&result lines

To handle extra "->" arrows resulting from extra Result lines
in explain outputs, we add the following explain method to
multi_test_helpers.sql file

- plan_without_arrows() is added for cases where we want the
whole explain output without arrows and without Result lines

* Alters public schema's owner to pg_database_owner in PG15

In PG15, public schema is owned by pg_database_owner role.
In multi_extension, we drop and recreate the ppublic schema,
hence its owner become the default user in our tests, postgres.
Change that to pg_database_owner for PG15 consistency.

This results in alternative test output for public schema grants
in the following test:

grant_on_schema_propagation.sql

Relevant PG commit: b073c3ccd06e4cb845e121387a43faa8c68a7b62

* Add alternative test outputs for change in Insert Select display

citus_local_tables_queries.sql
coordinator_shouldhaveshards.sql
cte_inline.sql
insert_select_repartition.sql
intermediate_result_pruning.sql
local_shard_execution.sql
local_shard_execution_replicated.sql
multi_deparse_shard_query.sql
multi_insert_select.sql
multi_insert_select_conflict.sql
multi_mx_insert_select_repartition.sql
mx_coordinator_shouldhaveshards.sql
single_node.sql

Relevant PG commit:
a8d8445a7b2f80f6d0bfe97b19f90bd2cbef8759

* Fixes columnar tap tests for PG15

In PG15, Perl test modules have been moved to a new namespace.
Also, postgres node new() and get_new_node() methods have been
unified to one method: new()

We create separate tap tests for PG13/14 and PG15+
and update the Makefiles accordingly.

Relevant PG commits:
201a76183e2056c2217129e12d68c25ec9c559c8
b3b4d8e68ae83f432f43f035c7eb481ef93e1583

* Handles EXPLAIN output diffs in PG15: HashAgg Leverage,alt. output

Still not sure of the relevant PG commit
Could be db0d67db2401eb6238ccc04c6407a4fd4f985832
but disabling enable_group_by_reordering didn't help.
2022-08-24 17:59:17 +02:00
aykut-bozkurt 041f88d7bf
Revert "Revert "Creates new colocation for colocate_with:='none' too"" (#6227)
This reverts commit d171a736ab.
2022-08-24 10:54:04 +03:00
Jelte Fennema e0ada050aa
Enable binary logical replication for shard moves (#6017)
Using binary encoding can save a lot of CPU cycles, both on the sender
and on the receiver. Since the walsender and walreceiver processes are
single threaded, this can matter a lot for the throughput if they are
bottlenecked on CPU.

This feature is only available in PG14, not PG13. It should be safe to 
always enable because it's only used for types that support binary 
encoding according to the PG docs:
> Even when this option is enabled, only data types that have binary 
> send and receive functions will be transferred in binary.

But in case it causes problems, it can still be disabled by setting
`citus.enable_binary_protocol` to `false`.
2022-08-23 16:38:00 +02:00
aykut-bozkurt 07cfba461a
ensuring reference tables on nodes should not create colocation entry. (#6224)
We create colocation entry in create_reference_table.
2022-08-23 16:17:59 +03:00
Marco Slot 639588bee0
Remove unused functions (#6220)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-22 11:53:25 +03:00
Jelte Fennema 3fadb98380
Fix compilation warning on PG13 + OpenSSL 3.0 (#6038)
This removes some warnings that are present when building on Ubuntu 22.04. 
It removes warnings on PG13 + OpenSSL 3.0. OpenSSL 3.0 has marked some 
functions that we use as deprecated, but we want to continue support OpenSSL
1.0.1 for the time being too. This indicates that to OpenSSL 3.0, so it doesn't 
show warnings.
2022-08-19 05:51:47 -07:00
Marco Slot 5160cafa82
Do not propagate GRANT ON SCHEMA from CREATE EXTENSION (#6175)
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-08-19 13:23:47 +03:00
Jelte Fennema 31faa88a4e
Track rebalance progress at the shard move level (#6187)
We're in the processes of totally changing the shard rebalancer
experience and infrastructure. Soon the shard rebalancer will include
retries, crash recovery and support for running in the background.

These improvements come at a cost though, the way the
get_rebalance_progress UDF currently works is very hard to replicate
with this new structure. This is mostly because the old behaviour
doesn't really make sense anymore with this new infrastructure. A new
and better way to track the progress will be included as part of the new
infrastructure.

This PR is in preparation of the new code rebalancer experience.
It changes the get_rebalance_progress UDF to only display the moves that
are in progress at the moment, not the ones that happened in the past or
that are planned in the future. Another option would have been to
completely remove the current get_rebalance_progress functionality and
point people to the new way of tracking progress. But old blogposts
still reference the old UDF and users might have some automation on top
of it. Showing the progress of the current moves is fairly simple to
achieve, even with the new infrastructure.

So this PR is a kind of compromise: It doesn't have complete feature
parity with the old get_rebalance_progress, but the most common use
cases will still work.

There's also an advantage of the change: You can now see progress of
shard moves that were triggered by calling citus_move_shard_placement
manually. Instead of only being able to see progress of moves that were
initiated using get_rebalance_table_shards.
2022-08-18 18:57:04 +02:00
Onder Kalaci 9ec8e627c1 Support Sequences owned by columns before distributing tables
There are 3 different ways that a sequence can be interacting
with tables. (1) and (2) are already supported. This commit adds
support for (3).

     (1) column DEFAULT nextval('seq'):

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
         ^                                     |
         |------------------ sequence <--------|

    (2) serial columns: Bigserial/small serial etc:

	The dependency is roughly like below,
	and ExpandCitusSupportedTypes() is responsible
	for finding the depending sequences.

        schema <--- table <--- column <---- default value
                                 ^             |
				 |             |
          		     sequence <--------|

   (3) Sequence OWNED BY table.column: Added support for
       this type of resolution in this commit.

       The dependency is almost like the following, and
       ExpandCitusSupportedTypes() is NOT responsible for finding
       the dependency.

        schema <--- table <--- column
                                 ^
				 |
          		     sequence
2022-08-18 10:29:40 +02:00
Naisila Puka 69ffdbf0e3
Uses object name in cannot distribute object error (#6186)
Object type ids have changed in PG15 because of at least two added
objects in the list: OBJECT_PARAMETER_ACL, OBJECT_PUBLICATION_NAMESPACE

To avoid different output between pg versions, let's use the object
name in the error, and put the object id in the error detail.

Relevant PG commits:
a0ffa885e478f5eeacc4e250e35ce25a4740c487
5a2832465fd8984d089e8c44c094e6900d987fcd
2022-08-18 11:05:17 +03:00
Ying Xu 91473635db
[Columnar] Check for existence of Citus before creating Citus_Columnar (#6178)
* Added a check to see if Citus has already been loaded before creating citus_columnar

* added tests
2022-08-17 15:12:42 -07:00
Nils Dijk a9d47a96f6
Fix reference table lock contention (#6173)
DESCRIPTION: Fix reference table lock contention

Dropping and creating reference tables unintentionally blocked on each other due to the use of an ExclusiveLock for both the Drop and conditionally copying existing reference tables to (new) nodes.

The patch does the following:
 - Lower lock lever for dropping (reference) tables to `ShareLock` so they don't self conflict
 - Treat reference tables and distributed tables equally and acquire the colocation lock when dropping any table that is in a colocation group
 - Perform the precondition check for copying reference tables twice, first time with a lower lock that doesn't conflict with anything. Could have been a NoLock, however, in preparation for dropping a colocation group, it is an `AccessShareLock`

During normal operation the first check will always pass and we don't have to escalate that lock. Making it that we won't be blocked on adding and remove reference tables. Only after a node addition the first `create_reference_table` will still need to acquire an `ExclusiveLock` on the colocation group to perform the copy.
2022-08-17 18:19:28 +02:00
Ahmet Gedemenli 0631e1998b
Fix upgrade paths for #6100 (#6176)
* Fix upgrade paths for #6100

Co-authored-by: Hanefi Onaldi <Hanefi.Onaldi@microsoft.com>
2022-08-17 18:56:53 +03:00
Jelte Fennema 3f6ce889eb
Use CreateSimpleHash (and variants) whenever possible (#6177)
This is a refactoring PR that starts using our new hash table creation
helper function. It adds a few more macros for ease of use, because C
doesn't have default arguments. It also adds a macro to check if a
struct contains automatic padding bytes. No struct that is hashed using
tag_hash should have automatic padding bytes, because those bytes are
undefined and thus using them to create a hash will result in undefined
behaviour (usually a random hash).
2022-08-17 13:01:59 +03:00
aykut-bozkurt 52efe08642
default mode for shard splitting is set to auto. (#6179) 2022-08-17 12:18:47 +03:00
aykut-bozkurt be06d65721
Nonblocking tenant isolation is supported by using split api. (#6167) 2022-08-17 11:13:07 +03:00
Jelte Fennema 78a5013e24
Support changing CPU priorities for backends and shard moves (#6126)
**Intro**
This adds support to Citus to change the CPU priority values of
backends. This is created with two main usecases in mind:

1. Users might want to run the logical replication part of the shard moves
   or shard splits at a higher speed than they would do by themselves. 
   This might cause some small loss of DB performance for their regular 
   queries, but this is often worth it. During high load it's very possible
   that the logical replication WAL sender is not able to keep up with the
   WAL that is generated. This is especially a big problem when the
   machine is close to running out of disk when doing a rebalance.
2. Users might have certain long running queries that they don't impact
   their regular workload too much.

**Be very careful!!!**
Using CPU priorities to control scheduling can be helpful in some cases
to control which processes are getting more CPU time than others. 
However, due to an issue called "[priority inversion][1]" it's possible that
using CPU priorities together with the many locks that are used within
Postgres cause the exact opposite behavior of what you intended. This
is why this PR only allows the PG superuser to change the CPU priority 
of its own processes. Currently it's not recommended to set `citus.cpu_priority`
directly. Currently the only recommended interface for users is the setting 
called `citus.cpu_priority_for_logical_replication_senders`. This setting
controls CPU priority for a very limited set of processes (the logical 
replication senders). So, the dangers of priority inversion are also limited
with when using it for this usecase.

**Background**
Before reading the rest it's important to understand some basic
background regarding process CPU priorities, because they are a bit
counter intuitive. A lower priority value, means that the process will
be scheduled more and whatever it's doing will thus complete faster. The
default priority for processes is 0. Valid values are from -20 to 19
inclusive. On Linux a larger difference between values of two processes
will result in a bigger difference in percentage of scheduling.

**Handling the usecases**
Usecase 1 can be achieved by setting `citus.cpu_priority_for_logical_replication_senders`
to the priority value that you want it to have. It's necessary to set
this both on the workers and the coordinator. Example:
```
citus.cpu_priority_for_logical_replication_senders = -10
```

Usecase 2 can with this PR be achieved by running the following as
superuser. Note that this is only possible as superuser currently 
due to the dangers mentioned in the "Be very carefull!!!" section. 
And although this is possible it's **NOT** recommended:
```sql
ALTER USER background_job_user SET citus.cpu_priority = 5;
```

**OS configuration**
To actually make these settings work well it's important to run Postgres
with more a more permissive value for the 'nice' resource limit than
Linux will do by default. By default Linux will not allow a process to
set its priority lower than it currently is, even if it was lower when
the process originally started. This capability is necessary to reset
the CPU priority to its original value after a transaction finishes.
Depending on how you run Postgres this needs to be done in one of two
ways:

If you use systemd to start Postgres all you have to do is add  a line
like this to the systemd service file:
```conf
LimitNice=+0 # the + is important, otherwise its interpreted incorrectly as 20
```

If that's not the case you'll have to configure `/etc/security/limits.conf` 
like so, assuming that you are running Postgres as the `postgres` OS user:
```
postgres            soft    nice            0
postgres            hard    nice            0
```
Finally you'd have add the following line to `/etc/pam.d/common-session`
```
session required pam_limits.so
```

These settings would allow to change the priority back after setting it
to a higher value.

However, to actually allow you to set priorities even lower than the
default priority value you would need to change the values in the 
config to something lower than 0. So for example:
```conf
LimitNice=-10
```

or

```
postgres            soft    nice            -10
postgres            hard    nice            -10
```

If you use WSL2 you'll likely have to do another thing. You have to 
open a new shell, because when PAM is only used during login, and 
WSL2 doesn't actually log you in. You can force a login like this:
```
sudo su $USER --shell /bin/bash
```
Source: https://stackoverflow.com/a/68322992/2570866

[1]: https://en.wikipedia.org/wiki/Priority_inversion
2022-08-16 13:07:17 +03:00
Jelte Fennema 1a01c896f0
Fix description of citus.distributed_deadlock_detection_factor (#5860)
The long description of the `citus.distributed_deadlock_detection_factor` 
setting was incorrectly stating that 1000 would disable it. Instead -1 
is the value that disables distributed deadlock detection.
2022-08-16 01:19:49 +03:00
Jelte Fennema 43c2a1e88b
Share more code between splits and moves (#6152)
When introducing non-blocking shard split functionality it was based
heavily on the non-blocking shard moves. However, differences between
usage was slightly to big to be able to reuse the existing functions
easily. So, most logical replication code was simply copied to dedicated
shard split functions and modified for that purpose.

This PR tries to create a more generic logical replication
infrastructure that can be used by both shard splits and shard moves.
There's probably more code sharing possible in the future, but I believe
this is at least a good start and addresses the lowest hanging fruit.

This also adds a CreateSimpleHash function that makes creating the
most common type of hashmap common.
2022-08-15 20:21:51 +03:00
Marco Slot 6c73576606 Fix HTAB memory leaks 2022-08-15 16:10:24 +02:00
Teja Mupparti e962113c63 Remove the GUC mention in the error message as this config is meant for advanced users 2022-08-11 09:43:14 -07:00
aykut-bozkurt 898801504e
sysid should be parsed as int. (#6150) 2022-08-11 10:44:46 +03:00
aykut-bozkurt 166272963a
log NOTICE createdb only if EnableUnsupportedFeatureMessages GUC is enabled. (#6151) 2022-08-09 21:21:22 +03:00
aykut-bozkurt cc694b6bcf
we consider stat object as invalid if it is not owned by current user (#6130) 2022-08-09 20:59:30 +03:00
Hanefi Onaldi a58523f1d8
Remove all references to .source files 2022-08-09 14:15:52 +03:00
Jelte Fennema 8017693b2f
Allow specifying the shard_transfer_mode when replicating reference tables (#6070)
When using `citus.replicate_reference_tables_on_activate = off`,
reference tables need to be replicated later. This can be done using the
`replicate_reference_tables()` UDF. However, this function only allowed
blocking replication. This changes the function to default to logical
replication instead, and allows choosing any of our existing shard
transfer modes.
2022-08-09 13:21:31 +03:00
Marco Slot 3b57ff2867 Fix crash in citus_copy_shard_placement 2022-08-09 09:31:05 +02:00
Jelte Fennema dd548ee3c7
Use faster custom copy logic for non-blocking shard moves (#6119)
DESCRIPTION: Use faster custom copy logic for non-blocking shard moves

Non-blocking shard moves consist of two main phases:
1. Initial data copy
2. Catchup phase

This changes the first of these phases significantly. Previously we used the
copy logic provided by postgres subscriptions. This meant we didn't have
to implement it ourselves, but it came with the downside of little control.
When implementing shard splits we needed more control to even make it
work, so we implemented our own logic for copying data between nodes.

This PR starts using that logic for non-blocking shard moves. Doing so
has four main advantages:
1. It uses COPY in binary format when possible, which is cheaper to encode 
    and decode. Furthermore it very often results in less data that needs to 
    be sent over the network.
2. It allows us to create the primary key (or other replica identity) after doing
    the initial data copy. This should give some speed up over the total run,
    because creating an index is bulk is much faster than incrementally building it.
3. It doesn't require a replication slot per parallel copy. Increasing the maximum
    number of replication slots uses resources in postgres, even if they are not used.
    So reducing the number of replication slots that shard moves need is nice.
4. Logical replication table_sync workers are slow to start up, so if lots of shards
    need to be copied that can make it quite slow. This can happen easily when
    combining Postgres partitioning with Citus.
2022-08-08 17:09:43 +02:00
Marco Slot ead9d28835 Avoid deadlocks on split failure by closing connections 2022-08-08 13:33:23 +02:00
Marco Slot 044dd26e40 Reimplement tenant isolation on top of block shard split 2022-08-08 13:33:23 +02:00
Teja Mupparti 430c201d03 get_current_transaction_id() UDF is not printing the timestamp of the current transaction on the coordinator even when non-null 2022-08-05 10:12:07 -07:00
aykut-bozkurt 4992533e33
support grant statement propagation for aggregates (#6132) 2022-08-05 14:47:33 +03:00
Ahmet Gedemenli 8b68b0b5bb
Fix pg upgrade script for foreign tables (#6100)
Fixes unexpected error for foreign tables when upgrading pg
2022-08-05 13:35:17 +03:00
Sameer Awasekar e236711eea Introduce Non-Blocking Shard Split Workflow 2022-08-04 16:32:38 +02:00
aykut-bozkurt b67abdd28c
we should not log error in preprocess if attached partition is missing. (#6131) 2022-08-04 15:49:14 +03:00
aykut-bozkurt 3ddc089651
stop distributing views with no distributed dependency if GUC DistributeLocalViews is set false. (#6083) 2022-08-04 12:34:40 +03:00
aykut-bozkurt 4ffe436bf9
we validate constraint as well if the statement is alter domain drop constraint (#6125) 2022-08-03 23:06:33 +03:00
aykut-bozkurt a662331668
qualify text dict and conf respect missingok (#6120) 2022-08-03 13:13:53 +03:00
aykutbozkurt 7387c7ed3d address method should take parameter isPostprocess 2022-08-02 21:00:23 +03:00
aykutbozkurt c98a68662a introduces operation type for dist ops 2022-08-02 20:42:32 +03:00
aykutbozkurt 57ce4cf8c4 use address method to decide if we should run preprocess and postprocess steps for a distributed object 2022-08-02 20:42:32 +03:00
Onder Kalaci c7b51025ab Add missing ifdef for PG 15 2022-08-02 09:46:53 +02:00
Jelte Fennema abffa6c3b9
Use shard split copy code for blocking shard moves (#6098)
The new shard copy code that was created for shard splits has some
advantages over the old shard copy code. The old code was using 
worker_append_table_to_shard, which wrote to disk twice. And it also 
didn't use binary copy when that was possible. Both of these issues
were fixed in the new copy code. This PR starts using this new copy
logic also for shard moves, not just for shard splits.

On my local machine I created a single shard table like this.
```sql
set citus.shard_count = 1;
create table t(id bigint, a bigint);
select create_distributed_table('t', 'id');

INSERT into t(id, a) SELECT i, i from generate_series(1, 100000000) i;
```

I then turned `fsync` off to make sure I wasn't bottlenecked by disk. 
Finally I moved this shard between nodes with `citus_move_shard_placement`
with `block_writes`.

Before this PR a move took ~127s, after this PR it took only ~38s. So for this 
small test this resulted in spending ~70% less time.

And I also tried the same test for a table that contained large strings:
```sql
set citus.shard_count = 1;
create table t(id bigint, a bigint, content text);
select create_distributed_table('t', 'id');

INSERT into t(id, a, content) SELECT i, i, 'aunethautnehoautnheaotnuhetnohueoutnehotnuhetncouhaeohuaeochgrhgd.athbetndairgexdbuhaobulrhdbaetoausnetohuracehousncaoehuesousnaceohuenacouhancoexdaseohusnaetobuetnoduhasneouhaceohusnaoetcuhmsnaetohuacoeuhebtokteaoshetouhsanetouhaoug.lcuahesonuthaseauhcoerhuaoecuh.lg;rcydabsnetabuesabhenth' from generate_series(1, 20000000) i;
```
2022-08-01 20:10:36 +03:00
Naisila Puka 85324f3acc
Clean up multi_shard_commit_protocol guc leftovers (#6110) 2022-08-01 15:22:02 +03:00
aykut-bozkurt f372e93d22
we supress notice log during looking up function oid to not break pg vanilla tests. (#6082) 2022-08-01 10:14:35 +03:00
Önder Kalacı cbdc2b3019
Merge branch 'main' into fix_relation_acess_2 2022-07-29 16:45:02 +02:00
Marco Slot 6d6e44166f Avoid catalog read via superuser() call in DecrementSharedConnectionCounter 2022-07-29 14:05:41 +02:00
Onder Kalaci bdaeb40b51 Add missing relation access record for local utility command
While testing 5670dffd33, I realized
that we have a missing RecordNonDistTableAccessesForTask() for
local utility commands.

Although we don't have to record the relation access for local
only cases, we really want to keep the behaviour for scale-out
be the same with single node on all aspects. We wouldn't want
any single node complex transaction to work on single machine,
but not on multi node cluster. Hence, we apply the same restrictions.

For example, on a distributed cluster, the following errors, and
after this commit this errors locally as well

```SQL
CREATE TABLE ref(a int primary key);
INSERT INTO ref VALUES (1);

CREATE TABLE dist(a int REFERENCES ref(a));
SELECT create_reference_table('ref');
SELECT create_distributed_table('dist', 'a');

BEGIN;
		SELECT * FROM dist;
		TRUNCATE ref CASCADE;

ERROR:  cannot execute DDL on table "ref" because there was a parallel SELECT access to distributed table "dist" in the same transaction
HINT:  Try re-running the transaction with "SET LOCAL citus.multi_shard_modify_mode TO 'sequential';"

COMMIT;
```

We also add the comprehensive test suite and run the same locally.
2022-07-29 11:36:33 +02:00
Onder Kalaci 149771792b Remove useless version compats
most likely leftover from earlier versions
2022-07-29 10:31:55 +02:00
Ying Xu 7c1a93b26b
Removed USE_PGXS snippet in Makefile that was blocking citus build when flag is set (#6101)
Code snippet in Makefile was blocking Citus build when USE_PGXS flag was set. This was included for port to FSPG but is not needed for Citus engine and can be safely removed.
2022-07-28 14:15:45 -07:00
aykut-bozkurt a218198e8f
reindex object address should return invalid addresses for unsepported object types in reindex stmt (#6096) 2022-07-28 15:31:49 +03:00
Marco Slot cff013a057 Fix issues with insert..select casts and column ordering 2022-07-28 13:23:57 +02:00
aykut-bozkurt 789d5b9ef9
null check for server in GetObjectAddressByServerName (#6095) 2022-07-28 13:13:28 +03:00
Onder Kalaci 0a5112964d Call relation access hash clean-up irrespective of remote transaction state
Mainly because local-only transactions should be cleaned up
2022-07-28 11:27:59 +02:00
Onder Kalaci d67cf907a2 Detach relation access tracking from connection management 2022-07-28 11:27:59 +02:00
Ying Xu fdf090758b
Bugfix for IN clause to be considered during planner phase in Columnar (#6030)
Reported bug #5803 shows that we are currently not sending the IN clause to our planner for columnar. This PR fixes it by checking for ScalarArrayOpExpr in ExtractPushdownClause so that we do not skip it. Also added a test case for this new addition.
2022-07-27 11:06:49 -07:00
Jelte Fennema 0f50bef696
Avoid possible information leakage about existing users (#6090) 2022-07-27 17:46:32 +02:00
Ahmet Gedemenli 2b2a529653
Error out for views with circular dependencies (#6051)
Adds error check for views with circular dependencies
2022-07-27 17:57:45 +03:00
aykut-bozkurt b08e5ec29d
added some missing object address callbacks (#6056) 2022-07-27 17:36:04 +03:00
Naisila Puka 1259d83511
Smallfix in CreateCollationDDL logic (#6089) 2022-07-27 14:33:31 +03:00
Onder Kalaci 5bc8a81aa7 Add colocation checks for shard splits 2022-07-27 10:01:19 +02:00
Onder Kalaci 12fa3aaf6b Concurrent shard move/copy and colocated table creation fix
It turns out that create_distributed_table
and citus_move/copy_shard_placement does not
work well concurrently.

To fix that, we need to acquire a lock, which
sounds like a good use of colocation lock.

However, the current usage of colocation lock is
limited to higher level UDFs like rebalance_table_shards
etc. Those usage of lock is still useful, but
we cannot acquire the same lock on citus_move_shard_placement
etc. because the coordinator connects to itself to acquire
the lock. Hence, the high level UDF blocks itself.

To fix that, we use one more colocation lock, with the placements
are the main objects to consider.
2022-07-27 10:01:19 +02:00
Onder Kalaci f076e81166 Do not cache all the metadata during fix_all_partition_shard_index_names 2022-07-27 09:49:08 +02:00
Onder Kalaci 26fdcb68f0 Optimize StringJoin() for when prefix-postfix is needed
Before this commit, we required multiple copies of the
same stringInfo if we needed to append/prepend data to
the stringInfo. Now, we optionally get prefix/postfix.

For large string operations, this can save up to %10
memory.
2022-07-27 09:49:08 +02:00
Onder Kalaci b8008999dc Reduce memory consumption while adjust partition index names
Previously, CreateFixPartitionShardIndexNames() created all
the relevant query strings for all the shards, and executed
the large query string. And, in terms of the memory consumption,
this huge command (and its ExprContext generated while running
the command) is the main bottleneck/

With this change, we are reducing the total amount of memory
usage to almost 1/shard_count.

On my local machine, a distributed partitioned table with 120 partitions,
each 32 shards, the total memory consumption reduced from ~3GB
to ~0.1GB. And, the total execution time increased from ~28 seconds
to ~30 seconds. This seems like a good trade-off.
2022-07-27 09:49:08 +02:00
aykut-bozkurt 5f27445b69
enable propagation warnings before postgres vanilla tests (#6081) 2022-07-27 10:34:41 +03:00
Onder Kalaci 6c65d29924 Check the PGPROC's validity properly
We used to only check whether the PID is valid
or not. However, Postgres does not necessarily
set the PID of the backend to 0 when it exists.

Instead, we need to be able to check it from procArray.
IsBackendPid() is what pg_stat_activity also relies
on for a similar purpose.
2022-07-26 17:44:44 +02:00
aykut-bozkurt 67ac3da2b0
added citus_depended_objects udf and HideCitusDependentObjects GUC to hide citus depended objects from pg meta queries (#6055)
use RecurseObjectDependencies api to find if an object is citus depended

make vanilla tests runnable to see if citus_depended function is working correctly
2022-07-25 16:43:34 +03:00
Marco Slot 5fabf94e39 Allow WITH HOLD cursors with parameters 2022-07-21 12:00:59 +02:00
Hanefi Onaldi eb3e5ee227 Introduce citus_locks view
citus_locks combines the pg_locks views from all nodes and adds
global_pid, nodeid, and relation_name. The columns of citus_locks don't
change based on the Postgres version, however the pg_locks's columns do.
Postgres 14 added one more column to pg_locks (waitstart timestamptz).
citus_locks has the most expansive column set, including the newly added
column. If citus_locks is queried in a Postgres version where pg_locks
doesn't have some columns, the values for those columns in citus_locks
will be NULL
2022-07-21 03:06:57 +03:00
Nitish Upreti 3d569cc49a
Shard Split support for Columnar and Partitioned Table (#6067)
DESCRIPTION:
This PR extends support for Partitioned and Columnar tables in blocking 'citus_split_shard_by_split_points' workflow.
Columnar Support : No special handling required. Just removing checks that fails split for columnar table and adding test coverage.
Partitioned Table Support :

Skip copying of parent table as they are empty, The partitions contain data and are treated as co-located shards that will be copied separately.
Attach partitions to parent on destination after inserting new shard metadata and before creating foreign key constraints.
MISC:
Fix Bug #4949 where Blocking shard moves fails if there is a foreign key between partitioned distributed tables (from child to parent).

TEST:
Added new test 'citus_split_shards_columnar_partitioned' for splitting 'partitioned' and 'columnar + partitioned' table.
Added new test 'shard_move_constraints_blocking' to add coverage for shard move bug fix.
Updated test 'citus_split_shard_by_split_points_negative' to allow columnar and partitioned table.
2022-07-20 12:24:50 -07:00
Naisila Puka 7d6410c838
Drop postgres 12 support (#6040)
* Remove if conditions with PG_VERSION_NUM < 13

* Remove server_above_twelve(&eleven) checks from tests

* Fix tests

* Remove pg12 and pg11 alternative test output files

* Remove pg12 specific normalization rules

* Some more if conditions in the code

* Change RemoteCollationIdExpression and some pg12/pg13 comments

* Remove some more normalization rules
2022-07-20 17:49:36 +03:00
aykutbozkurt 108ca875ad fix assertion bugs related to list length 2022-07-20 10:53:12 +03:00
aykutbozkurt ebb6d1c8c0 refactor code where GetObjectAddressFromParseTree is called because it returns list of addresses now 2022-07-19 18:13:12 +03:00
aykutbozkurt 9d232d7b00 change address method to return list of addresses 2022-07-19 18:13:11 +03:00
Önder Kalacı 90b1afe31e
Merge branch 'main' into baby_step_pg_15 2022-07-18 15:02:39 +02:00
Nitish Upreti 5b3537cdff
Shard Split for Citus (#6029)
* Blocking split setup

* Add missing type

* Missing API from Metadata Sync

* Shard Split e2e code

* Worker Split Copy DestReceiver skeleton

* Basic destreceiver code

* worker_split_copy UDF

* UDF calling

* Split points are text

* Isolate Tenant and Split Shard Unification

* Fixing executor and misc

* Reindent code

* Fixing UDF definitions

* Hello World Local Copy works

* Remote copy hello world works

* Local and Remote binary test

* Fixing text local copy and adding tests

* Hello World shard split works

* Negative tests

* Blocking Split workflow works

* Refactor

* Bug fix

* Reindent

* Cleaning up and adding comments

* Basic test for shard split workflow

* ReIndent

* Circle CI integration

* Removing include causing circle-ci build failure

* Remove SplitCopyDestReceiver and use PartitionedResultDestReceiver

* Add support for citus.enable_binary_protocol

* Reindent

* Fix build break

* Update Test

* Cleanup on catch

* Addressing open comments

* Update downgrade script and quote schema/table in COPY statement

* Fix metadata sync issue. Update regression test

* Isolation test and bug fix

* Add Isolation test, fix foreign constraint deadlock issue

* Misc code review comments

* Test name needing to be quoted

* Refactor code from review comments

* Explaining shardGroupSplitIntervalListList

* Fix upgrade & downgrade

* Fix broken test

* Test fix Round 2

* Fixing bug and modifying test appropriately

* Fully qualify copy udf name. Run Reindent

* Address PR comments

* Fix null handling when creating AuxiliaryStructures

* Ensure local copy is triggered in tests

* Limit max shards that can be created with split

* Test failure fix

* Remove split_mode and use shard_transfer_mode instead'

* Fix test failure

* Fix test failure

* Fixing permission issue when splitting non-superuser owned tables

* Fix test expected output

* Remove extra space

* Fix test

* attempt to fix test

* Addressing Marco's PR comment

* Only clean shards created by workflow

* Remove from merge

* Update test
2022-07-18 02:54:15 -07:00
Onder Kalaci 3eaef027e2 Remove unused code
Probably left over from removing old repartitioning code
2022-07-15 10:28:46 +02:00
Onder Kalaci 483a3a5875 PG 15 Compat: Resolve compile issues + shmem requests
Similar to #5897, one more step for running Citus with PG 15.

This PR at least make Citus run with PG 15. I have not tried running the tests with PG 15.

Shmem changes are based on 4f2400cb3f

Compile breaks are mostly due to #6008
2022-07-15 10:11:39 +02:00
ywj 1675519f93
Support citus_columnar as separate extension (#5911)
* Support upgrade and downgrade and separate columnar as citus_columnar extension

Co-authored-by: Yanwen Jin <yanwjin@microsoft.com>
Co-authored-by: Jeff Davis <jeff@j-davis.com>
2022-07-13 21:08:29 -07:00
Onder Kalaci b2e9a5baf1 Make sure citus_is_coordinator works on read replicas 2022-07-13 14:11:18 +02:00
Onder Kalaci 8ab696f7e2 LOCK COMMAND does not require primaries at the start 2022-07-13 14:08:49 +02:00
aykutbozkurt da089d72c5 we should check if relation is valid after fetching a relation 2022-07-06 16:35:01 +03:00
Halil Ozan Akgul 1490acbbe9 Removes incorrect parameter from get_all_active_transactions 2022-07-06 11:35:46 +03:00
aykutbozkurt d53a7760b0 * alter index/table rename weird syntax supported,
* correct the wrong level of lock if the weird syntax is used
2022-07-04 21:27:47 +03:00
aykutbozkurt ba62c0a148 auto is a valid option for vacuum index_cleanup. 2022-07-04 19:27:55 +03:00
Ahmet Gedemenli c8e1e243b8
Fix matviews for citus_add_local_table_to_metadata (#6023) 2022-07-04 17:00:07 +03:00
Hanefi Onaldi f60809a6c1
Fix downgrade scripts from 11.0-2 to 11.0-1 (#6039) 2022-06-29 22:43:50 +03:00
Onder Kalaci bab4c0a8c3 Fixes a bug that prevents upgrades when there are no worker nodes 2022-06-28 15:54:49 +02:00
Onder Kalaci bd3a070369 Fixes a bug that prevents upgrades when there COMPRESSION and DEFAULT columns 2022-06-28 13:36:00 +02:00
aykutbozkurt 8194dc4c62 * Added isolation tests for vacuum,
* Added more regression tests for more vacuum options,
* Fixed deadlock for unqualified vacuum when there is only 1 worker,
* Supported lock_skipped for vacuum.
2022-06-23 15:33:14 +03:00
aykutbozkurt 1d6c81245c fix bug, which is column mismatch of shard tasks when specifying column names for citus tables in vacuum and analyze commands 2022-06-23 15:33:14 +03:00
Aykut Bozkurt 6986f53835 propagate unqualified vacuum and analyze to all worker nodes 2022-06-23 15:33:14 +03:00
Ahmet Gedemenli 1ee3e8b7f4
Fix creating stats bug when CREATE TABLE LIKE (#6006) 2022-06-16 12:43:47 +03:00
Jelte Fennema 184c7c0bce
Make enterprise features open source (#6008)
This PR makes all of the features open source that were previously only
available in Citus Enterprise.

Features that this adds:
1. Non blocking shard moves/shard rebalancer
   (`citus.logical_replication_timeout`)
2. Propagation of CREATE/DROP/ALTER ROLE statements
3. Propagation of GRANT statements
4. Propagation of CLUSTER statements
5. Propagation of ALTER DATABASE ... OWNER TO ...
6. Optimization for COPY when loading JSON to avoid double parsing of
   the JSON object (`citus.skip_jsonb_validation_in_copy`)
7. Support for row level security
8. Support for `pg_dist_authinfo`, which allows storing different
   authentication options for different users, e.g. you can store
   passwords or certificates here.
9. Support for `pg_dist_poolinfo`, which allows using connection poolers
   in between coordinator and workers
10. Tracking distributed query execution times using
   citus_stat_statements (`citus.stat_statements_max`,
   `citus.stat_statements_purge_interval`,
   `citus.stat_statements_track`). This is disabled by default.
11. Blocking tenant_isolation
12. Support for `sslkey` and `sslcert` in `citus.node_conninfo`
2022-06-16 00:23:46 -07:00
Burak Velioglu e244e9ffb6
Fix dropping temporary view without specifying the explicit schema name (#6003) 2022-06-15 16:41:12 +02:00
Marco Slot ee34e1ed9d Fix bug in unqualified, non-existing DROP DOMAIN IF EXISTS 2022-06-15 13:59:08 +02:00
Ahmet Gedemenli 268d3fa3a6
Fix materialized view intermediate result filename (#5982) 2022-06-14 15:07:08 +03:00
Marco Slot 36c4ec6d53 Introduce a citus_finish_citus_upgrade() function 2022-06-13 13:15:15 +02:00
Halil Ozan Akgul b255706189 Fixes the bug where undistribute can drop Citus extension 2022-05-31 16:23:28 +03:00
Onder Kalaci 7157152f6c Do not send metadata changes during add node if citus.enable_metadata_sync is set to false 2022-05-30 13:24:31 +02:00
Onder Kalaci 010a2a408e Avoid assertion failure on citus_add_node 2022-05-30 12:22:09 +02:00
Gledis Zeneli beef392f5a
Fix memory error with citus_add_node reported by valgrind test (#5967)
The error comes due to the datum jsonb in pg_dist_metadata_node.metadata being 0 in some scenarios. This is likely due to not copying the data when receiving a datum from a tuple and pg deciding to deallocate that memory when the table that the tuple was from is closed.
Also fix another place in the code that might have been susceptible to this issue.
I tested on both multi-vg and multi-1-vg and the test were successful.
2022-05-28 00:22:00 +03:00
Ahmet Gedemenli 26d927178c
Propagate dependent views upon distribution (#5950) 2022-05-26 14:23:45 +03:00
jeff-davis 74ce210f8b
Columnar: fix wraparound bug. (#5962)
columnar_vacuum_rel() now advances relfrozenxid.

Fixes #5958.
2022-05-25 07:50:48 -07:00
Burak Velioglu 1d7dda991f Create view and materialized views with right schema and owner while
altering the distributed table.

To be able to alter view's owner without enforcing sequential mode.
Alter view process functions have been udpated to use metadata
connection.
2022-05-24 15:27:30 +03:00
Gledis Zeneli 27ddb4fc8e
Do not obtain AccessShareLock before actual lock (#5965)
Do not obtain AccessShareLock before acquiring the distributed locks.

Acquiring an AccessShareLock ensures that the relations which we are trying to get a distributed lock on will not be dropped in the time between when the LOCK command is issued and the LOCK commands are send to the worker. However, this also leads to distributed deadlocks in such scenarios:

```sql
-- for dist lock acquiring order coor, w1, w2

-- on w2
LOCK t1 IN ACCESS EXLUSIVE MODE;
-- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock

      -- concurrently on w1
      LOCK t1 IN ACCESS EXLUSIVE MODE;
      -- acquire AccessShareLock locally on t1 to ensure it is not dropped while we get ready to distribute the lock
      -- acquire dist lock on coor, w1, gets blocked on local AccessShareLock on w2

-- on w2 continuation of the execution above
-- starts to acquire dist locks and gets blocked on the coor by the lock acquired by w1

-- distributed deadlock

``` 

We opt for avoiding such deadlocks with the cost of the possibility of running into errors when the relations on which we are trying to acquire locks on get dropped.
2022-05-23 13:06:38 +03:00
Onder Kalaci dd02e1755f Parallelize metadata syncing on node activate
It is often useful to be able to sync the metadata in parallel
across nodes.

Also citus_finalize_upgrade_to_citus11() uses
start_metadata_sync_to_primary_nodes() after this commit.

Note that this commit does not parallelize all pieces of node
activation or metadata syncing. Instead, it tries to parallelize
potenially large parts of metadata, which is the objects and
distributed tables (in general Citus tables).

In the future, it would be nice to sync the reference tables
in parallel across nodes.

Create ~720 distributed tables / ~23450 shards
```SQL
-- declaratively partitioned table
CREATE TABLE github_events_looooooooooooooong_name (
  event_id bigint,
  event_type text,
  event_public boolean,
  repo_id bigint,
  payload jsonb,
  repo jsonb,
  actor jsonb,
  org jsonb,
  created_at timestamp
) PARTITION BY RANGE (created_at);

SELECT create_time_partitions(
  table_name         := 'github_events_looooooooooooooong_name',
  partition_interval := '1 day',
  end_at             := now() + '24 months'
);

CREATE INDEX ON github_events_looooooooooooooong_name USING btree (event_id, event_type, event_public, repo_id);
SELECT create_distributed_table('github_events_looooooooooooooong_name', 'repo_id');

SET client_min_messages TO ERROR;

```

across 1 node: almost same as expected
```SQL

SELECT start_metadata_sync_to_primary_nodes();
Time: 15664.418 ms (00:15.664)

select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 14284.069 ms (00:14.284)
```

across 7 nodes: ~3.5x improvement
```SQL

SELECT start_metadata_sync_to_primary_nodes();
┌──────────────────────────────────────┐
│ start_metadata_sync_to_primary_nodes │
├──────────────────────────────────────┤
│ t                                    │
└──────────────────────────────────────┘
(1 row)

Time: 25711.192 ms (00:25.711)

-- across 7 nodes
select start_metadata_sync_to_node(nodename,nodeport) from pg_dist_node;
Time: 82126.075 ms (01:22.126)
```
2022-05-23 09:15:48 +02:00
jeff-davis a2f5b068e6
Columnar: tighten security and improve visibility. (#5922)
Move internal storage details to a separate schema with no public
access to limit the possibility for information leakage.

Create views with public access that show storage details for those
columnar tables where the user has ownership privileges. Include
mapping between relation ID and storage ID for easier interpretation.
2022-05-20 15:30:31 -07:00
Ying Xu a1151c2395
Clear metadatacache during abort for create extension (#5907)
* Bug fix for bug #5876. Memset MetadataCacheSystem every time there is an abort

* Created an ObjectAccessHook that saves the transactionlevel of when citus was created and will clear metadatacache if that transaction level is rolled back. Added additional tests to make sure metadatacache is cleared
2022-05-20 13:47:58 -07:00
Marco Slot 7abcfac61f Add caching for functions that check the backend type 2022-05-20 19:02:37 +02:00
Marco Slot 09ec366ff5 Improve nested execution checks and add GUC to disable 2022-05-20 18:55:43 +02:00
Marco Slot e683993449 Fix prepared statement bug when switching from local to remote execution 2022-05-20 18:55:43 +02:00
jeff-davis a9f8a60007
Columnar: support relation options with ALTER TABLE. (#5935)
Columnar: support relation options with ALTER TABLE.

Use ALTER TABLE ... SET/RESET to specify relation options rather than
alter_columnar_table_set() and alter_columnar_table_reset().

Not only is this more ergonomic, but it also allows better integration
because it can be treated like DDL on a regular table. For instance,
citus can use its own ProcessUtility_hook to distribute the new
settings to the shards.

DESCRIPTION: Columnar: support relation options with ALTER TABLE.
2022-05-20 08:35:00 -07:00
Marco Slot ad5214b50c Allow distributed execution from run_command_on_* functions 2022-05-20 15:26:47 +02:00
gledis69 4731630741 Add distributing lock command support 2022-05-20 12:28:07 +03:00
Marco Slot 79d7e860e6 Add a run_command_on_coordinator function 2022-05-19 10:26:09 +02:00
Marco Slot fa9cee409c Fix downgrade scripts and add new downgrade tests 2022-05-19 10:26:09 +02:00
Ahmet Gedemenli 48d5c9a1b5 Fix schemaname qualify for rename seq stmts 2022-05-18 19:04:22 +03:00
Onder Kalaci 127450466e Do not warn unncessarily when a node is removed
In the past (pre-11), we allowed removing worker nodes
that had active placements for replicated distributed
table, without even checking if there are any other
replicas of the same placement.

However, with #5469, we prevent disabling nodes via a hard
error when there is the last active placement of shard, as we
do for reference tables. Note that otherwise, we'd allow
users to lose data.

As of today, the NOTICE is completely irrelevant.
2022-05-18 17:23:38 +02:00
Onder Kalaci b4dbd84743 Prevent distributed queries while disabling first worker node
First worker node has a special meaning for modifications on the replicated tables

It is used to acquire a remote lock, such that the modifications are serialized.

With this commit, we make sure that we do not let any distributed query to see a
different 'first worker node' while first worker node is disabled.

Note that, maybe implicitly mentioned above, when first worker node is disabled,
the first worker node changes, that's why we have to handle the situation.
2022-05-18 17:21:12 +02:00
Onder Kalaci db998b3d66 Adds "sync" option to citus_disable_node() UDF
Before this commit, we had:
```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false)
```

Where, we allow forcing to disable first worker node with
`force:=true`. However, it entails the risk for losing
data / diverging placement data etc.

With `force` flag, we control disabling the first worker node,
and with `async` flag we control whether the changes are done
via bg worker or immediately.

```SQL
SELECT citus_disable_node(nodename, nodeport, force boolean DEFAULT false, sync boolean DEFAULT false)
```

Where we can achieve all the following:

| Mode  | Data loss possibility | Can run in 2PC | Handle multiple node failures | Immediately effective |
| --- |--- |--- |--- |--- |
| force:false, sync: false  | false   | true  | true  | false |
| force:false, sync: true   | false  | false | false | true |
| force:true, sync: false   | true   | true  | true   | false |
| force:true, sync: true    | false  | false | false  | true |
2022-05-18 17:21:12 +02:00
Onder Kalaci 2cc4053fc1 Fixes a bug that prevents dropping/altering indexes
There are two problems in this area. First, when there are expressions
on the index name, we should call `transformIndexExpression()` before
generating the index name. That is what Postgres does.

Second, because of 40c24bfef9
PG 13 and PG 14 generates different names for indexes with function calls even for local PG tables.
Assume we have:
```SQL
create table t(id int);
select create_distributed_table('t', 'id');
create index ON t (my_very_boring_function(id));
```

On PG 13, the name of the index is `t_expr_idx`
```SQL
\d t
Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_expr_idx" btree (my_very_boring_function(id::bigint))
```

On PG 14, the name of the index is `t_my_very_boring_function_idx`
```SQL
\d t
 Table "public.t"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │  Type   │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ id     │ integer │           │          │         │
└────────┴─────────┴───────────┴──────────┴─────────┘
Indexes:
    "t_my_very_boring_function_idx" btree (my_very_boring_function(id::bigint))

```

The second issue is not very critical. The important part is that
we adjust regression tests to drop all the indexes, which ensures
the index names are sane on any version.
2022-05-18 16:35:17 +02:00
Nils Dijk b71a08955a
Refactor: reduce complexity and code duplication for Object Propagation
Over time we have added significantly improved the support for objects to be propagated by Citus as to make scaling out the database more seamless. It became evident that there was a lot of code duplication that got into the codebase to implement the propagation.

This PR tries to reduce the amount of repeated code that is at most only slightly different. To make things worse, most of the differences were actually oversights instead of correct.

This Patch introduces 3 reusable sets of pre/post processing steps for respectively
 - create
 - alter
 - drop

With the use of the common functionality we should have more coherent behaviour between different supported object by Citus.

Some steps either omit the Pre or Post processing step if they would not make sense to include.

All tests pass, only 1 test needed changing, foreign servers, as the dropping of foreign servers didn't implement support for dropping multiple foreign servers at once. Given the common approach correctly supports dropping of multiple objects, either distributed or not, the test that assumed it wouldn't work was now obsolete.
2022-05-18 15:58:28 +02:00
Onder Kalaci ee45e7bfbf Mark existing views as distributed when upgrade to 11.0+
We have a mechanism which ensures that newly distributed
objects are recorded during `alter extension citus update`.

However, the logic was lacking "view"s. With this commit, we make
sure that existing views are also marked as distributed during
upgrade.
2022-05-18 15:43:17 +02:00
Halil Ozan Akgul d171a736ab Revert "Creates new colocation for colocate_with:='none' too"
This reverts commit f74447b3b7.
2022-05-17 15:32:22 +03:00
Ahmet Gedemenli aa8f46ead0
Fix schema name bug for sequences (#5937) 2022-05-16 18:11:57 +03:00
Halil Ozan Akgul f74447b3b7 Creates new colocation for colocate_with:='none' too 2022-05-16 13:39:05 +03:00
Teja Mupparti e56fc34404 Fixes: #5787 In prepared statements, map any unused parameters
to a generic type.
2022-05-13 19:31:05 -07:00
Burak Velioglu 1875516ae9 Add ALTER VIEW support
Adds support for propagation ALTER VIEW commands to
- Change owner of view
- SET/RESET option
- Rename view and view's column name
- Change schema of the view

Since PG also supports targeting views with ALTER TABLE
commands, related code also added to direct such ALTER TABLE
commands to ALTER VIEW commands while sending them to workers.
2022-05-13 13:21:53 +03:00
Marco Slot 6fad5dc207 Add a citus_is_coordinator function 2022-05-13 10:02:52 +02:00
Ahmet Gedemenli 00e0f4d8e6 Fix alter statistics namespace name 2022-05-11 18:44:37 +03:00
Gledis Zeneli 4c6f62efc6
Switch to using LOCK instead of lock_relation_if_exists in TRUNCATE (#5930)
Breaking down #5899 into smaller PR-s

This particular PR changes the way TRUNCATE acquires distributed locks on the relations it is truncating to use the LOCK command instead of lock_relation_if_exists. This has the benefit of using pg's recursive locking logic it implements for the LOCK command instead of us having to resolve relation dependencies and lock them explicitly. While this does not directly affect truncate, it will allow us to generalize this locking logic to then log different relations where the pg recursive locking will become useful (e.g. locking views).

This implementation is a bit more complex that it needs to be due to pg not supporting locking foreign tables. We can however, still lock foreign tables with lock_relation_if_exists. So for a command:

TRUNCATE dist_table_1, dist_table_2, foreign_table_1, foreign_table_2, dist_table_3;

We generate and send the following command to all the workers in metadata:
```sql
SEL citus.enable_ddl_propagation TO FALSE;
LOCK dist_table_1, dist_table_2 IN ACCESS EXCLUSIVE MODE;
SELECT lock_relation_if_exists('foreign_table_1', 'ACCESS EXCLUSIVE');
SELECT lock_relation_if_exists('foreign_table_2', 'ACCESS EXCLUSIVE');
LOCK dist_table_3 IN ACCESS EXCLUSIVE MODE;
SEL citus.enable_ddl_propagation TO TRUE;
```

Note that we need to alternate between the lock command and lock_table_if_exists in order to preserve the TRUNCATE order of relations.
When pg supports locking foreign tables, we will be able to massive simplify this logic and send a single LOCK command.
2022-05-11 18:38:48 +03:00
Burak Velioglu 1460452442 Introduce CREATE/DROP VIEW
Adds support for propagating create/drop view commands and views to
worker node while scaling out the cluster. Since views are dropped while
converting the table type, metadata connection will be used while
propagating view commands to not switch to sequential mode.
2022-05-10 13:07:14 +03:00
Burak Velioglu 06a94d167e Use object address instead of relation id on DDLJob to decide on syncing metadata 2022-05-05 17:59:44 +03:00
Onder Kalaci f193e16a01 Refrain reading the metadata cache for all tables during upgrade
First, it is not needed. Second, in the past we had issues regarding
this: https://github.com/citusdata/citus/pull/4344

When I create 10k tables, ~120K shards, this saves
40Mb of memory during ALTER EXTENSION citus UPDATE.

Before the change:  MetadataCacheMemoryContext: 41943040 ~ 40MB
After the change:  MetadataCacheMemoryContext: 8192
2022-05-04 16:44:06 +02:00
Marco Slot ceb593c9da Convert citus.hide_shards_from_app_name_prefixes to citus.show_shards_for_app_name_prefixes 2022-05-03 14:22:13 +02:00
Jeff Davis 3e1180de78 PG15: handle extra argument to parse_analyze_varparams().
From PG commit 25751f54b8.
2022-05-02 10:12:03 -07:00
Jeff Davis b6a5617ea8 PG15: handle pg_analyze_and_rewrite_* renaming.
From PG commit 791b1b71da.
2022-05-02 10:12:03 -07:00
Jeff Davis 33ee4877d4 PG15: rename pgstat_initstats() -> pgstat_init_relation().
From PG commits bff258a273 and be902e2651.
2022-05-02 10:12:03 -07:00
Jeff Davis 033f9cfff7 PG15: update copied pg_get_object_address() code.
Account for PG commits 5a2832465fd8 and a0ffa885e478.
2022-05-02 10:12:03 -07:00
Jeff Davis bd455f42e3 PG15: handle change to SeqScan structure.
Account for PG commit 2226b4189b. The one site dependent on it can do
just as well with a Scan instead of a SeqScan.
2022-05-02 10:12:03 -07:00
Jeff Davis 3799f95742 PG15: Value -> String, Integer, Float.
Handle PG commit 639a86e36a.
2022-05-02 10:12:03 -07:00
Jeff Davis 26f5e20580 PG15: update integer parsing APIs.
Account for PG commits 3c6f8c011f and cfc7191dfe.
2022-05-02 10:12:03 -07:00
Jeff Davis 70c915a0f2 PG15: Handle data type changes in pg_collation.
Account for PG commit 54637508f8.
2022-05-02 10:12:03 -07:00
Jeff Davis 9915fe8a1a PG15: Handle different ways to get publication actions.
Account for PG commit 52e4f0cd47.
2022-05-02 10:12:03 -07:00
Jeff Davis 1c1ef7ab8d PG15: Handle extra argument to RelationCreateStorage.
Account for PG commit 9c08aea6a309. Introduce
RelationCreateStorage_compat.
2022-05-02 10:12:03 -07:00
Jeff Davis ac952b2cc2 PG15: Handle extra argument to ExecARDeleteTriggers.
Account for PG commit ba9a7e3921. Introduce
ExecARDeleteTriggers_compat.
2022-05-02 10:12:03 -07:00
Jeff Davis f944722c6a PG15: Use RelationGetSmgr() instead of RelationOpenSmgr().
Handle PG commit f10f0ae420.
2022-05-02 10:12:03 -07:00
Onder Kalaci 5fc7661169 Do not set coordinator's metadatasynced column to false
After a disable_node
2022-04-25 09:25:59 +02:00
Onder Kalaci a2debe0f02 Do not assign distributed transaction ids for local execution
In the past, for all modifications on the local execution,
we enabled 2PC (with 6a7ed7b309).

This also required us to enable coordinated transactions
via https://github.com/citusdata/citus/pull/4831 .

However, it does have a very substantial impact on the
distributed deadlock detection. The distributed deadlock
detection is designed to avoid single-statement transactions
because they cannot lead to any actual deadlocks.

The implementation is to skip backends without distributed
transactions are assigned. Now that we assign single
statement local executions in the lock graphs, we are
conflicting with the design of distributed deadlock
detection.

In general, we should fix it. However, one might
think that it is not a big deal, even if the processes
show up in the lock graphs, the deadlock detection
should not be causing any false positives. That is
false, unless https://github.com/citusdata/citus/issues/1803
is fixed. Now that local processes are considered as a single
distributed backend, the lock graphs might find:

    local execution 1 [tx id: 1] -> any local process [tx id: 0]
    any local process [tx id: 0] -> local execution 2 [tx id: 2]

And, decides that there is a distributed deadlock.

This commit is:
   (a) right thing to do, as local execuion should not need any
       distributed tx id
   (b) Eliminates performance issues that might come up with
       deadlock detection does a lot of unncessary checks
   (c) After moving local execution after the remote execution
       via https://github.com/citusdata/citus/pull/4301, the
       vauge requirement for assigning distributed tx ids are
       already gone.
2022-04-13 13:25:12 +02:00
Burak Velioglu 5d9599f964
Create function in transaction according to create object propagation guc 2022-04-08 17:15:31 +03:00
Nils Dijk 8897361f95
Implement DOMAIN propagation for citus 2022-04-08 15:25:39 +02:00
Onder Kalaci b0b91bab04 Rename metadata sync to node metadata sync where applicable 2022-04-07 17:51:31 +02:00
Marco Slot 2304815356 Allow adding a unique constraint with an index 2022-04-07 16:00:31 +02:00
Marco Slot c0827703ec Fix EXPLAIN ANALYZE JSON format for subplans 2022-04-07 11:38:20 +02:00
Marco Slot 544dce919a Handle user-defined type parameters in EXPLAIN ANALYZE 2022-04-07 11:14:32 +02:00
Marco Slot 9476f377b5 Remove old re-partitioning functions 2022-04-04 18:11:52 +02:00
Marco Slot 8c8c3b665d Add TABLESAMPLE support 2022-04-01 15:51:40 +02:00
jeff-davis c485a04139
Separate build of citus.so and citus_columnar.so. (#5805)
* Separate build of citus.so and citus_columnar.so.

Because columnar code is statically-linked to both modules, it doesn't
make sense to load them both at once.

A subsequent commit will make the modules entirely separate and allow
loading them both simultaneously.

Author: Yanwen Jin

* Separate citus and citus_columnar modules.

Now the modules are independent. Columnar can be loaded by itself, or
along with citus.

Co-authored-by: Jeff Davis <jefdavi@microsoft.com>
2022-03-31 19:47:17 -07:00
Onder Kalaci 9043a1ed3f Only hide shards from client backends and pg bg workers
The aim of hiding shards is to hide shards from client applications.

Certain bg workers (such as pg_cron or Citus maintanince daemon)
should be treated like client applications because users can run
queries from such bg workers. And, these bg workers should follow
the similar application_name checks as client backeends.

Certain other bg workers, such as logical replication or postgres'
parallel workers, should never hide shards. They are internal
operations.

Similarly the other backend types like the walsender or
checkpointer or autovacuum should never hide shards.
2022-03-30 16:56:12 +02:00
Ahmet Gedemenli b52823f8b4
Fix typo in error message for truncating foreign tables (#5864) 2022-03-29 13:14:16 +03:00
Jelte Fennema 3a44fa827a
Add versions of forboth that don't need ListCell (#5856)
We've had custom versions of Postgres its `foreach` macro which with a
hidden ListCell for quite some time now. People like these custom
macros, because they are easier to use and require less boilerplate.
This adds similar custom versions of Postgres its `forboth` macro. Now
you don't need ListCells anymore when looping over two lists at the same
time.
2022-03-23 14:50:36 +03:00
Ahmet Gedemenli b5448e43e3
Fix aggregate signature bug (#5854) 2022-03-23 13:42:03 +03:00
Burak Velioglu db9f0d926c
Add support for deparsing ALTER FUNCION ... SUPPORT ... commands 2022-03-22 21:55:55 +03:00
Onder Kalaci af4ba3eb1f Remove citus.enable_cte_inlining GUC
In Postgres 12+, users can adjust whether to inline/not inline CTEs
by [NOT] MATERIALIZED keywords. So, this GUC is already useless.
2022-03-22 17:14:44 +01:00
Halil Ozan Akgul 4690c42121 Fixes ALTER COLLATION encoding does not exist bug 2022-03-22 17:42:20 +03:00
Marco Slot 32c23c2775 Disallow re-partition joins when no hash function defined 2022-03-22 13:42:53 +01:00
Onur Tirtir 11433ed357 Create DDL job for create enum command in postprocess as we do for composite types
Since now we don't throw an error for enums that user attempts creating
in temp schema, the preprocess / DDL job that contains the prepared
statement (to idempotently create the enum type) gets executed. As a
result, we were emitting the following warning because of the error the
underlying worker connection throws:

```sql
WARNING:  cannot PREPARE a transaction that has operated on temporary objects
CONTEXT:  while executing command on localhost:xxxxx
WARNING:  connection to the remote node localhost:xxxxx failed with the following error: another command is already in progress
ERROR:  cannot PREPARE a transaction that has operated on temporary objects
CONTEXT:  while executing command on localhost:xxxxx
```
2022-03-22 15:09:23 +03:00
Onur Tirtir dc31102630 Locally create objects having a dependency that we cannot distribute
We were already doing so for functions & types believing that
this cannot be the case for other object types.

However, as in #5830, we cannot distribute an object that user
attempts creating in temp schema. Even more, this doesn't only
apply to functions and types but also to many other object types.

So with this commit, we teach preprocess/postprocess functions
(that need to create dependencies on worker nodes) how to skip
trying to distribute such objects.

We also start identifying temp schemas as the objects that we
don't know how to propagate to worker nodes so that we can
simply create objects locally if user attempts creating them
in a temp schema.

There are 36 callers of `EnsureDependenciesExistOnAllNodes` in
the codebase atm and for the most we still need to throw a hard
error (i.e.: not use `DeferErrorIfHasUnsupportedDependency`
beforehand), such as:

i) user explicitly wants to create a distributed object
* CreateCitusLocalTable
* CreateDistributedTable
* master_create_worker_shards
* master_create_empty_shard
* create_distributed_function
* EnsureExtensionFunctionCanBeDistributed

ii) we don't want to skip altering distributed table on worker nodes
* PostprocessIndexStmt
* PostprocessCreateTriggerStmt
* PostprocessCreateStatisticsStmt

iii) object is already distributed / handled by Citus before, so we
aren't okay with not propagating the ALTER command
* PostprocessAlterTableSchemaStmt
* PostprocessAlterCollationOwnerStmt
* PostprocessAlterCollationSchemaStmt
* PostprocessAlterDatabaseOwnerStmt
* PostprocessAlterExtensionSchemaStmt
* PostprocessAlterFunctionOwnerStmt
* PostprocessAlterFunctionSchemaStmt
* PostprocessAlterSequenceOwnerStmt
* PostprocessAlterSequenceSchemaStmt
* PostprocessAlterStatisticsSchemaStmt
* PostprocessAlterStatisticsOwnerStmt
* PostprocessAlterTextSearchConfigurationSchemaStmt
* PostprocessAlterTextSearchDictionarySchemaStmt
* PostprocessAlterTextSearchConfigurationOwnerStmt
* PostprocessAlterTextSearchDictionaryOwnerStmt
* PostprocessAlterTypeSchemaStmt
* PostprocessAlterForeignServerOwnerStmt

iv) we already cannot create those objects in temp schemas, so skipping
for now
* PostprocessCreateExtensionStmt
* PostprocessCreateForeignServerStmt

Also note that there are 3 more callers of
`EnsureDependenciesExistOnAllNodes` in enterprise in addition to those
36 but we don't need to do anything specific about them due to the same
reasoning given in iii).
2022-03-22 15:09:23 +03:00
Halil Ozan Akgul 50bace9cfb Fixes the type names that start with underscore bug 2022-03-22 14:24:30 +03:00
Halil Ozan Akgul 4dbc760603 Introduces citus_coordinator_node_id 2022-03-22 10:34:22 +03:00
Hanefi Onaldi 9f204600af
Allow all possible option types for text search objects (#5838) 2022-03-21 20:01:53 +01:00
Burak Velioglu d4625ec6a1
Add support for zero-argument polymorphic aggregates 2022-03-21 16:10:40 +03:00
Ahmet Gedemenli 46c6630328
Qualify CREATE AGGREGATE stmts in Preprocess (#5834) 2022-03-21 13:55:09 +03:00
Burak Velioglu 2c2064bf36
Create type locally if it has undistributable dependency 2022-03-18 18:23:32 +03:00
Marco Slot 055bbd6212 Use coordinated transaction when there are multiple queries per task 2022-03-18 15:04:27 +01:00
Marco Slot cab243218d Avoid locks in relation_is_a_known_shard 2022-03-18 14:37:39 +01:00
Marco Slot 5bb5359da0 Fix worker node version check 2022-03-17 13:23:02 +01:00
Marco Slot 22a18fc1f2 Fix typo in upgrade function 2022-03-17 13:23:02 +01:00
Burak Velioglu 333c73a53c
Drop distributed table on worker with ProcessUtilityParseTree 2022-03-15 17:42:01 +03:00
Gledis Zeneli 56ab64b747
Patches #5758 with some more error checks (#5804)
Add error checks to detect failed connection and don't ping secondary nodes to detect self reference.
2022-03-15 15:02:47 +03:00
Marco Slot e42a798707 Always use RowShareLock in pg_dist_node when syncing metadata 2022-03-15 10:28:51 +01:00
Onder Kalaci 338752d96e Guard against hard wait event set errors
Similar to https://github.com/citusdata/citus/pull/5158, but this
time instead of the executor, use this in all the remaining places.
2022-03-14 14:35:56 +01:00
Onder Kalaci 953951007c Move wait event error checks to connection manager 2022-03-14 14:35:56 +01:00
Onur Tirtir 216b9b5b7a
Fix an incorrect error message related with fkeys between replicated dist tables (#5796)
This is not supported in enterprise too.
2022-03-14 14:34:09 +01:00
Hanefi Onaldi b24e1dfccc
Propagate text search commands to all worker nodes (#5797)
Here is a list of some functions, and the `TargetWorkerSet` parameters
they supply to `NodeDDLTaskList`:

PostprocessCreateTextSearchConfigurationStmt - 
NON_COORDINATOR_NODES

PreprocessDropTextSearchConfigurationStmt -
NON_COORDINATOR_METADATA_NODES

PreprocessAlterTextSearchConfigurationSchemaStmt -
NON_COORDINATOR_METADATA_NODES 

I guess this means that, if metadata
syncing is disabled on the node, we may have some issues. Consider the
following:

Let's assume the user has metadata syncing disabled. 2 workers.

`CREATE TEXT SEARCH CONFIGURATION ...` will get propagated to all
workers. `ALTER ... CONFIGURATION ...` will not get propagated to
workers.

After adding a new non-metadata node, the new node will get the altered
configuration as it reads from catalog. At this point CONFIGURATION
definitions got diverged in the cluster.

I suggest that we always use `NON_COORDINATOR_METADATA_NODES` in all the
TEXT SEARCH operations here.
2022-03-14 14:44:34 +03:00
Onder Kalaci db529facab Only change the sequence types if the target column type is a supported sequence type
Before this commit, we erroneously converted the sequence
type to the column's type it is used. However, it is possible
that the sequence is used in an expression which then converted
to a type that cannot be a sequence, such as text.

With this commit, we only try this conversion if the column
type is a supported sequence type (e.g., smallint, int and bigint).

Note that we do this conversion because if the column type is a
bigint and the sequence is NOT a bigint, users would be in trouble
because sequences would generate values that are out of the range
of the column. (The other ways are already not supported such as
the column is int and the sequence is bigint would fail on the worker.)

In other words, with this commit, we scope this optimization only
when the target column type is a supported sequence type. Otherwise,
we let users to more freely use the sequences.
2022-03-11 16:06:00 +01:00
Ahmet Gedemenli d06146360d
Support GRANT ON SCHEMA commands in CREATE SCHEMA statements (#5789)
* Support GRANT ON SCHEMA commands in CREATE SCHEMA statements

* Add test

* add comment

* Rename to GetGrantCommandsFromCreateSchemaStmt
2022-03-11 14:47:45 +03:00
Jelte Fennema e5d5c7be93
Start erroring out for unsupported lateral subqueries (#5753)
With the introduction of #4385 we inadvertently started allowing and
pushing down certain lateral subqueries that were unsafe to push down.
To be precise the type of LATERAL subqueries that is unsafe to push down
has all of the following properties:
1. The lateral subquery contains some non recurring tuples
2. The lateral subquery references a recurring tuple from
   outside of the subquery (recurringRelids)
3. The lateral subquery requires a merge step (e.g. a LIMIT)
4. The reference to the recurring tuple should be something else than an
   equality check on the distribution column, e.g. equality on a non
   distribution column.


Property number four is considered both hard to detect and probably not
used very often. Thus this PR ignores property number four and causes
query planning to error out if the first three properties hold.

Fixes #5327
2022-03-11 11:59:18 +01:00
Hanefi Onaldi b0eb685101
Add support for TEXT SEARCH DICTIONARY objects
TEXT SEARCH DICTIONARY objects depend on TEXT SEARCH TEMPLATE objects.
Since we do not yet support distributed TS TEMPLATE objects, we skip
dependency checks for text search templates, similar to what we do for
roles.

The user is expected to manually create the TEXT SEARCH TEMPLATE objects
before a) adding new nodes, b) creating TEXT SEARCH DICTIONARY objects.
2022-03-11 03:40:20 +03:00
Marco Slot 49467e27e6
Ensure worker_save_query_explain_analyze always fully qualifies types (#5776)
Co-authored-by: Onur Tirtir <onurcantirtir@gmail.com>
Co-authored-by: Marco Slot <marco.slot@gmail.com>
2022-03-10 07:30:11 -08:00
Gledis Zeneli 2cb02bfb56
Fix node adding itself with citus_add_node leading to deadlock (Fix #5720) (#5758)
If a worker node is being added, a command is sent to get the server_id of the worker from the pg_dist_node_metadata table. If the worker's id is the same as the node executing the code, we will know the node is trying to add itself. If the node tries to add itself without specifying `groupid:=0` the operation will result in an error.
2022-03-10 17:46:33 +03:00
Burak Velioglu 547f6b18ef
Ensure dependencies exists for all alter owner commands 2022-03-10 16:37:55 +03:00
Ahmet Gedemenli 4312486141
Remove unnecessary schema name from CREATE SCHEMA stmts (#5785) 2022-03-10 15:19:14 +03:00
Hanefi Onaldi d153c2de0d Fix some typos in comments 2022-03-10 15:03:26 +03:00
Ahmet Gedemenli 551a7d1383
Support CREATE SCHEMA without name (#5782) 2022-03-10 13:38:00 +03:00
Marco Slot 8e43c8094d Fix CREATE EXTENSION propagation with custom version 2022-03-09 17:40:50 +01:00
Marco Slot 7559ad12ba Change create_object_propagation default to immediate 2022-03-09 17:40:50 +01:00
Burak Velioglu bbe1b16125
Check whether the object has unsupported or circular dependency 2022-03-09 16:37:53 +03:00
Jelte Fennema c8839de68b
Don't use cascading deletes in Citus 11 migration script (#5767)
Using CASCADE in a DELETE can inadvertently delete things we don't
intend to. It's safer to fail hard and make the user delete depending
things manually.
2022-03-09 14:35:23 +01:00
Halil Ozan Akgül 333bcc7948
Global PID Helper Functions (#5768)
* Introduces citus_nodename_for_nodeid and citus_nodeport_for_nodeid functions

* Introduces citus_nodeid_for_gpid and citus_pid_for_gpid functions

* Add tests
2022-03-09 13:15:59 +03:00
Onder Kalaci c32b2de1a7 Improve citus_lock_waits
1) Remove useless columns
2) Show backends that are blocked on a DDL even before
   gpid is assigned
3) One minor bugfix, where we clear distributedCommandOriginator
   properly.
2022-03-07 11:10:44 +01:00
Ahmet Gedemenli 2a3c0c1914
Revert upgrade script changes (#5757) 2022-03-07 13:04:58 +03:00
Onder Kalaci 24fcd2a88c Handle dropping the partitioned tables properly
Before this commit, we might be leaving some metadata on the workers.
Now, we handle DROP SCHEMA .. CASCADE properly to avoid any metadata
leakage.
2022-03-07 10:02:54 +01:00
Nils Dijk 3801576dfb
Move pg_dist_object to pg_catalog (#5765)
DESCRIPTION: Move pg_dist_object to pg_catalog

Historically `pg_dist_object` had been created in the `citus` schema as an experiment to understand if we could move our catalog tables to a branded schema. We quickly realised that this interfered with the UX on our managed services and other environments, where users connected via a user with the name of `citus`.

By default postgres put the username on the search_path. To be able to read the catalog in the `citus` schema we would need to grant access permissions to the schema. This caused newly created objects like tables etc, to default to this schema for creation. This failed due to the write permissions to that schema.

With this change we move the `pg_dist_object` catalog table to the `pg_catalog` schema, where our other schema's are also located. This makes the catalog table visible and readable by any user, like our other catalog tables, for debugging purposes.

Note: due to the change of schema, we had to disable 1 test that was running into a discrepancy between the schema and binary. Secondly, we needed to make the lookup functions for the `pg_dist_object` relation and their indexes less strict on the fallback of the naming due to an other test that, due to an unfortunate cache invalidation, needed to lookup the relation again. This makes that we won't default to _only_ resolving from `pg_catalog` outside of upgrades.
2022-03-04 17:40:38 +00:00
Halil Ozan Akgul 0500a62515 Updates citus_dist_stat_activity to use citus_stat_activity 2022-03-04 17:28:17 +03:00
Ahmet Gedemenli b8eedcd261
Notice when create_distributed_function called without params (#5752)
* Notice when create_distributed_function called without params

* Move variable comments to top

* Add valid check for cache entry

* add objtype to notice msg

* update test outputs

* Add more tests

* Address feedback
2022-03-04 17:26:39 +03:00
Önder Kalacı bd6a6563ff
Merge branch 'master' into calculate_gpid 2022-03-04 11:34:12 +01:00
Burak Velioglu cb6d67a9a9
Make sure that all dependencies of citus tables can be distributed 2022-03-03 20:08:09 +03:00
Onder Kalaci c7b67ba0ea Add citus_backend_gpid()
And also citus_calculate_gpid(nodeId,pid). These UDFs are just
wrappers for the existing functions. Useful for testing and simple
manipulation of citus_stat_activity.
2022-03-03 15:29:40 +01:00
Halil Ozan Akgul 06a0509b1a Introduces citus_stat_activity view 2022-03-03 16:19:20 +03:00
Marco Slot ddf7cf29f3 Sync pg_dist_colocation as a batch 2022-03-03 12:48:48 +01:00
Marco Slot 3ba61244b8 Synchronize pg_dist_colocation metadata 2022-03-03 11:01:59 +01:00
Marco Slot 43e4dd3808 Add a citus.internal_reserved_connections setting 2022-03-02 19:13:53 +01:00
Onder Kalaci e80a36c4b6 Improve visibility rules for non-priviledge roles
It seems like our approach is way too restrictive and some places
are wrong. Now, we follow very similar approach to pg_stat_activity.

Some of the changes are pre-requsite for implementing citus_dist_stat_activity
via citus_stat_activity.
2022-03-02 18:04:01 +01:00
Onder Kalaci 35ec9721b4 Add a new API for enabling Citus MX for clusters upgrading from earlier versions
Clusters created pre-Citus 11 mostly didn't have metadata sync enabled.
For those clusters, we add a utility UDF which fixes some minor issues
and sync the necessary objects to the workers.
2022-03-02 17:02:55 +01:00
Marco Slot dcfbb51b6b Revert "Build Columnar.so and make Citus depends on it (#5661)"
This reverts commit a4133c69e8.
2022-03-02 11:33:15 +01:00
Ahmet Gedemenli e1809af376 Propagate CREATE AGGREGATE commands 2022-03-02 10:52:43 +03:00
ywj a4133c69e8
Build Columnar.so and make Citus depends on it (#5661)
* [Columnar] Build columnar.so and let citus depends on it


Co-authored-by: Yanwen Jin <yanwjin@microsoft.com>
Co-authored-by: Ying Xu <32597660+yxu2162@users.noreply.github.com>
Co-authored-by: jeff-davis <Jeffrey.Davis@microsoft.com>
2022-03-01 23:31:14 +03:00
Nils Dijk 65bd540943
Feature: configure object propagation behaviour in transactions (#5724)
DESCRIPTION: Add GUC to control ddl creation behaviour in transactions

Historically we would _not_ propagate objects when we are in a transaction block. Creation of distributed tables would not always work in sequential mode, hence objects created in the same transaction as distributing a table that would use the just created object wouldn't work. The benefit was that the user could still benefit from parallelism.

Now that the creation of distributed tables is supported in sequential mode it would make sense for users to force transactional consistency of ddl commands for distributed tables. A transaction could switch more aggressively to sequential mode when creating new objects in a transaction.

We don't change the default behaviour just yet.

Also, many objects would not even propagate their creation when the transaction was already set to sequential, leaving the probability of a self deadlock. The new policy checks solve this discrepancy between objects as well.
2022-03-01 17:29:31 +03:00
Burak Velioglu f17872aed4
Expand functions while resolving dependencies 2022-03-01 17:08:46 +03:00
Gledis Zeneli b825232ecb
Handle rebalance / replication when a node is disabled (Fix #5664) (#5729)
The issue in question is caused when rebalance / replication call `FullShardPlacementList` which returns all shard placements (including those in disabled nodes with `citus_disable_node`).  Eventually, `FindFillStateForPlacement` looks for the state across active workers and fails to find a state for the placements which are in the disabled workers causing a seg fault shortly after.

Approach:
* `ActivePlacementHash` was not using the status of the shard placement's node to determine if the node it is active. Initially, I just fixed that.
* Additionally, I refactored the code which handles active shards in replication / rebalance to:
	* use a single function to determine if a shard placement is active. 
	* do the shard active shard filtering before calling `RebalancePlacementUpdates` and `ReplicationPlacementUpdates`, so test methods like `shard_placement_rebalance_array` and `shard_placement_replication_array` which have different shard placement active requirements can do their own filtering while using the same rebalance / replicate logic that `rebalance_table_shards` and `replicate_table_shards` use. 

Fix #5664
2022-02-25 19:54:30 +03:00
Hanefi Onaldi 6c25eea62f Fix some typos in comments 2022-02-24 19:48:52 +03:00
Onder Kalaci df95d59e33 Drop support for CitusInitiatedBackend
CitusInitiatedBackend was a pre-mature implemenation of the whole
GlobalPID infrastructure. We used it to track whether any individual
query is triggered by Citus or not.

As of now, after GlobalPID is already in place, we don't need
CitusInitiatedBackend, in fact it could even be wrong.
2022-02-24 12:12:43 +01:00
Marco Slot 0c4e3cb69c Drop worker_partition_query_result on downgrade 2022-02-24 10:18:56 +01:00
Hanefi Onaldi f4e8af2c22
Do not acquire locks on node metadata explicitly 2022-02-24 03:19:56 +03:00
Hanefi Onaldi b70949ae8c
Lock nodes when building ddl task lists 2022-02-24 03:19:56 +03:00
Marco Slot ef1ceb3953 Only use a single placement for map tasks 2022-02-23 19:40:21 +01:00
Marco Slot 490765a754 Enable re-partition joins after local execution 2022-02-23 19:40:21 +01:00
Marco Slot 3cd9aa655a Stop using citus.binary_worker_copy_format 2022-02-23 19:40:21 +01:00
Marco Slot 5ac0d31e8b Fix re-partition hash range generation 2022-02-23 19:40:21 +01:00
Marco Slot 72d8fde28b Use intermediate results for re-partition joins 2022-02-23 19:40:21 +01:00
Nils Dijk 1fb970224e
Fix: partitioned index dependencies (#5741)
#5685 introduced the resolution of dependencies for indices. This missed support for indices on partitioned tables. This change adds support for partitioned indices to the dependency resolution code.
2022-02-23 17:53:26 +03:00
Teja Mupparti a62901396b Allow unsafe triggers via a GUC 2022-02-21 22:45:17 -08:00
Onder Kalaci 95d5918967 Properly set worker_query and use 2022-02-21 18:22:33 +01:00
Onder Kalaci dffcafc096 Use global pids in citus_lock_waits 2022-02-21 17:46:34 +01:00
Onder Kalaci 331af3dce8 Dumping wait edges becomes optionally scan all backends
Before this commit, dumping wait edges can only be used for
distributed deadlock detection purposes. With this commit,
we open the possibility that we can use it for any backend.
2022-02-21 17:37:07 +01:00
Halil Ozan Akgul f6cd4d0f07 Overrides pg_cancel_backend and pg_terminate_backend to accept global pid 2022-02-21 16:41:35 +03:00
Ahmet Gedemenli c1d5ca9896 Do distributed check first, for DropSchema stmts 2022-02-21 14:43:04 +03:00
Ahmet Gedemenli 2bc6a00408 Refactor CreateDistributedTable to take column name 2022-02-21 12:07:17 +03:00
yxu2162 8974b2de66 Copied CheckCitusVersion over to Columnar to handle dependency issue. If we split columnar into two extensions, this will later be changed tl CheckColumnarVersion. 2022-02-18 09:47:39 -08:00
Burak Velioglu fa6866ed36
Start to propagate functions to worker nodes with
CREATE FUNCTION command together with it's dependencies.

If the function depends on any nondistributable object,
function will be created only locally. Parameterless
version of create_distributed_function becomes obsolete
with this change, it will deprecated from the code with a subsequent PR.
2022-02-18 13:56:51 +03:00
gledis69 a14fada153 Prevent Deadlocks When a Worker Tries to Create Collation (Fix #5583)
* When a worker tried to create a collation which had a dependency in the same worker node,
it would cause a deadlock, now it throws the correct "not a coordinator" error.
2022-02-18 12:28:02 +03:00
Teja Mupparti 46fa47beea Force-delegated functions' distribution argument must be reset as soon as the routine completes execution,
and not wait until the top level Executor ends. This fixes issue #5687
2022-02-17 10:48:30 -08:00
Nils Dijk 768b320470
reuse GetRoleSpecObjectForUser 2022-02-17 13:16:10 +01:00
Nils Dijk ea86f9f94e
Add support for TEXT SEARCH CONFIGURATION objects (#5685)
DESCRIPTION: Implement TEXT SEARCH CONFIGURATION propagation

The change adds support to Citus for propagating TEXT SEARCH CONFIGURATION objects. TSConfig objects cannot always be created in one create statement, and instead require a create statement followed by many alter statements to get turned into the object they should represent.

To support this we add functionality to the worker to create or replace objects based on a list of statements. When the lists of the local object and the remote object correspond 1:1 we skip the creation of the object and simply mark it distributed. This is especially important for TSConfig objects as initdb pre-populates databases with a dozen configurations (for many different languages).

When the user creates a new TSConfig based on the copy of an existing configuration there is no direct link to the object copied from. Since there is no link we can't simply rely on propagating the dependencies to the worker and send a qualified
2022-02-17 13:12:46 +01:00
Ahmet Gedemenli a1c3580c64 Support TRUNCATE for foreign tables 2022-02-17 09:59:53 +03:00
Onder Kalaci abd5b1c506 Prevent any monitoring view/udf to show already exited backends
The low-level StoreAllActiveTransactions() function filters out
backends that exited.

Before this commit, if you run a pgbench, after that you'd still
see the backends show up:
```SQL
 select count(*) from get_global_active_transactions();
┌───────┐
│ count │
├───────┤
│   538 │
└───────┘
```

After this patch, only active backends show-up:

```SQL
 select count(*) from get_global_active_transactions();
┌───────┐
│ count │
├───────┤
│    72 │
└───────┘
```
2022-02-14 17:34:32 +01:00
Ahmet Gedemenli 0411a98c99
Refactor EnsureSequentialMode functions (#5704) 2022-02-14 18:38:21 +03:00
Gledis Zeneli badfd561b2
Prevent Citus table functions from being called on shards (Fix #5610) (#5694)
DESCRIPTION: Prevent Citus table functions from being called on shards

The operations that guard against using shards are:
* Create Local Table
* Create distributed table (which affects reference table creation as well).

* I used a `ErrorIfRaltionIsKnownShard` instead of `ErrorIfIllegallyChangingKnownShard`.
`ErrorIfIllegallyChangingKnownShard` allows the operation if `citus.enable_manual_changes_to_shards`,
but I am not sure if it ever makes sense to create a distributed, reference, or citus local table out of a shard.

I tried to go over the code to identify other UDF-s where shards could be illegaly changed, but I could not find any other.
My knowledge of the codebase is not solid enough for me to say for sure.

Fixes #5610
2022-02-14 16:06:48 +03:00
Önder Kalacı dc6c194916
Show IDLE backends in citus_dist_stat_activity (#5700)
* Break the dependency to CitusInitiatedBackend infrastructure

With this change, we start to show non-distributed backends as well
in citus_dist_stat_activity. I think that
  (a) it is essential for making citus_lock_waits to work for blocked
      on DDL commands.
  (b) it is more expected from the user's perspective. The name of
      the view is a little inconsistent now (e.g., citus_dist_stat_activity)
      but we are already planning to improve the names with followup
      PRs.

Also, we have global pids assigned, the CitusInitiatedBackend
becomes obsolete.
2022-02-10 08:59:28 -08:00
Ahmet Gedemenli 76b63a307b Propagate create/drop schema commands 2022-02-10 14:58:09 +03:00
Marco Slot d0711ea9b4 Delegate function calls in FROM outside of transaction block 2022-02-09 20:56:25 +01:00
Onder Kalaci 1c30f61a70 Prevent citus.node_conninfo to use "application_name"
With https://github.com/citusdata/citus/pull/5657, Citus uses
a fixed application_name while connecting to remote nodes
for internal purposes.

It means that we cannot allow users to override it via
citus.node_conninfo.
2022-02-09 13:22:04 +01:00
Teja Mupparti 1e3c8e34c0 Allow create_distributed_function() on a function owned by an extension
Implement #5649
Allow create_distributed_function() on functions owned by extensions

1) Only update pg_dist_object, and do not propagate CREATE FUNCTION.
2) Ensure corresponding extension is in pg_dist_object.
3) Verify if dependencies exist on the function they should resolve to the extension.
4) Impact on node-scaling: We build a list of ddl commands based on all objects in
   pg_dist_object. We need to omit the ddl's for the extension-function, as it
   will get propagated by the virtue of the extension creation.
5) Extra checks for functions coming from extensions, to not propagate changes
   via ddl commands, even though the function is marked as distributed in pg_dist_object
2022-02-08 11:52:56 -08:00
Halil Ozan Akgul 8ee02b29d0 Introduce global PID 2022-02-08 16:49:38 +03:00
Burak Velioglu ab248c1785
Check object ownership while creating pg_dist_object entries on remote 2022-02-07 17:50:48 +03:00
Burak Velioglu 8ae7577581
Use superuser connection while syncing dependent objects' pg_dist_object tuples 2022-02-07 17:50:45 +03:00
Marco Slot 872f0a79db Remove random shard placement policy 2022-02-06 21:55:58 +01:00
Marco Slot 0cae8e7d6b Remove local-node-first shard placement 2022-02-06 21:36:34 +01:00
Teja Mupparti c8e504dd69 Fix the issue #5673
If the expression is simple, such as, SELECT function() or PEFORM function()
in PL/PgSQL code, PL engine does a simple expression evaluation which can't
interpret the Citus CustomScan Node. Code checks for simple expressions when
executing an UDF but missed the DO-Block scenario, this commit fixes it.
2022-02-04 15:44:53 -08:00
Ying Xu b5c116449b
Removed dependency from EnsureTableOwner (#5676)
Removed dependency for EnsureTableOwner. Also removed pg_fini() and columnar_tableam_finish() Still need to remove CheckCitusVersion dependency to make Columnar_tableam.h dependency free from Citus.
2022-02-04 12:45:07 -08:00
Onur Tirtir 79442df1b7
Fix coordinator/worker query targetlists for agg. that we cannot push-down (#5679)
Previously, we were wrapping targetlist nodes with Vars that reference
to the result of the worker query, if the node itself is not `Const` or
not a `Param`. Indeed, we should not do that unless the node itself is
a `Var` node or contains a `Var` within it (e.g.: `OpExpr(Var(column_a) > 2)`).
Otherwise, when worker query returns empty result set, then combine
query exec would crash since the `Var` would be pointing to an empty
tuple slot, which is not desirable for the node-executor methods.
2022-02-04 05:37:25 -08:00
Onder Kalaci 72d7d92611 Apply code review feedback 2022-02-04 10:52:57 +01:00
Onder Kalaci ff234fbfd2 Unify old GUCs into a single one
Replaces citus.enable_object_propagation with citus.enable_metadata_sync

Also, within Citus 11 release cycle, we added citus.enable_metadata_sync_by_default,
that is also replaced with citus.enable_metadata_sync.

In essence, when citus.enable_metadata_sync is set to true, all the objects
and the metadata is send to the remote node.

We strongly advice that the users never changes the value of
this GUC.
2022-02-04 10:52:56 +01:00
Teja Mupparti f31bce5b48 Fixes the issue seen in https://github.com/citusdata/citus-enterprise/issues/745
With this commit, rebalancer backends are identified by application_name = citus_rebalancer
and the regular internal backends are identified by application_name = citus_internal
2022-02-03 09:40:46 -08:00
jeff-davis b072b9235e
Columnar: fix checksums, broken in a4067913. (#5669)
Checksums must be set directly before writing the page. log_newpage()
sets the page LSN, and therefore invalidates the checksum.
2022-02-02 13:22:11 -08:00
Onder Kalaci 650243927c Relax some transactional limications on activate node
We already enforce EnsureSequentialModeMetadataOperations(), and given that all activate node is transaction, we should be fine
2022-02-01 15:56:55 +01:00
Onder Kalaci 34d91009ed Update outdated comment
As of the current HEAD, we support sequences as first class objects
2022-02-01 15:37:10 +01:00
Marco Slot 63c6896716 Enable function call pushdown from workers 2022-02-01 14:13:25 +01:00
Burak Velioglu f88cc230bf
Handle tables and objects as metadata. Update UDFs accordingly
With this commit we've started to propagate sequences and shell
tables within the object dependency resolution. So, ensuring any
dependencies for any object will consider shell tables and sequences
as well. Separate logics for both shell tables and sequences have
been removed.

Since both shell tables and sequences logic were implemented as a
part of the metadata handling before that logic, we were propagating
them while syncing table metadata. With this commit we've divided
metadata (which means anything except shards thereafter) syncing
logic into multiple parts and implemented it either as a part of
ActivateNode. You can check the functions called in ActivateNode
to check definition of different metadata.

Definitions of start_metadata_sync_to_node and citus_activate_node
have also been updated. citus_activate_node will basically create
an active node with all metadata and reference table shards.
start_metadata_sync_to_node will be same with citus_activate_node
except replicating reference tables. stop_metadata_sync_to_node
will remove all the metadata. All of those UDFs need to be called
by superuser.
2022-01-31 16:20:15 +03:00
Önder Kalacı f68ac4a7cf
Consider foreign keys between reference tables (#5659)
On #5071, we avoid edge cases, but below there are foreign key constraints as well

This commit makes sure we cover those as well
2022-01-28 13:38:14 +01:00
Heikki Linnakangas a40679139b
Use smgrextend() when extending relation, and WAL-log first. (#5654)
When creating a new table, we bypass the buffer cache and write the
initial pages directly with smgrwrite(). However, you're supposed to
use smgrextend() when extending a relation, rather than smgrwrite().
There isn't much difference between them, but smgrextend() updates the
relation size cache, which seems important, although I haven't seen
any real bugs caused by that.

Also, write the block to disk only after WAL-logging it, so that we
can include the LSN of the WAL record in the version that we write
out. Currently, the page as written to disk has LSN 0. That doesn't
cause any user-visible issues either, at worst it could make us
WAL-log a full page image of the page earlier than necessary, but that
doesn't matter currently because we WAL-log full page images of all
changes anyway.

I bumped into that issue with LSN 0 in the page header when testing
Citus with Zenith (https://github.com/zenithdb/zenith/issues/1176).
Zenith contains a check that PANICs if you write a block to disk
without WAL-logging it, and it works by checking the LSN of the page
that's written out. In this case, we are WAL-logging the page even
though the LSN on the page is 0, so it was a false alarm, but I'd love
to get this changed in Citus to keep the check in Zenith simple.

A downside of WAL-logging the page first is that if you run out of
disk space, you have already created the WAL record. So if you then
crash and restart, WAL recovery will likely run out of disk space,
too, which is bad. In practice, we have the same problem in other
places, like rewriteheap.c. Also, if you are on the brink of running
out of disk space, you will probably run out at WAL replay anyway,
regardless of which order we write these few pages. But if we wanted
to fix that, we could first extend the relation with zeros, and then
WAL-log the pages. That's how heap extension works.

It would be even nicer to use the buffer cache for this, and skip the
smgrimmedsync() on the relation. However, that would require more
work, because we don't have the Relation struct for the relation here.
We could use ReadBufferWithoutRelcache(), but that doesn't work for
unlogged tables. Unlogged tables are currently not supported
(https://github.com/citusdata/citus/issues/4742), but that would
become a problem if we want to support them in the future.
CreateFakeRelcacheEntry() also doesn't work with unlogged tables. We
could do things differently for logged and unlogged tables, but that
complicates the code further.

Co-authored-by: jeff-davis <Jeffrey.Davis@microsoft.com>
2022-01-27 12:04:08 -08:00
Onder Kalaci b26eeaecd3 Use a fixed application_name while connecting to remote nodes
Citus heavily relies on application_name, see
`IsCitusInitiatedRemoteBackend()`.

But if the user set the application name, such as export PGAPPNAME=test_name,
Citus uses that name while connecting to the remote node.

With this commit, we ensure that Citus always connects with
the "citus" user name to the remote nodes.
2022-01-27 10:46:25 +01:00
Onder Kalaci b9b419ef16 Allow creating distributed tables in sequential mode
With https://github.com/citusdata/citus/pull/2780, we allow
COPY to use any number of connections that the executor used
in a tx block.

Meaning that, while COPYing data to the shards, create_distributed_table
could allow sequential mode.
2022-01-26 12:58:18 +01:00
Onur Tirtir 8c8d696621
Not fail over to local execution when it's not supported (#5625)
We fall back to local execution if we cannot establish any more
connections to local node. However, we should not do that for the
commands that we don't know how to execute locally (or we know we
shouldn't execute locally). To fix that, we take localExecutionSupported
take into account in CanFailoverPlacementExecutionToLocalExecution too.

Moreover, we also prompt a more accurate hint message to inform user
about whether the execution is failed because local execution is
disabled by them, or because local execution wasn't possible for given
command.
2022-01-25 16:43:21 +01:00
Onur Tirtir ff3913ad99
Copy errmsg for distributed deadlock error into heap (#5641)
multi_log_hook() hook is called by EmitErrorReport() when emitting the
ereport either to frontend or to the server logs. And some callers of
EmitErrorReport() (e.g.: errfinish()) seems to assume that string fields
of given ErrorData object needs to be freed. For this reason, we copy the
message into heap here.

I don't think we have faced with such a problem before but it seems worth
fixing as it is theoretically possible due to the reasoning above.
2022-01-24 06:27:41 -08:00
Ahmet Gedemenli c838fb428f Refactor GenerateGrantOnSchemaStmtForRights 2022-01-24 11:31:59 +03:00
Onur Tirtir 4dc38e9e3d
Use EnsureCompatibleLocalExecutionState instead (#5640) 2022-01-21 15:37:59 +01:00
Ahmet Gedemenli 8647682c11 Fix typo: taget/target 2022-01-21 10:35:56 +03:00
Onur Tirtir 181111b84f Drop ruleutils copied for statistics 2022-01-20 17:28:19 +03:00
Onur Tirtir 7b59295af2 Drop ruleutils copied for triggers 2022-01-20 17:28:19 +03:00
Teja Mupparti 54862f8c22 (1) Functions will be delegated even when present in the scope of an explicit
BEGIN/COMMIT transaction block or in a UDF calling another UDF.
(2) Prohibit/Limit the delegated function not to do a 2PC (or any work on a
remote connection).
(3) Have a safety net to ensure the (2) i.e. we should block the connections
from the delegated procedure or make sure that no 2PC happens on the node.
(4) Such delegated functions are restricted to use only the distributed argument
value.

Note: To limit the scope of the project we are considering only Functions(not
procedures) for the initial work.

DESCRIPTION: Introduce a new flag "force_delegation" in create_distributed_function(),
which will allow a function to be delegated in an explicit transaction block.

Fixes #3265

Once the function is delegated to the worker, on that node during the planning

distributed_planner()
TryToDelegateFunctionCall()
CheckDelegatedFunctionExecution()
EnableInForceDelegatedFuncExecution()
Save the distribution argument (Constant)
ExecutorStart()
CitusBeginScan()
IsShardKeyValueAllowed()
Ensure to not use non-distribution argument.

ExecutorRun()
AdaptiveExecutor()
StartDistributedExecution()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the remoteTaskList.
NonPushableInsertSelectExecScan()
InitializeCopyShardState()
EnsureNoRemoteExecutionFromWorkers()
Ensure all the shards are local to the node in the placementList.

This also fixes a minor issue: Properly handle expressions+parameters in distribution arguments
2022-01-19 16:43:33 -08:00
Onur Tirtir 4a53967bdd
Remove an outdated comment from RelationIsAKnownShard (#5629) 2022-01-19 11:24:10 +01:00
Marco Slot 33bfa0b191 Hide shards from application_name's with a specific prefix 2022-01-18 15:20:55 +04:00
Ahmet Gedemenli e564220dd5
Fix typo: GetRelationTriggerFunctionDependencyList (#5626) 2022-01-17 18:17:07 +03:00
Ahmet Gedemenli 8936543b80
Create wrapper function CreateObjectAddressDependencyDefList (#5623) 2022-01-17 15:35:40 +03:00
Ying Xu 4dca662e97
Making Columnar Dependency Free from Citus (#5622)
* Removed distributed dependency in columnar_metadata.c

* Changed columnar_debug.c so that it no longer needed distributed/tuplestore and made it return a record instead of a tuplestore

* removed distributed/commands.h dependency

* Made columnar_tableam.c dependency-free

* Fixed spacing for columnar_store_memory_stats function

* indentation fix

* fixed test failures
2022-01-14 09:43:05 -08:00
Onur Tirtir 70d8e1fe97
Assert that we will create indexes on shards via local execution (#5620) 2022-01-13 17:09:57 +01:00
Halil Ozan Akgul 63cd90e5dd Add missing library to dependencies.c 2022-01-11 18:36:43 +03:00
Önder Kalacı 885601c02c
Require superuser while activating a node (#5609)
* Require superuser while activating a node

With this change, we require ActiveNode() (hence citus_add_node(),
citus_activate_node()) explicitly require for a superuser.

Before this commit, these functions were designed to work with
non-superuser roles with the relevent GRANTs given.

However, that is not a widely used way for calling the functions
above.

Due to possibility of non-super user calling the UDFs, they were
designed in a way that some commands were using some additional
short-lived superuser connections. That is:
	(a) breaking transactional behavior (e.g., ROLLBACK
 	    wouldn't fully rollback the whole transaction)
        (b) Making it very complicated to reason about which
	    parts of the node activation goes over which connections,
	    and becoming vulnerable to deadlocks / visibility issues.
2022-01-10 08:30:13 -08:00
Onur Tirtir 3cc44ed8b3
Tell other backends it's safe to ignore the backend that concurrently built the shell table index (#5520)
In addition to starting a new transaction, we also need to tell other
backends --including the ones spawned for connections opened to
localhost to build indexes on shards of this relation-- that concurrent
index builds can safely ignore us.

Normally, DefineIndex() only does that if index doesn't have any
predicates (i.e.: where clause) and no index expressions at all.
However, now that we already called standard process utility, index
build on the shell table is finished anyway.

The reason behind doing so is that we cannot guarantee not grabbing any
snapshots via adaptive executor, and the backends creating indexes on
local shards (if any) might block on waiting for current xact of the
current backend to finish, which would cause self deadlocks that are not
detectable.
2022-01-10 10:23:09 +03:00
Marco Slot ee3b50b026 Disallow remote execution from queries on shards 2022-01-07 17:46:21 +01:00
Ahmet Gedemenli 3c834e6693
Disable foreign distributed tables (#5605)
* Disable foreign distributed tables
* Add warning for existing distributed foreign tables
2022-01-07 18:12:23 +03:00
Onder Kalaci 7cb1d6ae06 Improve metadata connections
With https://github.com/citusdata/citus/pull/5493 we introduced
metadata specific connections.

With this connection we guarantee that there is a single metadata connection.
But note that this connection can be used for any other operation.
In other words, this connection is not only reserved for metadata
operations.

However, as https://github.com/citusdata/citus-enterprise/issues/715 showed
us that the logic has a flaw. We allowed ineligible connections to be
picked as metadata connections: such as exclusively claimed connections
or not fully initialized connections.

With this commit, we make sure that we only consider eligable connections
for metadata operations.
2022-01-07 10:36:32 +01:00
Onder Kalaci 9f2d9e1487 Move placement deletion from disable node to activate node
We prefer the background daemon to only sync node metadata. That's
why we move placement metadata changes from disable node to
activate node. With that, we can make sure that disable node
only changes node metadata, whereas activate node syncs all
the metadata changes. In essence, we already expect all
nodes to be up when a node is activated. So, this does not change
the behavior much.
2022-01-07 09:56:03 +01:00
Hanefi Onaldi 9edfbe7718 Fix the default value for DeferShardDeleteOnMove
The default for GUC citus.defer_drop_after_shard_move is true. However
we initialize the global variable with a false value.
2022-01-07 11:01:49 +03:00
Ahmet Gedemenli 45e423136c
Support foreign tables in MX (#5461) 2022-01-06 18:50:34 +03:00
Önder Kalacı 5305aa4246
Do not drop sequences when dropping metadata (#5584)
Dropping sequences means we need to recreate
and hence losing the sequence.

With this commit, we keep the existing sequences
such that resyncing wouldn't drop the sequence.

We do that by breaking the dependency of the sequence
from the table.
2022-01-06 09:48:34 +01:00
jeff-davis 2e03efd91e
Columnar: move DDL hooks to citus to remove dependency. (#5547)
Add a new hook ColumnarTableSetOptions_hook so that citus can get
control when the columnar table options change.
2022-01-04 23:26:46 -08:00
jeff-davis c9292cfad1
Make pg_version_compat.h and listutils.c dependency-free. (#5548)
Split distributed/version_compat.h into dependency-free
pg_version_compat.h, and the original which still has
dependencies. The original doesn't have much purpose, but until other
files have better discipline about including the correct header files,
then it's still needed.

Also make distributed/listutils.h dependency-free. Should be moved
outside of 'distributed' subdirectory, but that will cause significant
code churn, so leave for another cleanup patch.

Now both files can be included in columnar without creating a
dependency on citus.
2022-01-04 23:02:08 -08:00
jeff-davis 1546aa0d9f
Columnar: use proper generic WAL interface. (#5543)
Previously, we cheated by using the RM_GENERIC_ID record type, but not
actually using the generic WAL API. This worked because we always took
a full page image, and saved the extra work of allocating and copying
to a temporary page.

But it introduced complexity, and perhaps fragility, so better to just
use the API properly. The performance penalty for a serial data load
seems to be less than 1%.
2022-01-04 22:42:21 -08:00
Önder Kalacı 0a8b0b06c6
Do not allow distributed functions on non-metadata synced nodes (#5586)
Before this commit, Citus was triggering metadata syncing
in the background when a function is distributed. However,
with Citus 11, we expect all clusters to have metadata synced
enabled. So, we do not expect any nodes not to have the metadata.

This change:
	(a) pro: simplifies the code and opens up possibilities
		 to simplify futher by reducing the scope of
		 bg worker to only sync node metadata
        (b) pro: explicitly asks users to sync the metadata such that
  	    any unforseen impact can be easily detected
        (c) con: For distributed functions without distribution
		 argument, we do not necessarily require the metadata
		 sycned. However, for completeness and simplicity, we
		 do so.
2022-01-04 13:12:57 +01:00
Önder Kalacı d33650d1c1
Record if any partitioned Citus tables during upgrade (#5555)
With Citus 11, the default behavior is to sync the metadata.
However, partitioned tables created pre-Citus 11 might have
index names that are not compatiable with metadata syncing.

See https://github.com/citusdata/citus/issues/4962 for the
details.

With this commit, we record the existence of partitioned tables
such that we can fix it later if any exists.
2021-12-27 03:33:34 -08:00
Önder Kalacı c9127f921f
Avoid round trips while fixing index names (#5549)
With this commit, fix_partition_shard_index_names()
works significantly faster.

For example,

32 shards, 365 partitions, 5 indexes drop from ~120 seconds to ~44 seconds
32 shards, 1095 partitions, 5 indexes drop from ~600 seconds to ~265 seconds

`queryStringList` can be really long, because it may contain #partitions * #indexes entries.

Before this change, we were actually going through the executor where each command
in the query string triggers 1 round trip per entry in queryStringList.

The aim of this commit is to avoid the round-trips by creating a single query string.

I first simply tried sending `q1;q2;..;qn` . However, the executor is designed to
handle `q1;q2;..;qn` type of query executions via the infrastructure mentioned
above (e.g., by tracking the query indexes in the list and doing 1 statement
per round trip).

One another option could have been to change the executor such that only track
the query index when `queryStringList` is provided not with queryString
including multiple `;`s . That is (a) more work (b) could cause weird edge
cases with failure handling (c) felt like coding a special case in to the executor
2021-12-27 10:29:37 +01:00
Ahmet Gedemenli 042d45b263 Propagate foreign server ops 2021-12-23 17:54:04 +03:00
Talha Nisanci e196d23854
Refactor AttributeEquivalenceId (#5006) 2021-12-23 13:19:02 +03:00
Hanefi Onaldi 76176caea7 Fix typo s/exlusive/exclusive/ 2021-12-23 01:35:01 +03:00
Hanefi Onaldi 1af8ca8f7c
Fix statical analysis findings (#5550) 2021-12-22 18:16:11 +03:00
Ahmet Gedemenli 8e4ff34a2e Do not include return table params in the function arg list
(cherry picked from commit 90928cfd74)

Fix function signature generation

Fix comment typo

Add test for worker_create_or_replace_object

Add test for recreating distributed functions with OUT/TABLE params

Add test for recreating distributed function that returns setof int

Fix test output

Fix comment
2021-12-21 19:01:42 +03:00
Marco Slot 2eef71ccab Propagate SET TRANSACTION commands 2021-12-18 11:31:39 +01:00
Onder Kalaci fc98f83af2 Add citus.grep_remote_commands
Simply applies

```SQL
SELECT textlike(command, citus.grep_remote_commands)
```
And, if returns true, the command is logged. Else, the log is ignored.

When citus.grep_remote_commands is empty string, all commands are
logged.
2021-12-17 11:47:40 +01:00
Onur Tirtir cc4c83b1e5
HAVE_LZ4 -> HAVE_CITUS_LZ4 (#5541) 2021-12-16 16:21:52 +03:00
Hanefi Onaldi 9d4d73898a
Move healthcheck logic into new file (#5531)
and add a missing `CheckCitusVersion(ERROR)` call
2021-12-15 15:58:20 -08:00
Hanefi Onaldi 29e4516642 Introduce citus_check_cluster_node_health UDF
This UDF coordinates connectivity checks accross the whole cluster.

This UDF gets the list of active readable nodes in the cluster, and
coordinates all connectivity checks in sequential order.

The algorithm is:

for sourceNode in activeReadableWorkerList:
    c = connectToNode(sourceNode)
    for targetNode in activeReadableWorkerList:
        result = c.execute(
            "SELECT citus_check_connection_to_node(targetNode.name,
                                                   targetNode.port")
        emit sourceNode.name,
             sourceNode.port,
             targetNode.name,
             targetNode.port,
             result

- result -> true  ->  connection attempt from source to target succeeded
- result -> false -> connection attempt from source to target failed
- result -> NULL  -> connection attempt from the current node to source node failed

I suggest you use the following query to get an overview on the connectivity:

SELECT bool_and(COALESCE(result, false))
FROM citus_check_cluster_node_health();

Whenever this query returns false, there is a connectivity issue, check in detail.
2021-12-15 01:41:51 +03:00
Hanefi Onaldi 13fff9c37a Remove NOOP tuplestore_donestoring calls
PostgreSQL does not need calling this function since 7.4 release, and it
is a NOOP.

For more details, check PostgreSQL commit below :

commit dd04e958c8b03c0f0512497651678c7816af3198
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Sun Mar 9 03:34:10 2003 +0000

    tuplestore_donestoring() isn't needed anymore, but provide a no-op
    macro definition so as not to create compatibility problems.

diff --git a/src/include/utils/tuplestore.h b/src/include/utils/tuplestore.h
index b46babacd1..76fe9fb428 100644
--- a/src/include/utils/tuplestore.h
+++ b/src/include/utils/tuplestore.h
@@ -17,7 +17,7 @@
  * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * $Id: tuplestore.h,v 1.8 2003/03/09 02:19:13 tgl Exp $
+ * $Id: tuplestore.h,v 1.9 2003/03/09 03:34:10 tgl Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -41,6 +41,9 @@ extern Tuplestorestate *tuplestore_begin_heap(bool randomAccess,

 extern void tuplestore_puttuple(Tuplestorestate *state, void *tuple);

+/* tuplestore_donestoring() used to be required, but is no longer used */
+#define tuplestore_donestoring(state)  ((void) 0)
+
 /* backwards scan is only allowed if randomAccess was specified 'true' */
 extern void *tuplestore_gettuple(Tuplestorestate *state, bool forward,
                                        bool *should_free);
2021-12-14 18:55:02 +03:00
Halil Ozan Akgul a951e52ce8 Fix drop index trying to drop coordinator local indexes on metadata worker nodes 2021-12-14 11:28:08 +03:00
Burak Velioglu e8534c1dd5
Drop sequence metadata from workers explicitly 2021-12-06 19:25:51 +03:00
Burak Velioglu 21194c3b9d
Mark sequence distributed explicitly while syncing metadata
Since sequences are not marked as distributed while creating table if no
metadata worker node exists, we are marking all sequences distributed
while syncing metadata explicitly.
2021-12-06 19:25:51 +03:00
Burak Velioglu 6d849cf394
Allow delegating function from worker nodes
We've both allowed delegating functions and procedures from worker nodes
and also prevented delegation if a function/procedure has already been
propagated from another node.
2021-12-06 19:25:51 +03:00
Burak Velioglu a8b1ee87f7
Increment command counter after altering the sequence type 2021-12-06 19:25:51 +03:00
Burak Velioglu ed8e32de5e
Sync pg_dist_object on an update and propagate while syncing to a new node
Before that PR we were updating citus.pg_dist_object metadata, which keeps
the metadata related to objects on Citus, only on the coordinator node. In
order to allow using those object from worker nodes (or erroring out with
proper error message) we've started to propagate that metedata to worker
nodes as well.
2021-12-06 19:25:50 +03:00
Hanefi Onaldi 56e9b1b968 Introduce UDF to check worker connectivity
citus_check_connection_to_node runs a simple query on a remote node and
reports whether this attempt was successful.

This UDF will be used to make sure each worker node can connect to all
the worker nodes in the cluster.

parameters:
nodename: required
nodeport: optional (default: 5432)

return value:
boolean success
2021-12-03 02:30:28 +03:00
Onder Kalaci 549edcabb6 Allow disabling node(s) when multiple failures happen
As of master branch, Citus does all the modifications to replicated tables
(e.g., reference tables and distributed tables with replication factor > 1),
via 2PC and avoids any shardstate=3. As a side-effect of those changes,
handling node failures for replicated tables change.

With this PR, when one (or multiple) node failures happen, the users would
see query errors on modifications. If the problem is intermitant, that's OK,
once the node failure(s) recover by themselves, the modification queries would
succeed. If the node failure(s) are permenant, the users should call
`SELECT citus_disable_node(...)` to disable the node. As soon as the node is
disabled, modification would start to succeed. However, now the old node gets
behind. It means that, when the node is up again, the placements should be
re-created on the node. First, use `SELECT citus_activate_node()`. Then, use
`SELECT replicate_table_shards(...)` to replicate the missing placements on
the re-activated node.
2021-12-01 10:19:48 +01:00
Onder Kalaci d405993b57 Make sure to use a dedicated metadata connection
With this commit, we make sure to use a dedicated connection per
node for all the metadata operations within the same transaction.

This is needed because the same metadata (e.g., metadata includes
the distributed table on the workers) can be modified accross
multiple connections.

With this connection we guarantee that there is a single metadata connection.
But note that this connection can be used for any other operation.
In other words, this connection is not only reserved for metadata
operations.
2021-11-26 14:36:28 +01:00
Onder Kalaci 38b08ebde9 Generalize the error checks while removing node
The checks for preventing to remove a node are very much reference
table centric. We are soon going to add the same checks for replicated
tables. So, make the checks generic such that:
	 (a) replicated tables fit naturally
	 (b) we can the same checks in `citus_disable_node`.
2021-11-26 14:25:29 +01:00
Onder Kalaci 121f5c4271 Active placements can only be on active nodes
We re-define the meaning of active shard placement. It used
to only be defined via shardstate == SHARD_STATE_ACTIVE.

Now, we also add one more check. The worker node that the
placement is on should be active as well.

This is a preparation for supporting citus_disable_node()
for MX with multiple failures at the same time.

With this change, the maintanince daemon only needs to
sync the "node metadata" (e.g., pg_dist_node), not the
shard metadata.
2021-11-26 09:14:33 +01:00
Onder Kalaci b4931f7345 Do not acquire locks on reference tables when a node is removed/disabled
Before this commit, we acquire the metadata locks on the reference
tables while removing/disabling a node on all the MX nodes.

Although it has some marginal benefits, such as a concurrent
modification during remove/disable node blocks, instead of erroring
out, the drawbacks seems worse. Both citus_remove_node and citus_disable_node
are not tolerant to multiple node failures.

With this commit, we relax the locks. The implication is that while
a node is removed/disabled, users might see query errors. On the
other hand, this change becomes removing/disabling nodes more
tolerant to multiple node failures.
2021-11-26 09:08:25 +01:00
Onur Tirtir 76b8006a9e
Allow overwriting columnar storage pages written by aborted xacts (#5484)
When refactoring storage layer in #4907, we deleted the code that allows
overwriting a disk page previously written but not known by metadata.

Readers can see the change that introduced the code allows doing so in
commit a8da9acc63.

The reasoning was that; as of 10.2, we started aligning page
reservations (`AlignReservation`) for subsequent writes right after
allocating pages from disk. That means, even if writer transaction
fails, subsequent writes are guaranteed to allocate a new page and write
to there. For this reason, attempting to write to a page allocated
before is not possible for a columnar table that user created when using
v10.2.x.

However, since the older versions of columnar doesn't do that, following
example scenario can still result in writing to such disk page, even if
user now upgraded to v10.2.x. This is because, when upgrading storage to
2.0 (`ColumnarStorageUpdateIfNeeded`), we calculate `reservedOffset` of
the metapage based on the highest used address known by stripe
metadata (`GetHighestUsedAddressAndId`). However, stripe metadata
doesn't have entries for aborted writes. As a result, highest used
address would be computed by ignoring pages that are allocated but not
used.

- User attempts writing to columnar table on Citus v10.0x/v10.1x.
- Write operation fails for some reason.
- User upgrades Citus to v10.2.x.
- When attempting to write to same columnar table, they hit to "attempt
  to write columnar data .." error since write operation done in the
  older version of columnar already allocated that page, and now we are
  overwriting it.

For this reason, with this commit, we re-do the change done in
a8da9acc63.

And for the reasons given above, it wasn't possible to add a test for
this commit via usual code-paths. For this reason, added a UDF only for
testing purposes so that we can reproduce the exact scenario in our
regression test suite.
2021-11-26 07:51:13 +01:00
Onur Tirtir 85da4fc2e0
Merge branch 'master' into col/pg-upgrade-dependency 2021-11-26 09:34:43 +03:00
Onur Tirtir 81af605e07
Fix typo: "no sharding pruning constraints" -> "no shard pruning constraints" (#5490) 2021-11-25 21:00:44 +01:00
Onur Tirtir 73f06323d8 Introduce dependencies from columnarAM to columnar metadata objects
During pg upgrades, we have seen that it is not guaranteed that a
columnar table will be created after metadata objects got created.
Prior to changes done in this commit, we had such a dependency
relationship in `pg_depend`:

```
columnar_table ----> columnarAM ----> citus extension
                                           ^  ^
                                           |  |
columnar.storage_id_seq --------------------  |
                                              |
columnar.stripe -------------------------------
```

Since `pg_upgrade` just knows to follow topological sort of the objects
when creating database dump, above dependency graph doesn't imply that
`columnar_table` should be created before metadata objects such as
`columnar.storage_id_seq` and `columnar.stripe` are created.

For this reason, with this commit we add new records to `pg_depend` to
make columnarAM depending on all rel objects living in `columnar`
schema. That way, `pg_upgrade` will know it needs to create those before
creating `columnarAM`, and similarly, before creating any tables using
`columnarAM`.

Note that in addition to inserting those records via installation script,
we also do the same in `citus_finish_pg_upgrade()`. This is because,
`pg_upgrade` rebuilds catalog tables in the new cluster and that means,
we must insert them in the new cluster too.
2021-11-23 13:14:00 +03:00
Burak Velioglu 6590f12de4
Merge branch 'master' into velioglu/make_object_lock_explicit 2021-11-22 13:55:36 +03:00
Burak Velioglu 12e05ad196
Sorted addresses before getting lock 2021-11-22 11:43:32 +03:00
Marco Slot 56eae48daf Stop updating shard range in citus_update_shard_statistics 2021-11-19 10:51:15 +01:00